ArrayFormula is breaking the getLastRow() funtion. Possible workarounds?

In my spreadsheet, I have a running script, which is using the getLastRow() function as an essential part of its logic.

Ever since I applied the array formula in one of my columns, the getLastRow() function doesn't work properly. It seems that the array formula is "applying" all the way to the bottom of the sheet even when there are no other values in the other columns and thus, getLastRow() is returning the last row where there is an array formula, instead of the actual non-empty row.

Writing a slow function which checks which cells are empty is not an option for me, since the script will run out of time with such thing running (it has tens of thousands of rows).

Does anyone have any suggestions for a workaround?

Here is the ARRAYFORMULA:

=ArrayFormula(IF(A2:A="",,WEEKNUM(A2:A, 2)))

Solution 1:

Issue:

  • Undesirable addition of empty strings in all the available rows by traditional usage of ARRAYFORMULA(IF(A:A="",...))

Solution:

  • Using ARRAYFORMULA properly with INDEX/COUNTA(to determine the last row that's needed) ensures formula is only filled upto the needed row instead of a camouflage

  • INDEX/COUNTA: INDEX returns a value as well as a cell reference. A2:INDEX(A2:A,COUNTA(A2:A)) => If COUNTA(...) returns 10 => A2:INDEX(A2:A,10) => A2:A11 is the final reference feeded to weeknum

  • Assuming there are no blanks in between your data,

    =ARRAYFORMULA(WEEKNUM(A2:INDEX(A2:A,COUNTA(A2:A)),2))
    
  • Another alternative is to use ARRAY_CONSTRAIN/COUNTA:

    =ARRAY_CONSTRAIN(ARRAYFORMULA(WEEKNUM(A2:A, 2)),COUNTA(A2:A))
    
  • The usage of COUNTA assumes there are no blank cells in between. If there are any, you may need to manually add a offset. If there are two blank cells, add 2 to COUNTA

    A2:INDEX(A2:A,COUNTA(A2:A)+2)
    

Unless Google does inbuilt optimizations, INDEX/COUNTA is preferred over ARRAY_CONSTRAIN.


It might be hard to fix those array formulas with INDEX/COUNTA manually, so I made a script. This is just a proof of concept and alpha quality. So, test it in a copy of your spreadsheet rather than on the original. Having said that, I'm sure it'll handle most common cases without trouble.

/**
 * @see https://stackoverflow.com/a/46884012
 */
function fixArrayFormulas_so46884012() {
  const ss = SpreadsheetApp.getActive()/*.getSheetByName('Sheet1')*/,
    map = new Map([
      [
        // Normalize first part of range
        /* A:F */ String.raw`([a-z]+):([a-z]+)`,
        /* A1:F*/ String.raw`$11:$2`,
      ],
      [
        // Convert any previous index/counta to normal ranges
        /* A1:INDEX(F:F,COUNTA(F:F)) */ String.raw`([a-z]+\d+):INDEX\(([a-z]+)\d*:\w+,COUNTA\(\w+:\w+\)\)`,
        /*A1:F*/ String.raw`$1:$2`,
      ],
      [
        // Convert open ended ranges to  index/counta ranges
        /*A1:F*/ String.raw`([a-z]+\d+:)([a-z]+)`,
        /* A1:INDEX(F:F,COUNTA(F:F)) */ `$1INDEX($2:$2,COUNTA($2:$2))`,
      ],
    ]);
  map.forEach((v, k) =>
    ss
      .createTextFinder(k)
      .matchFormulaText(true)
      .useRegularExpression(true)
      .replaceAllWith(v)
  );
}

Solution 2:

Another solution is to temporarily remove the ArrayFormulas with

sheet.getRange("location of array formula").setValue('');

Then calculate lastRow

var lastRow = sheet.getLastRow();

Then replace the arrayformula

sheet.getRange("location of array formula").setFormula('the formula');

Solution 3:

Here is a function you can use to determine the "true" lastRow and lastColumn values of a Sheet values. It will handle both messy ArrayFormula() and merged cells.

function getSheetBoundaries2(sheet) {
  var dim = { lastColumn: 1, lastRow: 1 };
  sheet.getDataRange().getMergedRanges()
    .forEach(function (e) {
      var lastColumn = e.getLastColumn();
      var lastRow = e.getLastRow();
      if (lastColumn > dim.lastColumn) dim.lastColumn = lastColumn;
      if (lastRow > dim.lastRow) dim.lastRow = lastRow;
    });
  var rowCount = sheet.getMaxRows();
  var columnCount = sheet.getMaxColumns();
  var dataRange = sheet.getRange(1, 1, rowCount, columnCount).getValues();
  for (var rowIndex = rowCount; rowIndex > 0; rowIndex -= 1) {
    var row = dataRange[rowIndex - 1];
    if (row.join('').length > 0) {
      for (var columnIndex = columnCount; columnIndex > dim.lastColumn; columnIndex -= 1) {
        if (("" + row[columnIndex - 1]).length > 0) dim.lastColumn = columnIndex;
      }
      if (dim.lastRow < rowIndex) dim.lastRow = rowIndex;
    }
  }
  return dim;
}