How to get the total number of rows of a GROUP BY query?

Here is the solution for you

$sql="SELECT count(*) FROM [tablename] WHERE key == ? ";
$sth = $this->db->prepare($sql);
$sth->execute(array($key));
$rows = $sth->fetch(PDO::FETCH_NUM);
echo $rows[0];

It's a little memory-inefficient but if you're using the data anyway, I use this frequently:

$rows = $q->fetchAll();
$num_rows = count($rows);

The method I ended up using is very simple:

$query = 'SELECT a, b, c FROM tbl WHERE oele = 2 GROUP BY boele';
$nrows = $db->query("SELECT COUNT(1) FROM ($query) x")->fetchColumn();

Might not be the most efficient, but it seems to be foolproof, because it actually counts the original query's results.


I don't use PDO for MySQL and PgSQL, but I do for SQLite. Is there a way (without completely changing the dbal back) to count rows like this in PDO?

Accordingly to this comment, the SQLite issue was introduced by an API change in 3.x.

That said, you might want to inspect how PDO actually implements the functionality before using it.

I'm not familiar with its internals but I'd be suspicious at the idea that PDO parses your SQL (since an SQL syntax error would appear in the DB's logs) let alone tries to make the slightest sense of it in order to count rows using an optimal strategy.

Assuming it doesn't indeed, realistic strategies for it to return a count of all applicable rows in a select statement include string-manipulating the limit clause out of your SQL statement, and either of:

  1. Running a select count() on it as a subquery (thus avoiding the issue you described in your PS);
  2. Opening a cursor, running fetch all and counting the rows; or
  3. Having opened such a cursor in the first place, and similarly counting the remaining rows.

A much better way to count, however, would be to execute the fully optimized query that will do so. More often than not, this means rewriting meaningful chunks of the initial query you're trying to paginate -- stripping unneeded fields and order by operations, etc.

Lastly, if your data sets are large enough that counts any kind of lag, you might also want to investigate returning the estimate derived from the statistics instead, and/or periodically caching the result in Memcache. At some point, having precisely correct counts is no longer useful...