Caching db results - need a nudge on how to start

Solution 1:

The MySQL query cache does indeed cache the results of the query and as long as subsequent queries are exactly the same, the calculated result will be provided instead of actually running the query. MySQL will also skip calculating the execution plan.

To use it, you can either turn it on in your server's my.cnf or use the SQL_CACHE hint in front of your queries. If you have it turned on in my.cnf, you can tell MySQL not to cache a result by using SQL_NO_CACHE in front of your queries.

Any write (INSERT, UPDATE, DELETE, etc.) to a table will invalidate all of the entries in the query cache that came from that table.

One performance anomaly with the query cache: it uses an inefficient algorithm for finding entries within the cache so creating a larger cache can result in worse performance. You should experiment with changing it with your own data and query profiles but the last time I did this experiment I found that around 256MB was the sweet spot. Larger or smaller than this and the performance got worse. The manual suggests "Tens of megabytes"

You can also implement caching outside of MySQL using something like memcached. This is not transparent to the application, so you will have to add extra code in your application to handle doing the lookup to memcached and then, if you get a miss, doing the lookup in the database and storing the result in memcached.

Solution 2:

You could go about this several ways. There are likely other ways to go about this, but these 2 spring to mind instantly.

You could create another table to hold your cached results and on a schedule run the expensive queries which might join multiple tables with lots of records in them and dump the records into this table. Doing a SELECT * from a single table should be less expensive that doing a SELECT which joins 12 different tables, each with millions of records in them. While this doesn't exactly remove the work from the DB server, it should reduce the computational workload by only having to do the expensive queries on a schedule and having your regular clients fetch data from the cache table.

Alternatively, for a cache which is totally separate from your DB you could implement something like Redis. This will store data in memory and should be super-quick, but will need some extra logic in your application to use this as a data source instead of the real database. It scales well too - the Stack Exchange network utilise it quite effectively across their sites.