What is caching?

I'm constantly hearing about person y had performance issue x which they solved through caching.

Or, how doing x,y,z in your programs code can hurt your caching ability.

Even in one of the latest podcasts, Jeff Atwood talks about how they cache certain values for speedy retrieval.

There seems to be some ambiguity in the terms "cache" and "caching" and it has led me to be confused about it's meaning in different cases. Whether you are referring to application or database caching, cpu, etc and what that means.

What is caching and what are the different types?

From context I can get a sense of it, to store an oft retrieved value into main memory and have quicklook up access to it. However, what is it really?

This word seems to be used in a lot of different contexts with slightly different meaning (cpu, database, application, etc) and I'm really looking to clear it up.

Is there a distinction between how caching works in your applications vs your database caching?

When someone says that they found a piece of code that would hurt caching and after they fixed it, it improved the speed of their app, what are they talking about?

Is the program's caching something that is done automatically? How do you allow values to be cached in your programs? I've often read users on this site say that they cached a value in their application, I sit here and wonder what they mean.

Also, what does it really mean when someone talks about database caching? Is this simply a feature they turn on in their database? Do you have to explicitly cache values or does the database pick which ones to cache for you?

How do I begin caching items myself to improve performance?

Can you give me some examples of how I can begin caching values in my applications? Or again, is this something that is already done, under the hood and I simply have to write my code in a particular way to allow "caching"?

What about database caching, how do I begin that? I've heard about things like memcache. Is this type of utility required to cache in databases?

I'm looking to get a good distinction between caching in applications vs databases, how they are used and how it is implemented in both cases.


Solution 1:

Caching is just the practice of storing data in and retrieving data from a high-performance store (usually memory) either explicitly or implicitly.

Let me explain. Memory is faster to access than a file, a remote URL (usually), a database or any other external store of information you like. So if the act of using one of those external resources is significant then you may benefit from caching to increase performance.

Knuth once said that premature optimization is the root of all evil. Well, premature caching is the root of all headaches as far as I'm concerned. Don't solve a problem until you have a problem. Every decision you make comes at a cost that you'll pay to implement it now and pay again to change it later so the longer you can put off making a deicsion and changing your system the better.

So first identify that you actually have a problem and where it is. Profiling, logging and other forms of performance testing will help you here. I can't stress enough how important this step is. The number of times I've seen people "optimize" something that isn't a problem is staggering.

Ok, so you have a performance problem. Say your pages are running a query that takes a long time. If it's a read then you have a number of options:

  • Run the query as a separate process and put the result into a cache. All pages simply access the cache. You can update the cached version as often as is appropriate (once a day, once a week, one every 5 seconds, whatever is appropriate);
  • Cache transparently through your persistence provider, ORM or whatever. Of course this depends on what technology you're using. Hibernate and Ibatis for example support query result caching;
  • Have your pages run the query if the result isn't in the cache (or it's "stale", meaning it is calculated longer ago than the specified "age") and put it into the cache. This has concurrency problems if two (or more) separate processes all decide they need to update the result so you end up running the same (expensive) query eight times at once. You can handle this locking the cache but that creates another performance problem. You can also fall back to concurrency methods in your language (eg Java 5 concurrency APIs).

If it's an update (or updates take place that need to be reflected in your read cache) then it's a little more complicated because it's no good having an old value in the cache and a newer value in the database such that you then provide your pages with an inconsistent view of the data. But broadly speaking there are four approaches to this:

  • Update the cache and then queue a request to update the relevant store;
  • Write through caching: the cache provider may provide a mechanism to persist the update and block the caller until that change is made; and
  • Write-behind caching: same as write-through caching but it doesn't block the caller. The update happens asynchronously and separately; and
  • Persistence as a Service models: this assumes your caching mechanism supports some kind of observability (ie cache event listeners). Basically an entirely separate process--unknown to the caller--listens for cache updates and persists them as necessary.

Which of the above methodologies you choose will depend a lot on your requirements, what technologies you're using and a whole host of other factors (eg is clustering and failover support required?).

It's hard to be more specific than that and give you guidance on what to do without knowing much more detail about your problem (like whether or not you have a problem).

Solution 2:

You will most likely read about caching in the context of web applications. Because of the nature of the Web, caching can make a big performance difference.

Consider the following:

A web page request gets to the web server, which passes the request on to the application server, which executes some code that renders the page, which needs to turn to the database to dynamically retrieve data.

This model does not scale well, because as the number of requests for the page goes up, the server has to do the same thing over and over again, for every request.

This becomes even more of an issue if web server, application server, and database are on different hardware and communicate over the network with each other.

If you have a large number of users hitting this page, it makes sense to not go all the way through to the database for every request. Instead, you resort to caching at different levels.

Resultset Cache

Resultset caching is storing the results of a database query along with the query in the application. Every time a web page generates a query, the applications checks whether the results are already cached, and if they are, pulls them from an in-memory data set instead. The application still has to render the page.

Component Cache

A web page is comprised of different components - pagelets, or whatever you may want to call them. A component caching strategy must know what parameters were used to request the component. For instance, a little "Latest News" bar on the site uses the user's geographical location or preference to show local news. Consequently, if the news for a location is cached, the component does not need to be rendered and can be pulled from a cache.

Page Cache

One strategy for caching entire pages is to store the query string and/or header parameters along with the completely renderered HTML. The file system is fast enough for this - it is still way less expensive for a web server to read a file than to make a call to the application server to have the page rendered. In this case, every user who sends the same query string will get the same cached content.

Combining these caching strategies intelligently is the only way to create really scalable web apps for large numbers of concurrent users. As you can easily see, the potential risk here is that if a piece of content in the cache cannot be uniquely identified by it's key, people will start to see the wrong content. This can get pretty complicated, particularly when users have sessions and there is security context.

Solution 3:

There are two meanings that I know of.


One is application caching. This is when, if data is slow to get from somewhere (e.g. from over the network) or slow to calculate, then the application caches a copy of the data (so that it doesn't need to get it again or to recalculate: it's already cached). Implementing a cache takes a bit of extra application software (logic to use the cache) and extra memory (in which to store the cached data).

That's "caching" being used as you're quoting here:

From context I can get a sense of it, to store an oft retrieved value into main memory and have quicklook up access to it.


Another is CPU caching, which is described in this Wikipedia article. CPU caching happens automatically. If you do a lot of reading from a small amount of memory, then the CPU can do most of those reads from its cache. OTOH if you read from a large amount of memory, it can't all fit in the cache and the CPU must spend more time working with the slower memory.

That's "caching" being used as you're quoting here:

When someone says that they found a piece of code that would hurt caching and after they fixed it, it improved the speed of their app, what are they talking about?

It means they found a way to rearrange their code to cause fewer Cache misses.


As for database caching, I don't know.