Background
The HTTP 1.1 RFC includes a lot of new features to support caching. One of the headers we found interesting was the Vary header. It allows an origin server to tell the caching proxy that the URI will have variants (different encodings, formats, etc) as different representations of the resource. Basically, it allows your service to say, "Hey, for a URI, I support JSON, XML, and PDF, or I can give that to you compressed or not". Of course the formats can be whatever you choose, and further you can vary on whatever else you want. The key is that the Vary header acts as kind of a pointer and says, "I'm listing other headers that are the variants, and they will enumerate the specifics of what's different".
Sounds interesting, but it's damn hard to visualize, so let's make an example. Let's say our service has the following resource URI for customer orders:
/customers/{cust_id}/orders
And that the service can return the order data in different formats (mime-type); for argument's sake we'll support JSON, XML and PDF. Thus, a typical HTTP invocation would look like this (using an abbreviated HTTP message format here):
c>
GET /customers/123/orders HTTP 1.1
<xml data ...... />
OK, so now you have two variants of your resource in the caching proxy (squid), but, interestingly, they will expire at different wall clock times. It is important to know that the two will not "invalidate together" when one variant's TTL expires; they essentially live independent lives. Let's take an example (while it's possible for the origin server to return different max-age values for the Cache-Control header, for sanity's sake I'll make it 1 hour across the board.):
- request 1 asks for json data at 10:00am; it will be live in cache until 11:00am
- request 2 asks for xml data at 10:59am; it will live in cache until 11:59am
If you are worried about freshness of data (at whatever time scale), this is something to be aware of in squid. (Note: I'll be covering resource invalidation in a future installment of this series.)
Lessons Learned
We initially introduced with Vary because it allowed us to introduce new, more compact data representations without disturbing existing clients (that is, forcing them to now parse a new format), but the cache miss rate due to different clients requesting different formats coupled with the cross-country latency became a problem. However, if the latency to generate each variant is nontrivial either because of expensive database reads or, in our case, cross-country/cross-data center calls, each URI variant that may be used is going to amount to callers eating the pricey cost of the cache miss for a specific variant, even if the other variants are fresh in cache. Further, it becomes a burden on the service's engineers to figure out why some teams see "fast responses" (cache hits) and other teams see "your slow-ass service" (cache misses) as service owners need to keep the matrix of "what team consumes which variant" to figure out the rough math of how likely are they to get a cache hit.
If the latency per variant is inexpensive and the load on back-end resources is bearable, using Vary in your HTTP caching can open up some interesting data format possibilities and ways in which you can service clients.