Making Posterous faster with Varnish
Posterous serves an enormous amount of data from our servers every day. As our site grows, we have needed to think about new and interesting ways to improve performance. So, we spent the last couple months trying to improve performance any way we could. We stripped out much of the inline Javascript that our theming engine generated, we started using asset bundling and compression, and we audited the site for inefficient database queries.
Finally, we decided to add full page caching to Posterous blogs. This has resulted in one of our largest performance boosts to date. How we accomplished full page caching on Posterous is the subject of today's post.
Enter Varnish
Varnish, often called an "HTTP Accelerator" or a "Reverse Proxy", is the mechanism we chose to speed up Posterous. Until Varnish, we only relied on fragment caching to save precious CPU cycles. Now that Varnish is in place, we are caching entire pages. This takes a huge amount of load off of our application and database servers, and results in noticeable speed increases for our users.
To be precise: our tests have shown that pages served out of Varnish see a ~67% speed improvement in total page load time.
Configuration
For those not familiar with Varnish, it is configured using a file known as a "VCL (Varnish Configuration Language)" file. VCL resembles nginx config files, and is divided into subroutines (e.g. vcl_hash, vcl_recv, vcl_pipe).
You can learn more about VCL here.
We have included the actual VCL file we are using at the bottom of this post.
Dynamic content
While implementing a full page cache, we had quite a few challenges. The first and most important one was how we were going to render dynamic content on pages that are statically cached. For example, in the upper right corner of Posterous sites, there are elements that are unique to each user (like the list of their Posterous sites, their name, etc.). Since we only cache one version of a page, we cannot include this personalized information in the page that gets cached.
We addressed this by gathering information about the static page being served, and made a secondary AJAX call to our servers to account for a user's logged-in state, post view counts, and other dynamic content. Since this AJAX call is significantly less processor-intensive, and it is only made after the page finishes rendering, users see a significant increase in performance.
To make our lives a little easier, we also removed all inline Javascript generated by theme elements. We used HTML5's new data- attributes to add information to HTML elements directly so they can be more efficiently interpreted by Javascript.
Private posts on blog index pages
Another challenge we faced was allowing site owners and contributors to view private posts within their blog list pages. Since we want to store only one version of a page in the cache, our only alternative was to disable caching if a user is looking at one of his/her own sites. VCL alone didn't have any mechanisms for accomplishing this, so we turned to a really powerful feature of VCL: the ability to embed C code directly in the configuration file.
This snippet did the trick:
C{
char *host = VRT_GetHdr(sp, HDR_REQ, "\005Host:");
char *cookie = VRT_GetHdr(sp, HDR_REQ, "\007Cookie:");
char* result = NULL;
if (cookie == NULL) {
cookie = "";
}
if (host == NULL) {
host = "";
}
result = strstr(cookie, host);
if (result != NULL) {
VRT_SetHdr(sp, HDR_REQ, "\013X-No-Cache:", "YES", vrt_magic_string_end);
}
}C
if (req.http.X-No-Cache ~ "YES") {
return(pass);
}Essentially, we write to a users cookies the list of all sites they own. If the currently-viewed hostname (the current site) matches this cookie, we simply bypass the cache.
We did it this way because even if a malicious user spoofed this cookie, the worst that could happen is that they would see the non-cached version of the page, and since the malicious user isn't actually logged in as the site owner, they won't see any of the private posts.
Lacquer
On the back-end, we are using a gem called Lacquer. The gem was originally developed by Russ Smith (original gem). Lacquer communicates with the Varnish administration port to deal with the purging of stale pages. It also makes it easy to instruct Varnish that an outbound page should (or should not) be cached.
We found there were several limitations and a few bugs with the original gem, so we forked it and added some enhancements, including the ability to purge to multiple Varnish servers in a performant way. You can check out our forked version here.
Wrapping up
At Posterous, we are always looking for awesome ways to improve our service. Varnish is just one of the many enhancements we have added, and we will be writing about other ones very soon.
As always, we love feedback! If you have any suggestions for us, or questions about Varnish, please don't hesitate to ask.
If you found this post interesting, know that Posterous is hiring Infrastructure Engineers, Front-end Engineers, and more.
Reference
Here is the Posterous VCL file, in its full glory:
#-e This is a basic VCL configuration file for varnish. See the vcl(7)
#man page for details on VCL syntax and semantics.
#
#Default backend definition. Set this to point to your content
#server.
#
C{
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
}C
backend default {
.host = "127.0.0.1";
.port = "8282";
.max_connections = 2000;
.connect_timeout = 600s;
.first_byte_timeout = 600s;
.between_bytes_timeout = 600s;
}
sub vcl_hash {
### these 2 entries are the default ones used for vcl. Below we add our own.
set req.hash += req.url;
set req.hash += req.http.host;
# This will make sure that the mobile version of our site gets cached under a different hash
if (req.http.cookie ~ "mobile_view=true") {
set req.hash += "mobile";
}
if (req.http.user-agent ~ "(?i)palm|blackberry|nokia|phone|midp|mobi|symbian|chtml|ericsson|minimo|audiovox|motorola|samsung|telit|upg1|windows ce|ucweb|astel|plucker|x320|x240|j2me|sgh|portable|sprint|docomo|kddi|softbank|android|mmp|pdxgw|netfront|xiino|vodafone|portalmmm|sagem|mot-|sie-|ipod|up\\.b|webos|amoi|novarra|cdm|alcatel|pocket|iphone|mobileexplorer|mobile" && !(req.http.user-agent ~ "(?i)ipad") && !(req.http.cookie ~ "full_site=true")) {
set req.hash += "mobile";
}
return(hash);
}
#
# Handling of requests that are received from clients.
# First decide whether or not to lookup data in the cache.
#
sub vcl_recv {
# Pipe requests that are non-RFC2616 or CONNECT which is weird.
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
return(pipe);
}
# Pass requests that are not GET or HEAD
if (req.request != "GET" && req.request != "HEAD") {
return(pass);
}
# Pass requests for blog pages greater than page 3
if (req.url ~ "page=([4-9]|[1-9][0-9]+)$") {
return(pass);
}
# Never cache private posts
if (req.url ~ "\/private\/") {
return(pass);
}
# Don't cache the result of a redirect
if (req.http.Referer ~ "jumpto" || req.http.Origin ~ "poster") {
return(pass);
}
# Since we don't site owners and contributors to view the
# cached version of their site, we match a special cookie we set
# with the current host. This assures that site owners see
# private posts, while other users do not.
C{
char *host = VRT_GetHdr(sp, HDR_REQ, "\005Host:");
char *cookie = VRT_GetHdr(sp, HDR_REQ, "\007Cookie:");
char* result = NULL;
if (cookie == NULL) {
cookie = "";
}
if (host == NULL) {
host = "";
}
result = strstr(cookie, host);
if (result != NULL) {
VRT_SetHdr(sp, HDR_REQ, "\013X-No-Cache:", "YES", vrt_magic_string_end);
}
}C
if (req.http.X-No-Cache ~ "YES") {
return(pass);
}
#
# Everything below here should be cached
#
# Handle compression correctly. Varnish treats headers literally, not
# semantically. So it is very well possible that there are cache misses
# because the headers sent by different browsers aren't the same.
# @see: http://varnish.projects.linpro.no/wiki/FAQ/Compression
if (req.http.Accept-Encoding) {
if (req.http.Accept-Encoding ~ "gzip") {
# if the browser supports it, we'll use gzip
set req.http.Accept-Encoding = "gzip";
} elsif (req.http.Accept-Encoding ~ "deflate") {
# next, try deflate if it is supported
set req.http.Accept-Encoding = "deflate";
} else {
# unknown algorithm. Probably junk, remove it
remove req.http.Accept-Encoding;
}
}
# Clear cookie and authorization headers, set grace time, lookup in the cache
#unset req.http.Cookie;
#unset req.http.Authorization;
set req.grace = 1s;
return(lookup);
}
#
# Called when entering pipe mode
#
sub vcl_pipe {
# If we don't set the Connection: close header, any following
# requests from the client will also be piped through and
# left untouched by varnish. We don't want that.
set req.http.connection = "close";
return(pipe);
}
#
# Called when the requested object has been retrieved from the
# backend, or the request to the backend has failed
#
sub vcl_fetch {
# Comments are now fetched via ESI.
esi;
# Do not cache the object if the backend application does not want us to.
if (beresp.http.Cache-Control ~ "(no-cache|no-store|private|must-revalidate)") {
return(pass);
}
# Do not cache the object if the status is not in the 200s
if (beresp.status >= 300) {
# Remove the Set-Cookie header
#remove beresp.http.Set-Cookie;
return(pass);
}
#
# Everything below here should be cached
#
# Don't cache the comments ESI
if (req.url ~ "\/posts\/comments") {
set beresp.ttl = 0s;
}
# Remove the Set-Cookie header
remove beresp.http.Set-Cookie;
# Set the grace time
set beresp.grace = 1s;
# Static assets aren't served out of Varnish just yet, but when they are, this will
# make sure the browser caches them for a long time.
if (req.url ~ "\.(css|js|jpg|jpeg|gif|ico|png)\??\d*$") {
/* Remove Expires from backend, it's not long enough */
unset beresp.http.expires;
/* Set the clients TTL on this object */
set beresp.http.cache-control = "public, max-age=31536000";
/* marker for vcl_deliver to reset Age: */
set beresp.http.magicmarker = "1";
} else {
set beresp.http.Cache-Control = "private, max-age=0, must-revalidate";
set beresp.http.Pragma = "no-cache";
}
# return(deliver); the object
return(deliver);
}
sub vcl_deliver {
if (resp.http.magicmarker) {
/* Remove the magic marker */
unset resp.http.magicmarker;
/* By definition we have a fresh object */
set resp.http.age = "0";
}
# Add a header to indicate a cache HIT/MISS
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}