Last month, Jason Yee met with Fastly’s Senior Professional Services Engineer and Varnish Wizard Rogier “DocWilco” Mulhuijzen, and Lead Customer Engineer and Cat Herder, Austin Spires. They talked about everything from Varnish to CDNs and understanding performance. A lot of interesting things came up. Here are the highlights of that discussion:
Varnish implements a subset of the Edge Side Includes (ESI) language that allows you to have fragments in a page that are cached or not cached separately from other fragments in a page. Imagine, for instance, an ecommerce website, which is a classic example — there's always a little bit in the upper right corner with your shopping cart. It would be amazing if you could cache the product pages but not the shopping cart, so that you can serve the product page to everyone who visits it and then have that shopping cart updated separately. These days, a lot of people do that with AJAX calls back into the backend, but it's also possible to do that with ESI. Another feature that we really like is VCL, Varnish Configuration Language. Instead of being a config file in which you set some flags or some paths, it’s actually a programming language, albeit a very domain-specific programming language. When you're working with HTTP, VCL is really powerful. On top of that, the programming language is compiled to C and then compiled to a shared library, which is loaded into the Varnish process. Processing of each request goes through that little library and because it's all machine code it's very fast.
The main differentiator is globalization. Your website or mobile application will see improved performance if you can push out content closer to your end users. We like to say that when it comes to content delivery, CDNs beat the speed of light. This especially matters when you take encrypted connections, such as TLS, into consideration. If you don't have an existing connection to the server, a request is going to take four round trips. If you're coming from Europe and getting routed to a server on the West Coast of the US, that's going to end up being almost 500 milliseconds for a single request. If you put that through a CDN (and don't use any caching at this point), you're still going to get that down to about 200 milliseconds in the worst case scenario. That's a big, big upshot of using a CDN, even for your dynamic content.
If you start using a CDN with fast purging, you can cache static content as well as event-driven content. You can cache that type of content locally to all of your users, which will make the content perform really fast. When you update your content through a purge, it should immediately hit all the cache servers as well.
For example, take a wiki for a popular TV show. Right around the time an episode airs, fans are going to update the wiki pages with new information regarding plot changes, characters, etc. Then, for about a week, there's not going to be a high rate of change. If you were to put that page on a CDN and purge immediately as changes occur, all of the edits users make would be immediately shown to readers in real-time. Then, when edits stop, the page is cached until the next edit.
Applications and websites are not static anymore. Applications involve user interaction, either with a mobile device or with a desktop site, and that interaction impacts other users. It’s what the internet has become. The changes that individuals make on a site or application impact the entire system. When you're caching applications like these, proper amounts of cache and validation are essential. There’s increasingly more event-driven content to factor in when architecting your application.
Common event-driven content examples are user-based (editing a news article, for example) but machine or computational-based changes make up a significant portion of event-driven content. That's where the current micro-services movement fits in. An event-driven service could be rooted in human activity or it could be rooted in an external event. When architecting new services to leverage micro-services, validation and proper cache management matter even more.
As we mentioned, what CDNs are used for is changing — people are starting to think about using CDNs for whole pages. The next step is to realize that API calls are just HTTP requests, and their responses are just objects, so you can use a CDN for that as well. When you go down to the micro-service level, things are very manageable. Because of the smaller scale, it’s easier to keep track of relationships and caching rules. For example, if a change impacts a specific piece of data within a specialized micro-service, you know exactly what to purge from your CDN.
Configuration. You need to be able to influence your configuration in a very direct way. Also consider the ability to prepare your origin to better leverage a CDN’s abilities. Consider setting proper Cache-Control headers and combining them with headers that have different cache times for the CDN, such as Edge-Control or Surrogate-Control. These are settings that are CDN agnostic and shouldn’t impact how you select a CDN, but they’re definitely important and they're great web standards to explore.
Evaluate your favorite frameworks and tools. For example, Rails, WordPress, Django — they all have slightly similar but unique caching rules. Explore those; understand how that's going to affect your application's performance, and get a deeper understanding of your framework and HTTP in general. That's going to be the first step to really get your toe in the water to understand HTTP performance.