Web Application Performance
Over the past few months, I've become curious about web application performance. This interest led me to read the book High-Performance Browser Networking, give a tech talk to our engineering department about the topic, and begin work at improving our web application load times. This post details some of the concepts I learned and action items that apply to most networked applications.
Why care about web application performance?
This is a bit of a rhetorical question but worth briefly highlighting. Faster websites lead to a myriad of benefits like better engagement, user retention, and higher conversions to name a few. There are many examples of tech companies seeing huge ROI for improved performance. Simply put, speed is a feature and should be treated as such.
Speed and performance are relative terms and time can be subjective. The following table highlights time and user perception.
For an application to feel instant, a perceptible response to the user input must be provided within hundreds of milliseconds. After a second or more, the user's flow and engagement with the initiated task is broken, and after 10 seconds have passed, unless progress feedback is provided, the task is frequently abandoned.
A few hundred milliseconds does not leave us with a lot of time. However, it's worth noting that we need to provide a "perceptible response" really quickly. This does not mean that we need to load our entire application within 300ms (although if you can do that, you should). We could, for example, render some static HTML and allow the user to scroll the page, while some asynchronous tasks are still running and before our site is fully interactive. This is one of the reasons why it's worth it to understand the networking stack. Let's get into that now.
Bandwidth and Latency
Two of the most critical components that impact the performance of all networked traffic are bandwidth and latency.
Maximal throughput of a logical or physical communication path.
The time from the source sending a packet to the destination receiving it.
Bandwidth refers to the volume of data that we can send across the network. When your local ISP advertises faster internet speeds (100mbps+) they're referring to bandwidth. Latency is the time it takes for our individual packets to travel (across the wire) from the source to the destination. Here's a helpful visual of the difference.
As web engineers, our bottleneck (most of the time) is latency not bandwidth. Bandwidth becomes your bottleneck if your goal is to send large volumes of data all at once (like video streaming). Most web applications send lots of smaller chunks of data (HTML, JS, CSS assets, etc.). For everyday web browsing, latency is the limiting factor. Also, as engineers, we have little control over the bandwidth that our end users have access to anyway.
The above image shows that increasing bandwidth is helpful, but you quickly reach a point of diminishing returns. Contrast that with decreasing latency, where there is a linear correlation between lower latency and faster page load times.
What happens when we visit a website?
Given that primer on bandwidth and latency, we can dive into what actually happens when you type a URL into the address bar and hit enter. From a high level, it's easy to think that the client sends a
GET request to the server and the server responds with a
status 200. However, if you pop the hood on the networking stack you'll see that it's more complicated than that (My previous post talks more about getting behind the abstractions). Before we can send any application data across the network a few things must first happen.
- The client must complete a DNS lookup.
- The client and server must complete a TCP three-way handshake.
- The client and server must complete a successful TLS negotiation.
All of the above steps must happen to establish a TCP connection and send data across the wire. And, all of the above steps add latency. Both TCP and TLS add a full round trip of latency. The DNS lookup should happen quicker and the browser may have the DNS resolution cached. We can observe these requests using a tool such as wireshark.
The following image is a snapshot of the waterfall view of loading www.amazon.com inside the emulator tool webpagetest.
Looking at the first request, we can observe the DNS, TCP, and TLS steps. Not only does the browser have to initiate these steps for the first request, but it also has to do them for each subsequent request to a different domain (see requests #2 and #3 in the above image). Luckily, modern browsers are smart enough to reuse existing TCP connections where possible. Requesting assets from multiple domains can be expensive and add unnecessary latency. This is something we experience at Grove due to the number of third-party analytics vendors that we integrate with. It's worth it to optimize (and eliminate if possible) your web application's TCP connections.
- The fastest byte is a byte not sent. If you can, send fewer bytes. This is by far the most effective optimization.
- Position your servers closer to your users (via a CDN). This will reduce roundtrip latency.
- TCP and TLS should be optimized and existing connections should be reused where possible.
- Audit your usage of third-party scripts. Each one adds additional latency.
- Try to avoid
302 Redirects. These can be expensive.
Thanks for reading this far. I hope you find this post useful. I didn't get into too much detail about the specific TCP, TLS, and DNS steps. The main takeaway here is that they happen under the hood and they're abstracted away from us by the browser. It's worth it to understand, observe, and optimize them.
Subscribe to my blog via email or RSS feed.