A general view on TTFB and latency
TTFB (Time To First Byte) is often misunderstood or misinterpreted. In fact testing TTFB by different means and websites usually results in different values for TTFB. There is a myriad of reasons why this happens from network stack configuration over NIC configuration, TCP/UDP routing (and HOPS) and much much more. In fact it is impossible to exactly measure TTFB.
From my point of view it makes more sense to talk about latency or more exactly web content delivery latency.
More about the discussion at: https://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles/ (con TTFB) and https://plus.google.com/+IlyaGrigorik/posts/GTWYbYWP6xP (pro TTFB). And a more general view on both opinions is here https://researchasahobby.com/time-to-first-byte-ttfb-hosting-speed-tests/
Web content delivery latency
To some people latency is TTFB on how the word TTFB is used. To others latency only refers to network latency aka how long does a network package take on average from A to B. Here we talk about the latency to deliver web content to the receiver after an initial request has been sent. In the common testing tools like pingdom, keyCDN – web performance test,… this is the time declared as “WAIT” esspecially in the waterfall diagrams. In bytecheck it is just the TTFB column.
As it turns out all those tests merely indicate on what is happening and even then in a lot of cases the result does not apply to realworld site browsing.
For example the decreased ssl_buffer_size does actually decrease latency in a lot of cases. Yet the corresponding value (TLS) in keyCDN – web performance test does not, while latency of (network-wise) closeby client devices dropped from app. 44ms to 22ms, which actually is a 50% latency performance boost.
While 22ms might sound negligble, it actually is not. As of now the amount of items to load for a single web page increases steadily. Even 100-150 items per page is not a lot anymore, but actually common. In nested webpage loads from different https-sources, this easily leads to app 100ms less delay for the viewer. Considering 2 seconds should be the absolute maximum for the fully finished/rendered page on the client device,… by then 100ms is already 5% of those 2 seconds.
Bottom line: we want to reduce the time it takes to deliver payload (data) to the client device as much as possible. This can be achieved by:
- Reduce web content delivery latency
- Reduce amount of data to be transferred
- Flatten page load hierarchy
- Parallelize item load (directly connects to the previous point)
CentOS and tuned
CentOS comes with an elegant basic solution to webserver tuning: tuned. Installed in general by default, tuned offers the option to set the OS performance profile. In case of a web server we want the network-latency profile.
Show currently active profile:
> tuned-adm list
Set network-latency profile:
> tuned-adm profile network-latency
An alternative would be latency-performance as a profile, which in fact is quite similar to network-latency.
SSL in nginx
In-depth introduction to SSL-latency: http://www.semicomplete.com/blog/geekery/ssl-latency.html
Dynamic SSL buffer-size: https://blog.cloudflare.com/optimizing-tls-over-tcp-to-reduce-latency/
More in-depth on SSL/TLS:
Check results of SSL: https://www.ssllabs.com/ssltest/