A Balanced Look at GZIP and the User Experience
With Google recently starting to involve load time in search rankings, there has been a lot of talk about GZIP. Google Webmaster Tools exclaims "Compressing the following resources with gzip could reduce their transfer size by XKB:". Many people listen to this, and go out looking for a quick fix without realizing that poor GZIP implementation has a strongly negative effect on the overall user experience.
There are two main ways in PHP to go about implementing GZIP. The first is ob_gzhandler. Basically all that is really involved in setting this up is adding ob_start("ob_gzhandler");
anywhere before headers are sent, and it will blindly handle this for you.
The second is zlib.output_compression which while entirely transparent and Zends states that using zlib.output_compression is preferred over ob_gzhandler().
This method can have a huge downside for users. It can for instance greatly exacerbate the sluggishness of an already slow page or even site. Right now your probably saying to yourself "What? I thought GZIP made everything faster. This guy is crazy.". Allow me to explain.
Let’s say that you have a site with no all encompasing output buffer. As PHP works its way through your script it occasionally flushes out its internal buffer to apache, which in turn flushes it out to the browser, allowing the browser to begin rendering the incomplete document. Also, let’s say as an example you have a hold up slowing down your page in the footer of your site. The user would already have the rest of the content and could begin reading and enjoying the site.
Now let’s compare to a site GZIPed with an output buffer callback, e.g. ob_gzhandler. After the output buffer is started, your code begins to execute. Everything echoed is held in the buffer until the page is completely done and the output buffer closes, at which point it is GZIPed and flushed out to the user. The user recieves less bits, but that hold up in the footer we spoke of before halts up the entire rendering process, and the user cannot see anything until the footer completes.
For your viewing pleasure, a demonstration:
Take Note
I am not stating that all GZIP is bad. To the contrary, GZIP can be very beneficial and I absolutely sympathize with google wanting to cut down on their bandwidth. If you have Firebug or something similar you can note that the request for the Non-GZIPed example is a whopping 100 kilobytes, whereas the GZIPed examples request is a measily 323 bytes! The reason for this is I did not have enough content to get Apache to flush so to get around this I added the following line.
echo str_repeat(' ', 100000); //This creates 100kb worth of data, enough to trigger my copy of Apache to flush;
Not only does this trick cause Apache to flush, but it exagerates the positive effect of GZIP. Simple patterns, or in this case large amounts of whitespace, will compress quite wondefully.
Uncompressed static files, JavaScript and CSS being two of the best examples compress quite well. The best way to do this is on an Apache server is with
<IfModule mod_deflate.c>
<FilesMatch "\.(js|css)$">
SetOutputFilter DEFLATE
</FilesMatch>
</IfModule>
Looking for a Solution
I've posted a number of questions to Stack Overflow [1] [2] researching this writing and actually found very limited help.
It appears murky whether or not the protocol even supports the functionality of chunking GZIP results into separately compressed blocks, as would be required to decompress portions before the page is done – you can chunk a complete GZIP response, but the advantage of this is limited.
Lastly I have seen “promises” of ob_gzhandler caching the results leading to a speed boost on a number of blogs – this is purely fiction and not to be trusted.
Moral of the Story
Understand what you’re doing before you do it. Blindly using output buffering for GZIP can have adverse effects on your site.