Is the presence of whitespace / newlines in HTML bad for download speed? This question may be relevant because when creating a PHP page with HTML content a developer has the option to echo the HTML or the step out of PHP altogether and just type the HTML. Advantage of method number one: no whitespace or newlines sent to the server. Advantage method number two: the IDE can assist in creating / maintaining the page by intelligent code completion (keeping track of all the div tags for example). Compare for example the source code of wikipedia.org (no newlines) with that of bol.com (lots of newlines). Just like DNA carries around a lot of junk DNA (99% by the latest estimate), so can a HTTP response carry around a lot of junk space.

Testing is fairly straightforward.
* Page 1 (363kb) has 5000 lines with each line 50 times the letter x followed by a newline element. The source code will have a single line. Average download time 34 / 393 ms
* Page 2 (273kb) has an added line feed (\n) at the end of each line. The source code will have 500 lines. Average download time 37 / 434 ms
* Page 3 (283kb) has an additional line feed. The source code now has 10000 lines. Average download time 39 / 428 ms
* Page 4 (376kb) has the lines starting with an additional 20 whitespaces. Average download time 35 / 425 ms

The two download times measured (Firefox, Firebug) are one with default HTTP compression (deflate) 1] and one with compression disabled. In all measurements the deviations are large (8/70) and none of the differences are statistically significant.

No, junk space does not affect download speed.

1] disable compression in Firefox by typing about:config in the URL bar and search for the network.http.accept-encoding option, double click and empty the value gzip, deflate.

Advertisements