HTML Waste
By Paul Scanlon
One of the great things about hubpages is that it lets you concentrate on what you want to say, rather than how you get it out there. I.e. you don’t need to know any html or web page programming. If this is you, then maybe you won’t be interested in this article – however, you may be concerned with the waste there is in getting a web page to your browser. In these times of climate change, and increased awareness of energy consumption, it does concern you as a web user.
So what is HTML waste? All web pages consist of a mark-up language called HTML – Hyper Text Mark-up Language. This dates back to the very early days of the Internet, when it was still within the confines of research labs. It is a standard language that defines how a web page is displayed and what additional media is required, such as pictures and video. Go to any page (using Firefox – it can be turned of in IE) and right click on the page and select ‘view source’. You will get something like this –
Section of BBC News Web Page
<div class="relatedbbcsites"> <h3>Related BBC sites</h3> <ul> <li><a href="http://news.bbc.co.uk/sport1/hi/default.stm" title="Home of BBC Sport on the internet">Sport</a></li> <li><a href="http://www.bbc.co.uk/weather/" title="Weather information from around the world">Weather</a></li> <li><a href="http://news.bbc.co.uk/democracylive/hi/" title="Democracy Live">Democracy Live</a></li> <li><a href="http://www.bbc.co.uk/newsbeat" title="Radio 1 Newsbeat">Radio 1 Newsbeat</a></li> <li><a href="http://news.bbc.co.uk/cbbcnews/default.stm" title="CBBC Newsround">CBBC Newsround</a></li> <li><a href="http://www.bbc.co.uk/onthisday" title="BBC On This Day">On This Day</a></li> <li><a href="http://www.bbc.co.uk/blogs/theeditors/" title="Editors' Blog">Editors' Blog</a></li> </ul> </div> <!-- SiteVersions --> <div class="lang">Languages</div> <ul id="languages"> <li id="newyddionLoz"><a href="http://news.bbc.co.uk/welsh/default.stm" title="BBCNewyddion.com">Newyddion</a></li>
If you don’t know html – don’t worry, you don’t have to for this article. Take a look at the above. It is from the news.bbc.co.uk page a few nights ago. Anything strike you about it? No – probably not. Its almost exactly like any other page – well laid out with spaces.
And it’s the space that is the problem and waste. Web pages treat spaces differently to other characters. Multiple space characters and tabs are usually ignored by the browser. Perhaps you have tried spacing out a piece of text to get it in line, only to find that it doesn’t matter how many times you hit the space bar, it makes no difference. And yet many websites still send out needless spaces in their web pages.
BBC News Page
A Simple Exercise
As an exercise, I had a closer look at the BBC page, and tried to strip the needless and unused characters out. The results I got are below
Original BBC html page size : 89,085 bytes (or characters)
Striped BBC html page size : 59,822 bytes
That’s a saving of almost 33 %. Now that doesn’t include the image media for the page – if that’s included, the saving is still about 6% not something to be sniffed at. The striped page will look exactly like the original in a browser – no information is lost!
So what I hear some of you ask. Well, the BBC news page is ranked 44 by Alexa, and on high news days can be serving 40,000 requests per second (values for the 7/7/ London bombings). The infra structure required to serve this sort of service is not small, there will be banks and banks of servers running day and night, with associated air conditioning etc, etc. And 6 % of this is just running to transmit nothing useful!
But the BBC is not the only one. Look at these
Web site
| Original Size (bytes)
| Smallest Size (bytes)
| Possible Saving
|
|---|---|---|---|
UN Framework Convention on Climate Change
| 47,878
| 31,125
| 35%
|
Slashdot
| 111,407
| 91,191
| 18%
|
My Home Page
| 13,048
| 12,488
| 4%
|
And if you want to see how it can be done – have a look at the html google sends on a search page form.
Now when you take each web page in isolation, there is not much saving to be had – add together all the web pages served every day, and a lot of electric generation could be reduced and money saved.
Quick Pole
Do you think the web industry should write web pages that are more efficient?
See results without votingIs it time for a rethink about how the internet works. The technology for transmitting a web page has not changed much in over 20 years, and it is very inefficient. We are all being told to use energy efficient light bulbs, travel less in our cars and turn things off and not leave them on stand-by, so is it time for the Internet to catch up? Should we expect programmers to be more aware of what they produce – or is this another example of a web page that has nothing useful? Comments please.
Links to Sites Mentioned in Article
- BBC NEWS | News Front Page
Get the latest BBC World news: international news, features and analysis from Africa, Americas, South Asia, Asia-Pacific, Europe and the Middle East. - Slashdot - News for nerds, stuff that matters
- United Nations Framework Convention on Climate Change
- Paul Scanlon Home Page | Getting there. Slowly
Paul Scanlon homepage. News, views and opinions of Paul Scanlon. Come right on in, you are most welcome.
jayjay40 2 years ago
Gosh, you certainly opened my eyes I didn't even know that space could be wasted in this way a very interesting hub