?

Log in

No account? Create an account
Multifail! - 'Twas brillig, and the slithy toves did gyre and gimble in the wabe [entries|archive|friends|userinfo]
Thomas

[ website | Beware the Jabberwock... ]
[ deviantArt | the-boggyb ]
[ FanFiction | Torkell ]
[ Tumblr | torkellr ]

Links
[Random links| BBC news | Vulture Central | Slashdot | Dangerous Prototypes | LWN | Raspberry Pi]
[Fellow blogs| a Half Empty Glass | the Broken Cube | The Music Jungle | Please remove your feet | A letter from home]
[Other haunts| Un4seen Developments | Jazz 2 Online | EmuTalk.net | Feng's shui]

Multifail! [Friday 1st June 2012 at 9:20 pm]
Thomas

boggyb
[Tags|, , , ]
[Where |Welcome aboard the Southern service to London Victoria calling at Arundel, Amberley, Pulborough, Bil]
[Feeling |hungryhungry]
[Playing |Digitally Imported: Vocal Trance]

Somewhere within the depths of Amazon is a RSS client made of several kinds of fail, judging by the logs on a server I admin.

Fail the first: it doesn't appear to support receiving compressed feeds. This is somewhat forgivable as decompressing the feed does require a little more processing, but at the same time it saves massively on bandwidth. Enabling it on one site I admin reduced the bandwidth needed for most textual content (webpages, stylesheets, scripts, and so on) by over 50%.

Fail the second: it apparently has no concept of caching. HTTP provides many ways to specify how a page should be cached and when it must be reloaded, but the simplest is to provide a date and a unique identifier for the current version of the page. The client can then make a request to the server, saying that it only wants a new copy of the page if it's changed since the last date/version it saw. Supporting this on your site is essential if you don't want RSS readers to eat your bandwidth, but it does rather rely on the client also doing so. Clients that don't understand caching are the exception.

Fail the third: it polls this particular feed every 30 minutes, despite the fact that the blog is actually only updated once a week. Now, polling every half hour is not unreasonable, but combined with the first and second fails it means Amazon is responsible for about 70MB of data transfer a month. That's a fair bit when the cap is only 400MB. It'd be nice if there was some way to control the polling interval.

Fail the fourth: it doesn't actually obey the 301 response code. A 301 response code means that the page requested has moved permanently, and the client should update any bookmarks or similar. It does at least follow the redirect, so it's not a complete fail.

Fail the fifth: their client identifies itself as "RPT-HTTPClient/0.3". Well-behaved bots identify themselves as what they really are, and provide some sort of contact address. The trouble is that while I could apply the banhammer to this bot, it's likely doing something useful. I have a sneaking suspicion that it's what Amazon are using to pull the feed for the Kindle version of the site.
Link | Previous Entry | Share | Next Entry[ One penny | Penny for your thoughts? ]

Comments:
From: (Anonymous)
Saturday 2nd June 2012 at 9:08 am (UTC)

are we talking about my jungle here?

I assume this post is about www.themusicjungle.co.uk (shameless bit of advertising here, but EVERYONE should know how brilliant it is (more shameless hyperbole)). Well, as the world and his wife - sorry - I mean partner - aren't queueing up to subscribe, maybe we should say "yay boo" to kindle and take our unvalued custom elsewhere. That'll make them sorry, won't it? What's that? You think they won't even notice? Huh.
(Reply) (Thread)