Blogroll Timestamps with Perl and Bloglines07 Dec 2004
A few weeks ago I mentioned that I was retrieving my blogroll with timestamps from Bloglines using their web services API, but that both my scripts and their API were still buggy. Both are now mostly fixed, I can now get timestamps for 99.5% of my blogroll with the API. The necessary files are at my personal CVS repository: a database schema (schema.sql), three Perl scripts that read the blogroll into the database (bl.getblogroll.pl), read the timestamps into the database (bl.read.pl), and write an RSS file from the database (bl.write.pl), and finally a PHP script using MagpieRSS that reads the RSS file (blogroll-simple.php).
These scripts are not specific to any weblog software. The Perl scripts require the modules WebService::Bloglines, XML::Parser and LWP, among others. I run the Perl scripts every 15 minutes with cron (saving the results of bl.write.pl to a file), and include() the PHP script on my home page. That's all the documentation I'm going to give for now, but if you can understand it, feel free to play with the scripts and send me your improvements.
Well, maybe a few notes are in order.
1. The Bloglines API gets all feeds whether they are marked as public or not. I put my private feeds into separate folders which are ignored by the notifier. My script then only gets those feeds which are in folders that are not ignored.
2. The database field link_alias will override the weblog title in the RSS output, if present.
3. The WebService::Bloglines method getitems() will die if the HTTP status is not 200. However, it is perfectly valid for a feed to return a status of 304 (i.e. no newer entries are available). Therefore I check the status of each feed with LWP first.
4. Similarly, getitems() will also die if the input is not valid XML. This is the 0.5% when the script does not work (namely, for Aaron Swartz... his RSS is valid, but Bloglines appears to have truncated a field for his feed). So a blog is simply skipped if the XML is not valid.