So, I’m not the only one looking a solution for this problem.
Basically I want my RSS reader to fetch things (images for example) needed to display every entry during updates, so I can read them offline. Images in most feed entries are referenced remotely (http://), which are usually not downloaded until the entry is actually viewed. Some feeds use enclosures but that works more like an attachment rather than for content.
I’ve tried quite a few RSS readers and Straw seems to be the only one that does full automatic image fetch during updates. However Straw’s development has been stalling, and the latest version seems to be quite unstable.
Liferea has been my RSS reader for quite a while, and so I’ve decided to do it myself with (hopefully) the simplest way possible: a Liferea conversion filter which parses a feed and fetches things for offline reading.
At the moment it works by looking for <img> tags and fetches the image using
wget, and then replaces the original image src to point to the local one.
It’s a pretty simple perl script. I have written it in a way so it can be extended it to parse and fetch other things in the future, maybe embedded videos for example. It currently downloads all images, one by one. It also checks if a file is already downloaded or not. You can change
$SAVE_PATH in the script as needed.
You can git (yes, git) the script at
git://pigeond.net/offline_filter.git. Or alternatively get the latest version here, or browse the repo at http://pigeond.net/git/?p=offline_filter.git.
To use it, set the script as the conversion filter for the feed you want to have things downloaded and it should just work.
Now I can read all the really important stuff on the train, like xkcd and failblog ;).