improvements
Use date from <time> tag if present
Fix for GNU diff and change chars used in paths
Purpose: Download articles from those annoying RSS/Atom feeds which don't include the full article.
Included: A shell script and javascript program which extracts the article URLs, downloads them and converts them using Mozilla's readability.js
.
npm install jsdom
)npm install @mozilla/readability
)Put HTTP/S RSS feed URLs in a file called feed-urls
, one URL per line.
$ chmod +x get-articles.sh
$ ./get-articles.sh
Articles will be in the articles/
folder. The dates are the date the article was downloaded, not necessarily when it was published.