hn-scrapper
Get all of the latest links from Hacker News into a single page.
Installation and basic usage as CLI
Pre-requisites
Installation
# Clone this repository locally
mkdir -p ~/projects
git clone https://github.com/agilecreativity/hn-scrapper.git ~/projects/hn-scrapper
cd ~/projects/hn-scrapper
# Create the `~/bin` folder to hold the executable
mkdir -p ~/bin
# Generate the standalone using `lein bin`
lein bin
Usage
To see the help just type
~/bin/hn-scrapper
This should give you the help like
Extract the lastest Hacker News index to a single file
Usage: hn-scrapper [options]
-p, --page-count PAGE-COUNT 20
-o, --output-file OUTPUT-FILE hacker-news.md
-h, --help
Options:
--p PAGE-COUNT the number of pages to be extracted default to 20
--o OUTPUT-FILE the output file name default to 'hacker-news.md'
Now get the list of all news from Hacker News
# Get only the first page from the site
~/bin/hn-scrapper --page-count 1 --output-file hacker-news-front-page.md
# Get all of the news (20 pages) using shorter option
~/bin/hn-scrapper -p 20 -o hacker-news-top-20-pages.md
Example Sessions and Outputs
Sample sessions
Sample Markdown Output
Sample Markdown Output view in Github's Gist
The actual result in Markdown format
Features idea
- Export/print first level content of hackernews to PDFs or Epubs
- Group the results in some ways (topics, keywords, link to YouTube?)
- Persist the result to html pages and store the link just once!
Useful Links
License
Copyright © 2016 Burin Choomnuan
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.