delicmakaveli / Simple-Quora-Backup

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simple Quora Backup

Web Scraper and Crawler used to backup Quora Bookmarks and Answers.

Note 1: The scripts may not work due to changes on Quora's end in terms of HTML code.

Note 2: Quora forbids scraping and has made it very difficult by using lazy loading. These scripts get around it but they are still slow in comparison to some scrapers that just need to send a HTTP request and parse the response.

Simple Quora Backup is a project I made for personal use to backup my Bookmarks and Answers on Quora. I made it in case I have to access the content but I don't have internet or if Quora, for whatever reason, disappears :).

The basic idea is to run the scripts, let them log in to Quora automatically, crawl through pages, scroll through content, scrape the content once it's loaded and save it as text in simple, minimalistic fashion.

Contents:

Built With

  • Python 3.5 - Program was written in this language only
  • Selenium - Used for automating clicking and scrolling to get around lazy loading
  • BeautifulSoup4 - Used for Web Scraping

Prerequisites

What things you need to install and run the software:

Getting Started

This will get you a copy of the project up and running on your local machine for development and testing purposes.

Just download and extract the project master-folder.

Make sure you have everything from the Prerequisites above.

After that just run the scripts:

After that...

In the same directory each script will create it's own folder containing the text files to which the content was saved in a plain, simple, minimalistic text form.

Deployment

Although this was tested only on Windows and 64bit Linux(Ubuntu) it should run on all platforms that support Python and all the modules used.

Author

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

  • Hats off to the good people who wrote the code for all the modules that made building stuff with Python easier and faster.

About

License:MIT License


Languages

Language:Python 100.0%