bernorieder / reddit-tools

a bunch of scripts for investigaing reddit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reddit Tools

This is a collection of PHP command line scripts to grab data from Reddit and transform it into CSV files.

  • grab_list.php gets a list of posts from any subreddit via the (hot|new|rising|controversial|top|gilded) sorting options;
  • grab_threads.php grabs the 200 top comments per Reddit's API from the file generated by grab_list.php;
  • tocsv_list.php converts the file generated by grab_list.php to a CSV;
  • tocsv_comments.php converts all of the comments retrieved by grab_threads.php into a CSV;

For all scripts a running PHP command line parser is necessary. Which subreddits or files to work with and some other stuff can be edited directly in the top section of each file.

How do I use this?

  1. Make sure that you have PHP installed (to test: go into your command line and type "php -ver" - if you get a version info with at least PHP 5.5.4, you're good to go, if not => google "install php YOUROS");
  2. Download files and put them into a directory (the script needs to have the permission to write to this directory => google "directory write permission YOUROS");
  3. Edit grab_list.php and change the first couple of lines to fit your data desire;
  4. Type "php grab_list.php" into your command line, which should create a folder and a file fith a list of the posts you specified;
  5. Edit grab_threads.php same as above;
  6. Type "php grab_threads.php" and wait; this should download a file with the comment thread per post;
  7. Edit and run tocsv_list.php (same as before "php toc...") to transform the post list into a CSV;
  8. Edit and run tocsv_comment.php to transform all of the comments into into a CSV (this file also includes data and text for the post the comment is related to for easier handling);
  9. If there is a problem, consider asking someone who knows this kind of stuff; if that fails, sumbit an issue on github; do not contact the author of this script;

About

a bunch of scripts for investigaing reddit

License:The Unlicense


Languages

Language:PHP 100.0%