luistelmocosta / datacollect

A collection of tools to collect and download various data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

datacollect

A collection of tools to collect and download various data.

Often, I write simple scripts and tools to collect data for various "data science" tasks. I thought that it might be worthwhile to collect them in a central repository since they might be useful to others!

Contents


Important Note
Please note that I developed and tested these tools in Python 3.x, and it could be possible that the scripts do not work flawlessly in Python 2.7.x due to the more challenging unicode handling.



Collect Lyrics

[back to top]

A command line tool to download song lyrics given artist names and song titles.



Twitter Timeline

[back to top]

A command line tool that downloads your personal twitter timeline in CSV format with optional keyword filter.

Tutorial for turning your twitter timeline into a word cloud.



Collect Popular Music Tags

[back to top]

A command line tool to download popular tags for a list of songs from last.fm, e.g., for various data mining projects.



PDB Info Table

[back to top]

A command line tool that creates an info table from a list of PDB files.

ZINC Molecule Downloader

[back to top]

A command line tool for downloading 3D structures of small chemical molecules from http://zinc.docking.org.



Collect English Premier League Soccer Data

[back to top]

A command line tool to Collect Fantasy Soccer data from the Premier League.

About

A collection of tools to collect and download various data.

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 77.8%Language:HTML 19.6%Language:Python 2.6%