agarmu / datamine-scraper

Scrapes Project information from the Data Mine Website into a Jupyter Notebook

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The Datamine Scraper

The Datamine Scraper (tdmscrape) scrapes project information from the Data Mine Website into a Jupyter Notebook. This project is licensed under version 3 (or any later version) of the GNU AFFERO GENERAL PUBLIC LICENSE. A copy of version 3 of this license may be found in the LICENSE file at the root of the repository associated with this project.

Usage

$ tdmscrape <url>

That's it! You will be walked through an interactive wizard to provide some amount of information regarding your project, and the .ipynb skeleton for your file will be automatically generated.

Acknowledgment

My only request is that the line acknowledging me (as shown below) is left in both your notebook and any derivatives created from it.

This skeleton for this file was generated by the TDM Scraper made by Mukul Agarwal ...

You can also execute the following command to get more help/information regarding tdmscrape:

$ tdmscrape

Scrapes Project information from the Data Mine Website into a Jupyter Notebook

Usage:
  tdmscrape [URL] [flags]
  tdmscrape [command]

Examples:

To use this program, simply pass the requisite url as an argument.

E.g., to import the url "https://the-examples-book.com/projects/current-projects/10100-2023-project01", run:

$ tdmscrape "https://the-examples-book.com/projects/current-projects/10100-2023-project01"
	
The appropriate .ipynb file will be created in your current directory.

Available Commands:
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command
  info        Get information about the program
  license     Prints the license

Flags:
  -h, --help                           help for tdmscrape
  -n, --name string                    name to use for document
  -i, --number int                     project number (default -1)
  -o, --overwrite                      Overwrite existing notebook
  -s, --sub-sub-questions-own-blocks   sub-sub-questions get their own code blocks and response area

Use "tdmscrape [command] --help" for more information about a command.

Installation

Binary installation

Currently, only binaries for the following systems are distributed using GoReleaser.

  • Windows: x86-64, i386, and arm-64 architectures
  • Darwin (macOS): x86-64 and arm-64 architectures
  • Linux: x86-64, i386, and arm-64 architectures

These binaries can be found in the GitHub Releases for this repository.

The tdmscrape binary is not distributed through any package managers at this time.

Build tdmscrape locally.

After installing Go on your system, you can execute the following command to run your code on your system.

$ go install github.com/agarmu/datamine-scraper

About

Scrapes Project information from the Data Mine Website into a Jupyter Notebook

License:GNU Affero General Public License v3.0


Languages

Language:Go 92.0%Language:Nix 8.0%