kirubel-web / ScriptMiner

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Movie Transcript Scraper

Thumbnail Website

This Python script is designed to scrape movie transcripts from subslikescript.com. It extracts links to movie scripts from the main page and saves individual scripts to text files.

Dependencies

Install dependencies using:

pip install beautifulsoup4 requests

Usage

  1. Clone the repository:
git clone https://github.com/kirubel-web/ScriptMiner.git
cd ScriptMiner
  1. Run the script:
python scraper.py

The script will fetch movie transcript links from the main page and save individual transcripts to text files.

File Structure

  • scraper.py: The main Python script for scraping movie transcripts.
  • requirements.txt: List of Python dependencies.
  • README.md: Documentation for the project.

Output

The scraped movie transcripts are saved in individual text files. The file name corresponds to the movie title.

Contributing

If you would like to contribute to this project or report issues, please feel free to submit a pull request or open an issue.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About


Languages

Language:Jupyter Notebook 79.3%Language:Python 20.7%