downloads and parses subtitle dataset from opensubtitles.org
python3 parse_opensubtitle_xml.py
the above will download a zip containing the english opensubtitles corpus, and extract text from all the xml files (removes metadata)
downloads and parses subtitle dataset from opensubtitles.org
downloads and parses subtitle dataset from opensubtitles.org
python3 parse_opensubtitle_xml.py
the above will download a zip containing the english opensubtitles corpus, and extract text from all the xml files (removes metadata)
downloads and parses subtitle dataset from opensubtitles.org