Benudek / fastai-transcript

Video Transcripts of fast.ai MOOC courses made into searchable ebooks

Home Page:http://course.fast.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fastai-transcript

Video Transcripts of fast.ai MOOC courses made into searchable ebooks.

About

I took the available courses outlines, video timelines and autogenerated youtube subtitles and spliced them together to make searchable html/pdf files.

Since youtube's subtitles are just an endless sequence of lowercase words, I used a Neural Net project http://bark.phon.ioc.ee/punctuator to automatically create sentences and add punctuation. It did a pretty good job.

The outcome is usable, but it can be made very useful if the community will make an effort to proofread the transcripts while watching the videos and make the corrections where necessary.

Download PDFs

You can download the pdfs here:

  1. fast.ai - Deep Learning Part 1 (v2) Transcript (2018).pdf
  2. fast.ai - Deep Learning Part 2 (v2) Transcript (2018).pdf
  3. fast.ai - Intro To Machine Learning 1 (v1) Transcript (2018).pdf

Your Proofreading Help is needed

If you'd like to contribute you can just proofread a single section of a few paragraphs, or a few. And if you feel inspired -- even more ;)

The git repository contains multiple files but please ignore the files in the build directories and only work on the "lesson NN.html" files.

The following course transcripts currently need work:

  1. fast.ai - Deep Learning Part 1 (v2)
  2. fast.ai - Deep Learning Part 2 (v2)
  3. fast.ai - Intro To Machine Learning

Proofreading Notes

Since the files were generated automatically, including paragraphs, and punctuation - often sentences got split in the wrong place. So you may need to move some parts of the sentence around to make them whole again. At time the first half of the sentence is in one section of the document, and the second half is in the following section. Please move them together to where the whole sentence belongs.

Sometimes the subtitle generator made a consistent mispelling and it's easier to fix those across all files at once, so if you find any such multiple occurances of the same mispelling please let me know. e.g. I have applied the following fixes already (Perl regex syntax):

# common mispellings
s#Cagle|Cargill#Kaggle#ig;
s#Kressel#Crestle#ig;
s#fast,? AI|first AI|FASTA guy#fastai#ig;
s#panda's#pandas#ig;
s#Curly's#curlies#ig;
s#Pi ?torch|paytorch|hi ?torch#pytorch#ig;
s#SK learns?#scikit-learn#ig;
s#SJD#SGD#ig;
s#W get#wget#ig;

You can send me back your fixes via git pull or the good ol' patch, which you can email me as an attachment or with a link to dropbox/gdrive/etc.

When you edit the html files please edit them in plain text editor. If you use an HTML editor it's likely to inject a whole bunch of html tags, in which case it won't be an easy merge and it might get rejected. The sections that need editing are in almost plain text format with just <p>, </p> tags, so you don't need to know HTML to do that.

Build PDF

To build the "lesson NN.html" html files into a pdf I use a simple ./makepdf script, which requires wkhtmltopdf (with patched qt). If you don't have one just use the pdfs that are part of this git repository, I will keep those up-to-date as I receive improvements from you and others.

The build folders contain the source data I used to create the mashups. You don't need to touch or proofread any files there. "lesson NN.html" files are the only files to be edited.

Thank you

Thank you for your contributions.

About

Video Transcripts of fast.ai MOOC courses made into searchable ebooks

http://course.fast.ai/

License:Apache License 2.0


Languages

Language:HTML 99.2%Language:Perl 0.8%Language:Shell 0.1%