norvig / paip-lisp

Lisp code for the textbook "Paradigms of Artificial Intelligence Programming"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Convert cleaner version to markdown

pronoiac opened this issue · comments

There's a much cleaner version of the book available on O'Reilly Safari that's text instead of scanned bitmaps.

Getting that version: Unpaid trial accounts are available, and you can use safaribooks to download and generate an epub file, or grab my copy. I'd love to compare against another download.

Much formatting, like italics, are missing on rendering, but present in the source html. Note, @hourann also archived the Safari book as a forest of html files in a zip file; I found the epub easier to work with, as each chapter is only one file.

Converting it to markdown: The epub is a zip file with a mix of HTML / XML / XHTML. I found a tool for converting epub to markdown, and worked on it a bit: epub2markdown The markdown generated is still hairy - and only one file! - and needs manual cleanup, but I hope it signals where cleanup is needed. The generated markdown file is in another branch.

I'm interested in what other people think of this.

Tim O'Reilly says we have permission to use the epub from Safari. I agree that it looks like the best source to start from.

Tremendous! As someone who is interested in making a clean HTML version eventually, this HTML version is a LOT more suitable to start from than (almost?) any other format such as Markdown, especially given my usual fairly manual conversion methods.

(Download it here.)

I added the Safari epub to this project.

We imported chapters from the cleaner source in #58. To do:

  • Filter out the likes of #l0015 and .unnumlist
  • Salvage previous work on chapters

I've been going over and reincorporating previous work in my salvage branch. Code blocks are a pain! Plz help!

Done: preface
In progress: chapter 1
To do: Chapter 2, 3, 5, 10, 21; perhaps 11, 12, 16, 19, 20

Chapters 2, 3, and 5 are salvaged and merged into master.

I'm considering another source for comparison, the Apple iBooks version. I think O'Reilly Safari is introducing some errors for fingerprinting, and that the epub downloaders are flawed, and I think iBooks sidesteps both of those.

While it's still a work in progress, this part is done.