Syntactically highlighting PDF Documents in Haskell.
This is a Functional Toy Programming. Experimental yet.
Also an example of Haskell GUI((gi-gtk)
Japanese page is here.
Recommended: Ubuntu 20.04 LTS (Focal Fossa) Desktop or Ubuntu 18.04 LTS (Bionic Beaver) Desktop.
lubuntu 20.04 Desktop also works.
Instruction for Windows 10 (home / pro) is here(other page)
For other Distros, or other OS, equivalent process may work.
sudo apt update
wget -qO- https://get.haskellstack.org/ | sh
https://github.com/haskell-gi/haskell-gi
sudo apt install libgirepository1.0-dev libwebkit2gtk-4.0-dev libgtksourceview-3.0-dev
sudo apt install libpoppler-dev libpoppler-glib-dev
sudo apt install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
sudo apt install docker-ce
sudo docker pull nlpbox/corenlp:2018-10-27
sudo docker pull graham3333/corenlp-complete
See Stanford CoreNLP and Graham MacDonald
git clone https://github.com/polymonyrks/poppyS.git
cd poppyS
stack build
Assumes you have installed poppyS at $HOME/poppyS. Like below
/home/username/poppyS
You can check your $HOME by the following command.
echo $HOME
- Prepare some PDF Files.
- Run Stanford CoreNLP Server.
- Execute poppyS with args (target PDF File's path. Only English Documents supported. (Japanese Documents support(more experimental) is here)).
for more details, see below.
This program poppyS is suitable for hard to read documents such as a bit greek or latin ones.
Such kind of examples are below.
Read them yourself with this poppyS.
- Computer Science
- Haskell Wikibooks
- Basic Category Theory
- Category Theory for Programmers
- SICP
- Practical Foundations for Programing Languages
- Homotopy Type Theory
- Legal / Financial
- Copyright Law of the United States
- License Agreement Templates
- Annual Report (2019) (Apple Inc.)
- host city contract - Tokyo 2020 Olympic Games
- Biology / Medical
- User's Manual
sudo docker run -p 9000:9000 nlpbox/corenlp
This poppyS program assumes docker uses localhost, so if your environment is different from that you may have error, so modify the line below in fromPDF.hs.
command = "http://localhost:9000/?annotators=parse&outputFormat=json&timeout=50000"
stack exec poppyS-exe TARGETPDFPATH
Like below. (e.g.1) When you are at $HOME/poppyS.
stack exec poppyS-exe "pdfs/SICP.pdf"
(e.g.2) Full Path also O.K.
stack exec poppyS-exe "/home/username/poppyS/pdfs/SICP.pdf"
Once execute poppyS, wait a few seconds. Some words will be colored by yellow.
If no word is colored, this is probably timeout of Stanford CoreNLP Server, so re-execute poppyS.
Similar to Vim Keybindings.
command | effect |
---|---|
j | increase page (2 pages) |
k | decrease page (2 pages) |
Down | crop merginal white zones and adjust page size |
Right | increase page (1 page) |
Left | decrease page (1 page) |
w | toggle number of pages displayed (2 pages(maximized) to 1 page and vice versa) |
dd | decolor all phrases |
p | paste(recover) Coloring |
x | cut coloring (enter in or leave of Deleting Mode) |
gg | goto first page |
v | toggle mode (Vanilla) |
n | toggle mode (backward) |
m | toggle mode (forward) |
:w Enter | save the state |
action | effect |
---|---|
left click a word | coloring corresponding phrases (toggle forward) |
right click a word | coloring corresponding phrases (toggle backword) |
left click blank space | increase page (2 pages) |
right click blank space | decrease page (2 pages) |
Click some words, then some corresponding phrases are colored. When you click the same word multiple times, the coloring is toggled.
The toggle schedule is Red -> Blue -> Green -> Purple -> Orange -> Pink -> Aqua -> Cyan -> Red .. .
Pressing Key dd decolors all phrases. Some special words are remained yellowed.
Yellowed words are special ones in respect to how much area to be colored when you select them.
After pressing dd, press key p then it recovers the previous state
(Caveat: If you click another word after dd then the previous state is updated(destroyed).).
Even you decolor all the phrases, the tuples (e.g. (Red, word1), (Green, word2), (Aqua, word3) ..) are memoried.
Next time you click such words, you can start from the previous color.
This temporary decoloring(dd), recovering(p) and/or starting previous colors is effective for keeping visibility levels.
Pressing Key x enters into Deleting Mode. Click some words, then the correspoinding phrases become decolored.
Pressing Key x again leaves Deleting Mode.
Coloring words makes it easy to look clearly some phrases, but too-much-coloring increase entropy(become chaotic).
Use some tactics and train yourself the use of multiple colors(the latter is like a VR game). I'll show some tactics soon.
The art of reading natural language with coloring is imcomplete and experimental yet (especially active adhoc coloring by readers (not by writer)), but someday will be a common art I appreciate.
(see also LICENSE file included with Haskell source.)
Modified BSD3. For personal use and/or educational purpose use are O.K.
I also have some patent applications relating to this source files and technology used therein.
If you comply with the LICENSE Terms, then I never assert the mentioned usage described in LICENSE Terms by such intellectual properties.
If you are interested in commercial use, then please contact me.
- Functional Toy Manufactureing (Japanese Homepage) (https://www.polymony.net)
- Email: polymonyrks@polymony.net or polymonyrks@gmail.com