robertmartin8 / KindleClippings

Extract kindle highlights into organised text files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

python   MIT license   maintained  

One of the many great things about kindles is that you can highlight parts of your book to go back to later. However, it is perhaps surprising that there is no good way of aggregating the highlights (even per book).

KindleClippings is a utility born out of personal need, which fetches any highlights that you have made on your kindle, and organises them into plain text files per book. It is run from the command line using:

Can specify format which will create additional files of that specified type. Currently added support for pdf and docx.

python KindleClippings.py

The result is a new folder with individual text files per book:

In my workflow, I then copy these into Evernote, but the whole point is that you are now free to do whatever you want.

Background

When you make highlights or add bookmarks on your kindle, they are stored to a text file on the kindle called My Clippings.txt. This has a regular format, which means that it can be parsed:

==========
The Selfish Gene: 30th Anniversary Edition (Richard Dawkins)
- Your Highlight on page 92 | location 1406-1407 | Added on Saturday, 26 March 2016 14:59:39

Perhaps consciousness arises when the brain's simulation of the world becomes so complete that it must include a model of itself.(4)

==========
Fahrenheit 451 (Ray Bradbury)
- Your Bookmark at location 346 | Added on Saturday, 26 March 2016 15:46:21


==========
Fahrenheit 451 (Ray Bradbury)
- Your Highlight at location 784-785 | Added on Saturday, 26 March 2016 18:37:26

Who knows who might be the target of the well-read man?
==========

Prerequisites

The only requirement for this project is to have python (either python 2 or python 3) installed on your system. For users on macOS, you don't have to worry about this because it is already installed. On Windows, python can be installed following the instructions here.

For pdf and docx conversion, it requires library. It can be installed using below pip command

pip install -r requirements.txt

Basic usage

It is recommended that you download the KindleClippings.py and place it either in your home directory or the desktop. Connect your kindle, and make sure it exists in your filesystem. Then, open up your terminal/shell.

If you're on a mac, you might just be able to run

python KindleClippings.py

However, most users will need to specify the path to the kindle and optionally the path to the destination. By default, the script will create a folder called KindleClippings in the current directory, and place the resulting text files there (though this likely only works on mac). For example

python KindleClippings.py -source /Volumes/Kindle/

On windows, this might look something like:

python KindleClippings.py -source C:\Kindle -destination \

If the parsing is succesful, the script will print all of the exported titles

Exported titles:

To Kill a Mockingbird - Harper Lee.txt
A Clockwork Orange - 50th Anniversary Edition - Anthony Burgess.txt
The Road - Cormac McCarthy.txt
Fahrenheit 451 - Ray Bradbury.txt
Heart of Darkness - Joseph Conrad.txt
The Meaning of It All - Richard P Feynman.txt
The Selfish Gene - 30th Anniversary Edition - Richard Dawkins.txt

format usage: formatter will run after txt file are created so apart from txt file, it will also create file of specified type

python KindleClippings.py -source C:\Kindle -format pdf

About

I originally forked firewood, but I realised that my fork was fundamentally different to firewood – to the extent that it has become a different solution.

If you play around with firewood enough, you'll find that sometimes it can just completely break. This is because firewood relies on the regular order of the My Clippings.txt file from the kindle. For the most part, this is a fair assumption. However, I have found that very occasionally, kindle will insert an extra blank line that will prevent the whole program from functioning.

My solution does require regularity, but it is a lot more robust to irregularity. We first split the text file into individual highlights, then proceed from there.

Sometimes when you make a highlight on kindle, then delete it, it still gets stored into clippings. So if you make a wrong highlight and redo it, you'll end up with multiple very similar highlights. I haven't yet decided whether this is worth fixing, but in my workflow it's not very important.

About

Extract kindle highlights into organised text files

License:MIT License


Languages

Language:Python 100.0%