pettarin / export-kobo

A Python tool to export annotations and highlights from a Kobo SQLite file.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Option to have .txt output match formatting of Kindle clippings?

curiositry opened this issue · comments

Is there any chance you’d consider adding the option to export to a text file formatted like Kindle’s “My Clippings.txt”?

It would be handy for people who have both devices, want to keep their notes coherent when they switch platforms — or want to use one of the many tools available for managing Kindle clippings.

Sure, I have no problem with that.

Two routes:

  1. if you can, fork, add the function, and open a pull request, or
  2. send me (via email or as a reply here) an example of the new output format.

If you go with 1., I suggest to add a new command line switch (e.g. --kindle) to select the new format, similar to the existing --csv.

Great! Since I’m not familiar with the codebase and am a bit sloppy with Python, I’d vote for the easy (for me) route 😄


A bookmark:

==========
The Odyssey / Rendered into English prose for the use of those who cannot read the original (Homer)
- Your Bookmark on page 10 | location 139 | Added on Friday, 19 December 2014 19:54:11


A highlight:

==========
Meditations (Aurelius, Marcus)
- Your Highlight on page 10 | location 874-874 | Added on Friday, 26 December 2014 13:09:52

These things thou must always have in mind: What is the nature of the universe, and what is mine?

A note:

==========
Essays by Ralph Waldo Emerson (Ralph Waldo Emerson)
- Your Note at location 261 | Added on Sunday, 8 February 2015 17:25:12

True for me. How do we become humans who think, rather than just thinkers ... thinking machines?

Here’s somebody else’s full file if that's easier:

https://www.mobileread.com/forums/attachment.php?attachmentid=97944&d=1355946838

Thanks.

I see your snippets and the linked file are a bit different. For example, you have - Your Bookmark on page 10 while the file has - Bookmark on Page 62 or - Bookmark Loc. 508, but I guess I can just code your version, which looks newer.

I am not sure about the page and location values, which I think are not recoverable from the Kobo data. Probably I will just put dummy values.

Yes, it looks like they changed the format at some point. The Mobileread file is from 2012, and it also has a slightly different date format. My clippings file, which ran from 2014 to the end of 2016, seems to be internally consistent.

Thanks so much Alberto!

(If the page numbers were in the SQLite somewhere, it’d be possible to roughly translate the locations to page numbers (1 "loc" ≈ 125 characters). It also might be possible to make a script, for highlights at least, that would search for the highlight in your ebook library and nab the location — but that would be a different project :)

I added the --kindle option to commit 4af8eea and tagged it as (tentative) v2.1.0.

Please see if the output is good. If so, just feel free to close this issue. If not, let me know.

The problem with page/locations is that the Kobo SQLite file contains locations expressed using (roughly) EPUB CFI, which are not easily translatable in pages/locations in the Kindle sense. Moreover, they also have a [0,1] field, denoting the location within a chapter (XHTML file?), presumably to ease the (re)computation of the progress indicator in the ereader UI. For now, my script always outputs "location 1 on page 1".

@curiositry does it work? Shall I close the issue?

Yes, it works great — thanks so much!

I noticed one very minor issue: "December" is spelled "Dicember". I was planning to submit a PR for this but never got around to it.

EDIT: looks like this is fixed already :)

Sure, no problem. Yes, the misspelling has been fixed in bc7229e (current master).