ahyatt / ekg

The emacs knowledge graph, app for notes and structured data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suggestion to add much greater functionality

vigilancetech-com opened this issue · comments

If you were, to, instead of adding the note text itself to the database, drop a UUID of the note there instead, and drop the note into a directory hierarchy somewhat like what org does with its attachments, then the user could (subject to some customization variable) add that hierarchy into org's agenda files variable and actually have TODOs, search for properties, use org-ql, etc... on them.

Is there really much to be gained by having the note text itself in the database?

I suppose there might be some intertext search that sqlite might provide but there's also plenty of great tools for that already available from emacs (considering how long it's been doing that kind of thing with plain text files).

Don't get me wrong, I love the idea of using sqlite and triples as a quickly searchable INDEX to textual/org/md notes, but maybe putting the actual notes in there too is a step too far.

Also, it might be a good idea, when such note files are created, to add some kind of file properties to them such that if the sqlite database gets hosed it could be recreated in some automated form from the note files themselves, IOW, put the tags, uuid's, titles, etc... in there too. Maybe in such a way they can also be searched with org-ql (but that's probably a stretch because I don't think that works at the file level).

That's reasonable. I do think we get a few things from everything being in the database, though. The search, as you mention (full text search coming up in an upcoming release). And just being extremely fast. To me, it seems significantly faster than interacting with the filesystem, although I haven't attempted to quantify that. But also, when I think of the system, I don't see it as a system of organizing files, but as organizing notes - small little bits of text. Yes, these can be theoretically equivalent, but my hypothesis is that files encourage storing lots of information in one file, since if you are storing files you kind of what those files to have names and not just be a ton of small randomly named files. And then the files might need to be understandable on their own, so they inevitably, as you suggest, have their own metadata on them somehow. And that would be a pain to keep in sync. Eventually, after a lot of effort, I'd wind up with org-roam, which would be a shame because that already is a great solution for a file-based system.

I'm going to keep the existing design for now, I think it's an interesting direction to explore. But there is one thing: a note can have an included file, so you can sort of build a system that does what you want, although I haven't added anything that makes it particularly easy to do this for all notes.

I've been struggling with this same choice for years now. My daily notes app these days is logseq, which is mostly file based but keeps a db for fast indexing, and has a full-custom UI so it can support editable transclusion, editable search results, etc. But it is not a true markdown (or org-mode) system. In the past I've used full db systems like OneNote and Evernote; I much prefer being able to edit my notes in Emacs.
I do think whatever the ideal system is based on, the basic atomic unit should be a block of text with properties, not an entire file with multiple blocks. Each block (in my ideal world) has properties including parent, siblings as well as type (header, list item, paragraph) and metadata (ID, timestamps, tags, etc.). Then a note display can be composed of any sequence of blocks, including transclusions, references, search results, code-block executions, etc. This naturally leads to a db-based solution, like ekg.
BUT: I also really love plain-text because of Emacs but also all the wonderful CLI tools we have to manipulate text.

I think the "holy grail" for me would be a database storing everything at a block level, but the system writes out org files on every update, and has a filesystem watcher to ingest any changes from those files. (Org supports property drawers so it could represent all the db metadata losslessly.) So a bidirectional sync, giving the best of both worlds. Logseq is close to this, but (a) it's not Emacs and (b) it is missing basic things like paragraphs of text (it only supports lists, as an "outliner"). Also its org-mode support is poor and markdown doesn't have properties so the conversion is lossy.
Wouldn't it be nice to compose a note display (single or multiple notes, org-mode hierarchy or whatever) in an ekg-like system, say "open as buffer", edit the org-mode buffer as desired, and then when you save, the db gets updated?

With this kind of architecture, cross-machine syncing can be done via syncthing or Google Drive on the filesystem view, but a db-based sync protocol could also be designed. It would be a significant project for sure though!

That's reasonable. I do think we get a few things from everything being in the database, though. The search, as you mention (full text search coming up in an upcoming release). And just being extremely fast. To me, it seems significantly faster than interacting with the filesystem, although I haven't attempted to quantify that.

Most file systems are extremely performant (and more importantly robust -- I'd think far more than most database systems, plus there are a raft of commonly understood tools to do surgery on the former) these days

But also, when I think of the system, I don't see it as a system of organizing files, but as organizing notes - small little bits of text. Yes, these can be theoretically equivalent, but my hypothesis is that files encourage storing lots of information in one file, since if you are storing files you kind of what those files to have names and not just be a ton of small randomly named files.

  1. At least as far as most Unix/Linux based systems having tons of small "randomly" named files isn't really an issue, and in this case, placing the "notes" in their own files would serve two purposes:

a. act as a backing store for the database (and a place to rebuild it from - i.e. they're not really there for "human consumption"), and

b. allow a ton of pre-existing tools emacs (and OS level) tools to operate on them.

  1. With Unix/Linux systems, if the file size is small enough, the contents are actually stored IN the inode record instead of even opening up a new disk sector (which may be configurable), so in these cases the the performance/economy would be hard to beat!

https://unix.stackexchange.com/questions/197633/how-to-use-the-new-ext4-inline-data-feature-storing-data-directly-in-the-inode?rq=1

And then the files might need to be understandable on their own, so they inevitably, as you suggest, have their own metadata on them somehow. And that would be a pain to keep in sync.

Well, I would think that the only source of the modification of the metadata would be ekg it should not be a very difficult process. Using the backing store and its metadata to rebuild the index could come down the pike much later.

Eventually, after a lot of effort, I'd wind up with org-roam, which would be a shame because that already is a great solution for a file-based system.

I think what you (and I with N-Angulator) are thinking is still far superior as the notes/files do not have to have the concept of a "primary" key (which all the other "solutions" out there currently seem to have, which creates unnecessary friction to enter/modify the notes and their structure).

I'm going to keep the existing design for now, I think it's an interesting direction to explore. But there is one thing: a note can have an included file, so you can sort of build a system that does what you want, although I haven't added anything that makes it particularly easy to do this for all notes.

Yeah that "included" file would need some kind of utility to assign generic random names because it is beyond my desire to concoct names for them.