Google Docs changelog channel

Question

Google Docs changelog channel

plexish opened this issue 2 years ago · comments

plex commented 2 years ago

Pull diffs from each doc updated in the answers folder and subfolders every x minutes, put them into a Discord channel.

plex · Answer 1 · Sat Feb 04 2023 00:32:52 GMT+0800 (China Standard Time)

TBD: Handling suggestions?

Chris Rimmer · Answer 2 · Sat Feb 04 2023 21:18:51 GMT+0800 (China Standard Time)

Ways this could go

This concept could (in theory, given adequate work, etc etc) mean any of 4 options:

Either we're reading information about the actual content of Docs, or we're reading information about the suggestions
Either we're reading information about the current state, or we're reading data information specific change events

For all combinations, there are of course varying levels of detail we might imagine viewing and varying degrees of aggregation we might desire / tolerate. I've tried to enumerate most of the useful points in this possibility space below.

Reading snapshot information about doc content

This feels useless so I haven't thought much about how hard it would be.

Reading transitions on doc content

Easy:
"Chris Rimmer updated 'Name of document goes here' at https://docs.google.com/etc"
Probably very hard:
"Chris Rimmer changed the sentence "lorem ipsum" to "dolore sit amet" in 'Name of [...]' at https://docs.google.com/etc"

This difference is due to Google's Drive API making it easy to get attribution data (and some other little bits) but not easy to get either the actual content of the change at any particular revision. Brute force approaches like "download and parse both entire versions of the answer, and provide a diff between the versions" are possible, as plain-text exports of Docs are available for historic versions but this feels sorta tricky for small benefit, as the lack of context around a text change could often limit the utility anyway.

Reading snapshot information about suggestions

Easy:
"'Name of document goes here' at https://docs.google.com/etc" currently has x pending suggestions totalling y words"

Reading transitions on suggestions

This seems like the most useful case so I've done a bit of digging.

There are some challenges around this because I haven't yet found a way to get information about the creation of a suggestion, and attribution also seems difficult. This means even this simple-looking information seems difficult to get or derive:
"Chris Rimmer submitted a suggestion to 'Name of document goes here' at https://docs.google.com/etc"
And this seems very difficult:
"Chris Rimmer suggested changing the sentence 'lorem ipsum' to 'dolore sit amet' in 'Name of [...]' at https://etc"

Thankfully though, I don't think that is exactly the kind of output we want anyway, and the data we have readily available is quite compatible with aggregations that preserve the information we do care about, like suggestion count and total size, so the following is actually quite easy to implement:
"Since we last looked, 3 new submissions totalling about 200 words have been added to 'Name of document goes here' at https://docs.google.com/etc"

The approach?

So far it looks like doing anything with the actual document content is either (often both) tricky or useless so it looks like the most useful versions of this (and also coincidentally the easiest ones!) will provide insight into suggestions. Instantaneous snapshot views and state transitions both seem useful to look at, so I've outlined some ideas how these might work below.

Streaming state transitions into Discord in real-ish time

We can't easily do Actually Real Time:tm: streaming of suggestion notifications into Discord as there seems to be no way to subscribe to a streaming feed of suggestion events. But we can approximate this behaviour (latency of 0-10 minutes) as follows:

Periodically polling Google for recent changes to Docs
Running the parser against them whenever "new" changes are detected
Having the parser post its summary of the changes to that Doc's suggestions

This also has the convenient side effect of reducing latency of getting new answer Doc content into Coda as well, which seems nice!

User-prompted posting of info snapshots into Discord on demand

We could also make this sort of data interactive. Stampy the bot could query our questions API in Coda for something like, "which Docs have the most suggestions" or "which Docs have the largest combined word count across all their suggestions" and post appropriate responses into Discord.

This behaviour might be triggered in a manner roughly similar to how we drop questions into Discord, where an editor could manually trigger it with something like "Hey Stampy, which answer Docs do we need to check out?" and Stampy could also periodically push an appropriate answer Doc into some editor channel.

Chris Rimmer · Answer 3 · Sun Feb 05 2023 02:27:40 GMT+0800 (China Standard Time)

Update on this: We now have the current number of suggestions and the current total size of the markdown those suggestions would yield stored in Coda for the example question. It currently disregards the size of any suggested deletions, so deleting 200 bytes of text and replacing it with 150 bytes appears as a 150 byte change rather than 350. Deleting 200 bytes and replacing it with nothing would register as a 0 byte change so this is still incomplete, but it's relatively close.

As soon as I have a Discord webhook URL that I can send messages to (this is going to become a blocker shortly @robertskmiles so if you can generate one or give Plex permission to do so, that would be A+), I'm going to start feeding what I have into the wiki-feed channel in Discord to see what happens.

Peter Hozák · Answer 4 · Sun Feb 05 2023 16:23:44 GMT+0800 (China Standard Time)

IMHO the most important notifications from GDocs are about new suggestions ("Suggestions were added to <title> , currently X open suggestions.") and when content loads into Coda after suggestions are approved ("New version of <title> loaded to Coda .") - perhaps no more details needed in notifications, but content diffs would be nice in Coda.

Chris Rimmer · Answer 5 · Mon Feb 06 2023 23:45:28 GMT+0800 (China Standard Time)

@plexish Are you happy with the data we're seeing come through in #wiki-feed in Discord? If so we can probably close this as completed.

plex · Answer 6 · Thu Feb 09 2023 21:26:54 GMT+0800 (China Standard Time)

Data looks good, but let's make the layout a little more conversational?

Instead of

There are 2 open suggestions on the Google Doc for the question "Example with all the formatting" - does anyone have a minute to review them?
What subjects should I study at university to prepare myself for alignment research?
Number of suggestions
Was 0, now 2
Total size of suggestions
Was 0, now 65

How about

There are 2 suggestions (previously 0), with a size of 65 (previously 0) for the question "Example with all the formatting" - does anyone have a minute to review them?
What subjects should I study at university to prepare myself for alignment research?

plex · Answer 7 · Thu Feb 09 2023 21:41:05 GMT+0800 (China Standard Time)

oh, and does this note when people make direct edits to the Gdocs rather than suggestions @ChrisRimmer? Just prev length and new length would be pretty helpful, even if exact diffs are more hassle than it's worth.