`dolt diff` ... that only shows the tables changed in a simpler format
verdverm opened this issue · comments
I'm looking to feed the dolt diff
into some automated processes, but only want to know what tables have changed
I'm using dolt diff --summary
but it contains a bunch of formatting text
$ dolt diff b6u0b8g9crutgpaummtla6rbft91o2ue --summary -r json
+---------------------------+-----------+-------------+---------------+
| Table name | Diff type | Data change | Schema change |
+---------------------------+-----------+-------------+---------------+
| cbp_apprehensions_monthly | modified | true | false |
+---------------------------+-----------+-------------+---------------+
dolt diff --stats -r json
produces invalid JSON (schema_diff
has invalid content`), but could work
$ dolt diff b6u0b8g9crutgpaummtla6rbft91o2ue --stat -r json
{"tables":[{"name":"cbp_apprehensions_monthly","schema_diff":[prev size: 29676, new size: 29676, adds: 0, deletes: 0, modifications: 24990
4,686 Rows Unmodified (15.79%)
0 Rows Added (0.00%)
0 Rows Deleted (0.00%)
24,990 Rows Modified (84.21%)
0 Cells Added (0.00%)
0 Cells Deleted (0.00%)
49,980 Cells Modified (24.06%)
(29,676 Row Entries vs 29,676 Row Entries)
}]}
Ideally I could have something like git diff --name-only
$ dolt diff b6u0b8g9crutgpaummtla6rbft91o2ue --table-only
cbp_apprehensions_monthly
A workaround here would be dolt sql -q "select table_name from dolt_diff where commit_hash='WORKING'"
.
Good feature request for the CLI though.
I'll make a separate bug for the invalid JSON.
Is there a work around for diff between HEAD and previous commit?
(i.e. the two fields that come in a webhook payload)
Something like:
dolt sql -q "select table_name from dolt_diff where commit_hash ='STAGED' or commit_hash ='WORKING' or commit_hash='<CURRENT COMMIT>'"
I don't think HEAD will work.
edit: Had two staged. meant working
I don't think there would be anything in STAGED or WORKING, as it would be a fresh clone after a push
We have a hashof()
that can take HEAD
or HEAD~1
as an argument, so maybe something lile this?
tmp/main> select * from dolt_diff where commit_hash=hashof('HEAD') or commit_hash=hashof('HEAD~1');
+----------------------------------+------------+-----------+-------------------+---------------------+---------+-------------+---------------+
| commit_hash | table_name | committer | email | date | message | data_change | schema_change |
+----------------------------------+------------+-----------+-------------------+---------------------+---------+-------------+---------------+
| 616qa6ngvisk6notemsafdij9huqtiod | t | James | james@dolthub.com | 2024-04-30 18:27:12 | asdf | false | true |
| bjghbfbsbns8t8kh7ku7a3qm0ghbqrfi | t1 | root | root@localhost | 2024-04-30 18:28:08 | fdas | false | true |
+----------------------------------+------------+-----------+-------------------+---------------------+---------+-------------+---------------+
2 rows in set (0.00 sec)
Is there a way to say "give me the diff between these two specific commits"?
I think it's dolt_diff_stat()
. So select table_name from dolt_diff_stat(<from_revision>, <to_revision>)
. If you want the rows in a able, you use dolt_diff()
table function and pass in the table name.
https://docs.dolthub.com/sql-reference/version-control/dolt-sql-functions#dolt_diff_stat
Docs for dolt_diff(): https://docs.dolthub.com/sql-reference/version-control/dolt-sql-functions#dolt_diff