Record file structure/details in project metadata
tompollard opened this issue · comments
Tom Pollard commented
Currently, as far as I'm aware, we don't formally document the structure/details of files in the metadata of published projects.
For example, we have no structured record of details such as:
- Folder structure
- Lists of files
- File types
- File sizes
- File contents (e.g. what columns does a CSV file contain)
There may be value in documenting these kind of details. For example, we could refer to the metadata to:
- Support file-level data discovery.
- Assist with loading data into appropriate cloud tools (e.g. relational databases)
- Offer data summaries in the project description
This issue relates to #2184, which highlights a metadata format for documenting this kind of metadata.
Presumably we would want to generate the metadata around time of publication. The metadata would also need to be easy to regenerate in the rare cases where files are modified post-publication.