The-Academic-Observatory / academic-observatory-workflows

Telescopes, Workflows and Data Services for the Academic Observatory

Home Page:https://academic-observatory-workflows.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Define a method for users to easily create new Groups of institutions for analysis and comparison

rhosking opened this issue · comments

It would be really beneficial for end users (researchers, analysts, etc) to easily define a list of institutions (using their grid_id or other identifier system) that can be automatically utilised by our data aggregation workflows to produce 'group'-level aggregations the same way we create institution, country, funder and publisher level aggregations.

The current method is slow and requires a knowledge of SQL and BigQuery. Other methods must be investigated, and the chosen solution documented and easy to use.

Was just coming here to add an issue like this. Following some discussions I think either adding a bunch more groups the hard way or figuring out a way for users to expand this table could rapidly become important. Happy to go either way in the short term.

How is the Groups workflow currently working? I've got some groups we really ought to add and can do that but need to know where they should currently go (or whether this is likely to be addressed shortly)

currently you give me a list of grids, and I do it manually. That being said, if you can produce the data with the following schema, that is about 95% of the time it takes to get it in: https://console.cloud.google.com/bigquery?project=open-knowledge-datasets&authuser=3&organizationId=670938844927&p=open-knowledge-datasets&d=mappings&t=groupings&page=table

I'm thinking about a google form approach as a next step, or an API (but that might limit the number of people who can contribute)

I did a brief look to see whether a google form (or spreadsheet) would be able to represent the list of GRIDs but couldn't figure that out. One route would be to generate JSON-NL from a form output I guess. That wouldn't be too hard. Main bit is figuring out what the form should look like (a look up for GRID via a drop down would be useful but couldn't see how to do that...)