torodb / stampede

The ToroDB solution to provide better analytics on top of MongoDB and make it easier to migrate from MongoDB to SQL

Home Page:https://www.torodb.com/stampede/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remove data type suffix?

RobertLowe opened this issue · comments

Is there or can you point me where I could remove the data type suffix torodb creates? I know torodb does this by design to handle conflicting data types as mentioned in #226, but here's my use case:

I use Metabase which is a popular BI tool (https://github.com/metabase/metabase) by default it transforms columns by replacing underscores to spaces and Title Casing field names (created_at_t => Created At T), this makes metabase far less user friendly. I know I could create views to handle this, but my issue with that is since our data domain is dense it really creates a lot of tables to create views for.

I'm not sure if I should patch metabase or stampede, but it feels more at place here for my use case.

This would also clean up things for Hasura a GraphQL endpoint https://github.com/hasura/graphql-engine

Perhaps in the case of a conflict the suffix could remain

Thanks,

  • Rob

Hi @RobertLowe, despite your explanation I would still bet in creating views is a better solution. You can easily automate the view creation using a bit of pl/pgsql like:

DO $$DECLARE sql text; r record;
BEGIN
    FOR r IN SELECT table_catalog, table_schema, table_name FROM information_schema.tables
             WHERE table_catalog = 'postgres' AND table_type = 'BASE TABLE' AND table_schema = 'public'       
    LOOP
        sql := 'CREATE OR REPLACE VIEW view_schema.' || quote_ident(r.table_name) || ' AS SELECT ' || (
SELECT string_agg(quote_ident(c.column_name) || ' AS ' || quote_ident(CASE WHEN c.column_name LIKE '%\__' 
    THEN substr(c.column_name, 1, length(c.column_name) - 2) 
    ELSE c.column_name END), ',')
  FROM information_schema.columns c 
  WHERE c.table_catalog = r.table_catalog 
  AND c.table_schema = r.table_schema 
  AND c.table_name = r.table_name
) || ' FROM ' || quote_ident(r.table_schema) || '.' || quote_ident(r.table_name);
        RAISE NOTICE '%', sql;
        EXECUTE sql;
    END LOOP;
END$$;

If you want to touch ToroDB Stampede you should have a look on how it generated fields names in the d2r part of the engine. The problem is that if you have collisions in same column name with different type you will need to create a logic to hadle that (currently the code is highly coupled with the assumtion that the identifier will de unique). You can start looking at classes in https://github.com/torodb/engine/blob/master/core/src/main/java/com/torodb/core/d2r and in particular in the class DefaultIdentifierFactory.java

Awesome, thanks!