lyzs90 / MstrERD

Pretty schema diagrams for MicroStrategy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pretty schema diagrams for MicroStrategy

This notebook demonstrates how to convert the generally unhelpful MicroStrategy schema into human-readable JSON. We can then use a diagramming library such as JointJS or GoJS to draw aesthetically pleasing entity relationship diagrams.

Export MicroStrategy project schema

We will use the schema from the MicroStrategy Tutorial. The schema can be obtained in Developer from 'Schema' > 'Export Project Schema'. Export the Table Catalog - Logical View which contains a list of all warehouse tables and their attributes and facts.

from pyexcel_xls import get_data
import json

data = get_data("schema.xls")
logical_view = data["Logical View"]
temp = [[item[0], item[1], item[2]] for item in logical_view]
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero
import pandas as pd
df = pd.DataFrame(temp[1:], columns = logical_view[0])
df.head(10)
Table Name Object Type
0 CITY_CTR_SLS Call Center Attribute (K)
1 CITY_CTR_SLS Customer City Attribute (K)
2 CITY_CTR_SLS Cost Fact
3 CITY_CTR_SLS Profit Fact
4 CITY_CTR_SLS Revenue Fact
5 CITY_CTR_SLS Units Sold Fact
6 CITY_CTR_SLS Gross Revenue Fact
7 CITY_MNTH_SLS Month Attribute (K)
8 CITY_MNTH_SLS Customer City Attribute (K)
9 CITY_MNTH_SLS Cost Fact

Node list

# form node list

graph_spec = {"nodes": [], "links": []}

df2 = df["Table Name"].drop_duplicates()

for table_id in df2:
    props_list = []
    temp = pd.DataFrame()
    temp = df[df["Table Name"] == table_id]
    for index, row in temp.iterrows():
        props_list.append(row["Object"])
    graph_spec["nodes"].append({"id": table_id, "props": props_list})

Edge List

# cartesian product to find possible edges
df3 = pd.merge(df, df, how='left', on='Object')

# drop rows if Type eq Fact because can't join on Facts
df3 = df3[df3.Type_x != "Fact"]

# drop rows if same Table Name because it is a duplicate record
df3 = df3[df3["Table Name_x"] != df3["Table Name_y"]]

df3.head(10)
Table Name_x Object Type_x Table Name_y Type_y
1 CITY_CTR_SLS Call Center Attribute (K) DAY_CTR_SLS Attribute (K)
2 CITY_CTR_SLS Call Center Attribute (K) F_TUTORIAL_TARGETS Attribute (K)
3 CITY_CTR_SLS Call Center Attribute (K) ITEM_CCTR_MNTH_SLS Attribute (K)
4 CITY_CTR_SLS Call Center Attribute (K) LU_CALL_CTR Attribute (K)
5 CITY_CTR_SLS Call Center Attribute (K) LU_EMPLOYEE Attribute
6 CITY_CTR_SLS Call Center Attribute (K) SUBCATEG_MNTH_CTR_SLS Attribute (K)
8 CITY_CTR_SLS Customer City Attribute (K) CITY_MNTH_SLS Attribute (K)
9 CITY_CTR_SLS Customer City Attribute (K) CITY_SUBCATEG_SLS Attribute (K)
10 CITY_CTR_SLS Customer City Attribute (K) LU_CUST_CITY Attribute (K)
11 CITY_CTR_SLS Customer City Attribute (K) LU_CUSTOMER Attribute
# only keep unique pairs of nodes

unique_pairs = set()
temp = []
for index, row in df3.iterrows():
    if ((row[0], row[3]) in unique_pairs):
        pass
    else:
        unique_pairs.add((row[0], row[3]))
        temp.append(row)
        
df4 = pd.DataFrame(temp)

for index, row in df4.iterrows():
    graph_spec["links"].append({"source": row[0], "target": row[3]})

Export JSON

import json

with open('data.json', 'w') as f:
    json.dump(graph_spec, f)

About

Pretty schema diagrams for MicroStrategy

License:MIT License


Languages

Language:JavaScript 74.8%Language:HTML 13.2%Language:CSS 12.1%