bkamapantula / chart-recommender-gui

A rule-based chart recommendation service. Useful for teams to self-host or as an internal service.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chart recommender

This project is adapted from suggest-chart by @tejesh0.

How it works

data file -> column type detection -> map against rules -> recommend charts

Rules

Rules are configured to work for a combination of numeric and categorical columns. num and cat refer to numeric and categorical columns, respectively.


Below charts are recommended for numeric and categorical columns:

Combination Distribution of columns Uniqueness Charts
Numeric and Categorical columns 1 num, 1 cat Unique values in categorical column ['lollipop', 'bar_plot', 'circular_bar_plot', 'treemap', 'circlepack']
Numeric and Categorical columns 1 num, 1 cat Non-unique values in categorical column ['boxplot', 'violin_plot']
Numeric and Categorical columns >1 num, 1 cat Unique values in categorical column ['multi_line', 'parallel_plot', 'stacked_bar_plot', 'grouped_bar_plot']
Numeric and Categorical columns >1 num, 1 cat Non-unique values in categorical column ['grouped_scatterplot']
Numeric and Categorical columns 1 num, >1 cat Unique values in categorical column ['multi_line', 'parallel_plot', 'stacked_bar_plot', 'grouped_bar_plot'].

A variation of this is a nested view with these charts as possibilities: ['lollipop', 'bar_plot', 'circular_bar_plot', 'treemap', 'circlepack']
Numeric and Categorical columns 1 num, >1 cat Non-unique values in categorical column ['box_plot', 'violin_plot']
Numeric and Categorical columns 1 num, >1 cat Non-unique values in categorical column Using Adjacency as a principle, these are the chart possibilities: ['network', 'sankey', 'chord', 'arc']


Below charts are recommended for numeric columns alone:

Combination Distribution of columns Uniqueness Charts
Numeric 1 num - ['histogram', 'density_plot']
Numeric 2 num Unordered, few data points ['facet_box_plot', 'scatterplot']
Numeric 2 num Unordered, many data points ['facet_violin_plot', 'facet_density_plot']
Numeric 2 num Ordered ['line_chart', 'area_chart', 'connected_scatterplot']
Numeric 3 num Unordered ['box_plot', 'violin_plot', 'bubble_plot']
Numeric 3 num Ordered ['stacked_area_plot', 'line_graph']
Numeric >3 num Unordered ['box_plot', 'violin_plot', 'heatmap', 'correlogram']
Numeric >3 num Ordered ['stacked_area_plot', 'line_graph']


Below charts are recommended for numeric columns alone:

Combination Distribution of columns Type Charts
Categorical 1 cat - ['barplot', 'lollipop', 'donut', 'treemap', 'circlepack']
Categorical >1 cat Nested ['sunburst', 'treemap']
Categorical >1 cat Subgroup ['grouped_scatterplot', 'parallel_plot', 'stacked_bar_plot', 'grouped_bar_plot']
Categorical >1 cat Adjacency ['heatmap', 'network', 'sankey']


This work follows a subset of the excellent rule set defined by from Data to Viz.

Recommended charts

Here is the list of all possible chart recommendations:

  • bar chart
  • box plot
  • bubble sort
  • circlepack
  • connected scatterplot
  • correlogram
  • density plot
  • donut
  • facet box plot
  • facet density plot
  • grouped scatterplot
  • heatmap
  • histogram
  • line chart
  • lollipop
  • parallel coordinates
  • network
  • stacked bar chart
  • stacked area chart
  • sankey
  • scatterplot
  • sunburst
  • treemap
  • violin plot

Usage

Application setup

  • Install Gramex 1.x
  • Clone this repository
  • From the repo folder, run gramex setup .
  • From the repo folder, run gramex

Command-line interface (CLI)

Run recommend.py

python recommend.py

This uses data.xlsx file and recommends few charts as an output:

{'chart_list': ['lollipop', 'bar_plot', 'circular_bar_plot', 'treemap', 'circlepack']}

It runs initiate() function which uses two columns ID (categorical column with unique observations/values) and c1 (numeric column) to recommend charts.

As a service

import recommend

recommend.initiate('')

TODO

CLI tool

  • Accept Google spreadsheet URL as input
  • Accept local data file as input
  • Accept a subset of columns

Standalone tool

  • On running Gramex, provide an option to input a spreadsheet URL
  • Recommend charts subsequently

Contributions

About

A rule-based chart recommendation service. Useful for teams to self-host or as an internal service.


Languages

Language:Jupyter Notebook 66.6%Language:HTML 18.8%Language:Python 10.4%Language:JavaScript 3.0%Language:CSS 1.1%