[FEATURE REQUEST] SummaryDataPane
jc-healy opened this issue · comments
John Healy commented
This is a function that takes a summary function and outputs a data frame pane. We will pass in a function which takes a selection (and potentially other information) and generates a data frame to display to the user.
We will build a small set of pre-canned summary functions to make it easy for users to get started with this pane. These functions will be packaged into their own module to keep things tidy.
- value_counts_summarizer
- Takes a selection and a series
- pass a column from which to select and compute a value count
- sparse matrix largest columns passed to a value_counts summarizer.
- Takes a selection, sparse matrix, column_index_dictionary
- Column sum the selected rows from a passed in sparse matrix.
- Then find the k columns with the largest sums.
- Compute a value_counts display how many of these columns are present within our selection
John Healy commented
Extra summary functions:
3. Weighted sparse matrix summarizer where we are working with a dedupped data set so we need to pass in a selection along with counts.
4. Cluster interpretability summarizer
- regularized logistic regression on an interpretable feature space comparing the selected points against a sample of the other points
- Cluster interpretability high space centroid (top2vec summarizer)
- Compute the centroid of the selected points in the high space and return a weighted nearest neighbour set from an interpretable joint embedding.