There are 8 repositories under high-dimensional-data topic.
Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
A Python toolbox for gaining geometric insights into high-dimensional data
Fast Best-Subset Selection Library
A collection of small-sample, high-dimensional microarray data sets to assess machine-learning algorithms and models.
High-dimensional medians (medoid, geometric median, etc.). Fast implementations in Python.
Poisson pseudo-likelihood regression with multiple levels of fixed effects
A Toolkit for Interactive Statistical Data Visualization
Deep distance-based outlier detection published in KDD18: Learning representations specifically for distance-based outlier detection. Few-shot outlier detection
A Python package for hubness analysis and high-dimensional data mining
Statistical quality evaluation of dimensionality reduction algorithms
An interactive 3D web viewer of up to million points on one screen that represent data. Provides interaction for viewing high-dimensional data that has been previously embedded in 3D or 2D. Based on graphosaurus.js and three.js. For a Linux release of a complete embedding+visualization pipeline please visit https://github.com/sonjageorgievska/Embed-Dive.
The DPA package is the scikit-learn compatible implementation of the Density Peaks Advanced clustering algorithm. The algorithm provides robust and visual information about the clusters, their statistical reliability and their hierarchical organization.
Hubness analysis and removal functions
A fast high dimensional near neighbor search algorithm based on group testing and locality sensitive hashing
t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections
Sparse and Regularized Discriminant Analysis in R
Feature Selection by Optimized LASSO algorithm
MATLAB code for Unsupervised Feature Selection with Multi-Subspace Randomization and Collaboration (SRCFS) (KBS 2019)
A simple library for t-SNE animation and a zoom-in feature to apply t-SNE in that region
Fortran bindings to the FLANN library for performing fast approximate nearest neighbor searches in high dimensional spaces.
A general purpose Snakemake workflow to perform unsupervised analyses (dimensionality reduction & cluster analysis) and visualizations of high-dimensional data.
R package to implement high-dimensional confounding adjustment using continuous spike and slab priors
An advanced version of K-Means using Particle swarm optimization for clustering of high dimensional data sets, which converges faster to the optimal solution.