danoneata / fast-nca

Fast implementation of the Neighborhood Component Analysis algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fast NCA

Fast implementation of the Neighborhood Component Analysis algorithm in Python. The ideas behind some of the choices are further expanded in Master's thesis, Fast low-rank metric learning.

Features:

  • Sklearn-like API
  • Same gradient cost as the objective function
  • Avoid overflows when the scale of the metric is large
  • WIP Mini-batch version

Examples

Sample usage from Python:

from nca import NCA
n = NCA()
n.fit(X, y)
X = n.transform(X)

For an example, run the example.py script. Among others the script accepts the type of model and the dataset:

python example.py --model nca --data wine

For a complete description of the available options, just invoke the help prompt:

python example.py -h

Installation

The code depends on the usual Python scientific environment: NumPy, SciPy, Scikit-learn. The required packages are listed in the requirements.txt file and can be installed in a virtual environment as follows:

virtualenv -p python3 venv
source venv/bin/activate
pip install -r requirements.txt

Metric learning

Benchmarks

Related work

  • Other NCA implementations Python + MATLAB
  • Other metric learning algorithms

Acknowledgements

Special thanks to Iain Murray for writing a first version of this code, teaching me about automatic differentiation and supervising my Master's thesis project.

TODO

  • Add requirements
  • Add examples
  • Add example using NCA with Nearest Neighbour
  • Test numerical stability
  • Add argument parsing for example script
  • Add some visualizations
  • Add PCA to the list of models
  • Add concentric circles and noise to the list of datasets
  • Create separate modules, for example: data, models
  • Package the code
  • Do not compute gradients when only the cost function is required
  • Add example on MNIST
  • Add tests
  • Add gradient check tests
  • Compute timings
  • Big O notation for memory and computation
  • Write documentation
  • Implement version with mini-batches
  • Provide links to other implementations and outline differences
  • Motivate metric learning
  • Implement nearest mean metric learning

About

Fast implementation of the Neighborhood Component Analysis algorithm

License:MIT License


Languages

Language:Python 100.0%