imStudd / Self-Organizing-Map

Implementation of artificial neural network Self-Organizing Map (SOM).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Self-Organizing Map

Prerequisites

  • Python 3.8 (OPTIONAL)
    • nltk
    • numpy

Python is an optional prerequisite, it will be used for preprocessing if the data to be processed is text.
The method used for preprocessing is TF-IDF which is a fairly basic method.

Usage

make
./som.out [OPTIONS]

Options

Option Argument(s) Description
-h, --help - Display options.
-d, --data <path> Set data file (If not specified here, the data ).
-t, --text <path> <max_features> <0|1> Use text data file, it will be preprocessed.
Mode 0 keep m recurrent terms for reduce dimensionality
and mode 1 use random projection.
-c, --config <path> Set configuration file (config.ini by default).
-l, --load <path> Load neurons file (.dat file).
-v, --verbose - Print progression.
--print_data - Print all data.
--print_classes - Print all classes.
--no_print_map - Do not print the map.

Configuration

Name Type Description
DATA_PATH string Path of data file.
DATA_NAMES string Names of each data column (csv format). (OPTIONAL)
NEURONS_NUMBER unsigned int Number of neurons.
MAP_WIDTH unsigned int Neuron map width.
MAP_HEIGHT unsigned int Neuron map height.
NEIGHBORHOOD_RADIUS unsigned int Initial neighborhood radius.
NB_ITERATION_PHASE_2 unsigned int Number of training iteration.
NB_ITERATION_PHASE_1 unsigned int Number of training iteration for the first phase.
LEARNING_RATE_PHASE_1 unsigned int Learning rate for the first phase.
LEARNING_RATE_PHASE_2 unsigned int Learning rate for the second phase.
RANDOM_MIN double Minimum random value for neurons initialization. (OPTIONAL)
RANDOM_MAX double Maximum random value for neurons initialization. (OPTIONAL)
SHUFFLE boolean Get random data during training.
GAUSSIAN boolean Gaussian neighbourhood function.

Use Iniparser library as INI file parser.

Data format

Data must be in csv format.
Each value must be a numerical value except the last one which corresponds to the class, can be a string.

4.9, 3.0, 1.4, 0.2, Iris-setosa
4.7, 3.2, 1.3, 0.2, Iris-setosa
4.6, 3.1, 1.5, 0.2, Iris-setosa
...

Example

Example with Fisher's Iris data set (downloadable here).

Result

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Implementation of artificial neural network Self-Organizing Map (SOM).

License:MIT License


Languages

Language:C 91.7%Language:Python 5.5%Language:Makefile 2.4%Language:C++ 0.4%