desbma / firstname-chooser

Overengineered child naming

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

firstname-chooser

Build status License

There are only two hard things in Computer Science: cache invalidation and naming things.

-- Phil Karlton

Expecting a child, and feeling overwhelmed by the task of finding a name that he/she will bear for the rest of his/her life?

I have you covered. Actually Rust & graph theory do.

Overview

  • This is a tool to explore name ideas interactively, rather than pick a single name for you
  • firstname-chooser suggests names, you rate them and the next suggestions will be taylored according to your previous choices
  • No machine learning or opaque algorithms here, just good old graph theory
  • Choices are stored on filesystem, so you can stop and resume later

Algorithm explanation

  • Build a graph where nodes are names, and edges are Levenshtein distances between names
  • The first name is picked randomly in the graph
  • Pick the next name by maximizing or minimizing distance to names previously chosen, depending on whether they were liked or not
  • "commonness factor": this is a float value in the [0.0; 1.0] range to tweak behavior around name commonness (according to previous years statistics)
    • 0 will only consider your previous choices ignoring name commonness
    • 1 will ignore your choices and suggest names only according to their commonness
    • obviously values in between will consider both, to a degree that depends on the value
    • in my experience the sweet spot is between 0.4 and 0.5.

Implementation

  • Graph in memory is a 2D vector of distances, to represent a "half matrix" (distance between a and b is the same as between b and a, so no need to store it twice)
  • Graph is built using a thread pool for performance
  • Once the graph construction is fast enough, most of the tuning is actually in the optional source filtering, ie:
    • computing name weighting from source data
    • removing names given before year X
    • removing compound names
    • removing too short names

Usage

You need a Rust build environment for example from rustup.

Set toolchain to nightly with rustup override set nightly and run cargo run --release -- -h to build and get command line help.

Data source

For now this only contains a source for French names.

Before adding other languages or data source, I should probably add a NameSource trait. PR welcome if you care enough.

License

GPLv3

About

Overengineered child naming

License:GNU General Public License v3.0


Languages

Language:Rust 100.0%