Genetic Algorithms

Reproducing images by evolving populations of polygons

Visit the site to learn more and try it yourself!

Genetic Algorithms
Genetic Algorithms Overview
- Metaheuristics
- Genetic Algorithms
  - Goal
  - Setup
  - Evaluation
  - Selection
  - Reproduction
  - Replacement
Things I've Learned Trying to Recreate the Mona Lisa
This Site
- Site Tour
- Site Architecture
Running The Code
- Troubleshooting
  - Mac
Resources

Genetic Algorithms Overview

Metaheuristics

A set of techniques that are useful for solving a hard problem that typically has these characteristics:

It's not obvious how to find a solution to the problem
It's easy to check if a solution is good, or at least better than another one
The solution space is so large that simple brute force, trial and error isn't going to work

Metaheuristics are a type of stochastic optimization, algorithms that use some randomness to find a solution.
Many of them can be boiled down to a simple concept called hill climbing:

Pick a random solution as your starting point (it'll probably be bad, that's okay)
Randomly tweak it a bit to get a slightly different solution
Is the new one better? Awesome, keep it, otherwise keep the original
Repeat until you've "climbed the hill" to a good solution

Genetic Algorithms

A subcategory of Metaheuristics that borrows concepts from evolutionary biology.
Rather than testing a single solution at a time, these use a population of candidate solutions and through a process of not-so-natural selection "evolve" a good solution.
Let's walk through what that means below using this project as an example.

Goal

The algorithm stops when one of two things happens: we reach our ideal solution or (more likely) we run out of time, here measured in generations.

Given a target image, e.g. The Mona Lisa, we want to recreate it as closely as possible using a number of colored polygons.

Setup

We start with a population of candidate solutions called organisms.
Each organism has a set of chromosomes that represent its attempt at a solution.
The first generation is created randomly.

In our case, each chromosome encodes a single polygon - a list of (x,y) points and a color. Each organism has a list of chromosomes that comprise its solution.

Evaluation

We then apply a fitness function to each organism, assigning it a score based on how good of a solution it is.

Here that means rendering an organism's polygons to a canvas and then comparing that canvas pixel by pixel to the target image. At each pixel we subtract the difference between what the value should be and what the organism produced.

Selection

Once everybody has a score, it's time to reproduce!
Using a selection algorithm, we choose organisms from the population, typically two, as parent(s) and breed them to produce two new offspring. There are a variety of selection methods out there.

In this project you can choose between Tournament Selection, Roulette Selection, and Stochastic Universal Sampling. More info on each can be found in the Resources section.

Reproduction

Breeding our newly chosen parents consists of two operations:
Crossover - a recombination of the parents' chromosomes to produce two new sets in their offspring.

We randomly choose an index (or multiple indices) in the list of polygons as a crossover point and swap the polygons after that point.

Mutation - randomly tweaking values in the offspring's chromosomes to introduce more variety

Here we have a number of ways to mutate chromosomes:

Tweaking the position and color of each polygon

Adding or removing sides from a polygon

Adding or removing a polygon all together

Permuting the order of polygons

Replacement

Lastly, we replace the previous generation of organisms with the new generation of children.
This process repeats until an organism in the population hits our target or run out of time.

Things I've Learned Trying to Recreate the Mona Lisa

I spent a lot of time trying to tune the hyperparameters of this particular Genetic Algorithm demo in order reach a higher maximum. I knew I wouldn't ever get a small number of polygons to perfectly recreate a painting (also there's something almost anticlimactic about it being a photocopy of the original, like "here's the image you gave me, but with a million more steps"), but it seemed like there was another percent or two to eek out of the model if I could only get the settings right. Here are a few of my findings along the way.

Plateaus

Most images of any complexity seem to experience a fitness plateau around the 95 - 97% mark, depending on the image. This was good, but I had a strong suspicion it could be a percent or so better. Partly because it just didn't look good enough to me, and partly because even if I added 5 or 10 more polygons, the percent didn't seem to increase much at all. If the model was actually utilizing all the polygons decently, adding that many more should have made a difference. I tried so many things to avoid this and all of them just made things worse...

Speciation

I tried splitting the population in to subgroups and letting them evolve separately. Then when the fitness stagnated I'd have the two populations mate with each other.

Disruption Events

I tried adding disruption events if stagnation was detected, i.e. a brief period of increased mutation rates in an attempt to shake the population out of a local optimum.

Mitosis mechanism

If a chromosome tried to add a point, but was already at the max number, it would instead split into two new polygons. Conversely, if it tried to remove a point and was already at the minimum it would delete the chromosome. Here I thought I was being clever... I was not, it mostly seemed to cause these cyclical dips in fitness variation every few generations.

I think when it's that close to the target, most mutations are going to be deleterious, so achieving those last few percentage points are going to be slow no matter what.

Crossover

My tests showed that having crossover was important, but it didn't really seem to matter which type. I could get comparable results with all of them. Probably this is related to the fact that mutation includes permutation of the chromosome order.

Population Size

Increasing population size wasn't as impactful as I was hoping. Often the results from a 200 run would look indistinguishable from a 300 or 400 run. Generally larger sizes make the increase over time smoother, but it definitely isn't a linear relationship.

This Site

This is an interactive demo of Genetic Algorithms. Our organisms' DNA is an array of semi-transparent polygons and we're trying to evolve them to look like the Mona Lisa (or any other image of your choosing).

Site Tour

Gallery

This page shows a collection of example runs, along with some of their stats. You can download the gif timelapses of any.

Your Art

The results from your experiments will be displayed here. You can download the gif timelapses here as well.

Experiment

This page lets you set up and run new simulations. Queue up as many as you like and hit "play" to watch them go. You can pause and end the run early or let it evolve until it hits a stopping point - either it reaches the target fitness (yay) or max number of generations.

Site Architecture

This entirely client-side app was written in React using Redux for state management and React Sagas for asynchronous logic handling.

To help speed up some of the most time intensive calculations I used React18's new webworker feature to split up parallelizable work into separate workers.

Data from previous and pending simulation runs is stored in IndexedDB using Dexie. This allows it to persist between page reloads.

All of the Genetic Algorithm code I've tried to keep isolated in the population folder.

Running The Code

Install the project's packages by running npm install.

Run the project locally with npm start, by default it will be served on localhost:3000

Use npm run build to build a production version of the site for deployment.

Troubleshooting

Mac

The first time you run npm install you might come across a node-canvas error:

node-pre-gyp ERR!

Using the advice on this thread I was able to fix that by installing cairo on OSX, it's pretty straight forward and this page walks you through it.

Resources

Full credit to Robert Johansson for this project idea. Genetic Programming: Evolution of the Mona Lisa

Here's a list of other resources I used while working on this project:

Essentials of Metaheuristics

Analyzing Mutation Schemes for Real-Parameter Genetic Algorithms

Choosing Mutation and Crossover Ratios for Genetic Algorithms

Analyzing the Performance of Mutation Operators to Solve the Traveling Salesman Problem

Initial Population for Genetic Algorithms: A Metric Approach

Self-Adaptive Simulated Binary Crossover for Real-Parameter Optimization

Genetic Programming Needs Better Benchmarks

A Genetic Algorithm for Image Recreation — Can it Paint the Mona Lisa?

Shealynntate / genetic-algorithms

Genetic Algorithms

Reproducing images by evolving populations of polygons

Genetic Algorithms Overview

Metaheuristics

Genetic Algorithms

Goal

Setup

Evaluation

Selection

Reproduction

Replacement

Things I've Learned Trying to Recreate the Mona Lisa

Plateaus

Speciation

Disruption Events

Mitosis mechanism

Crossover

Population Size

This Site

Site Tour

Gallery

Your Art

Experiment

Site Architecture

Running The Code

Troubleshooting

Mac

Resources

About

Languages