Comic Book Artist Identifier

This project focuses on distinguishing a comic book artist’s work when presented with an unknown art sample.

Comic book art presents a number of unique difficulties in identification.

Typically, the artist pencils art which is inked over by another artist and colored by yet another artist. In addition, word balloons and narrative may further obscure the art or falsly register as part of the art.

The artist’s style will also change over time or with the subject matter. To further complicate the process, newer artists can be influenced by another artist’s style.

Data

The project uses Jack Kirby, co-creator of “The Black Panther,” “The Avengers” and many other popular series as the target artist. For demonstration purposes, the other cartoonist that the work is being compared to is Randall Munroe, known for his web comic XKCD. In the interest in giving the nueral network a chance of success, XKCD is primarily composed of stick figures.

Exploratory Data Analysis

Images files (jpg and png) were extracted from Google Images, Pinterest and Tumblr. The images were black and white and usually inked. Lighter pencils were thrown out. The initial pool consists of 1025 jpg and png images proportionally resized and cropped to 200x200 pixels.

To prepare the dataset, photos and “homages” were manually removed. File names with spaces and illegal characters were modified to allow the files to be input into the system. The training data was set with labels as the directory name to make adding additional artists easier.

Artist	1	2	3	4
Jack Kirby
Randall Munroe

Process

Initially, the project tried to identify between three artists but it proved too difficult with a mixture not only of artists but with the addition of color as well. Sample images were increased to their current size from 50x50 (grouped by original image and returning votes for the class). Larger, complete images were expected to show patterns better than smaller pieces. Various models were tested to try to eliminate overfitting but for this analysis, the simpler LeNet Convolutional network (developed for handwriting) was used.

Tracking Filters Through the Architecture

The model was generated by Keras using a TensorFlow backend. The architecture is a LeNet Convolutional Neural Network. This returned a poor accuracy of ~50%. With the model overfitting, what are the impacts of each of the layers and their settings? Here are the layers and their cumulative effects on the image.

Layers	Image
Original
Convolution: Conv2D : filter:20 kernel: 5x5
Activation: ReLU
Pooling: MaxPooling2D: Pool:2x2 Strides: 2x2
Convolution: Conv2D : filter:50 kernel: 5x5
Activation: ReLU
Pooling: MaxPooling2D: Pool:2x2 Strides: 2x2
Flatten + Dense(500) + Activation:ReLU
Dense(2)+Activation: Softmax