rhaussmann / ds-capstone-2

Comic Book Artist Identifier

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Comic Book Artist Identifier

This project focuses on distinguishing a comic book artist’s work when presented with an unknown art sample.

Comic book art presents a number of unique difficulties in identification.

Typically, the artist pencils art which is inked over by another artist and colored by yet another artist. In addition, word balloons and narrative may further obscure the art or falsly register as part of the art.

The artist’s style will also change over time or with the subject matter. To further complicate the process, newer artists can be influenced by another artist’s style.

Data

The project uses Jack Kirby, co-creator of “The Black Panther,” “The Avengers” and many other popular series as the target artist. For demonstration purposes, the other cartoonist that the work is being compared to is Randall Munroe, known for his web comic XKCD. In the interest in giving the nueral network a chance of success, XKCD is primarily composed of stick figures.

Exploratory Data Analysis

Images files (jpg and png) were extracted from Google Images, Pinterest and Tumblr. The images were black and white and usually inked. Lighter pencils were thrown out. The initial pool consists of 1025 jpg and png images proportionally resized and cropped to 200x200 pixels.

To prepare the dataset, photos and “homages” were manually removed. File names with spaces and illegal characters were modified to allow the files to be input into the system. The training data was set with labels as the directory name to make adding additional artists easier.

Artist 1 2 3 4
Jack Kirby kirby1 kirby2 kirby3 kirby1
Randall Munroe munroe1 munroe2 munroe3 munroe4

Process

Initially, the project tried to identify between three artists but it proved too difficult with a mixture not only of artists but with the addition of color as well. Sample images were increased to their current size from 50x50 (grouped by original image and returning votes for the class). Larger, complete images were expected to show patterns better than smaller pieces. Various models were tested to try to eliminate overfitting but for this analysis, the simpler LeNet Convolutional network (developed for handwriting) was used.

Tracking Filters Through the Architecture

The model was generated by Keras using a TensorFlow backend. The architecture is a LeNet Convolutional Neural Network. This returned a poor accuracy of ~50%. With the model overfitting, what are the impacts of each of the layers and their settings? Here are the layers and their cumulative effects on the image.

Layers Image
Original 0
Convolution: Conv2D : filter:20 kernel: 5x5 1
Activation: ReLU 2
Pooling: MaxPooling2D: Pool:2x2 Strides: 2x2 3
Convolution: Conv2D : filter:50 kernel: 5x5 4
Activation: ReLU 5
Pooling: MaxPooling2D: Pool:2x2 Strides: 2x2 6
Flatten + Dense(500) + Activation:ReLU
Dense(2)+Activation: Softmax

tensorboard graph

Second Model

The next model uses .L2 regularization, ensembles and batch normalization. The results: 0.9744 - acc: 0.8072.

Results

Accuracy Loss
acc loss

Predictions

Artist 1 2 3 4
Jack Kirby kirby1 kirby2 kirby3 kirby1
Randall Munroe munroe1 munroe2 munroe3 munroe4
Jim Lee lee1 lee2

Next Steps

Additional fine tuning techniques: Batch Normalization, L2 Regularization, Ensembles, Early Stopping Additional CV2 Filters: Canny Edge Detection

References

AI Machine Attempts to Understand Comic Books ... and Fails

Detecting comic strip dialogue bubble regions in images

Rosebrock, Adrian. Image classification with Keras and deep learning

Wilkins, Benjamin. Sketching Interfaces: Generating code from low fidelity wireframes

Ueno, Miki. [Comic Book Interpretation based on Deep Neural Networks] (http://www.ttic.edu/SNL2017/papers/SNL-2017_paper_19.pdf)

Uenoa, Miki, et al. Classification of Two Comic Books based on Convolutional Neural Networks

About

Comic Book Artist Identifier

License:MIT License