JamesMcGuigan / kaggle-arc

Kaggle - Abstraction and Reasoning Challenge

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kaggle ARC Abstraction and Reasoning Challenge

This codebase contains my entry and ongoing research into the Abstraction and Reasoning Corpus

This work was added to an ensemble of different approaches as part of the "Mathematicians + Experts" team

Public Leaderboard (22/914 = top 3% = Silver Medal)

Install and Execute

./requirements.sh
source venv/bin/activate

python3 ./submission/kaggle_compile.py src/solver_multimodel/main.py                 | tee ./submission/submission.py
python3 ./submission/kaggle_compile.py src/ensemble/sample_sub/sample_sub_combine.py | tee ./submission/submission.py

Gallery

Jupyter Notebook visualization of the solved and unsolved results for each of the Solvers

Data Model

This is an object oriented data model around the dataset, allowing for static typing and ease of use navigating between related datatypes.

Conceptual Mapping:

  • Competition: The collection of all Dataset in the competition
  • Dataset: An array of all Tasks in the competition
  • Task: The entire contents of a json file, outputs 1-3 lines of CSV
  • ProblemSet: An array of either test or training Problems
  • Problem: An input + output Grid pair
  • Grid: An individual grid represented as a numpy array
  • CSV: Export data model to submission.csv

Solver Abstract

Proof of concept: using inspect.signiture() figure out all possible permutations of f(g(h(x))) and implement an IoC dependency injection solver.

Solver MultiModel

This is the main codebase.

Core

Solver implements an object oriented base to handle common code for looping, testing and generating solutions in the dataset, allowing subclasses to override lifecycle methods such as detect(), fit(), solve_grid()

ProblemSetSolver is a Solver subclass designed for algorithms requiring data access to the Task rather than just the current input Grid.

ProblemSetEncoder is a baseclass for autogenerating a feature map from a list of typed univariate functions

Solvers

In order of complexity:

DoNothingSolver just returns the input grid

GlobSolver indexes the training dataset and returns verbatim any problems seen in the training dataset

ZoomSolver applies cv2.resize() and skimage.measure.block_reduce() to problems whose input/output grid sizes are an integer multiple of each other

SingleColorSolver + BorderSolver tests a list of functions to answer single color problems

GeometrySolver performs a brute force search of numpy array functions

TessellationSolver applies nested geometry solutions to tessellation problems

XGBGridSolver generates a large multi-dimential featuremap to be solved by XGBoost. The featuremap includes each pixel's "view" of neighbouring pixels. This was able to autosolve a suprising number of problem cases, but also produces a large number of incorrect or close guesess that managed to test correctly against the train side the task.

Hyperopt Bayesian Hyperparameter Optimization was also performed on XGBoost.

XGBSingleColorSolver solve simple problems using XGBoost in a subclassable way

Utils

Various utility functions including plot_task() and @np_cache()

Functions

A range of different numpy.array queries and transformations

Kaggle Compile

Kaggle Compile is a custom python concatenater that resolves local import statements and allows an IDE multi-file codebase to be compiled into a single-file Kaggle Kernel Script

About

Kaggle - Abstraction and Reasoning Challenge


Languages

Language:Jupyter Notebook 98.3%Language:Python 1.7%Language:Shell 0.0%