jleinonen / ikea-names

Generating fake names of IKEA furniture with an LSTM network

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generating fake names of IKEA furniture with an LSTM network

Disclaimer: The author is not affiliated with IKEA. The results shown here are intended for educational and entertainment purposes only and do not imply endorsement.

I had wanted to experiment with natural language processing using LSTM networks for a while, when I saw some discussion on the social media regarding the quirky names of IKEA furniture. I realized that they had a few thousand products, which should be enough to train a simple network to generate completely made up product names.

Fortunately the hard work of data collecting had already been done for me: I mined a dataset of product names from the IKEA dictionary. I then wrote some processing code in Python and implemented the LSTM model in Keras.

To see how the model is run, check out the code and this Jupyter notebook that gives an example of using it.

You can also clone the code and use the command line interface. Running this in the ikeanames folder should generate 100 names with the pre-trained model included with the code: python ikeanames.py generate 100

Here are one examples output of generated names:

GÅNGAL		KORT*		TABMAN		TOMMEN		SORSA
GÄSJÖ		TITTADIA	UTBY*		ILIOMA		LEDANA
GÖTA		PRAKTIT		PRASKE		BILJARL		PRUMÖR
STULD		MAJKÖ		VARRELUDDA	ARNÖM		TÄLLENG
BOLSVAT		RADERA		KAAMBY		JONNE		BLADA
REMONH		ANGELF		MÅNGEN		BERNNET		INNARIS
SÄLLAR		EFBY		TRULS*		LAGNI		SKRÄCKÖ
BERENN		JYGGJAA		MIRGA		EDRIKMÄN	INDAL
LJULDAN		GYSING		BEVJÄT		TRYPS		FLYK
SNÖRS		MARAK		VID		BELA		LONDED
KOJE		INDO*		GRÖKIG		KNASS		BERRSKAM
VOFTER		ESKME		TÄLLHOLM	RÖNE		SKODVEN
GRÄSÖR		SSÖRLIN		TULLA		FLABO		LURGER
KRUTT		MORA		HALLCEHYLL	DANTA		KLEVBAR
FENDIK		TOLLA		TROSURDA	IMEBAL		SKÄMDA
BORK		TROMMEN		ÖFJA		ORGELFIKÄR	GRÖNA
LASTAN		ENBAR		GAMOLIS		DRIGG		KORDENELL
UMMEREN		FRANKLIG	ONRELS		SERJÖ		BIMÅS
JOVERR		BARRKA		SANE		ÅNARLUS		BEDUDDER
KORPETIN	ESTER		EKIS		SLÄNG		PLÄL

The results are quite interesting: The model has clearly learned enough about the relationship of letters in Scandinavian languages to be able to generate halfway decent Scandinavian-sounding Pig Latin. A Swedish speaker should have no problem pronouncing most of these words even if most of them are just nonsense. The names ending with an asterisk are ones that were present in the training set. Clearly the model does much more than just repeat examples from its training. It actually manages to generate some words that do mean something even if they're not in the training set, for example "Göta" and "Gröna".

About

Generating fake names of IKEA furniture with an LSTM network

License:MIT License


Languages

Language:Jupyter Notebook 69.2%Language:Python 30.8%