ygorg / Emoticon_Regex

A regex that matches emoticons and kaomojis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Emoticon_Regex :-)

@}-'-,--

The regexes matches emoticons and kaomojis that can be found in emoticons.json.

>°)))>

Emoticon

Human readable =)

pixelart = {
	"yinyang": r"69|\(\.\\°\)",
	"buttock": r"\([_ ][Y¤)()][_ ]\)",
	"breast": r"\( [o\.] (?:\)\(|Y) [o\.] \)",
	"penis": r"(?:\(_\)_\)|[c83])=+[D3]",
	"vulva": r"{\('\)}",
	"rose": r"@}[->',]+",
	"snail": r"_@_/|i@_"
}

components = {
	"hat": r"(?:\*?[03<>]|\(\)|3>)",
	"eyes": r"(?:[¦|:;=38BEXxz@%+]|\(:\))",
	"nose": r"[-'o^~]",
	"mouth": r"[\)\(\|\]\[}{>3DdQPpOoXxSs@€*$#/\\°~]",
	"pixelart": r"<(?:/|\\)?3|&\[ \]|\[]==\[]|\(___\(--#|>°\)+><|{yinyang}|{buttock}|{breast}|{penis}|{vulva}|{snail}|{rose}".format(**pixelart)
}

emoticon = r"{hat}?{eyes}'?{nose}?{mouth}|{mouth}{nose}?'?{eyes}{hat}?|{pixelart}".format(**components)

Expanded

emoticon = r"(?:\*?[03<>]|\(\)|3>)?(?:[¦|:;=38BEXxz@%+]|\(:\))'?[-'o^~]?[\)\(\|\]\[}{>3DdQPpOoXxSs@€*$#/\\°~]|[\)\(\|\]\[}{>3DdQPpOoXxSs@€*$#/\\°~][-'o^~]?'?(?:[¦|:;=38BEXxz@%+]|\(:\))(?:\*?[03<>]|\(\)|3>)?|<(?:/|\\)?3|&\[ \]|\[]==\[]|\(___\(--#|>°\)+><|69|\(\.\\°\)|\([_ ][Y¤)()][_ ]\)|\( [o\.] (?:\)\(|Y) [o\.] \)|(?:\(_\)_\)|[c83])=+[D3]|{\('\)}|_@_/|i@_|@}[->',]+"

Kaomoji ^o^

Human readable

components = {
	"eye": r"\[?[éèn-u=µ\-`´;\"@vQTPç0ÔOo.ôUX'x*¤+$^°><~¬]\]?",
	"mouth": r"([_,0Oµou.-3AwpmQqWvV\-]|[_/]+)",
	"sweat": r"[\"']",
	"cheek": r"[)(|?]",
	"arm": r"[\\/)(o]",
	"head": r"[o°]"
}
kaomoji = r"{arm}?{cheek}?{eye} ?{mouth}? ?{eye}{sweat}?{cheek}?{arm}?|{arm}{head}{arm}".format(**components)

Expanded

kaomoji = r"[\\/)(o]?[)(|?]?\[?[éèn-u=µ\-`´;\"@vQTPç0ÔOo.ôUX\'x*¤+$^°><~¬]\]? ?([_,0Oµou.-3AwpmQqWvV\-]|[_/]+)? ?\[?[éèn-u=µ\-`´;\"@vQTPç0ÔOo.ôUX\'x*¤+$^°><~¬]\]?[\"\']?[)(|?]?[\\/)(o]?|[\\/)(o][o°][\\/)(o]"

To go further

These repos are about emoticons too. Check them out :

A method to detect kaomojis :

Steven Bedrick, Russell Beckley, Brian Roark, and Richard Sproat. 2012. Robust Kaomoji Detection in Twitter. In Proceedings of the Second Workshop on Language in Social Media (LSM '12). Association for Computational Linguistics, Stroudsburg, PA, USA, 56-64.

About

A regex that matches emoticons and kaomojis

License:MIT License