asdfjkl / kanjicanvas

Online Kanji (Japanese Character) Recognition in Javascript

Home Page:https://asdfjkl.github.io/kanjicanvas/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding new refPatterns

SethClydesdale opened this issue · comments

Hello,

Firstly I'd like to thank you for the work you've done on this project! I found it the most suitable for what I wanted to accomplish, which was to create stroke order exercises for the Genki textbooks and it has worked incredibly well thus far! You can view it online here. (this is also a debug link, so you can view output in the web console when clicking "next") I've also mentioned the project in the readme as thanks.

Anyway, onto my question! I was wondering how to add new refPatterns or what method you used to add them? I was wanting to add the following kanji as they were the only patterns I'm missing.

小, 自, 々, 彼, 供, 服, 夕

I tired using the copyStuff() function after drawing the kanji, however, the patterns returned seem to be inaccurate and pollute the candidates list. For example, this was the output for 小:

["小", 3, [[101.5,72.125,101.5,72.125,101.5,73.125,101.5,73.125,101.5,74.125,101.5,74.125,101.5,75.125,101.5,75.125,101.5,76.125,101.5,76.125,101.5,77.125,101.5,77.125,101.5,78.125,101.5,78.125,101.5,79.125,101.5,79.125,101.5,80.125,101.5,80.125,101.5,81.125,101.5,81.125,101.5,82.125,101.5,82.125,101.5,83.125,101.5,83.125,101.5,84.125,101.5,84.125,101.5,85.125,101.5,85.125,101.5,86.125,101.5,86.125,101.5,87.125,101.5,87.125,101.5,88.125,101.5,88.125,101.5,89.125,101.5,89.125,101.5,90.125,101.5,90.125,101.5,91.125,101.5,91.125,101.5,92.125,101.5,92.125,101.5,93.125,101.5,93.125,101.5,94.125,101.5,94.125,101.5,95.125,101.5,95.125,101.5,96.125,101.5,96.125,101.5,97.125,101.5,97.125,101.5,98.125,101.5,98.125,101.5,99.125,101.5,99.125,101.5,100.125,101.5,100.125,101.5,101.125,101.5,101.125,101.5,102.125,101.5,102.125,100.5,102.125,100.5,102.125,100.5,103.125,100.5,103.125,100.5,104.125,100.5,104.125,100.5,105.125,100.5,105.125,100.5,106.125,100.5,106.125,100.5,107.125,100.5,107.125,100.5,108.125,100.5,108.125,100.5,109.125,100.5,109.125,99.5,109.125,99.5,109.125,99.5,110.125,99.5,110.125,98.5,110.125,98.5,110.125,98.5,111.125,98.5,111.125,97.5,111.125,97.5,111.125,97.5,112.125,97.5,112.125,96.5,112.125,96.5,112.125,96.5,113.125,96.5,113.125,95.5,114.125,95.5,114.125,94.5,115.125,94.5,115.125,93.5,115.125,93.5,115.125,93.5,116.125,93.5,116.125,92.5,116.125,92.5,116.125,91.5,116.125,91.5,116.125,91.5,117.125,91.5,117.125,91.5,118.125,91.5,118.125,90.5,118.125],[87.5,81.125,87.5,81.125,87.5,82.125,87.5,82.125,87.5,83.125,87.5,83.125,87.5,84.125,87.5,84.125,87.5,85.125,87.5,85.125,87.5,86.125,87.5,86.125,87.5,87.125,87.5,87.125,87.5,88.125,87.5,88.125,86.5,88.125,86.5,88.125,86.5,89.125,86.5,89.125,86.5,90.125,86.5,90.125,86.5,91.125,86.5,91.125,85.5,91.125,85.5,91.125,85.5,92.125,85.5,92.125,84.5,92.125,84.5,92.125,84.5,93.125,84.5,93.125,84.5,94.125,84.5,94.125,83.5,94.125,83.5,94.125,83.5,95.125,83.5,95.125,83.5,96.125,83.5,96.125,82.5,96.125,82.5,96.125,81.5,96.125,81.5,96.125,81.5,97.125,81.5,97.125,80.5,97.125,80.5,97.125,80.5,98.125,80.5,98.125,79.5,98.125,79.5,98.125,79.5,99.125,79.5,99.125,78.5,99.125,78.5,99.125,77.5,99.125,77.5,99.125,77.5,100.125,77.5,100.125,76.5,100.125,76.5,100.125,75.5,100.125,75.5,100.125,75.5,101.125,75.5,101.125,75.5,102.125],[111.5,81.125,111.5,81.125,112.5,81.125,112.5,81.125,112.5,82.125,112.5,82.125,112.5,83.125,112.5,83.125,113.5,83.125,113.5,83.125,113.5,84.125,113.5,84.125,114.5,85.125,114.5,85.125,114.5,86.125,114.5,86.125,114.5,87.125,114.5,87.125,115.5,87.125,115.5,87.125,115.5,88.125,115.5,88.125,116.5,89.125,116.5,89.125,116.5,90.125,116.5,90.125,117.5,90.125,117.5,90.125,117.5,91.125,117.5,91.125,118.5,91.125,118.5,91.125,119.5,92.125,119.5,92.125,119.5,93.125,119.5,93.125,120.5,93.125,120.5,93.125,120.5,94.125,120.5,94.125,121.5,94.125,121.5,94.125,121.5,95.125,121.5,95.125,121.5,96.125,121.5,96.125,122.5,96.125,122.5,96.125,123.5,96.125,123.5,96.125,124.5,96.125,124.5,96.125,124.5,97.125,124.5,97.125,124.5,98.125,124.5,98.125,125.5,98.125,125.5,98.125,126.5,98.125]]]

With this in the refPatterns, drawing 一, 二, 三, or random lines 1-4 for example, always has 小 as the first candidate which feels off.

I'm thinking perhaps I may have did something wrong, so I figured I'd ask to learn what method you went about for adding the current refPatterns.

It's more complicated unfortunately:

  • Each kanji is stored in an XML file, the filename is UTF-8 code as hex + ".xml"
  • The xml file contains the strokes. These strokes are drawn within a "virtual box" of size 256x256
  • The python script read_all.py is then used to read all input kanji, apply moment-normalization, and extract feature points. It then prints out the array 'var refPatterns = [...]". This array can then be copy and pasted into the javascript source file

I'll add the XML sources to this repo the next days. You can then augment the files with your own characters and apply the above steps.

This is an example .xml (zipped).
5bc6.zip

I see, I figured the XML file and the python script had something to do with it when I was looking over the source code again earlier today, but I wasn't too sure of how to go about editing the XML file. I'll keep an eye out for the XML sources.

Thanks a bunch 👍

xmls.zip

All patterns attached. Of course, if you add additional ones, it'll be appreciated if you could submit them to this project...

Ah, thanks once again! I'll take a look over the files and see if I can go about adding the seven ones I need, and when I do I'll make sure to create a pull request.

Just one more thing though: How did you go about getting and setting all the x/y coordinates in these files? I have a few ideas, but just wanted to check with you before I go all out. Sorry for all the questions, haha.

Right. Way way back, I build a small java program for drawing and saving patterns. I quickly build a jar and attached it. Not really tested, hope it works for you. File -> Save Image will write a .xml . Don't apply any normaliziation techniques in the java program, as the python script will take care of that.
jTegaki.zip

Just tested it and it works incredibly well! After running it through read_all.py I applied it to the refPatterns and it appears to be working properly now!

Thanks for all the help, @asdfjkl! When I get all the new patterns setup I'll make a pull request. 👍