knobel-dk / docblock-reader

Comment your PHP code using neural networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The Personal Rasmus Lerdorf

Imagine that you had Rasmus Lerdorf sitting next to you. Rasmus can comment PHP code for you.

This proof-of-concept uses 443 PHP libraries from GitHub to train a Sequence to Sequence Deep Learning with TensorFlow.

Facts about the dataset

We have prepared the data by writing comments and PHP snippets into two files comments.dat and snippets.dat. Line 615 in snippets.dat has its respective comment in line in 615 in comments.dat:

Preview of data
Read obtaining the data yourself if you are interested in how we got this data
  • A total of 34,105 PHP files in the dataset
  • 33,116 (97.10%) had accompanying PSR Docblocks that could be parsed
  • 989 files (2.90%) were skipped due to incomplete/missing PSR Docblocks.
  • Resulting data setis 181,937 rows (88 MB).

Mining the data

Test

About

Comment your PHP code using neural networks


Languages

Language:Shell 92.2%Language:PHP 5.4%Language:Python 2.3%