Feedback on new documentation hosted on Github Pages

Question

Feedback on new documentation hosted on Github Pages

ThilinaRajapakse opened this issue 4 years ago · comments

As Simple Transformers grows, the single page README documentation has gotten quite bloated and difficult to use. Because of this, I've decided that it's time (if not a little late already) to move the documentation to a more user-friendly Github Pages hosted website at the link below.

https://thilinarajapakse.github.io/simpletransformers/

As of now, only the text classification section is live but it should be enough to get an idea of how the final documentation will look like. If you guys have any feedback, ideas, concerns, or mistakes/typos to report, I'd love to hear from you. Since it is still being written, incorporating feedback and fixing issues will be much easier at this stage!

Yann Defretin · Answer 1 · Sun May 03 2020 19:44:46 GMT+0800 (China Standard Time)

I think this is great and it was the best thing to do. Now I would redo the README file to get rid of most of the things and do a clean file that links to some documentation chapters for people who want advanced explanations on some topics.

Thilina Rajapakse · Answer 2 · Sun May 03 2020 20:00:54 GMT+0800 (China Standard Time)

Great!

Yes, I agree. Once the website is ready, the readme should be trimmed down to the basics only with links to the docs. As a rough idea, I think the setup instructions, a clear link to the docs, some of the minimal starts (not sure about these), acknowledgements, and the contributors' section should be enough.

Jackson Kearl · Answer 3 · Wed May 13 2020 14:13:28 GMT+0800 (China Standard Time)

It would be helpful to have the sample scripts log something to console that we can verify our results against. Currently not sure if my setup is working because I don't know what values to expect.

Thilina Rajapakse · Answer 4 · Wed May 13 2020 15:16:54 GMT+0800 (China Standard Time)

Good point. I'll add the outputs to the scripts so that users can check against them. The outputs probably won't be correct though. I'll also add the links to the medium articles as they are real-world examples with verifiable results.

Shrivarsheni · Answer 5 · Wed May 27 2020 20:54:04 GMT+0800 (China Standard Time)

Hello sir, I am facing trouble while running the code for convAI on google colab.
I am unable to run model.train_model() .The root being CUDA out of memory

Nadine Ruecker · Answer 6 · Mon Jun 01 2020 18:25:05 GMT+0800 (China Standard Time)

I think it would be great if you also add a few words regarding unbalanced datasets. I'm new and I would like to understand if my dataset for multi-class classification needs to be balanced or not. Thank you!

Thilina Rajapakse · Answer 7 · Tue Jun 02 2020 01:05:48 GMT+0800 (China Standard Time)

Hello sir, I am facing trouble while running the code for convAI on google colab.
I am unable to run model.train_model() .The root being CUDA out of memory

You can try lowering the train_batch_size.

P.S. Please make your comment on a related issue (or a new issue if no related issue exists)

Thilina Rajapakse · Answer 8 · Tue Jun 02 2020 01:16:03 GMT+0800 (China Standard Time)

I think it would be great if you also add a few words regarding unbalanced datasets. I'm new and I would like to understand if my dataset for multi-class classification needs to be balanced or not. Thank you!

Thank you for your suggestion! While I agree that it will be useful, that sort of information is generic to deep learning and not specific to Simple Transformers. Because of this, I feel that adding this kind of information is going to make the whole thing too complicated.

Regarding unbalanced datasets, it really depends on a lot of factors. Generally speaking, if your classes can be clearly differentiated and you don't have too many labels, you can usually get away with unbalanced data. If one or more of the classes only have a handful of samples, the model might not learn to predict those. One way to deal with such issues is to use class weights as described here.

Nadine Ruecker · Answer 9 · Tue Jun 02 2020 01:51:52 GMT+0800 (China Standard Time)

@ThilinaRajapakse: Oh I'm sorry. this is exactly what I needn't but I was not looking for the right term! Thank you!!

Thilina Rajapakse · Answer 10 · Tue Jun 02 2020 05:00:20 GMT+0800 (China Standard Time)

Nothing to be sorry about, we've all been there! 🤷‍♂️

Dr Alexander Mikhalev · Answer 11 · Thu Jun 04 2020 03:40:37 GMT+0800 (China Standard Time)

@ThilinaRajapakse any reason not to use sphinx and readthedoc?

Thilina Rajapakse · Answer 12 · Thu Jun 04 2020 05:31:33 GMT+0800 (China Standard Time)

There's no objective reason. But, subjectively and in no particular order,

I don't want to deal with sphinx
Jekyll seems to have the best support on GitHub pages
sphinx + readthedocs looks a little dated 🤷‍♂️

Jackson Kearl · Answer 13 · Tue Jun 16 2020 07:37:21 GMT+0800 (China Standard Time)

Hey! It'd be helpful for the installation page to specify how to do a minimal isolated install (for a container or GitHub Actions), ideally without using anaconda.

This is specifically for a forward-propagation-only workflow that will run CPU only and only for a handful of inputs, so dealing with GPU/drivers/etc isn't important, and having a quick install is important as the environment needs to be recreated from scratch each go.

Aakash Dusane · Answer 14 · Thu Jul 23 2020 14:04:13 GMT+0800 (China Standard Time)

Hey. Not sure if this is the right space or should raise issue elsewhere. In the new docs on the site, 'Configuring the classification model' section needs a small correction. For the arguments lazy_text_a_column, lazy_text_b_column, the description should read "for lazy loading sentence pair datasets" instead of "single sentence datasets", if i'm not mistaken.

Pablo N. Marino · Answer 15 · Sun Jul 26 2020 08:24:12 GMT+0800 (China Standard Time)

I'm trying to contribute to the docs on the github pages, but struggling to figure out how to render them locally to see my changes, I think the final version of the readme(w the docs removed) should contain the steps to render the docs locally

Pablo N. Marino · Answer 16 · Sun Jul 26 2020 08:44:04 GMT+0800 (China Standard Time)

I'm trying to contribute to the docs on the github pages, but struggling to figure out how to render them locally to see my changes, I think the final version of the readme(w the docs removed) should contain the steps to render the docs locally

I figured how to do it, if you want can open a PR for adding the instructions to the readme @ThilinaRajapakse

Aakash Dusane · Answer 17 · Sun Jul 26 2020 15:09:11 GMT+0800 (China Standard Time)

@pablonm3 I would like to contribute to the docs too. How would I do it?

Thilina Rajapakse · Answer 18 · Sun Jul 26 2020 16:02:48 GMT+0800 (China Standard Time)

I'm trying to contribute to the docs on the github pages, but struggling to figure out how to render them locally to see my changes, I think the final version of the readme(w the docs removed) should contain the steps to render the docs locally

I figured how to do it, if you want can open a PR for adding the instructions to the readme @ThilinaRajapakse

That would be great! I agree that it's a little confusing. My web development skills are pretty mediocre so I've had trouble setting it up myself. 😅

We can put it in a proper contributions guideline later on.

Pablo N. Marino · Answer 19 · Tue Jul 28 2020 12:56:33 GMT+0800 (China Standard Time)

I'm trying to contribute to the docs on the github pages, but struggling to figure out how to render them locally to see my changes, I think the final version of the readme(w the docs removed) should contain the steps to render the docs locally

I figured how to do it, if you want can open a PR for adding the instructions to the readme @ThilinaRajapakse

That would be great! I agree that it's a little confusing. My web development skills are pretty mediocre so I've had trouble setting it up myself. 😅

We can put it in a proper contributions guideline later on.

@ThilinaRajapakse I plan to write a small guide for how to edit docs, do you think it should be included in the repo's readme or in the Jekyll doc?

Thilina Rajapakse · Answer 20 · Tue Jul 28 2020 20:24:28 GMT+0800 (China Standard Time)

I think it's better to have it in the repo as that's the place where people will look when they want to contribute. I'm open to other suggestions though.

Pablo N. Marino · Answer 21 · Tue Jul 28 2020 22:03:05 GMT+0800 (China Standard Time)

I think the same

Pablo N. Marino · Answer 22 · Thu Jul 30 2020 11:58:49 GMT+0800 (China Standard Time)

@pablonm3 I would like to contribute to the docs too. How would I do it?

@aakashdusane I just opened a PR w the instructions: #605

Pablo N. Marino · Answer 23 · Thu Aug 06 2020 06:41:53 GMT+0800 (China Standard Time)

@ThilinaRajapakse what are the remaining tasks for getting rid of the docs from the readme?

Thilina Rajapakse · Answer 24 · Thu Aug 06 2020 15:52:37 GMT+0800 (China Standard Time)

Multi-Modal Classification
Language Generation
ConvAI

I think that's all the tasks.

Pablo N. Marino · Answer 25 · Mon Aug 10 2020 12:45:17 GMT+0800 (China Standard Time)

Thanks @ThilinaRajapakse, I'll try to work on a PR this week to start moving some of the docs that are left

Thilina Rajapakse · Answer 26 · Mon Aug 10 2020 14:05:36 GMT+0800 (China Standard Time)

Sounds good, thanks. Just a heads up, but I might make my own changes to any submitted docs.

Pablo N. Marino · Answer 27 · Sun Aug 16 2020 05:13:14 GMT+0800 (China Standard Time)

added docs for convAI to Jekyll :https://github.com/ThilinaRajapakse/simpletransformers/pull/655/files

stale · Answer 28 · Thu Oct 15 2020 10:46:01 GMT+0800 (China Standard Time)

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.