rsangole / capstone_project

Predict 498 Capstone Project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Develop data dictionary

rsangole opened this issue · comments

commented

For all the data we shall be consuming, a data dictionary needs to be put together. This can be an excel sheet, or a R dataframe.

  • data dictionary - what data, column definitions, usage comments etc
  • entity relationship diagram
commented

I have created a quick view into the data I see in processed:

https://github.com/rsangole/capstone_project/blob/master/src/eda/00-Data%20Dictionary.ipynb

I must admit I am a bit confused @andrew3cooper with some of the files and how we use them. When you're back online, could you please help me out?

commented

@andrew3cooper given you've created the new datasets, I'm assigning this task to you..

I made some updates, focusing on the main trap test result dataset. I'm also working on an entity-relationship diagram in LucidChart but may end up just sketching one on paper and uploading a scanned image.

commented

Sounds good @andrew3cooper . Sketch for now works. An easy "code based" method of creating them is using plantuml.com. We use this at work and it's damn simple to use. If you create a sketch, I can code it into plantuml.

commented

Just saw your data dictionary excel file. Looks nice and detailed Andrew. good work! I'm marking the 1st "todo" as complete above.

commented

Made a few changes to the dictionary excel and did a push.

commented

@andrew3cooper - this task is only pending the ERD. Will close it once complete.