TensorMatics-opensource / ai-deeplearning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ai-deeplearning

Tutotorials to implement deep learning in following scenarios:-

  1. Text Classification - please visit here
  2. Text NER - please visit here
  3. Text Sentiment Analysis - Coming Soon

Little Background on the project

India is a vast country with a lot of problems in proportion with its size. The government’s official think tank, Niti Aayong, released a paper last year identifying five sectors for AI intervention in India – healthcare(opens in new window), agriculture, education, smart cities and smart mobility.

"AI can help solve some of the most difficult social and environmental challenges in areas like healthcare, disaster prediction, environmental conservation, agriculture, or cultural preservation,” Google's AI head Jeff Dean.Click here to read more(https://timesofindia.indiatimes.com/business/international-business/the-next-big-frontier-using-ai-to-solve-social-issues/articleshow/70167752.cms)

I(www.linkedin.com/in/puneetjindaisb) myself have spent 8+ years dealing with 5Vs of BigData in different domains and completely believe in AI for social good and so wanted to crowdsource efforts to contribute the same for the better change which lead to setting up of Eduwaive Foundation to drive further thinking. After interacting with 2k+ researchers/beginners, Labellerr got to understand that it is not just a business problem to solve. Rather its technological and education barrier at the core if we want to truly bring a change in the AI ecosystem. What better way is when passionate researchers get together as a community to make it easy for the most naive users and enabling them to solve problems literally at the click of a button or let them manage control over the customization if they want to.

This OpenSource program is being funded by www.labellerr.com team under the TensorMatics community initiative for researchers(Academia/Industry) community to boost the AI ecosystem and make AI super accessible and easy to use in problem solving.

People who are interested to learn and contribute can do it in 3 ways:-

  1. User Experience (Product Management)
  2. Framework Evangelist (Sales, Marketing and Operations)
  3. Software Contributor (Data Science and Software Engineering)

You can choose to be a IC, Mentor role in any one of the above departments

Some of the roles and responsibilities might include:-

  1. Devising Primary and secondary data collection strategy and execution including for statistical analysis including hypothesis testing and regression analysis and UX design research
  2. Data management and engineering including ETL
  3. Leveraging statistical models and various ML and Deep Learning techniques to build ML models.
  4. Learning best Software engineering and leverage them to shape up your use case
  5. Learning AI and UX through a single lens
  6. Leveraging front end technologies like HTML, CSS, JS, D3JS to come up with best way of data visualization deliverables in user intuitive way
  7. Documenting all the work done so that it is presentable for desired outcomes of enabling 1000s of researchers like you to solve a same or similar use case easily in 1/30 of time.

Typical example can be let’s say i am talking about object detection use case:- Object detection is one of the most popular use cases and in heavy use right now but it itself independently not as valuable as it is with image segmentation for OCR. Identify how many blogs already written on this. What pain point do these blogs address and what they don’t. User personas who need to build this and what scenarios. Any partial or full solution already available and what pain points it has. Assumptions need to be listed Do feasibility study on costs, effort. Users , technology

While solving the AI use cases, we need to keep in mind AI design principles for which we will take inspiration from Google's AI principles. In short, we go beyond just importing libraries, training the models on BigData.

  1. Data collection is not easy.
  2. ETLs are not easy to write
  3. Data Management is a pain
  4. Which library to use for model training is not an easy question to answer
  5. Model deployment and debugging is all greek to the majority of the naive consumers of deep learning.

Till now there are solutions available which have come recently in the market but still not able to fill the above gaps and reach to the journey where a researcher would be able to do an end to end process in a day and focus more on their research problem rather than struggling to learn deep learning in details especially when majority of time there are time constraints to spend more time on domain than software technology.

Million dollar question is: how can i solve my problem in a day instead of 30/90 days or even worse as not solving it and giving up

So to solve the above we have come up with a base level framework on which we shall develop the usable layer where you will need to play the role of full stack data scientist who does the following whether they know or learn:-

  1. Think and study about the domain and possible ways to collect data. it's not easy but your job is to make framework user's job easy by writing Bots/crawlers/APIs by using frameworks like puppeteer, scrapy, selenium, nodejs, django, proxies
  2. Data management and model building leveraging how various third party clouds can be leveraged to handle this easily including the use of free credits for research work and even leveraging IAAS/PAAS/SAAS services for making it super easy to do this middle layer including deployment
  3. data Labelling provided by open source framework like LabelImg, Labellerr, labelbox etc.

If the above details really make sense to you, then please go ahead and register yourself at www.eduwaive.org so that we can proceed ahead for the next steps of the application by applying to opensource internship by TensorMatics Labellerr team

Please note above details are subject to changes. For any further queries you can write to us at puneet.jindal@tensormatics.com