AashiDutt / Sarvadnya

This repo is a collection of various PoCs (Proof-of-Concepts) to interface custom data using LLMs.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sarvadnya (सर्वज्ञ), an All-Knowing Micro SaaS!!

Chatbots can be real WoW!! The recent evidence is: ChatGPT. Now that they are more human-like with the latest LLMs (Large Language Models). But these LLMs are Pretrained on their own (HUGE) data. Mere mortals don't have any ways ($$, time, expertise) to train own LLMs. Some do have facility to get fine-tuned on custom corpus, but limited. Custom fine tuning of text documents is being provided by many. Embarking on the journey to master the powerful and sought-after paradigm of RAG, along with multi-modal fine-tuning techniques and Knowledge Graphs. The combination of LLM and RAG (with KG) is an IKIGAI - a concept that the world needs, is willing to pay for, and something you are good at and enjoy. This aligns with Naval’s suggestion of cultivating ‘Specific Knowledge’ - a unique skill set that is untrainable and possessed by few.

This repo is a collection of various PoCs (Proof-of-Concepts) to interface custom data using LLMs. Some notes here

What?

PoCs demos

  • On various modalities, use cases and domains, for ChatBots
  • Prep videos, write Medium Posts (GDE/TH), LinkedIn posts, Youtube channel,
  • Make a webpage and store them there as a portfolio, Opensource at Github [Contrary thought: Let me LinkedIn page itself be the webpage, no extra url to maintain]

MicroSaaS

  • Learn some end-to-end hosting platform (LangServe? Langchian with Azure AI?)
  • Convert above demos to have user input (disclaimers, limited uploads, $$)

Trainings

  • Make intro videos of all the workshops and courses (use GDE video material)
  • Show open source course material, always updating

Research

  • MidcurveNN: graph to graph transformation learning
  • Indic Languages: OCR for sharada scripts, tokenizer for Sanskrit

Why?

Chatbot on Own Data

  • Fine-tuning LLMs with own data using LoRA etc
  • Retrieval Augmented Generation (RAG) on own data

Retrieval Augmented Generation (RAG)

  • WHY?: World needs, ready to pay, I am good at, and I like it making wow chatbots
  • Domain:
    • on knowledge graphs, more grounding
    • tabular financial data, representation and similarity
    • midcurveNN Geometric serialisation and retrieval
    • active loop idea of fine-tuning your data
    • langchain and llamaindex with any new llm
    • bharat gpt, bhashini with sanskrit, do prototype on arthashastra principles
  • Specific Knowledge - LLMs, Graphs, Sanskrit

Tech Stacks

  • Enterprise: Google Doc AI, Vertex AI, Microsoft Azure Language AI Services
  • Open Source: Langchain (Serve/Smith/Graph), HuggingFace, Streamlit for UI

Why LangChain? Unofficial Developer Advocate

  • Local (secure), no over-the-net API/web calls
  • Open source, Free via HuggingFace, Contrib possible
  • PoC to Prod, end-to-end
  • Python!! end-to-end, with Streamlit as UI
  • Huge support, community, opportunities
  • Coach: write/talk about it via Medium Stories, Webinars, LinkedIn posts (Mvp ++, advocu ++)
  • Passive MicroSaaS income, pay per use, Integrations

Product Specs:

Checklist: MicroSaas

  • Do you have unfair advantage:
    • Network of founders, influences, for further reach
    • Audience: folks who want this app and can pay
    • Being early
  • Start With a Problem or many problems (don’t tell me your ideas)
  • Move from Problems to Solutions, easy, debuggable
  • Evaluate Your Solutions
  • How is Your Solution Different?
  • Talk to Potential Customers
  • Start Marketing Before Coding
  • Build MVP
  • Solves any specific need (pain point) and not anything-and-everything,
  • Is it for specific people, 1000 true (paying) fans, say $30 or $3 a month
  • Is it a daily need?

LangChain MicroSaaS

  1. Focused Development: Micro SaaS businesses focus on serving a niche market or a specific customer segment with a highly targeted software solution[^10^]. This allows for focused development and targeted marketing⁹.

  2. Cost-Effective: Micro SaaS businesses operate with minimal resources, leveraging cloud infrastructure and automation tools to streamline operations and keep costs low[^10^]. RAG offers an affordable, secure, and explainable alternative to general-purpose LLMs, drastically reducing the likelihood of hallucination⁴.

  3. Customized Solutions: RAG allows businesses to achieve customized solutions while maintaining data relevance and optimizing costs⁶. By adopting RAG, companies can use the reasoning capabilities of LLMs, utilizing their existing models to process and generate responses based on new data⁴.

  4. Integration with LangChain: LangChain is a framework designed to simplify the creation of applications using LLMs¹. It can dynamically connect different systems, chains, and modules to use data and functions from many sources, like different LLMs¹. This allows businesses to develop language model-powered software applications that can carry out various activities, including code analysis, document analysis, and summarization².

  5. Data-Aware and Agentic: LangChain is data-aware and agentic, enabling connections with various data sources for richer, personalized experiences³. This allows for better interoperability across the board, offering various valuable tools that allow businesses to connect to different vendors (including other LLMs) and integrations with a comprehensive collection of open-source components¹.

  6. Access to Various LLM Providers: LangChain offers access to LLMs from various providers like OpenAI, Hugging Face, Cohere, AI24labs, among others¹. These models can be accessed through API calls using platform-specific API tokens, allowing developers to leverage their advanced capabilities to build as they see fit¹.

  7. Recurring Profits and Low Risk: With their recurring profits, fewer capital needs, low risk, dedicated customers and minimal operating expenses, Micro SaaS has started attracting many entrepreneurs towards them in recent years¹⁴.

  8. Stable Recurring Income: Micro-SaaS businesses are usually location-independent and can be a source of stable recurring income once the product has achieved a loyal user base¹¹.

Sources: (1) What is Micro SaaS And How to Create One In 2024. https://bufferapps.com/blog/what-is-micro-saas/. (2) Complete Guide to Micro-Saas: Build a Profitable Business.. https://blog.payproglobal.com/micro-saas-guide. (3) RAG and LLM business process automation: A technical strategy. https://blog.griddynamics.com/retrieval-augmented-generation-llm/. (4) Retrieval Augmented Generation using Azure Machine Learning prompt flow .... https://learn.microsoft.com/en-us/azure/machine-learning/concept-retrieval-augmented-generation?view=azureml-api-2. (5) What is LangChain: How It Enables Businesses to Do More with LLMs. https://www.bluelabellabs.com/blog/what-is-langchain/. (6) LangChain: A New Era of Business Innovation - Medium. https://medium.com/@tvs_next/langchain-a-new-era-of-business-innovation-7207a44382c9. (7) What is LangChain? A Beginners Guide With Examples - Enterprise DNA Blog. https://blog.enterprisedna.co/what-is-langchain-a-beginners-guide-with-examples/. (8) Top 25 Profitable Micro SaaS Business Ideas in 2022 - StartupTalky. https://startuptalky.com/micro-saas-business-ideas/. (9) Building a Micro-SaaS: Best Tools and Platforms In 2022 - Saastitute. https://www.saastitute.com/blog/building-a-micro-saas-best-tools-and-platforms. (10) Improve LLM responses in RAG use cases by interacting with the user. https://aws.amazon.com/blogs/machine-learning/improve-llm-responses-in-rag-use-cases-by-interacting-with-the-user/. (11) An introduction to RAG and simple/ complex RAG - Medium. https://medium.com/enterprise-rag/an-introduction-to-rag-and-simple-complex-rag-9c3aa9bd017b. (12) Concept of RAG (Retreival-Augmented Generation) in LLM. https://blog.devgenius.io/concept-of-rag-retreival-augmented-generation-in-llm-4f878251b4d1. (13) How To Build a Profitable Micro-SaaS Business in 2024 - BufferApps. https://bufferapps.com/blog/how-to-build-a-micro-saas/. (14) Top 10 Micro SaaS Ideas To Build a Profitable Business in 2024. https://controlhippo.com/blog/micro-saas/.

Bottom-line

  • Not looking for Success, but Wonder!!
  • तमसो मा ज्योतिर्गमय : From Dark (hidden in text data) to Light (insights)

Folks to Follow

Publications so far

References

About

This repo is a collection of various PoCs (Proof-of-Concepts) to interface custom data using LLMs.

License:MIT License


Languages

Language:Jupyter Notebook 90.2%Language:HTML 5.2%Language:Python 4.6%Language:Dockerfile 0.0%Language:Shell 0.0%