aditikhare007 / AI_Research_Junction_Aditi_Khare

AI_Research_Junction@Aditi_Khare - Research Papers Summaries Capturing Latest advancements in Generative AI, Quantum AI and Computer Vision

Home Page:https://github.com/aditikhare007/AI_Research_Junction_Aditi_Khare

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Greetings AI Community 👋 About me

** AWS & AI Research Specialist - Principal Applied AI Product Engineer [Product-Owner] & Enterprise Architect @PepsiCo | IIM-A | Community Member @Landing AI | AI Research Specialist [Portfolio] | Author | Quantum AI | Mojo | Next JS | 7+ Years of Experience in Fortune 50 Product Companies | **

** Global Top AI Community Member @Landing.AI @MLOPS Community, @Pandas AI, @Full Stack Deep Learning, @H2o.ai Generative AI, @Modular & @Cohere AI @hugging Face Research Papers Group @Papers with Code @DAIR.AI ** ** Completed 90+ Online Technical Paid Courses from Udemy & Coursera as I believe in Continuous Learning and Growth Mindset **

** AI Research Junction @Aditi Khare - Research Paper Summaries @Generative AI @Computer Vision @Quantum AI **

** AI Research Junction @Aditi Khare - Research Papers Summaries @Gen AI @Computer Vision @Qunatum AI **

** My AI Newsletter-AI Research Junction @Research Papers Summaries @Generative AI, @Computer Vision @Quantum AI **

** Aditi Khare @ AI Research Junction Newletter **

If you find my content useful then please subscribe to my AI Research Junction Newsletter to support my work Thank you!!

** Welcome to AI Research Junction@Aditi Khare-Research Papers Summaries @Generative AI @Computer Vision @Quantum AI **

** Gen AI Research Papers Summaries **

** JAN 2024 **

  1. Amazon's Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models - https://arxiv.org/abs/2401.13795 - Computer Vision and Pattern Recognition
  2. Quantum-Inspired Machine Learning for Molecular Docking - https://arxiv.org/abs/2401.12999 - Quantum Computing
  3. ChatQA - NIVIDIA'S GPT-4 Level Conversational QA Models - https://arxiv.org/pdf/2401.10225v1.pdf - Generative AI
  4. Meta's Self-Rewarding Language Models - https://arxiv.org/abs/2401.10020 - Generative AI
  5. Chainpoll - A high efficacy method for LLM hallucination detection - https://arxiv.org/pdf/2310.18344v1.pdf
  6. AI-Optimized-Catheter-Design-could-prevent-urinary-tract-infections-without-drugs/ - https://www.scientificamerican.com/article/ai-optimized-catheter-design-could-prevent-urinary-tract-infections-without-drugs/
  7. TrustLLM - Trustworthiness in Large Language Models - https://arxiv.org/abs/2308.05374
  8. LLaMA Pro: Progressive LLaMA with Block Expansion - https://arxiv.org/abs/2401.02415
  9. DeepSeekMoE - Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models - https://arxiv.org/abs/2401.06066
  10. VAST AI releases Triplane Meets Gaussian Splatting on Hugging Face - Fast and Generalizable Single-View 3D Reconstruction with Transformers Demo - https://arxiv.org/abs/2312.09147
  11. Masked Audio Generation using a Single Non-Autoregressive Transformers - https://arxiv.org/abs/2401.04577

** DEC 2023 **

  1. 11th Dec 2023 - ** Mistral-embed - An embedding model with a 1024 embedding dimension, achieves 55.26 on MTEB** - https://mistral.ai/news/mixtral-of-experts/-
  2. 11th Dec 2023 - ** LLM360-Fully Transparent Open-Source LLMs** -https://arxiv.org/pdf/2312.06550.pdf
  3. 12th Dec 2023 - ** Mathematical Language Models: A Survey**-https://arxiv.org/abs/2312.07622
  4. 13th Dec 2023 - ** PromptBench: A Library for Evaluation of Large Language Models**-https://arxiv.org/pdf/2312.07910.pdf
  5. 1st Dec 2023 - ** Mamba: Linear-Time Sequence Modeling with Selective State Spaces**-https://arxiv.org/ftp/arxiv/papers/2312/2312.00752.pdf
  6. 14th Dec 2023 - ** Distributed Representations of Words and Phrases and their Compositionality (Word2vec)**-https://papers.nips.cc/paper_files/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf
  7. 11th Dec 2023 - ** Beyond Human Data - Scaling Self-Training for Problem-Solving with Language Models**
    • Approach for self-training with feedback that can substantially reduce dependence on human-generated data.
    • Combined Model-generated data with a reward function improves the performance of LLMs on problem-solving tasks-https://arxiv.org/abs/2312.06585

** Nov 2023 **

  1. 9th Nov 2023 - A Survey of Large Language Models in Medicine: Principles, Applications, and Challenges- https://arxiv.org/abs/2311.05112
  2. System 2 Attention - Leverages Reasoning & Instruction following capabilities of LLMs to decide what to attend to; it regenerates input context to only include relevant portions before attending to the regenerated context to elicit the final response from the model; increases factuality and outperforms standard attention-based LLMs on tasks such as QA and math world problems - https://arxiv.org/abs/2311.11829),
  3. Advancing Long-Context LLMs - Overview of the methodologies for enhancing Transformer architecture modules that optimize long-context capabilities across all stages from pre-training to inference.-https://arxiv.org/abs/2311.12351.
  4. Parallel Speculative Sampling -Approach to reduce inference time of LLMs based on a variant of speculative sampling and parallel decoding; achieves significant speed-ups (up to 30%) by only learning as little as O(d_emb) additional parameters-https://arxiv.org/abs/2311.13581.
  5. Mirasol3B - Multimodal model for learning across audio, video, and text which decouples the multimodal modeling into separate, focused autoregressive models; the inputs are processed according to the modalities; this approach can handle longer videos compared to other models and it outperforms state-of-the-art approach on video QA, long video QA, and audio-video-text benchmark-https://arxiv.org/abs/2311.05698
  6. GPQA - Proposes a graduate-level Google-proof QA benchmark consisting of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry; the strongest GPT-4 based baseline achieves 39% accuracy; this benchmark offers scalable oversight experiments that can help obtain reliable and truthful information from modern AI systems that surpass human capabilities-https://arxiv.org/abs/2311.12022
  7. Chain-of-Thought Reasoning to Language Agents - summary of CoT reasoning, foundational mechanics underpinning CoT techniques, and their application to language agent frameworks.-https://arxiv.org/abs/2311.11797
  8. GAIA - a benchmark for general AI assistants consisting of real-world questions that require a set of fundamental abilities such as reasoning, multimodal handling, web browsing, and generally tool-use proficiency; shows that human respondents obtain 92% vs. 15% for GPT-4 equipped with plugins-https://arxiv.org/abs/2311.12983.
  9. LLMs for Scientific Discovery - Explores the impact of large language models-particularly GPT-4, across various scientific fields including drug discovery, biology, and computational chemistry; assesses GPT-4's understanding of complex scientific concepts, its problem-solving capabilities, and its potential to advance scientific research through expert-driven case assessments and benchmark testing-https://arxiv.org/abs/2311.07361 11.Fine-Tuning LLMs for Factuality - Fine-tunes language model for factuality without requiring human labeling; it learns from automatically generated factuality preference rankings and targets open-ended generation settings- it significantly improves the factuality of Llama-2 on held-out topics compared with RLHF or decoding strategies targeted at factuality-https://arxiv.org/abs/2311.08401. 12.Contrastive CoT Prompting-Proposes a contrastive chain of thought method to enhance language model reasoning; the approach provides both valid and invalid reasoning demonstrations, to guide the model to reason step-by-step while reducing reasoning mistakes; also proposes an automatic method to construct contrastive demonstrations and demonstrates improvements over CoT promptiing-https://arxiv.org/abs/2311.09277. 13.A Survey on Language Models for Code-Provides an overview of LLMs for code, including a review of 50+ models, 30+ evaluation tasks, and 500 related works-https://arxiv.org/abs/2311.07989v1
  10. JARVIS-1 - Open-world agent that can perceive multimodal input-https://arxiv.org/abs/2311.05997.
  11. Learning to Filter Context for RAG -Proposes a method that improves the quality of the context provided to the generator via two steps: 1) identifying useful context based on lexical and information-theoretic approaches & raining context filtering models that can filter retrieved contexts at inference outperforms existing approaches on extractive question answering-https://arxiv.org/abs/2311.08377v1.
  12. MART-Proposes an approach for improving LLM safety with multi-round automatic red-teaming; incorporates automatic adversarial prompt writing and safe response generation, which increases red-teaming scalability and the safety of LLMs; violation rate of an LLM with limited safety alignment reduces up to 84.7% after 4 rounds of MART, achieving comparable performance to LLMs with extensive adversarial prompt writing-https://arxiv.org/abs/2311.07689. 17.LLMs can Deceive Users-Explores the use of an autonomous stock trading agent powered by LLMs; finds that the agent acts upon insider tips and hides the reason behind the trading decision; shows that helpful and safe LLMs can strategically deceive users in a realistic situation without direction instructions or training for deception-https://arxiv.org/abs/2311.07590.
  13. Hallucination in LLMs-A comprehensive survey-https://arxiv.org/abs/2311.05232.
  14. GPT4All - Outlines technical details of the GPT4All model family along with the open-source repository that aims to democratize access to LLMs-https://arxiv.org/abs/2311.04931.
  15. FreshLLMs - Proposes a dynamic QA benchmark-https://arxiv.org/abs/2310.03214.

** OCT 2023**

  1. Spectron-Approach for spoken language modeling trained end-to-end to directly process spectrograms-it can be fine-tuned to generate high-quality accurate spoken language-the method surpasses existing spoken language models in speaker preservation and semantic coherence-https://arxiv.org/abs/2305.15255. |
  2. LLMs Meet New Knowledge - Presents benchmark to assess LLMs' abilities in knowledge understanding, differentiation, and association; benchmark results show-https://arxiv.org/abs/2310.14820
  3. Detecting Pretraining Data from LLMs - Explores the problem of pretraining data detection which aims to determine if a black box model was trained on a given text; proposes a detection method named Min-K% Prob as an effective tool for benchmark example contamination detection, privacy auditing of machine unlearning, and copyrighted text detection in LM’s pertaining data-https://arxiv.org/abs/2310.16789.
  4. ConvNets Match Vision Transformers - Evaluates a performant ConvNet architecture pretrained on JFT-4B at scale; observes a log-log scaling law between the held out loss and compute budget; after fine-tuning on ImageNet, NFNets match the reported performance of Vision Transformers with comparable compute budgets-https://arxiv.org/abs/2310.16764. |
  5. Managing AI Risks - Short Paper outlining risks from upcoming and advanced AI systems, including an examination of social harms, malicious uses, and other potential societal issues emerging from the rapid adoption of autonomous AI systems-https://managing-ai-risks.com/managing_ai_risks.pdf.
  6. Branch-Solve-Merge Reasoning in LLMs-LLM Program that consists of branch, solve, and merge modules parameterized with specific prompts to the base LLM; this enables an LLM to plan a decomposition of task into multiple parallel sub-tasks, independently solve them, and fuse solutions to the sub-tasks; improves evaluation correctness and consistency for multiple LLMs-https://arxiv.org/abs/2310.15123.
  7. LLMs for Software Engineering-A comprehensive survey of LLMs for software engineering, including open research and technical challenges-https://arxiv.org/abs/2310.03533
  8. Self-RAG- Presents new retrieval-augmented framework that enhances an LM’s quality and factuality through retrieval and self-reflection; trains an LM that adaptively retrieves passages on demand, and generates and reflects on the passages and its own generations using special reflection tokens-it significantly outperforms SoTA LLMs-https://arxiv.org/abs/2310.11511.
  9. Retrieval-Augmentation for Long-form Question Answering-Explores retrieval-augmented language models on long-form question answering; finds that retrieval is an important component but evidence documents should be carefully added to the LLM; finds that attribution error happens more frequently when retrieved documents lack sufficient information/evidence for answering the question-https://arxiv.org/abs/2310.12150.
  10. A Study of LLM-Generated Self-Explanations** - assesses an LLM's capability to self-generate feature attribution explanations; self-explanation is useful to improve performance and truthfulness in LLMs-this capability can be used together with chain-of-thought prompting-https://arxiv.org/abs/2310.11207.
  11. OpenAgents - Open Platform for using and hosting language agents in the wild; includes three agents, including a Data Agent for data analysis, a Plugins Agent with 200+ daily API tools, and a Web Agent for autonomous web browsing-https://arxiv.org/abs/2310.10634v1. | 12.LLMs can Learn Rules - presents a two-stage framework that learns a rule library for reasoning with LLMs; in the first stage-https://arxiv.org/abs/2310.07064. 13.Meta Chain-of-Thought Prompting - a generalizable chain-of-thought-https://arxiv.org/abs/2310.06692.
  12. Improving Retrieval-Augmented LMs with Compressors-Presents two approaches to compress retrieved documents into text summaries before pre-pending them in-context: 1) extractive compressor-Selects useful sentences from retrieved documents and abstractive compressor - generates summaries by synthesizing information from multiple documents; achieves a compression rate of as low as 6% with minimal loss in performance on language modeling tasks and open domain question answering tasks; the proposed training scheme performs selective augmentation which helps to generate empty summaries when retrieved docs are irrelevant or unhelpful for a task-https://arxiv.org/abs/2310.04408.
  13. ** Retrieval meets Long Context LLMs** - compares retrieval augmentation and long-context windows for downstream tasks to investigate if the methods can be combined to get the best of both worlds; an LLM with a 4K context window using simple RAG can achieve comparable performance to a fine-tuned LLM with 16K context; retrieval can significantly improve the performance of LLMs regardless of their extended context window sizes; a retrieval-augmented LLaMA2-70B with a 32K context window outperforms GPT-3.5-turbo-16k on seven long context tasks including question answering and query-based summarization-https://arxiv.org/abs/2310.03025.
  14. StreamingLLM - Framework that enables efficient streaming LLMs with attention sinks, a phenomenon where the KV states of initial tokens will largely recover the performance of window attention; the emergence of the attention sink is due to strong attention scores towards the initial tokens; this approach enables LLMs trained with finite length attention windows to generalize to infinite sequence length without any additional fine-tuning-https://arxiv.org/abs/2309.17453. 17.The Dawn of LMMs-Comprehensive analysis of GPT-4V to deepen the understanding of large multimodal models-https://arxiv.org/abs/2309.17421.
  15. Training LLMs with Pause Tokens-Performs training and inference on LLMs with a learnable token which helps to delay the model's answer generation and attain performance gains on general understanding tasks of Commonsense QA and math word problem-solving; experiments show that this is only beneficial provided that the delay is introduced in both pertaining and downstream fine-tuning- https://arxiv.org/abs/2310.02226. 19.Analogical Prompting-New Prompting Approach to automatically guide the reasoning process of LLMs; the approach is different from chain-of-thought in that it doesn’t require labeled exemplars of the reasoning process-the approach is inspired by analogical reasoning and prompts LMs to self-generate relevant exemplars or knowledge in the context-https://arxiv.org/abs/2310.01714.

** SEPT 2023 **

  1. AlphaMissense - AI Model classifying missense variants to help pinpoint the cause of diseases; the model is used to develop a catalogue of genetic mutations; it can categorize 89% of all 71 million possible missense variants as either likely pathogenic or likely benign-https://www.science.org/doi/10.1126/science.adg7492.
  2. Chain-of-Verification reduces Hallucination in LLMs-Develops a method to enable LLMs to "deliberate" on responses to correct mistakes; include the following steps: 1) draft initial response, 2) plan verification questions to fact-check the draft.
  3. Answer questions independently to avoid bias from other responses and generate a final verified response- https://arxiv.org/abs/2309.11495
  4. Contrastive Decoding Improves Reasoning in Large Language Models-Shows that contrastive decoding leads Llama-65B to outperform Llama 2 and other models on commonsense reasoning and reasoning benchmarks-https://arxiv.org/abs/2309.09117.
  5. LongLoRA - Efficient fine-tuning approach to significantly extend the context windows of pre-trained LLMs; implements shift short attention-a substitute that approximates the standard self-attention pattern during training; it has less GPU memory cost and training time compared to full fine-tuning while not compromising accuracy- https://arxiv.org/abs/2309.12307.
  6. LLMs for Generating Structured Data-Studies the use of LLMs for generating complex structured data; proposes a structure-aware fine-tuning method, applied to Llama-7B, which significantly outperform other model like GPT-3.5/4 and Vicuna-13B-https://arxiv.org/abs/2309.08963
  7. Textbooks Are All You Need II-New 1.3 billion parameter model trained on 30 billion tokens; the dataset consists of "textbook-quality" synthetically generated data; phi-1.5 competes or outperforms other larger models on reasoning tasks suggesting that data quality plays a more important role than previously thought-https://arxiv.org/abs/2309.05463.
  8. The Rise and Potential of LLM Based Agents - A Comprehensive overview of LLM based agents; covers from how to construct these agents to how to harness them for good-https://arxiv.org/abs/2309.07864. |

** AUG 2023 **

  1. Open Problem and Limitation of RLHF - provides an overview of open problems and the limitations of RLHF- https://arxiv.org/abs/2307.15217
  2. Skeleton-of-Thought - proposes a prompting strategy that firsts generate an answer skeleton and then performs parallel API calls to generate the content of each skeleton point; reports quality improvements in addition to speed-up of up to 2.39x-https://arxiv.org/abs/2307.153373. 3.MetaGPT - a framework involving LLM-based multi-agents that encodes human standardized operating procedures (SOPs) to extend complex problem-solving capabilities that mimic efficient human workflows; this enables MetaGPT to perform multifaceted software development, code generation tasks, and even data analysis using tools like AutoGPT and LangChain-https://arxiv.org/abs/2308.00352v2
  3. OpenFlamingo - Introduces a family of autoregressive vision-language models ranging from 3B to 9B parameters; the technical report describes the models, training data, and evaluation suite-https://arxiv.org/abs/2308.01390.

** JULY 2023 **

  1. Universal Adversarial LLM Attacks** - finds universal and transferable adversarial attacks that cause aligned models like ChatGPT and Bard to generate objectionable behaviors; the approach automatically produces adversarial suffixes using greedy and gradient search-https://arxiv.org/abs/2307.15043.
  2. A Survey on Evaluation of LLMs - a comprehensive overview of evaluation methods for LLMs focusing on what to evaluate, where to evaluate, and how to evaluate-https://arxiv.org/abs/2307.03109
  3. How Language Models Use Long Contexts - finds that LM performance is often highest when relevant information occurs at the beginning or end of the input context; performance degrades when relevant information is provided in the middle of a long context-https://arxiv.org/abs/2307.03172.
  4. LLMs as Effective Text Rankers - Proposes a prompting technique that enables open-source LLMs to perform state-of-the-art text ranking on standard benchmarks- https://arxiv.org/abs/2306.17563
  5. Multimodal Generation with Frozen LLMs - introduces an approach that effectively maps images to the token space of LLMs; enables models like PaLM and GPT-4 to tackle visual tasks without parameter updates; enables multimodal tasks and uses in-context learning to tackle various visual tasks-https://arxiv.org/abs/2306.17842.
  6. CodeGen2.5 - releases a new code LLM trained on 1.5T tokens; the 7B model is on par with >15B code-generation models and it’s optimized for fast sampling-https://arxiv.org/abs/2305.02309.
  7. InterCode - introduces a framework of interactive coding as a reinforcement learning environment; this is different from the typical coding benchmarks that consider a static sequence-to-sequence process- https://arxiv.org/abs/2306.14898.

** JUNE 2023 **

  1. LeanDojo - Open-Source Lean Playground consisting of toolkits, data, models, and benchmarks for theorem proving-also develops ReProver, Retrieval augmented LLM-based prover for theorem solving using premises from a vast math library-https://arxiv.org/abs/2306.15626.
  2. Extending Context Window of LLMs-Extends the context window of LLMs like LLaMA to up to 32K with minimal fine-tuning (within 1000 steps); previous methods for extending the context window are inefficient but this approach attains good performance on several tasks while being more efficient and cost-effective-https://arxiv.org/abs/2306.15595.
  3. Computer Vision Through the Lens of Natural Language - proposes a modular approach for solving computer vision problems by leveraging LLMs; the LLM is used to reason over outputs from independent and descriptive modules that provide extensive information about an image-https://arxiv.org/abs/2306.16410.
  4. Understanding Theory-of-Mind in LLMs with LLMs - a framework for procedurally generating evaluations with LLMs; proposes a benchmark to study the social reasoning capabilities of LLMs with LLMs. https://arxiv.org/abs/2306.15448.
  5. Evaluations with No Labels-A Framework for self-supervised evaluation of LLMs by analyzing their sensitivity or invariance to transformations on input text; can be used to monitor LLM behavior on datasets streamed during live model deployment-https://arxiv.org/abs/2306.13651v1), [Tweet
  6. Long-range Language Modeling with Self-Retrieval - an architecture and training procedure for jointly training a retrieval-augmented language model from scratch for long-range language modeling tasks. - https://arxiv.org/abs/2306.13421.
  7. Scaling MLPs-A Tale of Inductive Bias - Shows that the performance of MLPs improves with scale and highlights that lack of inductive bias can be compensated- https://arxiv.org/abs/2306.13575
  8. Textbooks Are All You Need - Introduces a new 1.3B parameter LLM called phi-1; it’s significantly smaller in size and trained for 4 days using a selection of textbook-quality data and synthetic textbooks and exercises with GPT-3.5; achieves promising results on the HumanEval benchmark-https://arxiv.org/abs/2306.11644
  9. RoboCat - New Foundation agent that can operate different robotic arms and can solve tasks from as few as 100 demonstrations; the self-improving AI agent can self-generate new training data to improve its technique and get more efficient at adapting to new tasks-https://arxiv.org/abs/2306.11706. |
  10. ClinicalGPT - Language model optimized through extensive and diverse medical data, including medical records, domain-specific knowledge, and multi-round dialogue consultations. https://arxiv.org/abs/2306.09968s |
  11. An Overview of Catastrophic AI Risks - provides an overview of the main sources of catastrophic AI risks; the goal is to foster more understanding of these risks and ensure AI systems are developed in a safe manner-https://arxiv.org/abs/2306.12001v1.
  12. AudioPaLM-Text-based and speech-based LMs, PaLM-2 and AudioLM, into a multimodal architecture that supports speech understanding and generation; outperforms existing systems for speech translation tasks with zero-shot speech-to-text translation capabilities-https://arxiv.org/abs/2306.12925v1.

** MAY 2023 **

  1. Gorilla-Finetuned LLaMA-based model that surpasses GPT-4 on writing API calls. This capability can help identify the right API, boosting the ability of LLMs to interact with external tools to complete specific tasks-https://arxiv.org/abs/2305.15334.
  2. The False Promise of Imitating Proprietary LLMs - provides a critical analysis of models that are finetuned on the outputs of a stronger model; argues that model imitation is a false premise and that the higher leverage action to improve open source models is to develop better base models-https://arxiv.org/abs/2305.15717
  3. InstructBLIP-Explores visual-language instruction tuning based on the pre-trained BLIP-2 models; achieves state-of-the-art zero-shot performance on 13 held-out datasets, outperforming BLIP-2 and Flamingo. https://arxiv.org/abs/2305.06500
  4. Active Retrieval Augmented LLMs-Introduces FLARE, retrieval augmented generation to improve the reliability of LLMs; FLARE actively decides when and what to retrieve across the course of the generation; demonstrates superior or competitive performance on long-form knowledge-intensive generation tasks-https://arxiv.org/abs/2305.06983.
  5. AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head - connects ChatGPT with audio foundational models to handle challenging audio tasks and a modality transformation interface to enable spoken dialogue-https://arxiv.org/abs/2304.12995
  6. DataComp: In search of the next generation of multimodal datasets - releases a new multimodal dataset benchmark containing 12.8B image-text pairs-https://arxiv.org/abs/2304.14108. 12. ChatGPT for Information Extraction - provides a deeper assessment of ChatGPT's performance on the important information extraction task-https://arxiv.org/abs/2304.11633 |

** MARCH 2023 **

  1. GPT-4 Technical Report-GPT-4-Large multimodal model with broader general knowledge and problem-solving abilities-https://arxiv.org/abs/2303.08774v2.
  2. LERF: Language Embedded Radiance Fields-Method for grounding language embeddings from models like CLIP into NeRF; this enables open-ended language queries in 3D.
  3. An Overview on Language Models: Recent Developments and Outlook - an overview of language models covering recent developments and future directions. It also covers topics like linguistic units, structures, training methods, evaluation, and applications-https://arxiv.org/abs/2303.05759. |
  4. Eliciting Latent Predictions from Transformers with the Tuned Lens**-Method for transformer interpretability that can trace a language model predictions as it develops layer by layer-https://arxiv.org/abs/2303.08112. |

** FEB 2023 **

  1. Multimodal Chain-of-Thought Reasoning in Language Models - Uses vision features to elicit chain-of-thought reasoning in multimodality, enabling the model to generate effective rationales that contribute to answer inference-https://arxiv.org/abs/2302.00923.
  2. Dreamix: Video Diffusion Models are General Video Editors - a diffusion model that performs text-based motion and appearance editing of general videos.
  3. Benchmarking Large Language Models for News Summarization-https://arxiv.org/abs/2301.13848. |

** JAN 2023 **

  1. Rethinking with Retrieval: Faithful Large Language Model Inference - shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting- https://arxiv.org/abs/2301.00303.
  2. SparseGPT: Massive Language Models Can Be Accurately Pruned In One-Shot - Presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining-https://arxiv.org/abs/2301.00774. |
  3. ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders - Performant model based on a fully convolutional masked autoencoder framework and other architectural improvements. CNNs are sticking back-https://arxiv.org/abs/2301.00808. | |

** Open AI's Prompt Engineering Handbook **

https://platform.openai.com/docs/guides/prompt-engineering

** OpenAI Cookbook - Github **

https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md#how-to-improve-reliability-on-complex-tasks

** Lilianweng's Gitub Resources & Blogs **

https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/

** Real-time machine learning: challenges and solutions by @Chip Huyen **

https://huyenchip.com/2022/01/02/real-time-machine-learning-challenges-and-solutions.htm

** Building LLM applications for production by @Chip Huyen **

https://huyenchip.com/2023/04/11/llm-engineering.html

** Full Stack LLM Bootcamp by @Charles Frye @Sergey Karayev @Josh Tobin **

https://fullstackdeeplearning.com/llm-bootcamp/

** Prompt Engineering Guide ** - Excellent Resources for Latest Research Papers on Prompt Engineering by @Elvis Saravia

https://www.promptingguide.ai/](https://www.promptingguide.ai/).

** Awesome Resources of AI Machine Learning, MLOPS , Productionizing Machine Learning Models at Scale **

** MadeWithML - @Goku Mohandas** https://madewithml.com/courses/mlops/testing/

** Awesome Production Machine Learning ** https://github.com/EthicalML/awesome-production-machine-learning

** Transformers from Scratch - Awesome Transformers Explaination ** https://e2eml.school/transformers.html#resources

** Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass Vedio Lecture Explaination on Transformers ** - https://www.youtube.com/watch?v=rBCqOTEfxvg

The Annotated Transformer - Code Implmentation along with Model Architecture - https://nlp.seas.harvard.edu/2018/04/03/attention.html

** Hugging Face Transformers Library Github **

https://github.com/huggingface/transformers

** FullStack DeepLearning Prof - @Sergey Karayev - Good Read on GPT Architecture **

https://dugas.ch/artificial_curiosity/GPT_architecture.html

** About - An awesome & curated list of best LLMOps tools for developers ** https://github.com/tensorchord/Awesome-LLMOps

** A list of open LLMs available for commercial use by @eugeneyan **

https://github.com/eugeneyan/open-llms

** @ModularAI - @Mojo 🔥 — AI developers **

Mojo combines the usability of Python with the performance of C, unlocking unparalleled programmability of AI hardware and extensibility of AI models.

** Replit - Training Large Language Models **

https://replit.com/@MckayWrigley/Takeoff-School-LangChain-101-Models?v=1

** Lean Copilot: LLMs as Copilots for Theorem Proving in Lean **

Lean Copilot allows large language models (LLMs) to be used in Lean for proof automation, e.g., suggesting tactics/premises and searching for proofs. You can use our built-in models from LeanDojo or bring your own models that run either locally (w/ or w/o GPUs) or on the cloud.

** Cohere AI - LLM University **

https://docs.cohere.com/docs/llmu?_gl=1*1k8vxo5*_gcl_au*MTE2MjEzMDAwNi4xNzAyODE1ODA5*_ga*NjUwODA2NDQ4LjE2ODQxNDkwMTU.*_ga_CRGS116RZS*MTcwMjgxNTgwOC40LjEuMTcwMjgxNTgyOS4zOS4wLjA.

** Cohere.AI - Playground using RAG **

https://cohere.com/

** @Niels Rogge's Slides of his Recent talk on how to train and deploy open-source large language models (LLMs) **

The talk is about the rise of open LLMs we saw this year, from Llama-2 by Meta to the Mixtral-8x7b model by Mistral AI which is already on par with GPT-3.5, being even preferred over Gemini Pro, according to recent evaluations ** - https://docs.google.com/presentation/d/1yBWLNzlrrIsfNprbEnmYdqckAMrwFZB-/edit#slide=id.p1

Strategies for Effective and Efficient Text Ranking Using Large Language Models

https://blog.reachsumit.com/posts/2023/12/towards-ranking-aware-llms/

How to make LLMs go fast

https://vgel.me/posts/faster-inference/

LLM360: Towards Fully Transparent Open-Source LLMs

  1. Fully open-source LLM developed by LLM 360 with 7B parameters.
  2. Instruction-tuned LLM based on Amber (7B).
  3. Safety-finetuned LLM based on AmberChat.

LLM360 - Towards Fully Transparent Open-Source LLMs - AmberChat LLM model: https://huggingface.co/LLM360/AmberChat AmberSafe LLM model: https://huggingface.co/LLM360/AmberSafe Amber paper Link - https://arxiv.org/abs/2312.06550

Awesome Research Github Repositories

  1. Zero to Hero Research Scientist @Andrejkarpathy https://github.com/karpathy/nn-zero-to-hero-@Andrejkarpathy

  2. Awesome Machine Learning Models in Production https://github.com/EthicalML/awesome-production-machine-learning

Speech Recongition

Choosing Python Speech Recognition Package that exist on PyPI

  1. apiai
  2. assemblyai
  3. google-cloud-speech
  4. pocketsphinx
  5. SpeechRecognition
  6. watson-developer-cloud

** Open Source AI Libraries **

ChatBots Browser Extension AI Libraries

  1. ChatHub - ChatHub is one library for Chatbot Client. https://github.com/aditikhare007/chathub
  • Chatbots in one app, currently supporting ChatGPT, new Bing Chat, Google Bard, Claude, and open-source models including LLama2, Vicuna, ChatGLM et
  • Support for ChatGPT API and GPT-4 Browsing.
  • Shortcut to quickly activate the app anywhere in the browser
  • Markdown and code highlight support
  • Prompt Library for custom prompts and community prompts
  • Conversation history saved locally
  • Export and Import all your data.
  1. Vald-Open-source,Cloud-native distributed vector search engine.
  • Features horizontal scaling.
  • Customizable filtering.
  • Auto-indexing/backup.

Large Language Models - AI Libraries

Awesome LLM Interpretability

https://github.com/JShollaj/awesome-llm-interpretability

Parul Pandey's Research Paper Collection

  1. Three Pass Method.
  2. Andrew Ng's Lecture on reading research paper.
  3. Connected Papers.
  4. Research Rabbit.
  5. Arxiv Sanity Preserver.
  6. Arvix Vanity.

** Quantum AI Research Papers **

** JAN 2024 **

** DEC 2023 **

  1. 21st Dec 2023 - ** Exploiting Novel GPT-4 APIs ** - https://arxiv.org/abs/2312.14302
  2. 18th Dec 2023 - ** Design of Quantum Machine Learning Course for a Computer Science Program ** - https://ieeexplore.ieee.org/document/10313632
  3. 2nd Dec 2023 - ** Hybrid Quantum Neural Network in High-dimensional Data Classification ** - https://arxiv.org/abs/2312.01024

** Thank you so much for visiting my AI Research Junction@Aditi Khare - Research Papers Summaries @Generative AI @Computer Vision @Quautum AI **

** If you find my AI Research Junction@Aditi Khare useful please star my repository to support my work - Happy Learning **