aditikhare007/AI_Research_Junction_Aditi_Khare

Greetings AI Community 👋 About me

** Global Top AI Community Member @Landing.AI @MLOPS Community, @Pandas AI, @Full Stack Deep Learning, @H2o.ai Generative AI, @Modular & @Cohere AI @hugging Face Research Papers Group @Papers with Code @DAIR.AI ** ** Completed 90+ Online Technical Paid Courses from Udemy & Coursera as I believe in Continuous Learning and Growth Mindset **

AI Research Junction @Aditi Khare - Research Paper Summaries @Generative AI @Computer Vision @Quantum AI

** AI Research Junction @Aditi Khare - Research Papers Summaries @Gen AI @Computer Vision @Qunatum AI **

My AI Newsletter-AI Research Junction @Research Papers Summaries @Generative AI, @Computer Vision @Quantum AI

** Aditi Khare @ AI Research Junction Newletter **

If you find my content useful then please subscribe to my AI Research Junction Newsletter to support my work Thank you!!

Welcome to AI Research Junction@Aditi Khare-Research Papers Summaries @Generative AI @Computer Vision @Quantum AI

Gen AI Research Papers Summaries

JAN 2024

Amazon's Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models - https://arxiv.org/abs/2401.13795 - Computer Vision and Pattern Recognition
Quantum-Inspired Machine Learning for Molecular Docking - https://arxiv.org/abs/2401.12999 - Quantum Computing
ChatQA - NIVIDIA'S GPT-4 Level Conversational QA Models - https://arxiv.org/pdf/2401.10225v1.pdf - Generative AI
Meta's Self-Rewarding Language Models - https://arxiv.org/abs/2401.10020 - Generative AI
Chainpoll - A high efficacy method for LLM hallucination detection - https://arxiv.org/pdf/2310.18344v1.pdf
AI-Optimized-Catheter-Design-could-prevent-urinary-tract-infections-without-drugs/ - https://www.scientificamerican.com/article/ai-optimized-catheter-design-could-prevent-urinary-tract-infections-without-drugs/
TrustLLM - Trustworthiness in Large Language Models - https://arxiv.org/abs/2308.05374
LLaMA Pro: Progressive LLaMA with Block Expansion - https://arxiv.org/abs/2401.02415
DeepSeekMoE - Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models - https://arxiv.org/abs/2401.06066
VAST AI releases Triplane Meets Gaussian Splatting on Hugging Face - Fast and Generalizable Single-View 3D Reconstruction with Transformers Demo - https://arxiv.org/abs/2312.09147
Masked Audio Generation using a Single Non-Autoregressive Transformers - https://arxiv.org/abs/2401.04577

DEC 2023

11th Dec 2023 - ** Mistral-embed - An embedding model with a 1024 embedding dimension, achieves 55.26 on MTEB** - https://mistral.ai/news/mixtral-of-experts/-
11th Dec 2023 - ** LLM360-Fully Transparent Open-Source LLMs** -https://arxiv.org/pdf/2312.06550.pdf
12th Dec 2023 - ** Mathematical Language Models: A Survey**-https://arxiv.org/abs/2312.07622
13th Dec 2023 - ** PromptBench: A Library for Evaluation of Large Language Models**-https://arxiv.org/pdf/2312.07910.pdf
1st Dec 2023 - ** Mamba: Linear-Time Sequence Modeling with Selective State Spaces**-https://arxiv.org/ftp/arxiv/papers/2312/2312.00752.pdf
14th Dec 2023 - ** Distributed Representations of Words and Phrases and their Compositionality (Word2vec)**-https://papers.nips.cc/paper_files/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf
11th Dec 2023 - ** Beyond Human Data - Scaling Self-Training for Problem-Solving with Language Models**
- Approach for self-training with feedback that can substantially reduce dependence on human-generated data.
- Combined Model-generated data with a reward function improves the performance of LLMs on problem-solving tasks-https://arxiv.org/abs/2312.06585

Nov 2023

9th Nov 2023 - A Survey of Large Language Models in Medicine: Principles, Applications, and Challenges- https://arxiv.org/abs/2311.05112
System 2 Attention - Leverages Reasoning & Instruction following capabilities of LLMs to decide what to attend to; it regenerates input context to only include relevant portions before attending to the regenerated context to elicit the final response from the model; increases factuality and outperforms standard attention-based LLMs on tasks such as QA and math world problems - https://arxiv.org/abs/2311.11829),
Advancing Long-Context LLMs - Overview of the methodologies for enhancing Transformer architecture modules that optimize long-context capabilities across all stages from pre-training to inference.-https://arxiv.org/abs/2311.12351.
Parallel Speculative Sampling -Approach to reduce inference time of LLMs based on a variant of speculative sampling and parallel decoding; achieves significant speed-ups (up to 30%) by only learning as little as O(d_emb) additional parameters-https://arxiv.org/abs/2311.13581.
Mirasol3B - Multimodal model for learning across audio, video, and text which decouples the multimodal modeling into separate, focused autoregressive models; the inputs are processed according to the modalities; this approach can handle longer videos compared to other models and it outperforms state-of-the-art approach on video QA, long video QA, and audio-video-text benchmark-https://arxiv.org/abs/2311.05698
GPQA - Proposes a graduate-level Google-proof QA benchmark consisting of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry; the strongest GPT-4 based baseline achieves 39% accuracy; this benchmark offers scalable oversight experiments that can help obtain reliable and truthful information from modern AI systems that surpass human capabilities-https://arxiv.org/abs/2311.12022
Chain-of-Thought Reasoning to Language Agents - summary of CoT reasoning, foundational mechanics underpinning CoT techniques, and their application to language agent frameworks.-https://arxiv.org/abs/2311.11797
GAIA - a benchmark for general AI assistants consisting of real-world questions that require a set of fundamental abilities such as reasoning, multimodal handling, web browsing, and generally tool-use proficiency; shows that human respondents obtain 92% vs. 15% for GPT-4 equipped with plugins-https://arxiv.org/abs/2311.12983.
LLMs for Scientific Discovery - Explores the impact of large language models-particularly GPT-4, across various scientific fields including drug discovery, biology, and computational chemistry; assesses GPT-4's understanding of complex scientific concepts, its problem-solving capabilities, and its potential to advance scientific research through expert-driven case assessments and benchmark testing-https://arxiv.org/abs/2311.07361 11.Fine-Tuning LLMs for Factuality - Fine-tunes language model for factuality without requiring human labeling; it learns from automatically generated factuality preference rankings and targets open-ended generation settings- it significantly improves the factuality of Llama-2 on held-out topics compared with RLHF or decoding strategies targeted at factuality-https://arxiv.org/abs/2311.08401. 12.Contrastive CoT Prompting-Proposes a contrastive chain of thought method to enhance language model reasoning; the approach provides both valid and invalid reasoning demonstrations, to guide the model to reason step-by-step while reducing reasoning mistakes; also proposes an automatic method to construct contrastive demonstrations and demonstrates improvements over CoT promptiing-https://arxiv.org/abs/2311.09277. 13.A Survey on Language Models for Code-Provides an overview of LLMs for code, including a review of 50+ models, 30+ evaluation tasks, and 500 related works-https://arxiv.org/abs/2311.07989v1
JARVIS-1 - Open-world agent that can perceive multimodal input-https://arxiv.org/abs/2311.05997.
Learning to Filter Context for RAG -Proposes a method that improves the quality of the context provided to the generator via two steps: 1) identifying useful context based on lexical and information-theoretic approaches & raining context filtering models that can filter retrieved contexts at inference outperforms existing approaches on extractive question answering-https://arxiv.org/abs/2311.08377v1.
MART-Proposes an approach for improving LLM safety with multi-round automatic red-teaming; incorporates automatic adversarial prompt writing and safe response generation, which increases red-teaming scalability and the safety of LLMs; violation rate of an LLM with limited safety alignment reduces up to 84.7% after 4 rounds of MART, achieving comparable performance to LLMs with extensive adversarial prompt writing-https://arxiv.org/abs/2311.07689. 17.LLMs can Deceive Users-Explores the use of an autonomous stock trading agent powered by LLMs; finds that the agent acts upon insider tips and hides the reason behind the trading decision; shows that helpful and safe LLMs can strategically deceive users in a realistic situation without direction instructions or training for deception-https://arxiv.org/abs/2311.07590.
Hallucination in LLMs-A comprehensive survey-https://arxiv.org/abs/2311.05232.
GPT4All - Outlines technical details of the GPT4All model family along with the open-source repository that aims to democratize access to LLMs-https://arxiv.org/abs/2311.04931.
FreshLLMs - Proposes a dynamic QA benchmark-https://arxiv.org/abs/2310.03214.

OCT 2023

Spectron-Approach for spoken language modeling trained end-to-end to directly process spectrograms-it can be fine-tuned to generate high-quality accurate spoken language-the method surpasses existing spoken language models in speaker preservation and semantic coherence-https://arxiv.org/abs/2305.15255. |
LLMs Meet New Knowledge - Presents benchmark to assess LLMs' abilities in knowledge understanding, differentiation, and association; benchmark results show-https://arxiv.org/abs/2310.14820
Detecting Pretraining Data from LLMs - Explores the problem of pretraining data detection which aims to determine if a black box model was trained on a given text; proposes a detection method named Min-K% Prob as an effective tool for benchmark example contamination detection, privacy auditing of machine unlearning, and copyrighted text detection in LM’s pertaining data-https://arxiv.org/abs/2310.16789.
ConvNets Match Vision Transformers - Evaluates a performant ConvNet architecture pretrained on JFT-4B at scale; observes a log-log scaling law between the held out loss and compute budget; after fine-tuning on ImageNet, NFNets match the reported performance of Vision Transformers with comparable compute budgets-https://arxiv.org/abs/2310.16764. |
Managing AI Risks - Short Paper outlining risks from upcoming and advanced AI systems, including an examination of social harms, malicious uses, and other potential societal issues emerging from the rapid adoption of autonomous AI systems-https://managing-ai-risks.com/managing_ai_risks.pdf.
Branch-Solve-Merge Reasoning in LLMs-LLM Program that consists of branch, solve, and merge modules parameterized with specific prompts to the base LLM; this enables an LLM to plan a decomposition of task into multiple parallel sub-tasks, independently solve them, and fuse solutions to the sub-tasks; improves evaluation correctness and consistency for multiple LLMs-https://arxiv.org/abs/2310.15123.
LLMs for Software Engineering-A comprehensive survey of LLMs for software engineering, including open research and technical challenges-https://arxiv.org/abs/2310.03533
Self-RAG- Presents new retrieval-augmented framework that enhances an LM’s quality and factuality through retrieval and self-reflection; trains an LM that adaptively retrieves passages on demand, and generates and reflects on the passages and its own generations using special reflection tokens-it significantly outperforms SoTA LLMs-https://arxiv.org/abs/2310.11511.
Retrieval-Augmentation for Long-form Question Answering-Explores retrieval-augmented language models on long-form question answering; finds that retrieval is an important component but evidence documents should be carefully added to the LLM; finds that attribution error happens more frequently when retrieved documents lack sufficient information/evidence for answering the question-https://arxiv.org/abs/2310.12150.
A Study of LLM-Generated Self-Explanations** - assesses an LLM's capability to self-generate feature attribution explanations; self-explanation is useful to improve performance and truthfulness in LLMs-this capability can be used together with chain-of-thought prompting-https://arxiv.org/abs/2310.11207.
OpenAgents - Open Platform for using and hosting language agents in the wild; includes three agents, including a Data Agent for data analysis, a Plugins Agent with 200+ daily API tools, and a Web Agent for autonomous web browsing-https://arxiv.org/abs/2310.10634v1. | 12.LLMs can Learn Rules - presents a two-stage framework that learns a rule library for reasoning with LLMs; in the first stage-https://arxiv.org/abs/2310.07064. 13.Meta Chain-of-Thought Prompting - a generalizable chain-of-thought-https://arxiv.org/abs/2310.06692.
Improving Retrieval-Augmented LMs with Compressors-Presents two approaches to compress retrieved documents into text summaries before pre-pending them in-context: 1) extractive compressor-Selects useful sentences from retrieved documents and abstractive compressor - generates summaries by synthesizing information from multiple documents; achieves a compression rate of as low as 6% with minimal loss in performance on language modeling tasks and open domain question answering tasks; the proposed training scheme performs selective augmentation which helps to generate empty summaries when retrieved docs are irrelevant or unhelpful for a task-https://arxiv.org/abs/2310.04408.
** Retrieval meets Long Context LLMs** - compares retrieval augmentation and long-context windows for downstream tasks to investigate if the methods can be combined to get the best of both worlds; an LLM with a 4K context window using simple RAG can achieve comparable performance to a fine-tuned LLM with 16K context; retrieval can significantly improve the performance of LLMs regardless of their extended context window sizes; a retrieval-augmented LLaMA2-70B with a 32K context window outperforms GPT-3.5-turbo-16k on seven long context tasks including question answering and query-based summarization-https://arxiv.org/abs/2310.03025.
StreamingLLM - Framework that enables efficient streaming LLMs with attention sinks, a phenomenon where the KV states of initial tokens will largely recover the performance of window attention; the emergence of the attention sink is due to strong attention scores towards the initial tokens; this approach enables LLMs trained with finite length attention windows to generalize to infinite sequence length without any additional fine-tuning-https://arxiv.org/abs/2309.17453. 17.The Dawn of LMMs-Comprehensive analysis of GPT-4V to deepen the understanding of large multimodal models-https://arxiv.org/abs/2309.17421.
Training LLMs with Pause Tokens-Performs training and inference on LLMs with a learnable token which helps to delay the model's answer generation and attain performance gains on general understanding tasks of Commonsense QA and math word problem-solving; experiments show that this is only beneficial provided that the delay is introduced in both pertaining and downstream fine-tuning- https://arxiv.org/abs/2310.02226. 19.Analogical Prompting-New Prompting Approach to automatically guide the reasoning process of LLMs; the approach is different from chain-of-thought in that it doesn’t require labeled exemplars of the reasoning process-the approach is inspired by analogical reasoning and prompts LMs to self-generate relevant exemplars or knowledge in the context-https://arxiv.org/abs/2310.01714.

SEPT 2023

AlphaMissense - AI Model classifying missense variants to help pinpoint the cause of diseases; the model is used to develop a catalogue of genetic mutations; it can categorize 89% of all 71 million possible missense variants as either likely pathogenic or likely benign-https://www.science.org/doi/10.1126/science.adg7492.
Chain-of-Verification reduces Hallucination in LLMs-Develops a method to enable LLMs to "deliberate" on responses to correct mistakes; include the following steps: 1) draft initial response, 2) plan verification questions to fact-check the draft.
Answer questions independently to avoid bias from other responses and generate a final verified response- https://arxiv.org/abs/2309.11495
Contrastive Decoding Improves Reasoning in Large Language Models-Shows that contrastive decoding leads Llama-65B to outperform Llama 2 and other models on commonsense reasoning and reasoning benchmarks-https://arxiv.org/abs/2309.09117.
LongLoRA - Efficient fine-tuning approach to significantly extend the context windows of pre-trained LLMs; implements shift short attention-a substitute that approximates the standard self-attention pattern during training; it has less GPU memory cost and training time compared to full fine-tuning while not compromising accuracy- https://arxiv.org/abs/2309.12307.
LLMs for Generating Structured Data-Studies the use of LLMs for generating complex structured data; proposes a structure-aware fine-tuning method, applied to Llama-7B, which significantly outperform other model like GPT-3.5/4 and Vicuna-13B-https://arxiv.org/abs/2309.08963
Textbooks Are All You Need II-New 1.3 billion parameter model trained on 30 billion tokens; the dataset consists of "textbook-quality" synthetically generated data; phi-1.5 competes or outperforms other larger models on reasoning tasks suggesting that data quality plays a more important role than previously thought-https://arxiv.org/abs/2309.05463.
The Rise and Potential of LLM Based Agents - A Comprehensive overview of LLM based agents; covers from how to construct these agents to how to harness them for good-https://arxiv.org/abs/2309.07864. |

AUG 2023

Open Problem and Limitation of RLHF - provides an overview of open problems and the limitations of RLHF- https://arxiv.org/abs/2307.15217
Skeleton-of-Thought - proposes a prompting strategy that firsts generate an answer skeleton and then performs parallel API calls to generate the content of each skeleton point; reports quality improvements in addition to speed-up of up to 2.39x-https://arxiv.org/abs/2307.153373. 3.MetaGPT - a framework involving LLM-based multi-agents that encodes human standardized operating procedures (SOPs) to extend complex problem-solving capabilities that mimic efficient human workflows; this enables MetaGPT to perform multifaceted software development, code generation tasks, and even data analysis using tools like AutoGPT and LangChain-https://arxiv.org/abs/2308.00352v2
OpenFlamingo - Introduces a family of autoregressive vision-language models ranging from 3B to 9B parameters; the technical report describes the models, training data, and evaluation suite-https://arxiv.org/abs/2308.01390.

JULY 2023

Universal Adversarial LLM Attacks** - finds universal and transferable adversarial attacks that cause aligned models like ChatGPT and Bard to generate objectionable behaviors; the approach automatically produces adversarial suffixes using greedy and gradient search-https://arxiv.org/abs/2307.15043.
A Survey on Evaluation of LLMs - a comprehensive overview of evaluation methods for LLMs focusing on what to evaluate, where to evaluate, and how to evaluate-https://arxiv.org/abs/2307.03109
How Language Models Use Long Contexts - finds that LM performance is often highest when relevant information occurs at the beginning or end of the input context; performance degrades when relevant information is provided in the middle of a long context-https://arxiv.org/abs/2307.03172.
LLMs as Effective Text Rankers - Proposes a prompting technique that enables open-source LLMs to perform state-of-the-art text ranking on standard benchmarks- https://arxiv.org/abs/2306.17563
Multimodal Generation with Frozen LLMs - introduces an approach that effectively maps images to the token space of LLMs; enables models like PaLM and GPT-4 to tackle visual tasks without parameter updates; enables multimodal tasks and uses in-context learning to tackle various visual tasks-https://arxiv.org/abs/2306.17842.
CodeGen2.5 - releases a new code LLM trained on 1.5T tokens; the 7B model is on par with >15B code-generation models and it’s optimized for fast sampling-https://arxiv.org/abs/2305.02309.
InterCode - introduces a framework of interactive coding as a reinforcement learning environment; this is different from the typical coding benchmarks that consider a static sequence-to-sequence process- https://arxiv.org/abs/2306.14898.

JUNE 2023

LeanDojo - Open-Source Lean Playground consisting of toolkits, data, models, and benchmarks for theorem proving-also develops ReProver, Retrieval augmented LLM-based prover for theorem solving using premises from a vast math library-https://arxiv.org/abs/2306.15626.
Extending Context Window of LLMs-Extends the context window of LLMs like LLaMA to up to 32K with minimal fine-tuning (within 1000 steps); previous methods for extending the context window are inefficient but this approach attains good performance on several tasks while being more efficient and cost-effective-https://arxiv.org/abs/2306.15595.
Computer Vision Through the Lens of Natural Language - proposes a modular approach for solving computer vision problems by leveraging LLMs; the LLM is used to reason over outputs from independent and descriptive modules that provide extensive information about an image-https://arxiv.org/abs/2306.16410.
Understanding Theory-of-Mind in LLMs with LLMs - a framework for procedurally generating evaluations with LLMs; proposes a benchmark to study the social reasoning capabilities of LLMs with LLMs. https://arxiv.org/abs/2306.15448.
Evaluations with No Labels-A Framework for self-supervised evaluation of LLMs by analyzing their sensitivity or invariance to transformations on input text; can be used to monitor LLM behavior on datasets streamed during live model deployment-https://arxiv.org/abs/2306.13651v1), [Tweet
Long-range Language Modeling with Self-Retrieval - an architecture and training procedure for jointly training a retrieval-augmented language model from scratch for long-range language modeling tasks. - https://arxiv.org/abs/2306.13421.
Scaling MLPs-A Tale of Inductive Bias - Shows that the performance of MLPs improves with scale and highlights that lack of inductive bias can be compensated- https://arxiv.org/abs/2306.13575
Textbooks Are All You Need - Introduces a new 1.3B parameter LLM called phi-1; it’s significantly smaller in size and trained for 4 days using a selection of textbook-quality data and synthetic textbooks and exercises with GPT-3.5; achieves promising results on the HumanEval benchmark-https://arxiv.org/abs/2306.11644
RoboCat - New Foundation agent that can operate different robotic arms and can solve tasks from as few as 100 demonstrations; the self-improving AI agent can self-generate new training data to improve its technique and get more efficient at adapting to new tasks-https://arxiv.org/abs/2306.11706. |
ClinicalGPT - Language model optimized through extensive and diverse medical data, including medical records, domain-specific knowledge, and multi-round dialogue consultations. https://arxiv.org/abs/2306.09968s |
An Overview of Catastrophic AI Risks - provides an overview of the main sources of catastrophic AI risks; the goal is to foster more understanding of these risks and ensure AI systems are developed in a safe manner-https://arxiv.org/abs/2306.12001v1.
AudioPaLM-Text-based and speech-based LMs, PaLM-2 and AudioLM, into a multimodal architecture that supports speech understanding and generation; outperforms existing systems for speech translation tasks with zero-shot speech-to-text translation capabilities-https://arxiv.org/abs/2306.12925v1.

MAY 2023

Gorilla-Finetuned LLaMA-based model that surpasses GPT-4 on writing API calls. This capability can help identify the right API, boosting the ability of LLMs to interact with external tools to complete specific tasks-https://arxiv.org/abs/2305.15334.
The False Promise of Imitating Proprietary LLMs - provides a critical analysis of models that are finetuned on the outputs of a stronger model; argues that model imitation is a false premise and that the higher leverage action to improve open source models is to develop better base models-https://arxiv.org/abs/2305.15717
InstructBLIP-Explores visual-language instruction tuning based on the pre-trained BLIP-2 models; achieves state-of-the-art zero-shot performance on 13 held-out datasets, outperforming BLIP-2 and Flamingo. https://arxiv.org/abs/2305.06500
Active Retrieval Augmented LLMs-Introduces FLARE, retrieval augmented generation to improve the reliability of LLMs; FLARE actively decides when and what to retrieve across the course of the generation; demonstrates superior or competitive performance on long-form knowledge-intensive generation tasks-https://arxiv.org/abs/2305.06983.
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head - connects ChatGPT with audio foundational models to handle challenging audio tasks and a modality transformation interface to enable spoken dialogue-https://arxiv.org/abs/2304.12995
DataComp: In search of the next generation of multimodal datasets - releases a new multimodal dataset benchmark containing 12.8B image-text pairs-https://arxiv.org/abs/2304.14108. 12. ChatGPT for Information Extraction - provides a deeper assessment of ChatGPT's performance on the important information extraction task-https://arxiv.org/abs/2304.11633 |

MARCH 2023

GPT-4 Technical Report-GPT-4-Large multimodal model with broader general knowledge and problem-solving abilities-https://arxiv.org/abs/2303.08774v2.
LERF: Language Embedded Radiance Fields-Method for grounding language embeddings from models like CLIP into NeRF; this enables open-ended language queries in 3D.
An Overview on Language Models: Recent Developments and Outlook - an overview of language models covering recent developments and future directions. It also covers topics like linguistic units, structures, training methods, evaluation, and applications-https://arxiv.org/abs/2303.05759. |
Eliciting Latent Predictions from Transformers with the Tuned Lens**-Method for transformer interpretability that can trace a language model predictions as it develops layer by layer-https://arxiv.org/abs/2303.08112. |

FEB 2023

Multimodal Chain-of-Thought Reasoning in Language Models - Uses vision features to elicit chain-of-thought reasoning in multimodality, enabling the model to generate effective rationales that contribute to answer inference-https://arxiv.org/abs/2302.00923.
Dreamix: Video Diffusion Models are General Video Editors - a diffusion model that performs text-based motion and appearance editing of general videos.
Benchmarking Large Language Models for News Summarization-https://arxiv.org/abs/2301.13848. |

JAN 2023

Rethinking with Retrieval: Faithful Large Language Model Inference - shows the potential of enhancing LLMs by retrieving relevant external knowledge based on decomposed reasoning steps obtained through chain-of-thought prompting- https://arxiv.org/abs/2301.00303.
SparseGPT: Massive Language Models Can Be Accurately Pruned In One-Shot - Presents a technique for compressing large language models while not sacrificing performance; "pruned to at least 50% sparsity in one-shot, without any retraining-https://arxiv.org/abs/2301.00774. |
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders - Performant model based on a fully convolutional masked autoencoder framework and other architectural improvements. CNNs are sticking back-https://arxiv.org/abs/2301.00808. | |

Open AI's Prompt Engineering Handbook

https://platform.openai.com/docs/guides/prompt-engineering

OpenAI Cookbook - Github

https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md#how-to-improve-reliability-on-complex-tasks

Lilianweng's Gitub Resources & Blogs

https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/

Real-time machine learning: challenges and solutions by @Chip Huyen

https://huyenchip.com/2022/01/02/real-time-machine-learning-challenges-and-solutions.htm

Building LLM applications for production by @Chip Huyen

https://huyenchip.com/2023/04/11/llm-engineering.html

Full Stack LLM Bootcamp by @Charles Frye @Sergey Karayev @Josh Tobin

https://fullstackdeeplearning.com/llm-bootcamp/

Prompt Engineering Guide - Excellent Resources for Latest Research Papers on Prompt Engineering by @Elvis Saravia

https://www.promptingguide.ai/](https://www.promptingguide.ai/).

Awesome Resources of AI Machine Learning, MLOPS , Productionizing Machine Learning Models at Scale

** MadeWithML - @Goku Mohandas** https://madewithml.com/courses/mlops/testing/

** Awesome Production Machine Learning ** https://github.com/EthicalML/awesome-production-machine-learning

** Transformers from Scratch - Awesome Transformers Explaination ** https://e2eml.school/transformers.html#resources

** Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass Vedio Lecture Explaination on Transformers ** - https://www.youtube.com/watch?v=rBCqOTEfxvg

The Annotated Transformer - Code Implmentation along with Model Architecture - https://nlp.seas.harvard.edu/2018/04/03/attention.html

Hugging Face Transformers Library Github

https://github.com/huggingface/transformers

FullStack DeepLearning Prof - @Sergey Karayev - Good Read on GPT Architecture

https://dugas.ch/artificial_curiosity/GPT_architecture.html

** About - An awesome & curated list of best LLMOps tools for developers ** https://github.com/tensorchord/Awesome-LLMOps

A list of open LLMs available for commercial use by @eugeneyan

https://github.com/eugeneyan/open-llms

@ModularAI - @Mojo 🔥 — AI developers

Mojo combines the usability of Python with the performance of C, unlocking unparalleled programmability of AI hardware and extensibility of AI models.

Replit - Training Large Language Models

https://replit.com/@MckayWrigley/Takeoff-School-LangChain-101-Models?v=1

Lean Copilot: LLMs as Copilots for Theorem Proving in Lean

Lean Copilot allows large language models (LLMs) to be used in Lean for proof automation, e.g., suggesting tactics/premises and searching for proofs. You can use our built-in models from LeanDojo or bring your own models that run either locally (w/ or w/o GPUs) or on the cloud.

Cohere AI - LLM University

https://docs.cohere.com/docs/llmu?_gl=1*1k8vxo5*_gcl_au*MTE2MjEzMDAwNi4xNzAyODE1ODA5*_ga*NjUwODA2NDQ4LjE2ODQxNDkwMTU.*_ga_CRGS116RZS*MTcwMjgxNTgwOC40LjEuMTcwMjgxNTgyOS4zOS4wLjA.

Cohere.AI - Playground using RAG

https://cohere.com/

@Niels Rogge's Slides of his Recent talk on how to train and deploy open-source large language models (LLMs)

The talk is about the rise of open LLMs we saw this year, from Llama-2 by Meta to the Mixtral-8x7b model by Mistral AI which is already on par with GPT-3.5, being even preferred over Gemini Pro, according to recent evaluations ** - https://docs.google.com/presentation/d/1yBWLNzlrrIsfNprbEnmYdqckAMrwFZB-/edit#slide=id.p1

Strategies for Effective and Efficient Text Ranking Using Large Language Models

https://blog.reachsumit.com/posts/2023/12/towards-ranking-aware-llms/

How to make LLMs go fast

https://vgel.me/posts/faster-inference/

LLM360: Towards Fully Transparent Open-Source LLMs

Fully open-source LLM developed by LLM 360 with 7B parameters.
Instruction-tuned LLM based on Amber (7B).
Safety-finetuned LLM based on AmberChat.

LLM360 - Towards Fully Transparent Open-Source LLMs - AmberChat LLM model: https://huggingface.co/LLM360/AmberChat AmberSafe LLM model: https://huggingface.co/LLM360/AmberSafe Amber paper Link - https://arxiv.org/abs/2312.06550

Awesome Research Github Repositories

Zero to Hero Research Scientist @Andrejkarpathy https://github.com/karpathy/nn-zero-to-hero-@Andrejkarpathy
Awesome Machine Learning Models in Production https://github.com/EthicalML/awesome-production-machine-learning

Speech Recongition

Choosing Python Speech Recognition Package that exist on PyPI

apiai
assemblyai
google-cloud-speech
pocketsphinx
SpeechRecognition
watson-developer-cloud

Open Source AI Libraries

ChatBots Browser Extension AI Libraries

ChatHub - ChatHub is one library for Chatbot Client. https://github.com/aditikhare007/chathub

Chatbots in one app, currently supporting ChatGPT, new Bing Chat, Google Bard, Claude, and open-source models including LLama2, Vicuna, ChatGLM et
Support for ChatGPT API and GPT-4 Browsing.
Shortcut to quickly activate the app anywhere in the browser
Markdown and code highlight support
Prompt Library for custom prompts and community prompts
Conversation history saved locally
Export and Import all your data.

Vald-Open-source,Cloud-native distributed vector search engine.

Features horizontal scaling.
Customizable filtering.
Auto-indexing/backup.

Large Language Models - AI Libraries

Awesome LLM Interpretability

https://github.com/JShollaj/awesome-llm-interpretability

Parul Pandey's Research Paper Collection

Three Pass Method.
Andrew Ng's Lecture on reading research paper.
Connected Papers.
Research Rabbit.
Arxiv Sanity Preserver.
Arvix Vanity.

Quantum AI Research Papers

JAN 2024

DEC 2023

21st Dec 2023 - ** Exploiting Novel GPT-4 APIs ** - https://arxiv.org/abs/2312.14302
18th Dec 2023 - ** Design of Quantum Machine Learning Course for a Computer Science Program ** - https://ieeexplore.ieee.org/document/10313632
2nd Dec 2023 - ** Hybrid Quantum Neural Network in High-dimensional Data Classification ** - https://arxiv.org/abs/2312.01024

Greetings AI Community 👋 About me

** AI Research Junction @Aditi Khare - Research Paper Summaries @Generative AI @Computer Vision @Quantum AI **

** My AI Newsletter-AI Research Junction @Research Papers Summaries @Generative AI, @Computer Vision @Quantum AI **

If you find my content useful then please subscribe to my AI Research Junction Newsletter to support my work Thank you!!

** Welcome to AI Research Junction@Aditi Khare-Research Papers Summaries @Generative AI @Computer Vision @Quantum AI **

** Gen AI Research Papers Summaries **

** JAN 2024 **

** DEC 2023 **

** Nov 2023 **

** OCT 2023**

** SEPT 2023 **

** AUG 2023 **

** JULY 2023 **

** JUNE 2023 **

** MAY 2023 **

** MARCH 2023 **

** FEB 2023 **

** JAN 2023 **

** Open AI's Prompt Engineering Handbook **

** OpenAI Cookbook - Github **

** Lilianweng's Gitub Resources & Blogs **

** Real-time machine learning: challenges and solutions by @Chip Huyen **

** Building LLM applications for production by @Chip Huyen **

** Full Stack LLM Bootcamp by @Charles Frye @Sergey Karayev @Josh Tobin **

** Prompt Engineering Guide ** - Excellent Resources for Latest Research Papers on Prompt Engineering by @Elvis Saravia

** Awesome Resources of AI Machine Learning, MLOPS , Productionizing Machine Learning Models at Scale **

** Hugging Face Transformers Library Github **

** FullStack DeepLearning Prof - @Sergey Karayev - Good Read on GPT Architecture **

** A list of open LLMs available for commercial use by @eugeneyan **

** @ModularAI - @Mojo 🔥 — AI developers **

** Replit - Training Large Language Models **

** Lean Copilot: LLMs as Copilots for Theorem Proving in Lean **

** Cohere AI - LLM University **

** Cohere.AI - Playground using RAG **

** @Niels Rogge's Slides of his Recent talk on how to train and deploy open-source large language models (LLMs) **

Strategies for Effective and Efficient Text Ranking Using Large Language Models

How to make LLMs go fast

LLM360: Towards Fully Transparent Open-Source LLMs

Awesome Research Github Repositories

Speech Recongition

Choosing Python Speech Recognition Package that exist on PyPI

** Open Source AI Libraries **

ChatBots Browser Extension AI Libraries

Large Language Models - AI Libraries

Awesome LLM Interpretability

Parul Pandey's Research Paper Collection

** Quantum AI Research Papers **

** JAN 2024 **

** DEC 2023 **

** Thank you so much for visiting my AI Research Junction@Aditi Khare - Research Papers Summaries @Generative AI @Computer Vision @Quautum AI **

** If you find my AI Research Junction@Aditi Khare useful please star my repository to support my work - Happy Learning **

About

AI Research Junction @Aditi Khare - Research Paper Summaries @Generative AI @Computer Vision @Quantum AI

My AI Newsletter-AI Research Junction @Research Papers Summaries @Generative AI, @Computer Vision @Quantum AI

Welcome to AI Research Junction@Aditi Khare-Research Papers Summaries @Generative AI @Computer Vision @Quantum AI

Gen AI Research Papers Summaries

JAN 2024

DEC 2023

Nov 2023

OCT 2023

SEPT 2023

AUG 2023

JULY 2023

JUNE 2023

MAY 2023

MARCH 2023

FEB 2023

JAN 2023

Open AI's Prompt Engineering Handbook

OpenAI Cookbook - Github

Lilianweng's Gitub Resources & Blogs

Real-time machine learning: challenges and solutions by @Chip Huyen

Building LLM applications for production by @Chip Huyen

Full Stack LLM Bootcamp by @Charles Frye @Sergey Karayev @Josh Tobin

Prompt Engineering Guide - Excellent Resources for Latest Research Papers on Prompt Engineering by @Elvis Saravia

Awesome Resources of AI Machine Learning, MLOPS , Productionizing Machine Learning Models at Scale

Hugging Face Transformers Library Github

FullStack DeepLearning Prof - @Sergey Karayev - Good Read on GPT Architecture

A list of open LLMs available for commercial use by @eugeneyan

@ModularAI - @Mojo 🔥 — AI developers

Replit - Training Large Language Models

Lean Copilot: LLMs as Copilots for Theorem Proving in Lean

Cohere AI - LLM University

Cohere.AI - Playground using RAG

@Niels Rogge's Slides of his Recent talk on how to train and deploy open-source large language models (LLMs)

Open Source AI Libraries

Quantum AI Research Papers

JAN 2024

DEC 2023

Thank you so much for visiting my AI Research Junction@Aditi Khare - Research Papers Summaries @Generative AI @Computer Vision @Quautum AI

If you find my AI Research Junction@Aditi Khare useful please star my repository to support my work - Happy Learning