Fefe (frldj)

frldj

Geek Repo

Github PK Tool:Github PK Tool

Fefe's starred repositories

private-gpt

Interact with your documents using the power of GPT, 100% privately, no data leaks

Language:PythonLicense:Apache-2.0Stargazers:53149Issues:0Issues:0
Language:PythonStargazers:8Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0

Python-Implementation-of-LSA

A Jupyter notebook on implementation of Latent Semantic Analysis (A Topic Modelling Algorithm) in python.

Language:Jupyter NotebookLicense:MITStargazers:8Issues:0Issues:0

InformationRetrieval

Real Yelp review data, cosine similarity ranking of query review in Vector Space, TF-IDF model. Unigram, Bigram Language model with linear interpolation smoothing, absolute discounting smoothing, Dirichlet smoothing. Perplexity analysis. Evaluations of six language models, including boolean, TF-IDF, Okapi BM25, Pivoted Length Normalization, Jelinek-Mercer smoothing, Dirichlet Prior Smoothing. The evaluation methods include Mean Average Precision, P@K, Reciprocal rank, Normalized Discount Cumulative Gain (NDCG).

Language:JavaStargazers:2Issues:0Issues:0

BM25

A complete implementation of Okapi BM25 with five evaluation methods (precision, recall, MAP, P at N and NDCG at N), using only standard Python libraries.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Language:PythonLicense:Apache-2.0Stargazers:1574Issues:0Issues:0

nlp

Natural Language Processing

Language:PythonStargazers:96Issues:0Issues:0

search-engine-tfidf

Search engine implementation with TF.IDF algorithm using python + flask + mysql

Language:PythonStargazers:7Issues:0Issues:0

Intelligent_Document_Finder

Document Search Engine Tool

Language:PythonLicense:MITStargazers:70Issues:0Issues:0

TopicBERT

Implementation of EMNLP2020 accepted paper: "TopicBERT: Topic-aware BERT for Efficient Document Classification"

Language:PythonStargazers:42Issues:0Issues:0
Language:SASStargazers:1Issues:0Issues:0

deep-learning-keras-tf-tutorial

Learn deep learning with tensorflow2.0, keras and python through this comprehensive deep learning tutorial series. Learn deep learning from scratch. Deep learning series for beginners. Tensorflow tutorials, tensorflow 2.0 tutorial. deep learning tutorial python.

Language:Jupyter NotebookStargazers:811Issues:0Issues:0

CamembertForFun

Small project of sentiment classification using CamemBERT trained on Allociné reviews and with a webapp interface

Language:PythonStargazers:2Issues:0Issues:0

nyt-article-summarizer

New York Times Article Summarization Tool

Language:Jupyter NotebookStargazers:16Issues:0Issues:0

NLP-image-to-text

code to extract text from images

Language:PythonLicense:MITStargazers:35Issues:0Issues:0

cord19

a repo for the cord19 challenge

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:32Issues:0Issues:0

annoy

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Language:C++License:Apache-2.0Stargazers:12975Issues:0Issues:0

Checkbox-Table-cell-detection-using-OpenCV-Python

To extract relevant information from unstructured data sources like OMR sheets, scanned invoices, bills, etc into structured data, using Computer Vision and Natural Language Processing. the primary steps we are dependent on are Optical Character Recognition and Document Layout Analysis. Optical Character Recognition (OCR) is for detecting the text from the image where we try to get additional metadata from the documents like identifying headers, paragraphs, lines, words, tables, key-value pairs, etc.

Language:Jupyter NotebookStargazers:2Issues:0Issues:0

Search_Engine_for_Wikipedia

Implementing from scratch a search engine for the French Wikipedia

Language:Jupyter NotebookStargazers:11Issues:0Issues:0

French-Word-Embeddings

French word embeddings from series sub-titles

Language:Jupyter NotebookLicense:MITStargazers:22Issues:0Issues:0

bert_semantic_matching

BERT中文语义匹配,基于allennlp。

Language:PythonStargazers:5Issues:0Issues:0
Language:PythonLicense:MITStargazers:3144Issues:0Issues:0

Topic-Modeling-BERT-LDA

# Topic modeling with BERT, LDA and Clustering. Latent Dirichlet Allocation(LDA) probabilistic topic assignment and pre-trained sentence embeddings from BERT/RoBERTa.

Language:Jupyter NotebookStargazers:49Issues:0Issues:0

TopicModelling-LSA-LDA

Retrieving 'Topics' (concept) from corpus using (1) Latent Dirichlet Allocation (Genism) for modelling. Perplexity and Coherence score were used as evaluation models. (2) Latent Semantic Analysis using Term Frequency- Inverse Document Frequency and Truncated Singular Value Decomposition.

Language:Jupyter NotebookStargazers:12Issues:0Issues:0

semantic-search-through-wikipedia-with-weaviate

Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine

Language:PythonLicense:MITStargazers:241Issues:0Issues:0

vector_engine

Build a semantic search engine with Transformers and Faiss

Language:Jupyter NotebookStargazers:143Issues:0Issues:0

Information_retrieval_system

Information retrieval system ,python, Text Mining, bag f words, web Mining, word2Vec, jupyter, IPYNB

Language:Jupyter NotebookStargazers:1Issues:0Issues:0

DGMS

Code for "Deep Graph Matching and Searching for Semantic Code Retrieval"

Language:PythonStargazers:24Issues:0Issues:0