chienhung1519's starred repositories
Python-AI-Book
《少年Py的大冒險》第二集, 深度學習的入門!
Fengshenbang-LM
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Transformer_classifier_pytorch
A notebook demonstrating use of Transformer's encoder block for classification using Pytorch only!
awesome-chatgpt
Curated list of ChatGPT related resource, tools, prompts, apps / ChatGPT 相關優質資源、工具、應用的精選清單。
information-retrieval
Neural information retrieval / Semantic search / Bi-encoders
Text-Attention-Heatmap-Visualization
Plot the vector graph of attention based text visualisation
deep-significance
Enabling easy statistical significance testing for deep neural networks.
annotated_deep_learning_paper_implementations
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
speech-nlp-datasets
Contains links to publicly available datasets for modeling health outcomes using speech and language.
lstm-crf-pytorch
LSTM-CRF in PyTorch
unilm-pytorch
pytorch版unilm模型
Lifelog-VidLife
VidLife contains personal life events with triple forms from The Big Bang Theory, eg. (Leonard, visit, Penny), which is designed for training and evaluating personal life event extraction systems. You could download part of the life events annotations, which is released in this repository. The complete dataset will be made available online after our paper is accepted.
Lifelog-LiveKB
People often forget something in the daily life, thus information recall support for people at the right time and at the right place is emerging. Constructing personal knowledge base for individuals is important for the application of memory recall and living assistance. We collect 18 users who set their tweets as public and posted tweets ranged from 2009 to 2017. We aim to extract life events from tweets shared on Twitter, and construct personal knowledge bases of individuals.
Lifelog-PKBQAC-Dataset
A Dataset for Personal Knowledge Base Question Ansewring and Unanswerable Question Correction
Lifelog-VisLife
Recently, people tend to record their daily life via filming Video Weblog (VLog), which contains visual and audio data. These large scale multimodal data can be used to support information recall service that enables users to query their past experiences. To this end, we construct a visual lifelogging dataset for investigating the issues of personal life event extraction from vlogs shared on YouTube and constructing a personal knowledge base (PKB) for individuals. There are 1,733 videos from three selected YouTubers ranging from 2016 to 2019. The videos we crawled are all about traveling.
Finance-NumClaim
Numerals provide important information in financial narratives. Our statistical result in the financial analysis reports shows that over 58.47% of sentences contain at least one numeral. Without the numerals, lots of fine-grained information in the analysis reports will be lost. This phenomenon evidences the importance of the numerals in the financial narrative. Based on our observation, investors always make a claim with an estimation. This estimation can be a cue for detecting the investor's fine-grained claim. Therefore, we propose an expert-annotated dataset, NumClaim, for probing argument mining in the financial narrative. Among 5,144 instances in the NumClaim dataset, 23.78% and 76.22% of instances containing numerals are annotated as In-claim'' and Out-of-claim'', respectively.
Finance-ICRD
There are two tasks in the ICRD. We separate the datasets into three parts, including Train/Dev/Test. (1) Premise Detection In the premise detection task, we aim at identifying whether the given sentence is a premise. There are two keys for each instance. "sentence" is the given sentence. If the value of "ans" is 0, means the given sentence is not a premise. If the value of "ans" is 1, means the given sentence is a premise. (2) Claim-Premise Inference When given a claim and a sentence, models are asked to predict whether the given sentence is the premise of the claim. There are three keys for each instance. "claim" is the given claim and "compare_sent" is the other given sentence. If the value of "ans" is 0, means the given sentence is not a premise of the given claim. If the value of "ans" is 1, means the given sentence is a premise of the given claim.