Web IR / NLP Group @ NUS (WING-NUS)

Web IR / NLP Group @ NUS

WING-NUS

Geek Repo

Location:Singapore

Home Page:wing.comp.nus.edu.sg

Twitter:@wing_nus

Github PK Tool:Github PK Tool

Web IR / NLP Group @ NUS's repositories

SG-Deep-Question-Generation

This repository contains code and models for the paper: Semantic Graphs for Generating Deep Questions (ACL 2020).

Language:PythonLicense:MITStargazers:64Issues:3Issues:0

nus-sms-corpus

This is the distribution point for the NUS SMS Corpus as described and updated from This is a corpus of SMS (Short Message Service) messages collected for research at the Department of Computer Science at the National University of Singapore. This dataset consists of 67,093 SMS messages taken from the corpus on Mar 9, 2015. The messages largely originate from Singaporeans and mostly from students attending the University. These messages were collected from volunteers who were made aware that their contributions were going to be made publicly available. The data collectors opportunistically collected as much metadata about the messages and their senders as possible, so as to enable different types of analyses. This corpus was collected by Tao Chen and Min-Yen Kan. If you use this data, please ensure the following paper is cited. For more details, please refer to Citation field. Tao Chen and Min-Yen Kan (2013). Creating a Live, Public Short Message Service Corpus: The NUS SMS Corpus. Language Resources and Evaluation, 47(2)(2013), pages 299-355. URL: https://link.springer.com/article/10.1007%2Fs10579-012-9197-9

Language:PythonLicense:NOASSERTIONStargazers:18Issues:4Issues:17

SSID

Student Submission Integrity Diagnosis

ELCo

The Dataset and Official Implementation for <The ELCo Dataset: Bridging Emoji and Lexical Composition> @ LREC-COLING 2024

Language:PythonStargazers:11Issues:3Issues:0

sciwing

SciWING is a modern toolkit for scientific document processing from WING-NUS

Language:PythonLicense:MITStargazers:6Issues:4Issues:3

Summarization-Papers

Summarization Papers

Language:TeXStargazers:6Issues:2Issues:0

FormatEval

[Preprint' 24] LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs

Language:PythonStargazers:2Issues:0Issues:0

lib4moocdata

Library for processing MOOC data dumps. Currently limited to Coursera data.

Language:PerlLicense:GPL-3.0Stargazers:2Issues:2Issues:0

AdvFM

Adversarial Deep Factorization Machine

Language:PythonStargazers:1Issues:2Issues:0

AutomaticKeyphraseExtraction

Data for Automatic Keyphrase Extraction Task

Language:PythonStargazers:1Issues:2Issues:0

FormatBiasEval

[Preprint' 24] LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs

Language:PythonStargazers:1Issues:0Issues:0
License:NOASSERTIONStargazers:1Issues:2Issues:0

QACheck

About Data and Codes for EMNLP 2023 System Demo Paper "QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking"

Language:PythonLicense:Apache-2.0Stargazers:1Issues:2Issues:0

QMSum

Dataset for NAACL 2021 paper: "QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization"

Language:Jupyter NotebookLicense:MITStargazers:1Issues:2Issues:0

Sealing

[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"

Language:PythonLicense:MITStargazers:1Issues:1Issues:0
Language:PythonLicense:MITStargazers:1Issues:2Issues:0

SemanticTokenizer

Item Tokenization: the future for the recommender systems

Language:PythonStargazers:1Issues:1Issues:0

wing-website

Hugo Blox WING Website pilot

Language:TeXLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

CoAnnotating

This is the official repository for "CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation"

Language:Jupyter NotebookStargazers:0Issues:2Issues:0

ControllableLyricTranslation

Code for the paper "Songs Across Borders: Singable and Controllable Neural Lyric Translation"

Language:PythonLicense:MITStargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:1Issues:0

DiSQ-Score

The Dataset and Official Implementation for <Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations> @ ACL 2024

Language:PythonStargazers:0Issues:0Issues:0

LLM-Misinfo-QA

This repository contains data and code used for On the Risk of Misinformation Pollution with Large Language Models (to appear on Findings of EMNLP 2023).

Language:PythonStargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:0Issues:0

nnose

Codebase for NNOSE: Nearest Neighbor Occupational Skill Extraction

License:MITStargazers:0Issues:0Issues:0

RL-for-Question-Generation

This repository contains codes and models for the paper: Exploring Question-Specific Rewards for Generating Deep Questions (COLING 2020).

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

SciTab

The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"

License:MITStargazers:0Issues:2Issues:0