GaoShen's repositories
DistributeCrawler
基于Map/Reduce爬虫,可抽取各大新闻网站的新闻正文并进行分类和聚类
stickerchat
Dataset for WWW 2020 paper "Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog"
proto-summ
Dataset proposed by ''How to Write Summaries with Patterns? Learning towards Abstractive Summarization through Prototype Editing''
DistributedCrawler
DistributeCrawler的Maven版
table-summ
BioGen: Generating Biography Summary under Table Guidance on Wikipedia
char-rnn-tensorflow
Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow
dialog-summ
Code for ACL 2023 paper "Dialogue Summarization with Structure-aware Graph Modeling via Static-Dynamic Fused Graph"
Beijing_Daxuexi_Simple
北京 青年大学习 使用Github Actions自动完成
DateConverter
Convert the date in Excel
DistributedWebSearcher
DistributedCrawler的Web搜索站点
LuceneStudy
Lucene3.0的学习
PKUAutoSubmit
PKU一键出入校备案小工具
TextBox
TextBox 2.0 is a text generation library with pre-trained language models
webmagic
A scalable web crawler framework.