ptaszynski's repositories
yacis-corpus
This is a repository of YACIS corpus and information of how to obtain the whole corpus as well as its annotations.
mlask
This page contains information about ML-Ask. ML-Ask is a system for Affect Analysis of textual input in Japanese. It is based on a linguistic assumption that emotional states of a speaker are conveyed by emotional expressions used in emotive utterances. ML-Ask firstly separates emotive utterances from non-emotive and in the emotive utterances seeks for expressions of specific emotion types.
cyberbullying-Polish
This dataset contains tweets with annotated labels of cyberbullying and hate-speech in Polish language, as well as scoring scripts providing Precision, Recall, balanced F-score and Accuracy.
cabocha-extractor
A tool for preprocessing of text data in Japanese for further machine learning. It uses MeCab for tokenization and part-of-speech tagging and Cabocha for shallow and deep parsing.
COVID-19
Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
fetag
Simple feature tagger / 簡単な素性抽出器
googleforms_automation
https://youtu.be/LzYZKS9DjH8
tfidf
A simple tf*idf calculator.
visualizer.coffee-shot-downloader
Download your shot data as json files from https://visualizer.coffee