Model Card for `numbda-webnews`

中文版

numbda-webnews is a news classification model fine-tuned from roberta-base-finetuned-ifeng-chinese with a new dataset of approximately 40k news articles crawled from news websites in China, which is a sub-project of the AI-Testing project.

The dataset contains (not limited to) the following 14 categories:

资讯
财经
体育
时政
娱乐
社会
科技
汽车
健康
萌宠
国际
生活
美食
游戏

The above 14 categories have a total of 26k samples.

Model Details

Model Sources

Repository: https://github.com/wenh06/numbda-webnews
Huggingface Hub: https://huggingface.co/wenh06/numbda-webnews

Uses

Direct Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# "wenh06/numbda-webnews" can be replaced with local path to the model directory
tokenizer = AutoTokenizer.from_pretrained("wenh06/numbda-webnews")
model = AutoModelForSequenceClassification.from_pretrained("wenh06/numbda-webnews")

pipeline = pipeline("text-classification", model=model, tokenizer=tokenizer)

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Training Details

Training Data

This model was fine-tuned using a new dataset of approximately 40k news articles crawled from news websites in China, which would be released latter some time.

Evaluation

Evaluation results and software/hardware information can be found in Weights & Biases.

Metric	Score
top1-accuracy	0.768
top3-accuracy	0.944
top5-accuracy	0.981

Curves of Top n Accuracy

Top1 Accuracy	Top3 Accuracy	Top5 Accuracy

About

Languages

Language:Python 96.0%Language:Dockerfile 4.0%

Model Card for numbda-webnews