license | language | metrics | pipeline_tag | ||
---|---|---|---|---|---|
apache-2.0 |
|
|
text-classification |
numbda-webnews
is a news classification model fine-tuned from roberta-base-finetuned-ifeng-chinese with a new dataset of approximately 40k news articles crawled from news websites in China, which is a sub-project of the AI-Testing project.
The dataset contains (not limited to) the following 14 categories:
- 资讯
- 财经
- 体育
- 时政
- 娱乐
- 社会
- 科技
- 汽车
- 健康
- 萌宠
- 国际
- 生活
- 美食
- 游戏
The above 14 categories have a total of 26k samples.
- Repository: https://github.com/wenh06/numbda-webnews
- Huggingface Hub: https://huggingface.co/wenh06/numbda-webnews
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
# "wenh06/numbda-webnews" can be replaced with local path to the model directory
tokenizer = AutoTokenizer.from_pretrained("wenh06/numbda-webnews")
model = AutoModelForSequenceClassification.from_pretrained("wenh06/numbda-webnews")
pipeline = pipeline("text-classification", model=model, tokenizer=tokenizer)
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
This model was fine-tuned using a new dataset of approximately 40k news articles crawled from news websites in China, which would be released latter some time.
Evaluation results and software/hardware information can be found in Weights & Biases.
Metric | Score |
---|---|
top1-accuracy | 0.768 |
top3-accuracy | 0.944 |
top5-accuracy | 0.981 |
Top1 Accuracy | Top3 Accuracy | Top5 Accuracy |
---|---|---|