biningo / coca-splitter

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

This script can split COCA vocabulary into small groups to be imported into dictionary app (e.g. Eudic) for studying.

Please refer to COCA 词频表使用 and 快速掌握 COCA 词汇表.

Requirements

  • Python 2

This script is orginally written in Python 2, now has updated to Python3, so please make sure you have Python 3 installed in your environment. For my it's /usr/bin/python3.

Usage

python split.py coca20000.txt 15

by default, the Output file is coca20000_batch_import.txt.

Note:

The last number 15 is the group size, it means each group contains 15 words, you can change it to your need.

Files

  • coca20000.txt contains the origianl vocabulary list
  • coca_refinded.txt contains the final refined vocabulary list according to this article 快速掌握 COCA 词汇表

About


Languages

Language:Python 100.0%