This is a simple tool that uses ChatGPT to translate text into a specified target language in a faithful way to the original. The tool takes a text file (.pdf
, .txt
, .md
, .html
and .rtf
) or a folder of text files as input, and outputs a translated text file or a bilingual text file with the original and translated text side by side. Special optimization has been done especially for academic paper PDF parsing and translation.
使用ChatGPT将文本以忠于原文的方式翻译成指定的目标语言。该工具接受一个文本文件(.pdf
, .txt
, .md
, .html
或.rtf
)或者一个包含文本的文件夹,并生成一个直接翻译后的文本或一个双语的(并列显示原始文本和翻译文本)文本。尤其对于学术论文 PDF 解析和翻译做了特别的优化。
Use this on Google Colab (recommended). See here
Google Colab上使用这个工具(推荐)。见这里
# Install
git clone https://github.com/Raychanan/ChatGPT-for-Translation.git
cd ./ChatGPT-for-Translation/
pip install -r requirements.txt --quiet
# Run
python ChatGPT-translate.py --input_path=input.txt --openai_key=password
This command will translate the text in input.txt into simplified Chinese using ChatGPT. You can also specify any language you want. For example, --target_language="Japanese"
. See this txt as an example.
这个命令将使用ChatGPT把input.txt
中的文本翻译成简体中文。你也可以指定任何你想要的语言。例如,--target_language="Japanese"
。翻译后的txt文件例子见这里
python ChatGPT-translate.py --input_path=./folder/ --openai_key=password
python ChatGPT-translate.py --bilingual --input_path=input.txt --openai_key=password
Azure:
python ChatGPT-translate.py --input_path=input.txt --use_azure --azure_endpoint=endpoint_uri --azure_deployment_name=deployment_name --openai_key=your_AOAI_key
GPT-4:
python ChatGPT-translate.py --input_path=input.txt --model=gpt-4 --openai_key=password
You need a OpenAI API key (https://beta.openai.com/signup/)
你需要一个OpenAI的API密钥(https://beta.openai.com/signup/)
--num_threads: The number of threads to use for translation (default: 5).
--only_process_this_file_extension For example, set only_process_this_file_extension="txt"
--not_to_translate_references By default, not to translate references.
PDF parser is based on scipdf project on Github. Some adjustments were done to allow users to parse PDFs without having to initialzing a server locally.