Byaidu / PDFMathTranslate

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

Home Page:https://pdf2zh.com

Repository from Github https://github.comByaidu/PDFMathTranslateRepository from Github https://github.comByaidu/PDFMathTranslate

pdf2zh 2.0

awwaawwa opened this issue · comments

Note

2.0 Moved to a new repository under the organization: PDFMathTranslate/PDFMathTranslate-next

Background

PDF2ZH consists of the following parts:

  1. Core: I am completely rewriting it, related code is at https://github.com/funstory-ai/BabelDOC
  2. Configuration System
  3. GUI
  4. Translator Support

The current PDF2ZH code is difficult to maintain. I hope to completely rewrite all parts of PDF2ZH except for the translator to improve both user and developer experience.

BabelDOC Preview

Image

Project Positioning

BabelDOC serves as a PDF translation library providing core translation functionality.

PDF2ZH handles self-deployment related tasks, such as GUI, better configuration system, better initial guidance, multi-language localization, etc.

Roadmap

  • Experimental support for BabelDOC in CLI and GUI
  • Stop adding new features to the legacy backend
  • Refactor the configuration system
  • Improve GUI
  • Switch documentation to mkdocs
  • Provide a more user-friendly http api
  • Integration Testing
  • Completely remove the legacy backend after BabelDOC is perfected
  • ...

Specific Details

Requires further research & discussion.

For the configuration system, I want:

  1. Support CLI parameters
  2. Support environment variables
  3. Support configuration files
  4. Support modifying all configurations in GUI

For the GUI, I hope to have a comprehensive beginner's guide. And it would be best to have a complete packaging solution, aiming to achieve one-click startup with a double-click .exe file, and then complete all operations within the GUI.

GUI needs good localization support. Export all UI strings.

weblate is all you need.

After pdf2zh 2.0 is released, explain the relationship between memory usage and page count in the readme.

GUI may need to consider supporting automatic PDF splitting for translation, and then merging back together.

Progress Update:

  1. Technical stack: pydantic+pydantic-settings+nicegui+fastapi

  2. Configuration system will support: multi-level toml config, environment variables, cli, gui

  3. Except for half of translator.py and the complete cache.py, all other code has been completely rewritten

  4. For docker/exe/cli, both webui + http api will be run simultaneously.

  5. Project positioning: BabelDOC will serve as a core library for PDF translation, mainly to be imported by pdf2zh. Secondary development should use the http or python api provided by pdf2zh 2.0. BabelDOC only guarantees compatibility with pdf2zh. BabelDOC itself does not promise any additional compatibility. After pdf2zh 2.0rc is released, all BabelDOC apis should be considered internal apis and should not be called directly under normal circumstances.

This will take some time, so please be patient.

document refatory.
文档readme有些内容对应不上,需要重新在2.0的时候调整readme文档。

When will 2.0 release?

commented

同问2.0什么时候发布,项目写的写的超级好但我想通过API调用并且能替换自己本地的翻译引擎,BabelDOC的项目中并没有相关介绍

快了,一到两周?@awwaawwa

快了,一到两周?@awwaawwa

乐观估计1-2周。

同问2.0什么时候发布,项目写的写的超级好但我想通过API调用并且能替换自己本地的翻译引擎,BabelDOC的项目中并没有相关介绍

2.0初期仅支持调用本地的openai兼容api喔

加就完事了,不是啥复杂的东西。

API is planned to support two types. The first API reads all settings using the config system. The second API receives all configurations through the API request.

It's a bit difficult to partially override config from the API.

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/14634166438

You can download win64-exe and win64-exe-with-assets

The first preview version of 2.0~

Second pdf2zh 2.0 Preview Version

Download Links

Updates

  • Synchronized with the latest BabelDOC 0.3.31
  • Now supports 100+ languages (see the complete list at BabelDOC Supported Languages)
  • Windows version utilizes DirectML acceleration for layout recognition

2025-5-5:
sync BabelDOC 0.3.32 GitHub Actions Build

sync BabelDOC 0.3.33 GitHub Actions Build

fix ollama GitHub Actions Build

sync BabelDOC 0.3.34 GitHub Actions Build

2025-5-6:
sync BabelDOC 0.3.35 GitHub Actions Build

Support disabling siliconcloud qwen3 model thinking GitHub Actions Build

sync BabelDOC 0.3.36 GitHub Actions Build

2025-5-9:
sync BabelDOC 0.3.39 GitHub Actions Build

funstory-ai/BabelDOC#344 added a --custom-system-prompt "/no_think You are a professional, authentic machine translation engine." which is mainly used to add the /no_think instruction of Qwen 3 in the prompt.

Technically, this thing will only replace the first sentence of the prompt generated by BabelDOC, nothing more.

Image

2025-5-10:

  1. sync BabelDOC 0.3.40 GitHub Actions Build

Fixed the issue where some file homepages were not translated & improved compatibility

Fixed the issue where translation cache was not effective

  1. GitHub Actions Build

Add OpenAICompatible translator type, similar to the previous openai-liked

Fix az openai

2025-5-12:

  1. sync BabelDOC 0.3.42 https://github.com/awwaawwa/PDFMathTranslate/actions/runs/14971278304

Optimized the prompt words and improved the parsing stability of formulas and rich text placeholders. Fixed the issue of infinite recursion in placeholder generation during multi-segment translation fallback.

2025-5-13
https://github.com/awwaawwa/PDFMathTranslate/actions/runs/14994072557
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Synchronized BabelDOC 0.3.44, preliminary support for pdf rotation attribute (please note that some pages require vertical layout support for proper translation, which is not supported in the current version of BabelDOC)

2025-5-14

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15010998457
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Fixed the issue where using GUI translation would automatically set the "gui" option in the default configuration file to true (it will now automatically be set to false).

@awwaawwa 2.0版本中gradio用户名密码验证的功能保留了吗,我看到setup_gui函数没有传任何参数进去。

@awwaawwa 2.0版本中gradio用户名密码验证的功能保留了吗,我看到setup_gui函数没有传任何参数进去。

还在的,auth_file参数。但是没有对应的设置。

@awwaawwa 2.0版本中gradio用户名密码验证的功能保留了吗,我看到setup_gui函数没有传任何参数进去。

还在的,auth_file参数。但是没有对应的设置。

是的,没有设置,所以逻辑就无效了。我还以为这个功能deprecated了😂。2.0重构了太多了,我quick code review了一下(为了猜环境变量,可以说完全是两款软件了。欢迎PR吗,如果能顺手修的我直接PR算了。

awwaawwa#9

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15016041352
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

文档还在路上,最近有开始弄文档了。等文档好了就发2.0正式版

2.0都不能叫重构了,,,,直接叫重写比较合适

环境变量应该是PDF2ZH_全大写的cli参数名,然后把-改成_

@awwaawwa 2.0版本中gradio用户名密码验证的功能保留了吗,我看到setup_gui函数没有传任何参数进去。

还在的,auth_file参数。但是没有对应的设置。

是的,没有设置,所以逻辑就无效了。我还以为这个功能deprecated了😂。2.0重构了太多了,我quick code review了一下(为了猜环境变量,可以说完全是两款软件了。欢迎PR吗,如果能顺手修的我直接PR算了。

我刚才加上了,哈哈哈。
当然可以pr了,也可以等合进现在的仓库再pr也可以。
: )

2025-5-15

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15049233880
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Sync BabelDOC 0.3.48, fix recent issues with some files having ridiculous stacking problems

2025-5-16

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15065732395
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Sync BabelDOC 0.3.49, optimize prompts to reduce placeholder error rate, and enhance compatibility.

Will it support http api?

Will it support http api?

HTTP API is expected in version 2.1 or 2.2

2025-5-17

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15073554383
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Synchronized BabelDOC 0.3.50, now supports Python 3.13. The exe and docker versions are also built based on Python 3.13. However, I have only tested the source code + Python 3.13 locally on my computer and it runs normally. I would appreciate it if the community could help test the exe and docker versions~

2025-5-17

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15073554383 https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Synchronized BabelDOC 0.3.50, now supports Python 3.13. The exe and docker versions are also built based on Python 3.13. However, I have only tested the source code + Python 3.13 locally on my computer and it runs normally. I would appreciate it if the community could help test the exe and docker versions~

显示没有找到段落,试了好几个pdf都一样,包括以前正常可用的2.0版本(5月9号左右的版本)成功翻译过的文件
traceback.txt

2025-5-17
https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15073554383 https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate
Synchronized BabelDOC 0.3.50, now supports Python 3.13. The exe and docker versions are also built based on Python 3.13. However, I have only tested the source code + Python 3.13 locally on my computer and it runs normally. I would appreciate it if the community could help test the exe and docker versions~

显示没有找到段落,试了好几个pdf都一样,包括以前正常可用的2.0版本(5月9号左右的版本)成功翻译过的文件 traceback.txt

原文件发一下?

直接发个新issue吧,记得用对模版+完整填写

2025-5-20

更正硅基流动名称
移除deeplx支持
本次更新是一次破坏性更新,此版本对配置文件格式有所调整(移除deeplx相关配置,调整tmt配置,调整硅基流动配置)。
现在默认的配置文件已变更为 ~/.config/pdf2zh/config.v3.toml

~/.config/pdf2zh/default/2.0.0.rc0.toml 中会存放当前版本的默认配置,可以参考。

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15134272617
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

请问2.0版本目是否支持和计划支持:1、PDF是图片扫描件形式的 2、一行中文 一行英文的排版形式

请问2.0版本目是否支持和计划支持:1、PDF是图片扫描件形式的 2、一行中文 一行英文的排版形式

  1. 请自行完成ocr操作。若文件已做过ocr处理,就有初步支持
  2. 不支持且不计划支持

试用了下5.20号的预览版,发现qwmt模块没匹配"zh-CN": "Chinese",导致无法翻译,提示

"D:\Programs\pdf2zh_beta\site-packages\pdf2zh\translator\translator_im pl\qwenmt.py", line 58, in lang_mapping return langdict[input_lang] ~~~~~~~~^^^^^^^^^^^^ KeyError: 'zh-CN'

添加后qwen-mt-plus跑的挺正常,qwen-mt-turbo的话不知道为啥全是提示词

您是一个专业可靠的机器翻译引擎,负责将输入文本翻译成 zh‑CN。在翻译时,请参 考以下信息以提高翻译质量:0. 全文中的第一个标题:1 | 介绍 1. 全文中最相似的标 题:摘要在翻译时,请遵循以下规则:1. 不要翻译样式标签,例如 "<style id='1'>xxx"!2. 不要翻译公式占位符,例如 "{v3}"。系统会自动用相应的公式替换占 位符。3. 如果不需要翻译(例如专有名词、代码等),则返回原文。4. 只输出翻译结 果,不解释和注释。5. 将文本翻译成 zh‑CN。现在,请仔细阅读以下待翻译文本,并 直接输出翻译结果。

试用了下5.20号的预览版,发现qwmt模块没匹配"zh-CN": "Chinese",导致无法翻译,提示

"D:\Programs\pdf2zh_beta\site-packages\pdf2zh\translator\translator_im pl\qwenmt.py", line 58, in lang_mapping return langdict[input_lang] ~~~~~~~~^^^^^^^^^^^^ KeyError: 'zh-CN'

添加后qwen-mt-plus跑的挺正常,qwen-mt-turbo的话不知道为啥全是提示词

您是一个专业可靠的机器翻译引擎,负责将输入文本翻译成 zh‑CN。在翻译时,请参 考以下信息以提高翻译质量:0. 全文中的第一个标题:1 | 介绍 1. 全文中最相似的标 题:摘要在翻译时,请遵循以下规则:1. 不要翻译样式标签,例如 "<style id='1'>xxx"!2. 不要翻译公式占位符,例如 "{v3}"。系统会自动用相应的公式替换占 位符。3. 如果不需要翻译(例如专有名词、代码等),则返回原文。4. 只输出翻译结 果,不解释和注释。5. 将文本翻译成 zh‑CN。现在,请仔细阅读以下待翻译文本,并 直接输出翻译结果。

#951

蹲社区好心人帮忙

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15294364940
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

sync BabelDOC 0.3.53

Optimize prompt words to reduce the error rate of rich text translation

Optimize translation error logs

sync BabelDOC 0.3.54

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15295271576

Fix a minor layout bug

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15344872982
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate
Sync BabelDOC 0.3.55

Added automatic terminology extraction (enabled by default, current version of pdf2zh does not support disabling)


https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15351474074
https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Sync BabelDOC 0.3.56

integrate hyperscan for efficient term matching and improve regex handling

Significantly improve terminology matching performance

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15344872982 https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate Sync BabelDOC 0.3.55

Added automatic terminology extraction (enabled by default, current version of pdf2zh does not support disabling)

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15351474074 https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate

Sync BabelDOC 0.3.56

integrate hyperscan for efficient term matching and improve regex handling

Significantly improve terminology matching performance

@awwaawwa Hello, I notice that the current version in the repo is not entire matching that in the docker image, especailly gui.py. My question is that are you going to update in docker image instead of repo-wise update?

https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15344872982 https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate Sync BabelDOC 0.3.55
Added automatic terminology extraction (enabled by default, current version of pdf2zh does not support disabling)
https://github.com/awwaawwa/PDFMathTranslate/actions/runs/15351474074 https://github.com/awwaawwa/PDFMathTranslate/pkgs/container/pdfmathtranslate
Sync BabelDOC 0.3.56
integrate hyperscan for efficient term matching and improve regex handling
Significantly improve terminology matching performance

@awwaawwa Hello, I notice that the current version in the repo is not entire matching that in the docker image, especailly gui.py. My question is that are you going to update in docker image instead of repo-wise update?

#965
You can also take note that there is a PR here.

2.0 Moved to a new repository under the organization: https://github.com/PDFMathTranslate/PDFMathTranslate-next

Version 2.0 official release has been published.

Thank you for this excellent work. I've rewritten a GUI that supports bilingual preview and large model Q&A, which might be of some help.
https://github.com/zstar1003/FreePDF