ip-rw / translate_code

translate source code without breaking it (too badly). supports chinese, russian and really anything that isn't ASCII.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Source code translator

Updated to include a GPT version with significantly better performance and accuracy than the original. You can see an example here https://github.com/ip-rw/yakit_english/ this is a large, mature electron GUI translated from Chinese into English without manual intervention.

Rough and ready way to translate source code into English. It will extract and translate blocks of non-ASCII text and then write it all back to the file. It expects you to pipe/pass as args a list of file paths to translate. It uses GoogleTranslate from the deep_translate library and works best via pool of rotating proxies. There's no reason the other deep_translate backends wouldn't work, just untested.

It's been tested with Russian and Chinese source and does as good a job as one could hope. YMMV but it seems to be okay at not mangling files.

It batches things up and handles Google's antics as best it can, there's a fair bit of juggling but it should go as quickly as it can without filling the files with nonsense.

I made it because the Chinese particularly release a lot of interesting code now, and unfortunately its just squiggles to me.

Usage

I use like this:

find ~/ksubdomain -type f |grep -v 'git\|svg'|  python3 main.py

If you want to supercharge things (have a rotating proxy handy) then use xargs but beware of unescaped file paths (I avoid spaces):

find ~/ksubdomain -type f |grep -v 'git\|svg'|  xargs -n 30 -P5 python3 main.py 

Prerequisites

You'll need Python 3 along with the following packages:

  • argparse
  • cypunct
  • charset-normalizer
  • deep_translator
  • thefuzz

Before:

user@flex:~/ksubdomain$ go run cmd/ksubdomain/*.go e
NAME:
   cmd enum - 枚举域名

USAGE:
   cmd enum [command options] [arguments...]

OPTIONS:
   --domain value, -d value        域名
   --band value, -b value          宽带的下行速度,可以5M,5K,5G (default: "2m")
   --resolvers value, -r value     dns服务器文件路径,一行一个dns地址,默认会使用内置dns
   --output value, -o value        输出文件名
   --silent                        使用后屏幕将仅输出域名 (default: false)
   --retry value                   重试次数,当为-1时将一直重试 (default: 3)
   --timeout value                 超时时间 (default: 6)
   --stdin                         接受stdin输入 (default: false)
   --only-domain, --od             只打印域名,不显示ip (default: false)
   --not-print, --np               不打印域名结果 (default: false)
   --dns-type value                dns类型 可以是a,aaaa,ns,cname,txt (default: "a")
   --domainList value, --dl value  从文件中指定域名
   --filename value, -f value      字典路径
   --skip-wild                     跳过泛解析域名 (default: false)
   --ns                            读取域名ns记录并加入到ns解析器中 (default: false)
   --level value, -l value         枚举几级域名,默认为2,二级域名 (default: 2)
   --level-dict value, --ld value  枚举多级域名的字典文件,当level大于2时候使用,不填则会默认
   --help, -h                      show help (default: false)

After:

user@flex:~/ksubdomain$ go run cmd/ksubdomain/*.go e
NAME:
   cmd enum - Enumerate the domain name

USAGE:
   cmd enum [command options] [arguments...]

OPTIONS:
   --domain value, -d value        Domain name
   --band value, -b value          broadband downlink speed, can be 5M, 5K, 5G (default: "2m")
   --resolvers value, -r value     dns server file path, one dns address per line, the default will use the built-in dns
   --output value, -o value        output file name
   --silent                        After using it, the screen will only output the domain name (default: false)
   --retry value                   retry times, when it is -1, (default: 3)
   --timeout value                 timeout Time (default: 6)
   --stdin                         accepts stdin input (default: false)
   --only-domain, --od             only prints the domain name, does not display the ip (default: false)
   --not-print, --np               does not print the domain name result (default: false)
   --dns-type value                dns type can be a, aaaa, ns, cname, txt (default: "a")
   --domainList value, --dl value  Specify the domain name
   --filename value, -f value      dictionary path
   --skip-wild                     skip pan-analysis domain name (default: false)
   --ns                            Read the ns record of the domain name and add it to the ns parser (default: false)
   --level value, -l value         Enumerate several levels of domain names, the default is 2, the second-level domain name (default: 2)
   --level-dict value, --ld value  Enumerate the dictionary file of multi-level domain names, used when the level is greater than 2, if not filled, it will default to
   --help, -h                      show help (default: false)

About

translate source code without breaking it (too badly). supports chinese, russian and really anything that isn't ASCII.


Languages

Language:Python 100.0%