Google Translate - Translation Exploit
Vulnerabilities in translation algorithm that leaks some weird data.
Description
Basically this exploit breaks Google Translate service (Cloud Translation API too) using por
, (by
in english) word with an id
(number typed) and a few keywords like people
, downloads
, posts
, message
, etc...
This suspected to be a Query/Code Injection exploit that interacts, apparently, with Google Maps, Youtube, Blogger, Google Play,... databases that leaks non-indexed information (e.g. "csp03607292" ) by the search engines, so it could be internal information.
Some results
More Info Stock Photo Information Photo ID: csp03607292
downloads and downloads for Android applications and games, and get the latest updates and corrections by ANDROID android.permission.INTERNET android.permission.ACCESS_NETWORK_STATE android.permission.INTERNET android.permission.ACCESS_NETWORK_STATE android.permission.WRITE_EXTERNAL_STORAGE android.permission.WRITE_EXTERNAL_STORAGE android.permission.WRITE_EXTERNAL_STORAGE android.permission.WRITE_EXTERNAL_STORAGE
3 Downloads,,,,,,,,,,,,,,,,,,,,, by bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
On Cloud Translation API
This also works on Google Cloud Translation API.
Request: https://translation.googleapis.com/language/translate/v2/?q=por%2011141566661212131314689999312345797365%20downloads&source=pt&target=en&key=YOUR_API_KEY_HERE
{
"data": {
"translations": [
{
"translatedText": "More tracks from this album"
}
]
}
}
Example of reproduction
- Open https://translate.google.com/
- Choose Portuguese as the source language and other language for the translation language.
- Try to translate
por 3232312231236872122321344 message
. - Check out your weird translation! (you can change the number to change the database query)
Advanced queries
This could be scalable with AND
keywords, ;
, middle text on the number or even negative numbers at the end afects the result.
Examples:
por 11141566661212131314689999312345797365 downloads AND por 11141566661212131314689999312345797365 downloadspor 11141566661212131314689999312345797365 -23467
por 11141566661212131314689999312345797365 downloads AND; por 111415666612345212131314asdf689999312345797365 downloads -234672345234523452345
por 111415666612345212131314asdf689999312345797365 downloads -234672345234523452345
se 1890743402834712390487 posts' AND par 1890743402834712390487 posts' GOTO por 1890743402834712390487 posts'
;DROP por 1890743402834712390487 posts
FROM por 1890743402834712390487 posts
Keyword usage
At the beginning
por
by
se
par
Middle info
- signed/unsigned integers
- non spaced text with digits at the beginning and end
Separators
AND
;
'
OR
DROP
FETCH
GOTO
TO
XOR
FROM
ROM
TOP
(Work with a lot of SQL keywords, more here)
At the end
downlaods
pessoas
posts
message
Formal Report
At this point, I got no official contact from Google. I already reported this issue on Google Issue Tracker (#119504713
on Tue, 13 Nov 2018, 20:27
) but the issue was marked as Intended Behavior
. A new issue was reported on Sun, 3 Mar 2019, 22:34
with more clear and new info #127179818
but unfortunately was also marked as Intended Behavior
(now with a human message).
Hi,
Thanks for report! It seems like, while it might be surprising, this is actually working as intended and is a feature of the product. This particular bug looks like that the translate ML model is producing a garbage data (possibly the inputs to it were not validated enough). It looks security-relevant, but it's just a coincidence here.
That said - if you think we misunderstood your report, and you see a well defined security risk, please let us know what we missed.