wtekiela / opensub4j

Java library for communicating with opensubtitles.org XML-RPC API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Downloading subtitles with custom charset / encoding

VeiZhang opened this issue · comments

There seems no params set charset in XMLRPC download, but according to the sample, the download link can set charset:

https://dl.opensubtitles.org/en/download/src-api/vrf-19be0c59/sid-Es8Yw0zrLBHcaLjKikJ-2rHWo99/filead/1953189057.gz
https://dl.opensubtitles.org/en/download/subencoding-utf8/src-api/vrf-19be0c59/sid-Es8Yw0zrLBHcaLjKikJ-2rHWo99/filead/1953189057.gz

I had try the two download link, and it works by http. But I don't know how to do with XMLRPC.

I checked your method to set charset, but it didn't work.

private Content getSubtitleFileContent(String charsetName) {

Hi @VeiZhang
right now the library doesn't provide a way to customize the download link using a specific encoding. The method you've mentioned is not to set the encoding, but rather to get the content using provided encoding (used in toString method after unzipping - check here:

) so ideally you should pass the original encoding of the subtitle file.

So to allow downloading subtitles with a specific encoding would be an enhancement

@wtekiela Thanks for your reply.

It didn't work, the reason is the download link. So I try to add these code:

Map<String, String> videoProperties = new HashMap<>();

videoProperties.put("subencoding", "utf8");

Still not work.
So if there is no other params to set, I will replace the link with encoding to fix the messy code.

Can you provide some concrete steps to reproduce and expected outcomes? I'm not sure if I follow your train of thought with those changes.

Yes, I made a query,

ListResponse<SubtitleInfo> response = mOpenSubtitlesClient.searchSubtitles("all", null, null, "", "The avengers", "1", "1", null);

Choose the first subtitle, and the content of download link contains messy code which encoding is UTF-8.

As u said, there is no encoding param to set in query request. I get a idea, I can get the download link after a query, and I will use a rule to add the encoding to create a new link, then download the new link by myself:

https://dl.opensubtitles.org/en/download/src-api/vrf-19be0c59/sid-Es8Yw0zrLBHcaLjKikJ-2rHWo99/filead/1953189057.gz
// add the /subencoding-utf8/
https://dl.opensubtitles.org/en/download/subencoding-utf8/src-api/vrf-19be0c59/sid-Es8Yw0zrLBHcaLjKikJ-2rHWo99/filead/1953189057

Hope I explain clearly.

@wtekiela Sorry, I found another way to solve, use SubtitleInfo.getEncoding() to getContent, the messy code will disappear.
Thanks!