nijel / enca

Extremely Naive Charset Analyser

Home Page:https://cihar.com/software/enca/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

enconv fails with "enconv: Cannot seek in file : Bad file descriptor"

mwestphal opened this issue · comments

I've been using enconv to convert subtitles from chinese encoding to UTF8. It used to work perfectly.
Now it always fails with the following :

>$ enca -L zh twin.peaks.s02e17.1080p.bluray.x264-reward.chs.srt 
Simplified Chinese National Standard; GB2312
  CRLF line terminators
>$ enconv -L zh -x UTF8 twin.peaks.s02e17.1080p.bluray.x264-reward.chs.srt 
enconv: Cannot seek in file `/tmp/encaJqirv6': Bad file descriptor
failed to rename temporary file back
>$ ls /tmp/
ompi.violet.1000  serverauth.wCtKrOO4mv  tmp.Vt74BtXPf1  tmp.Z7Ei03ozve

iconv has no problem and is able to convert the file (but is not as practical to use)

$ iconv -f GB2312 -t UTF8 < twin.peaks.s02e17.1080p.bluray.x264-reward.chs.srt > a.srt`

I've uploaded a file to convert, but i think it will fail with any file needing conversion https://ufile.io/5eati

I'm using ArchLinux, i have this error with the ArchLinux packaged version as well as one built myself from master.

I looked into It. It seems that it is caused by call to recode_perform_task (src/convert_recode.c:120), which closes given streams. To resolve this they should be duplicated e.g.

--- a/src/convert_recode.c
+++ b/src/convert_recode.c
@@ -101,7 +101,8 @@ convert_recode(File *file,
       return ERR_IOFAIL;
     file->buffer->pos = 0;
 
-    if ((tempfile = file_temporary(file->buffer, 1)) == NULL
+    /* We do not unlink tempfile, because we want to reopen it later */
+    if ((tempfile = file_temporary(file->buffer, 0)) == NULL
         || file_seek(file, 0, SEEK_SET) != 0) {
       file_free(tempfile);
       return ERR_IOFAIL;
@@ -112,9 +113,20 @@ convert_recode(File *file,
     task->fail_level = enca_recode_fail_level;
     task->abort_level = RECODE_SYSTEM_ERROR;
     task->input.name = NULL;
-    task->input.file = file->stream;
     task->output.name = NULL;
-    task->output.file = tempfile->stream;
+    /* recode_perform_task closes given streams, so we need to duplicate them */
+    task->input.file = fopen(file->name, "rb");
+    if (task->input.file == NULL) {
+        fprintf(stderr, "failed to reopen `%s'\n", file->name);
+        file_free(tempfile);
+        return ERR_IOFAIL;
+    }
+    task->output.file = fopen(tempfile->name, "wb");
+    if (task->input.file == NULL) {
+        fprintf(stderr, "failed to reopen `%s'\n", tempfile->name);
+        file_free(tempfile);
+        return ERR_IOFAIL;
+    }
 
     /* Now run conversion original -> temporary file. */
     success = recode_perform_task(task);

I do not know why it is broken - maybe something was recently changed in librecode. It looks that enca has problems in case of stdin too (#29).

It seems to be resolved by recode rrthomas/recode#4, which also solves #29.

I have the same issue right now, also used to work perfectly, but now, when trying to convert
in.srt (changed extension to be uploadable), I get either this exact issue or free(): invalid pointer. When trying to encode to ISO-8859-2 it freezes instead.