embedded nul in string
ChristelSwift opened this issue · comments
i'm trying to download an excel file from an sftp site. The excel file looks fine to me, but i'd have to skip a couple of lines to import it as a dataset. When i run:
data <- getURL(
url = url,
userpwd = userpwd,
verbose = TRUE
)
I get
* SSH authentication methods available: password
* Initialized password authentication
* Authentication complete
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
embedded nul in string: 'PK\003\004\024'
Any idea what i can do to fix this?
Hi Christel
I'd try
tmp <- getURLContent(url = url, userpwd = userpwd, verbose = TRUE, binary = TRUE)
data = rawToChar(tmp)
Please let us know if that solves the problem.
thank you. I tried but unfortunately still got the same error:
> tmp <- getURLContent(url = url, userpwd = userpwd, verbose = TRUE, binary = TRUE)
* Trying xxxxxx...
* TCP_NODELAY set
* Connected to sftp.xxxx.com (xxxx) port 22 (#0)
* SSH MD5 fingerprint: xxxxx
* SSH authentication methods available: password
* Initialized password authentication
* Authentication complete
Error in curlPerform(url = url, curl = curl, .opts = .opts) :
embedded nul in string: 'PK\003\004\024'
Can you please try
data = getBinaryURL(url = url, userpwd = userpwd, verbose = TRUE)
and hopefully that will work or give a different problem.
it has imported but it's in binary format so it looks nothing like the original excel...
* SSH authentication methods available: password
* Initialized password authentication
* Authentication complete
* Failed to close libssh2 file: -31 SFTP Protocol Error
* Connection #0 to host sftp.grouptechedge.com left intact
> data
[1] 50 4b 03 04 14 00 00 00 08 00 5b 8b 7e 57 1c 07 04 7d 2d 01 00 00 3a 02 00 00 11 00 00 00 64 6f 63 50 72 6f 70 73 2f
[40] 63 6f 72 65 2e 78 6d 6c 8d 91 cd 4e c3 30 10 84 9f 80 77 88 7c 4f 36 4e d4 82 ac a6 95 00 f5 44 25 24 8a 40 dc 2c 7b
[79] db 5a c4 3f b2 0d 69 df 1e 37 69 43 a5 72 e0 68 cf f8 db d9 f1 6c b1 d7 6d f6 8d 3e 28 6b 1a 42 8b 92 64 68 84 95 ca
[118] 6c 1b f2 ba 5e e6 77 24 0b 91 1b c9 5b 6b b0 21 07 0c 64 31 bf 99 09 c7 84 f5 f8 ec ad 43 1f 15 86 2c 81 4c 60 c2 35
can i convert this back into the original excel?
Yes, it is a zip archive corresponding to an xlsx file, I imagine.
It will contain numerous XML files with a very specific format.
You can save the raw vector to a file (e.g. writeBin()) and use something like readxl::read_excel() to read it.
(You can work with the zip archive directly with a package such as Rcompression and also with the XML files in the zip archive using a variety of packages.)
for future reference, this worked:
my_tmp_file = tempfile()
getBinaryURL(
url = my_url,
userpwd = my_user_pwd,
ftp.use.epsv = FALSE,
crlf =TRUE
) %>%
writeBin(con = my_tmp_file)
db = read_xlsx(
path = my_tmp_file,
sheet = 1
)