Rewrite `distant-ssh2` using `russh` (native Rust)
chipsenkbeil opened this issue · comments
There are a lot of problems with the ssh libraries we're using today. They're unreliable, error-prone, and inconsistent. This ignores the build complexity that they introduce as well.
russh is a Rust-native implementation of ssh, which should ideally work as a client to other SSHD implementations. The core library is lacking sftp support, but we could use russh-sftp for inspiration, even though it only supports the server-side portion of sftp.
The specification for sftp (version 3) doesn't seem that complex to implement, so this could be worth pursuing.
@chipsenkbeil So, any plan continue?
@baoyachi if you are asking if this is planned, then yes, it is. It is in the 1.0 milestone, meaning that it could be worked on at any point between now and the release of 1.0.
Also look at https://www.rfc-editor.org/rfc/rfc4254#page-14 which defines the v2 spec of ssh. It highlights things like extended data for stderr.
And https://github.com/Miyoshi-Ryota/async-ssh2-tokio/blob/main/src/client.rs as an example of authentication, server validation, and process execution. If we can extend this to support sftp, it should cover all needs.
Check out https://www.rfc-editor.org/rfc/rfc4251#page-8 for data type formats.
Data Type Representations Used in the SSH Protocols
byte
A byte represents an arbitrary 8-bit value (octet). Fixed length
data is sometimes represented as an array of bytes, written
byte[n], where n is the number of bytes in the array.
boolean
A boolean value is stored as a single byte. The value 0
represents FALSE, and the value 1 represents TRUE. All non-zero
values MUST be interpreted as TRUE; however, applications MUST NOT
store values other than 0 and 1.
uint32
Represents a 32-bit unsigned integer. Stored as four bytes in the
order of decreasing significance (network byte order). For
example: the value 699921578 (0x29b7f4aa) is stored as 29 b7 f4
aa.
uint64
Represents a 64-bit unsigned integer. Stored as eight bytes in
the order of decreasing significance (network byte order).
string
Arbitrary length binary string. Strings are allowed to contain
arbitrary binary data, including null characters and 8-bit
characters. They are stored as a uint32 containing its length
(number of bytes that follow) and zero (= empty string) or more
bytes that are the value of the string. Terminating null
characters are not used.
Strings are also used to store text. In that case, US-ASCII is
used for internal names, and ISO-10646 UTF-8 for text that might
be displayed to the user. The terminating null character SHOULD
NOT normally be stored in the string. For example: the US-ASCII
string "testing" is represented as 00 00 00 07 t e s t i n g. The
UTF-8 mapping does not alter the encoding of US-ASCII characters.
mpint
Represents multiple precision integers in two's complement format,
stored as a string, 8 bits per byte, MSB first. Negative numbers
have the value 1 as the most significant bit of the first byte of
the data partition. If the most significant bit would be set for
a positive number, the number MUST be preceded by a zero byte.
Unnecessary leading bytes with the value 0 or 255 MUST NOT be
included. The value zero MUST be stored as a string with zero
bytes of data.
By convention, a number that is used in modular computations in
Z_n SHOULD be represented in the range 0 <= x < n.
Examples:
value (hex) representation (hex)
----------- --------------------
0 00 00 00 00
9a378f9b2e332a7 00 00 00 08 09 a3 78 f9 b2 e3 32 a7
80 00 00 00 02 00 80
-1234 00 00 00 02 ed cc
-deadbeef 00 00 00 05 ff 21 52 41 11
name-list
A string containing a comma-separated list of names. A name-list
is represented as a uint32 containing its length (number of bytes
that follow) followed by a comma-separated list of zero or more
names. A name MUST have a non-zero length, and it MUST NOT
contain a comma (","). As this is a list of names, all of the
elements contained are names and MUST be in US-ASCII. Context may
impose additional restrictions on the names. For example, the
names in a name-list may have to be a list of valid algorithm
identifiers (see Section 6 below), or a list of RFC3066 language
tags. The order of the names in a name-list may or may not be
significant. Again, this depends on the context in which the list
is used. Terminating null characters MUST NOT be used, neither
for the individual names, nor for the list as a whole.
Examples:
value representation (hex)
----- --------------------
(), the empty name-list 00 00 00 00
("zlib") 00 00 00 04 7a 6c 69 62
("zlib,none") 00 00 00 09 7a 6c 69 62 2c 6e 6f 6e 65
Some parts of sftp were wrapped to be more comfortable, particularly reading a file, which sftp exposes through specifying a maximum length to read and an offset. Two things:
- I'd been considering refactoring reading to be a stream, so maybe that's something we do to support reading subsets of a file. See #164.
- If we want to support reading everything, we need to wrap this.
See how libssh does it by checking out sftp_read and sftp_readdir.
Our own wrapper around wezterm-ssh uses the implementation of AsyncRead
to invoke read_to_string
to continue reading until EOF is reached, reading up to buf len bytes at a time. We can probably keep that logic as long as our wrapper supports AsyncRead
.
Good. 👍
AspectUnk/russh-sftp#4 (comment) if you need any help with russh integration let me know, I'm very interested in your project
AspectUnk/russh-sftp#4 (comment) if you need any help with russh integration let me know, I'm very interested in your project
Thanks! Still haven't gotten to it yet. Was reading how to handle a pty using russh since I also need to support that.