twitter-archive / cloudhopper-smpp

Efficient, scalable, and flexible Java implementation of the Short Messaging Peer to Peer Protocol (SMPP)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Packed GSM encoding should pad with CR when required

stela opened this issue · comments

When using packed 7-bit GSM encoding, no padding with a CR is done if the last 7 bits are all zero. This causes "interesting" effects where different inputs are encoded to the same output bytes. "same" is true in the test case below:

byte[] requiresPadding = CharsetUtil.CHARSET_PACKED_GSM.encode("1234567");
byte[] endsWithAt = CharsetUtil.CHARSET_PACKED_GSM.encode("1234567@");
boolean same = java.util.Arrays.equals(requiresPadding, endsWithAt);
Assert.assertFalse(same);

Similarly, such a padding CR character should be ignored when decoding.

Of course one can manually patch up the padding after encoding/decoding, but that's not very convenient. When to pad depends on if the first character starts at an even byte boundary or somewhere else (like sometimes when following a UDH). Maybe the encode method should be extended to take a variable amount of prefix-bits for packed GSM encoding, and then when decoding specify a start-bit-offset.

Cleaning up old issues. Open to a PR if you think it would be helpful.