eclipse-ee4j / jaxb-fi

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exception thrown when decoding the attribute-values sequence within an initial vocabulary where an attribute value length is > 9

Tomas-Kraus opened this issue · comments

When decoding a Fast Infoset file which contains an initial vocabulary, the Decoder class throws an exception if the initial vocabulary contains attribute values with length > 8. From some investigation, it appears that the decoder parses the attribute-values sequence incorrectly, treating the individual string values as a "NonEmptyOctetString starting on the 2nd bit of an octet (C.22)", rather than "NonEmptyOctetString starting on the fifth bit of an octet(C.23)". This results in the decoder not being able to properly determine the length of a string when a message which contains a length > 8 is encountered since the length of the string is calculated differently between C.22 and C.23.

Some more background:
attribute-values in the initial vocabulary is treated as an EncodedCharacterString (C.19). The current method to parse attribute-values in the code is to call this function in the Decoder.decodeInitialVocabulary(), since in the ParserVocabulary class, attributeValues is treated as a StringArray:

private void decodeTableItems(StringArray array) throws FastInfosetException, IOException {
final int noOfItems = decodeNumberOfItemsOfSequence();

for (int i = 0; i < noOfItems; i++)

{ array.add(decodeNonEmptyOctetStringOnSecondBitAsUtf8String()); }

}

The call path within decodeNonEmptyOctetStringOnSecondBitAsUtf8String() is the problem, as that decodes the strings within the set as 2 bit offset NonEmptyOctetStrings (C.22), instead of 5 bit offset NonEmptyOctetStrings (C.23).

The code to decode a 5 bit offset NonEmptyOctetString (C.23) is intermixed within decodeNonIdentifyingStringOnFirstBit(), which is trying to decode a NonIdentifyingStringOrIndex (C.14), where that type actually includes strings encoded as a 5 bit offset NonEmptyString.

There are probably a handful of ways to fix this, but one shortcut is to create a new function based upon decodeNonIdentifyingStringOnFirstBit() which removes the code that's specific to the unique properties of C.14. I'll post a snippet of a fix shortly.

Affected Versions

[1.2.12]

@glassfishrobot Commented
Reported by khiggins

@glassfishrobot Commented
khiggins said:
Here is the fix that I did, although it's not perfect by any means:

Step 1. Added new decodeTableItems function to Decoder.java where I can pass in a boolean as a 2nd argument. This new function in turn calls another new function
private void decodeTableItems(StringArray array, boolean fifthBit) throws FastInfosetException, IOException {
final int noOfItems = decodeNumberOfItemsOfSequence();

for (int i = 0; i < noOfItems; i++)

{ System.out.println("item number " + i); array.add(decodeNonEmptyOctetStringOnFifthBitAsString()); }

}

Step 2. Added decodeNonEmptyOctetStringOnFifthBitAsString() to Decoder.java:
protected final String decodeNonEmptyOctetStringOnFifthBitAsString() throws FastInfosetException, IOException

{ decodeNonEmptyOctetStringOnFifthBit(); return new String(_charBuffer, 0, _charBufferLength); }

Step 3: Added new function decodeNonEmptyOctetStringOnFifthBit() to Decoder.java, which is mostly based on decodeNonIdentifyingStringOnFirstBit():
protected final int decodeNonEmptyOctetStringOnFifthBit() throws FastInfosetException, IOException {
final int b = read();
switch(DecoderStateTables.NISTRING(b)) {
case DecoderStateTables.NISTRING_UTF8_SMALL_LENGTH:
_octetBufferLength = (b & EncodingConstants.OCTET_STRING_LENGTH_5TH_BIT_SMALL_MASK) + 1;
decodeUtf8StringAsCharBuffer();
return NISTRING_STRING;
case DecoderStateTables.NISTRING_UTF8_MEDIUM_LENGTH:
_octetBufferLength = read() + EncodingConstants.OCTET_STRING_LENGTH_5TH_BIT_SMALL_LIMIT;
decodeUtf8StringAsCharBuffer();
return NISTRING_STRING;
case DecoderStateTables.NISTRING_UTF8_LARGE_LENGTH:

{ final int length = (read() << 24) | (read() << 16) | (read() << 8) | read(); _octetBufferLength = length + EncodingConstants.OCTET_STRING_LENGTH_5TH_BIT_MEDIUM_LIMIT; decodeUtf8StringAsCharBuffer(); return NISTRING_STRING; }

case DecoderStateTables.NISTRING_UTF16_SMALL_LENGTH:
_octetBufferLength = (b & EncodingConstants.OCTET_STRING_LENGTH_5TH_BIT_SMALL_MASK) + 1;
decodeUtf16StringAsCharBuffer();
return NISTRING_STRING;
case DecoderStateTables.NISTRING_UTF16_MEDIUM_LENGTH:
_octetBufferLength = read() + EncodingConstants.OCTET_STRING_LENGTH_5TH_BIT_SMALL_LIMIT;
decodeUtf16StringAsCharBuffer();
return NISTRING_STRING;
case DecoderStateTables.NISTRING_UTF16_LARGE_LENGTH:

{ final int length = (read() << 24) | (read() << 16) | (read() << 8) | read(); _octetBufferLength = length + EncodingConstants.OCTET_STRING_LENGTH_5TH_BIT_MEDIUM_LIMIT; decodeUtf16StringAsCharBuffer(); return NISTRING_STRING; }

case DecoderStateTables.NISTRING_RA:

{ // Decode resitricted alphabet integer _identifier = (b & 0x0F) << 4; final int b2 = read(); _identifier |= (b2 & 0xF0) >> 4; decodeOctetsOnFifthBitOfNonIdentifyingStringOnFirstBit(b2); decodeRestrictedAlphabetAsCharBuffer(); return NISTRING_STRING; }

case DecoderStateTables.NISTRING_EA:

{ // Decode encoding algorithm integer _identifier = (b & 0x0F) << 4; final int b2 = read(); _identifier |= (b2 & 0xF0) >> 4; decodeOctetsOnFifthBitOfNonIdentifyingStringOnFirstBit(b2); return NISTRING_ENCODING_ALGORITHM; }

case DecoderStateTables.NISTRING_INDEX_SMALL:
_integer = b & EncodingConstants.INTEGER_2ND_BIT_SMALL_MASK;
return NISTRING_INDEX;
case DecoderStateTables.NISTRING_INDEX_MEDIUM:
_integer = (((b & EncodingConstants.INTEGER_2ND_BIT_MEDIUM_MASK) << 8) | read())

  • EncodingConstants.INTEGER_2ND_BIT_SMALL_LIMIT;
    return NISTRING_INDEX;
    case DecoderStateTables.NISTRING_INDEX_LARGE:
    _integer = (((b & EncodingConstants.INTEGER_2ND_BIT_LARGE_MASK) << 16) | (read() << 8) | read())
  • EncodingConstants.INTEGER_2ND_BIT_MEDIUM_LIMIT;
    return NISTRING_INDEX;
    case DecoderStateTables.NISTRING_EMPTY:
    return NISTRING_EMPTY_STRING;
    default:
    throw new FastInfosetException(CommonResourceBundle.getInstance().getString("message.NonEmptyOctetStringLengthOnFifthBit"));
    }
    }

Step 4:
Adjust the decodeInitialVocabulary() in the Decoder class to call the new decodeTable function created in step 1 above:
if ((b2 & EncodingConstants.INITIAL_VOCABULARY_ATTRIBUTE_VALUES_FLAG) > 0)

{ decodeTableItems(_v.attributeValue, true); }

@glassfishrobot Commented
khiggins said:
meant to say > 8 in the title - sorry, typo.

@glassfishrobot Commented
This issue was imported from java.net JIRA FI-49