Runtime error on TextEncoder/TextDecoder
Autokaka opened this issue · comments
I'm running @bufbuild/protobuf module on a JavaScript Runtime that could only accept ECMAScript (For example, QuickJS::Eval), and the runtime throw exception that said:
Error message:TextDecoder is not defined
SourceCode:
this.textDecoder = textDecoder !== null && textDecoder !== void 0 ? textDecoder : new TextDecoder();
^
Stacktrace:
at BinaryReader (oh_modules/.ohpm/@bufbuild+protobuf@1.6.0/oh_modules/@bufbuild/protobuf/dist/esm/binary-encoding.js:281:95)
at readerFactory (oh_modules/.ohpm/@bufbuild+protobuf@1.6.0/oh_modules/@bufbuild/protobuf/dist/esm/private/binary-format-common.js:25:31)
at fromBinary (oh_modules/.ohpm/@bufbuild+protobuf@1.6.0/oh_modules/@bufbuild/protobuf/dist/esm/message.js:45:34)
at fromBinary (oh_modules/.ohpm/@bufbuild+protobuf@1.6.0/oh_modules/@bufbuild/protobuf/dist/esm/google/protobuf/descriptor_pb.js:1680:16)
at func_main_0 (oh_modules/.ohpm/@bufbuild+protobuf@1.6.0/oh_modules/@bufbuild/protobuf/dist/esm/private/feature-set.js:19:35)
This issue occurs at 'feature-set.ts':
export const featureSetDefaults = FeatureSetDefaults.fromBinary(protoBase64.dec(
/*upstream-inject-feature-defaults-start*/ "ChESDAgBEAIYAiABKAEwAhjmBwoREgwIAhABGAEgAigBMAEY5wcKERIMCAEQARgBIAIoATABGOgHIOYHKOgH" /*upstream-inject-feature-defaults-end*/)); // This method uses TextDecoder...
The TextEncoder/TextDecoder could only be used in browser/NodeJS platform.
According to the source code, there is no way to make configurations for TextEncoder/TextDecoder.
Should we implement the TextEncoder/TextDecoder using pure JavaScript since this library can run in any ECMAScript runtime?
Or will there be a configuration for this?
According to the source code, there is no way to make configurations for TextEncoder/TextDecoder.
In addition to various polyfills, the toBinary/fromBinary
methods take an optional BinaryWriteOptions/BinaryReadOptions
object which can specify a writerFactory/readerFactory
function respectively as mentioned in the docs. Those function need to produce an IBinaryWriter/IBinaryReader
which can be implemented using the BinaryWriter/BinaryReader
classes. You simply need to implement TextEncoderLike/TextDecoderLike
:
protobuf-es/packages/protobuf/src/binary-encoding.ts
Lines 73 to 74 in 246e6df
// my-binary-format.ts
import {BinaryWriter, BinaryReader} from '@bufbuild/protobuf';
const myTextDecoderLike = {
decode: (input?: Uint8Array): string => {
/* Implement this yourself */
}
};
const myTextEncoderLike = {
encode: (input?: string): Uint8Array => {
/* Implement this yourself */
}
};
export const readerFactory = (bytes: Uint8Array) => new BinaryReader(bytes, myTextDecoderLike);
export const writerFactory = (): IBinaryWriter => new BinaryWriter(myTextEncoderLike);
// usage
import {readerFactory, writerFactory} from './my-binary-format'
// read
export const featureSetDefaults = FeatureSetDefaults.fromBinary(
protoBase64.dec(...),
{ readerFactory }
);
// write
const featureSetDefaultsBin = FeatureSetDefaults.toBinary(
featureSetDefaults,
{ writerFactory }
);
Should we implement the TextEncoder/TextDecoder using pure JavaScript since this library can run in any ECMAScript runtime?
Probably not since UTF-8 encoding/decoding is notoriously difficult to do correctly and fast. The TextEncoder/TextDecoder are available in all browsers along with Node, Deno, and Bun; and you can polyfill or customize the binary options as shown above if your runtime does not have them available.
@Autokaka, we've just updated the docs to explain this a bit better:
Internally, the classes use TextEncoder and TextDecoder from the text encoding API to encode and decode text as UTF-8. In an environment where this API is unavailable, your need to bring your own UTF-8 encoder. To do so, you can use the serialization options writerFactory and readerFactory to provide your own implementation.
For the specific case of feature-set.ts
, it's currently not possible to provide your own codecs. I think we should try to fix this up. You can still set your own with globalThis.TextEncoder = ...
though, in the meantime.
For QuickJS, the best approach is probably to do the work in C with a solid library, and make it available to the JS world conforming to the TextDecoder / TextEncoder interfaces. A quick search suggests that it may have been done before: https://github.com/rsenn/qjs-modules/blob/main/quickjs-textcode.h
Thanks, these answers help a lot! I'll have a try.
v1.7.1 includes a fix - you will still have to bring your own implementation of the text encoding API, but we don't try to access it at module init time.