imkos / cbor

CBOR codec (RFC 8949) with CBOR tags, Go struct tags (toarray, keyasint, omitempty), float64/32/16, big.Int, and fuzz tested billions of execs.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CBOR Codec in Go

fxamacker/cbor is a library for encoding and decoding CBOR and CBOR Sequences.

CBOR is a trusted alternative to JSON, MessagePack, Protocol Buffers, etc.  CBOR is an Internet Standard defined by IETF STD 94 (RFC 8949) and is designed to be relevant for decades.

fxamacker/cbor is used in projects by Arm Ltd., Cisco, Dapper Labs, EdgeX Foundry, Fraunhofer‑AISEC, Linux Foundation, Microsoft, Mozilla, Oasis Protocol, Tailscale, Teleport, and others.

See Quick Start.

fxamacker/cbor

CodeQL Go Report Card

fxamacker/cbor is a CBOR codec in full conformance with IETF STD 94 (RFC 8949). It also supports CBOR Sequences (RFC 8742) and Extended Diagnostic Notation (Appendix G of RFC 8610).

Features include full support for CBOR tags, Core Deterministic Encoding, duplicate map key detection, etc.

Struct tags (toarray, keyasint, omitempty) reduce encoded size of structs.

alt text

API is mostly same as encoding/json, plus interfaces that simplify concurrency for CBOR options.

CBOR Security

Configurable limits help defend against malicious inputs.

Decoding 10 bytes of malicious data directly into []byte is efficiently rejected.

Codec Speed (ns/op) Memory Allocs
fxamacker/cbor 2.5.0 43.95n ± 5% 32 B/op 2 allocs/op
ugorji/go 1.2.11 5353261.00n ± 4% 67111321 B/op 13 allocs/op
More Details and Prior Comparions

Latest comparison used:

  • Input: []byte{0x9B, 0x00, 0x00, 0x42, 0xFA, 0x42, 0xFA, 0x42, 0xFA, 0x42}
  • go1.19.10, linux/amd64, i5-13600K (disabled all e-cores, DDR4 @2933)
  • go test -bench=. -benchmem -count=20

Prior comparisons

Codec Speed (ns/op) Memory Allocs
fxamacker/cbor 2.5.0-beta2 44.33 ± 2% 32 B/op 2 allocs/op
fxamacker/cbor 0.1.0 - 2.4.0 ~44.68 ± 6% 32 B/op 2 allocs/op
ugorji/go 1.2.10 5524792.50 ± 3% 67110491 B/op 12 allocs/op
ugorji/go 1.1.0 - 1.2.6 💥 runtime: out of memory: cannot allocate
  • Input: []byte{0x9B, 0x00, 0x00, 0x42, 0xFA, 0x42, 0xFA, 0x42, 0xFA, 0x42}
  • go1.19.6, linux/amd64, i5-13600K (DDR4)
  • go test -bench=. -benchmem -count=20

Design and Feature Highlights

Design balances tradeoffs between speed, security, memory, encoded data size, usability, etc.

Highlights

🚀  Speed

Encoding and decoding is fast without using Go's unsafe package. Slower settings are opt-in. Default limits allow very fast and memory efficient rejection of malformed CBOR data.

🔒  Security

Decoder has configurable limits that defend against malicious inputs. Duplicate map key detection is supported. By contrast, encoding/gob is not designed to be hardened against adversarial inputs.

Codec passed multiple confidential security assessments in 2022. No vulnerabilities found in subset of codec in a nonconfidential security assessment prepared by NCC Group for Microsoft Corporation.

🗜️  Data Size

Struct tags (toarray, keyasint, omitempty) automatically reduce size of encoded structs. Encoding optionally shrinks float64→32→16 when values fit.

🧩  Usability

API is mostly same as encoding/json plus interfaces that simplify concurrency for CBOR options. Encoding and decoding modes can be created at startup and reused by any goroutines.

Presets include Core Deterministic Encoding, Preferred Serialization, CTAP2 Canonical CBOR, etc.

📆  Extensibility

Features include CBOR extension points (e.g. CBOR tags) and extensive settings. API has interfaces that allow users to create custom encoding and decoding without modifying this library.

Quick Start

Install: go get github.com/fxamacker/cbor/v2 and import "github.com/fxamacker/cbor/v2".

Key Points

  • Encoding and decoding modes are created from options (settings).
  • Modes can be created at startup and reused.
  • Modes are safe for concurrent use.

Default Mode

Package level functions only use default settings.
They provide the "default mode" of encoding and decoding.

// API matches encoding/json.
b, err := cbor.Marshal(v)        // encode v to []byte b
err := cbor.Unmarshal(b, &v)     // decode []byte b to v
encoder := cbor.NewEncoder(w)    // create encoder with io.Writer w
decoder := cbor.NewDecoder(r)    // create decoder with io.Reader r

Some CBOR-based formats or protocols may require non-default settings.

For example, WebAuthn uses "CTAP2 Canonical CBOR" settings. It is available as a preset.

Presets

Presets can be used as-is or as a starting point for custom settings.

// EncOptions is a struct of encoder settings.
func CoreDetEncOptions() EncOptions              // RFC 8949 Core Deterministic Encoding
func PreferredUnsortedEncOptions() EncOptions    // RFC 8949 Preferred Serialization
func CanonicalEncOptions() EncOptions            // RFC 7049 Canonical CBOR
func CTAP2EncOptions() EncOptions                // FIDO2 CTAP2 Canonical CBOR

Presets are used to create custom modes.

Custom Modes

Modes are created from settings. Once created, modes have immutable settings.

💡 Create the mode at startup and reuse it. It is safe for concurrent use.

// Create encoding mode.
opts := cbor.CoreDetEncOptions()   // use preset options as a starting point
opts.Time = cbor.TimeUnix          // change any settings if needed
em, err := opts.EncMode()          // create an immutable encoding mode

// Reuse the encoding mode. It is safe for concurrent use.

// API matches encoding/json.
b, err := em.Marshal(v)            // encode v to []byte b
encoder := em.NewEncoder(w)        // create encoder with io.Writer w
err := encoder.Encode(v)           // encode v to io.Writer w

Default mode and custom modes automatically apply struct tags.

Struct Tags

Struct tags (toarray, keyasint, omitempty) reduce encoded size of structs.

Example using struct tags

alt text

Struct tags simplify use of CBOR-based protocols that require CBOR arrays or maps with integer keys.

CBOR Tags

CBOR tags are specified in a TagSet.

Custom modes can be created with a TagSet to handle CBOR tags.

em, err := opts.EncMode()                  // no CBOR tags
em, err := opts.EncModeWithTags(ts)        // immutable CBOR tags
em, err := opts.EncModeWithSharedTags(ts)  // mutable shared CBOR tags

TagSet and modes using it are safe for concurrent use. Equivalent API is available for DecMode.

Example using TagSet and TagOptions

// Use signedCWT struct defined in "Decoding CWT" example.

// Create TagSet (safe for concurrency).
tags := cbor.NewTagSet()
// Register tag COSE_Sign1 18 with signedCWT type.
tags.Add(	
	cbor.TagOptions{EncTag: cbor.EncTagRequired, DecTag: cbor.DecTagRequired}, 
	reflect.TypeOf(signedCWT{}), 
	18)

// Create DecMode with immutable tags.
dm, _ := cbor.DecOptions{}.DecModeWithTags(tags)

// Unmarshal to signedCWT with tag support.
var v signedCWT
if err := dm.Unmarshal(data, &v); err != nil {
	return err
}

// Create EncMode with immutable tags.
em, _ := cbor.EncOptions{}.EncModeWithTags(tags)

// Marshal signedCWT with tag number.
if data, err := cbor.Marshal(v); err != nil {
	return err
}

Functions and Interfaces

Functions and interfaces at a glance

Common functions with same API as encoding/json:

  • Marshal, Unmarshal
  • NewEncoder, (*Encoder).Encode
  • NewDecoder, (*Decoder).Decode

NOTE: Unmarshal will return ExtraneousDataError if there are remaining bytes because RFC 8949 treats CBOR data item with remaining bytes as malformed.

  • 💡 Use UnmarshalFirst to decode first CBOR data item and return any remaining bytes.

Other useful functions:

  • Diagnose, DiagnoseFirst produce human-readable Extended Diagnostic Notation from CBOR data.
  • UnmarshalFirst decodes first CBOR data item and return any remaining bytes.
  • Wellformed returns true if the the CBOR data item is well-formed.

Interfaces identical or comparable to Go encoding packages include:
Marshaler, Unmarshaler, BinaryMarshaler, and BinaryUnmarshaler.

The RawMessage type can be used to delay CBOR decoding or precompute CBOR encoding.

Security Tips

🔒 Use Go's io.LimitReader to limit size when decoding very large or indefinite size data.

Default limits may need to be increased for systems handling very large data (e.g. blockchains).

DecOptions can be used to modify default limits for MaxArrayElements, MaxMapPairs, and MaxNestedLevels.

Status

v2.5.0 was released on Sunday, August 13, 2023. It is fuzz tested and production quality.

IMPORTANT: Before upgrading from prior release, please read the notable changes highlighted in the release notes.

See latest releases and v2.5.0 release notes for list of new features and improvements.

Who uses fxamacker/cbor

fxamacker/cbor is used in projects by Arm Ltd., Berlin Institute of Health at Charité, Chainlink, Cisco, Confidential Computing Consortium, ConsenSys, Dapper Labs, EdgeX Foundry, F5, Fraunhofer‑AISEC, Linux Foundation, Microsoft, Mozilla, National Cybersecurity Agency of France (govt), Netherlands (govt), Oasis Protocol, Smallstep, Tailscale, Taurus SA, Teleport, TIBCO, and others.

Although GitHub only reports around 200 repos depend on this library, that is for v1 (old version). For v2 (current version), GitHub reports 2000+ repositories depend on fxamacker/cbor.

fxamacker/cbor passed multiple confidential security assessments. A nonconfidential security assessment (prepared by NCC Group for Microsoft Corporation) includes a subset of fxamacker/cbor v2.4.0 in its scope.

Standards

This library is a full-featured generic CBOR (RFC 8949) encoder and decoder. Notable CBOR features include:

CBOR Feature Description
CBOR tags API supports built-in and user-defined tags.
Preferred serialization Integers encode to fewest bytes. Optional float64 → float32 → float16.
Map key sorting Unsorted, length-first (Canonical CBOR), and bytewise-lexicographic (CTAP2).
Duplicate map keys Always forbid for encoding and option to allow/forbid for decoding.
Indefinite length data Option to allow/forbid for encoding and decoding.
Well-formedness Always checked and enforced.
Basic validity checks Optionally check UTF-8 validity and duplicate map keys.
Security considerations Prevent integer overflow and resource exhaustion (RFC 8949 Section 10).

Known limitations are noted in the Limitations section.

Go nil values for slices, maps, pointers, etc. are encoded as CBOR null. Empty slices, maps, etc. are encoded as empty CBOR arrays and maps.

Decoder checks for all required well-formedness errors, including all "subkinds" of syntax errors and too little data.

After well-formedness is verified, basic validity errors are handled as follows:

  • Invalid UTF-8 string: Decoder has option to check and return invalid UTF-8 string error. This check is enabled by default.
  • Duplicate keys in a map: Decoder has options to ignore or enforce rejection of duplicate map keys.

When decoding well-formed CBOR arrays and maps, decoder saves the first error it encounters and continues with the next item. Options to handle this differently may be added in the future.

By default, decoder treats time values of floating-point NaN and Infinity as if they are CBOR Null or CBOR Undefined.

Click to expand topic:

Duplicate Map Keys

This library provides options for fast detection and rejection of duplicate map keys based on applying a Go-specific data model to CBOR's extended generic data model in order to determine duplicate vs distinct map keys. Detection relies on whether the CBOR map key would be a duplicate "key" when decoded and applied to the user-provided Go map or struct.

DupMapKeyQuiet turns off detection of duplicate map keys. It tries to use a "keep fastest" method by choosing either "keep first" or "keep last" depending on the Go data type.

DupMapKeyEnforcedAPF enforces detection and rejection of duplidate map keys. Decoding stops immediately and returns DupMapKeyError when the first duplicate key is detected. The error includes the duplicate map key and the index number.

APF suffix means "Allow Partial Fill" so the destination map or struct can contain some decoded values at the time of error. It is the caller's responsibility to respond to the DupMapKeyError by discarding the partially filled result if that's required by their protocol.

Tag Validity

This library checks tag validity for built-in tags (currently tag numbers 0, 1, 2, 3, and 55799):

  • Inadmissible type for tag content
  • Inadmissible value for tag content

Unknown tag data items (not tag number 0, 1, 2, 3, or 55799) are handled in two ways:

  • When decoding into an empty interface, unknown tag data item will be decoded into cbor.Tag data type, which contains tag number and tag content. The tag content will be decoded into the default Go data type for the CBOR data type.
  • When decoding into other Go types, unknown tag data item is decoded into the specified Go type. If Go type is registered with a tag number, the tag number can optionally be verified.

Decoder also has an option to forbid tag data items (treat any tag data item as error) which is specified by protocols such as CTAP2 Canonical CBOR.

For more information, see decoding options and tag options.

Limitations

If any of these limitations prevent you from using this library, please open an issue along with a link to your project.

  • CBOR Undefined (0xf7) value decodes to Go's nil value. CBOR Null (0xf6) more closely matches Go's nil.
  • CBOR map keys with data types not supported by Go for map keys are ignored and an error is returned after continuing to decode remaining items.
  • When decoding registered CBOR tag data to interface type, decoder creates a pointer to registered Go type matching CBOR tag number. Requiring a pointer for this is a Go limitation.

Fuzzing and Code Coverage

Code coverage must not fall below 95% when tagging a release. Code coverage is above 96% (go test -cover) for fxamacker/cbor v2.5.

Coverage-guided fuzzing must pass billions of execs using before tagging a release. Fuzzing is done using nonpublic code which may eventually get merged into this project. Until then, reports like OpenSSF Scorecard can't detect fuzz tests being used by this project.


Versions and API Changes

This project uses Semantic Versioning, so the API is always backwards compatible unless the major version number changes.

These functions have signatures identical to encoding/json and they will likely never change even after major new releases:
Marshal, Unmarshal, NewEncoder, NewDecoder, (*Encoder).Encode, and (*Decoder).Decode.

Exclusions from SemVer:

  • Newly added API documented as "subject to change".
  • Newly added API in the master branch that has never been release tagged.
  • Bug fixes that change behavior (e.g. return error that was missed in prior version) if function parameters are unchanged. We try to highlight these in the release notes.

Code of Conduct

This project has adopted the Contributor Covenant Code of Conduct. Contact faye.github@gmail.com with any questions or comments.

Contributing

Please open an issue before beginning work on a PR. The improvement may have already been considered, etc.

For more info, see How to Contribute.

Security Policy

Security fixes are provided for the latest released version of fxamacker/cbor.

For the full text of the Security Policy, see SECURITY.md.

Acknowledgements

Many thanks to all the contributors on this project!

I'm especially grateful to Bastian Müller and Dieter Shirley for suggesting and collaborating on CBOR stream mode, and much more.

I'm very grateful to Stefan Tatschner, Yawning Angel, Jernej Kos, x448, ZenGround0, and Jakob Borg for their contributions or support in the very early days.

This library clearly wouldn't be possible without Carsten Bormann authoring CBOR RFCs.

Special thanks to Laurence Lundblade and Jeffrey Yasskin for their help on IETF mailing list or at 7049bis.

This library uses x448/float16 which used to be included. Now as a standalone package, x448/float16 is useful to other projects as well.

License

Copyright © 2019-2023 Faye Amacker.

fxamacker/cbor is licensed under the MIT License. See LICENSE for the full license text.


About

CBOR codec (RFC 8949) with CBOR tags, Go struct tags (toarray, keyasint, omitempty), float64/32/16, big.Int, and fuzz tested billions of execs.

License:MIT License


Languages

Language:Go 100.0%