uuid-rs / uuid

Generate and parse UUIDs.

Home Page:https://www.crates.io/crates/uuid

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`Timestamp::from_rfc4122` panics when using timestamp before unix epoch

zheland opened this issue · comments

Timestamp::from_rfc4122 method panics with attempt to subtract with overflow error when passing ticks value less than UUID_TICKS_BETWEEN_EPOCHS.

I'm not sure that the intentional use of a UUID with the wrong timestamp can be considered acceptable by the standard. But even if this is considered unacceptable, a # Panics section describing that the Timestamp::from_rfc4122 method panics at certain values would be useful in method description.

The UUID crate allowed these values up to and including version 1.1.2. In UUID v1.1.2 the timestamp stored RFC4122 ticks. After UUID v1.2.1 Timestamp started to use unix time instead.

use uuid;

#[test]
fn any_timestamp_value_supported() {
    uuid::Timestamp::from_rfc4122(u64::MAX, 0); // ok
    uuid::Timestamp::from_rfc4122(122192928000000000, 0); // ok
    uuid::Timestamp::from_rfc4122(122192927999999999, 0); // panics
    uuid::Timestamp::from_rfc4122(0, 0); // panics
}
thread 'any_timestamp_value_supported' panicked at 'attempt to subtract with overflow', /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/uuid-1.3.2/src/timestamp.rs:135:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Code from the UUID crate for reference:

uuid/src/timestamp.rs

Lines 25 to 27 in 6188ecf

/// The number of 100 nanosecond ticks between the RFC4122 epoch
/// (`1582-10-15 00:00:00`) and the Unix epoch (`1970-01-01 00:00:00`).
pub const UUID_TICKS_BETWEEN_EPOCHS: u64 = 0x01B2_1DD2_1381_4000;

uuid/src/timestamp.rs

Lines 75 to 89 in 6188ecf

pub const fn from_rfc4122(ticks: u64, counter: u16) -> Self {
#[cfg(not(any(feature = "v1", feature = "v6")))]
{
let _ = counter;
}
let (seconds, nanos) = Self::rfc4122_to_unix(ticks);
Timestamp {
seconds,
nanos,
#[cfg(any(feature = "v1", feature = "v6"))]
counter,
}
}

uuid/src/timestamp.rs

Lines 133 to 138 in 6188ecf

const fn rfc4122_to_unix(ticks: u64) -> (u64, u32) {
(
(ticks - UUID_TICKS_BETWEEN_EPOCHS) / 10_000_000,
((ticks - UUID_TICKS_BETWEEN_EPOCHS) % 10_000_000) as u32 * 100,
)
}

Thanks for the report @zheland. I think we've got two options here:

  1. Enshrine the panic by making it explicit (not just debug) and add a section to the docs.
  2. Truncate the timestamp so those before the epoch will be set to the epoch, along with a section in the docs.

Do either of those options appeal to you more than the other?

@KodrAus,

  1. I don't like panic variant. I do not think that generating UUID with a time value smaller than unix epoch is a good reason for a panic. It seems to me that such things should be forgiven quietly in most UUID generation usecases.
  2. Truncating seems much better to me, but with an incorrect time value it may result in a very similar generated UUIDs for UUIDv1 and UUIDv6. However, if considering that the use of incorrect time is rather wrong, then this option is basically completely fine.
  3. Alternatively, I might suggest using wrapping operations when converting between rfc4122 and unix time. This will make it possible to obtain some unix time value from any rfc4122. And if necessary convert it back to the same rfc4122 timestamp.
    Pros: UUIDv1 will support any valid rfc4122 timestamp and UUIDv7 will support any valid unix timestamp.
    Cons: Possible silent wrapping operations when using invalid unix time to generate UUIDv1 or silent wrapping operations when using invalid rfc4122 to generate UUIDv7. If mentioned in the documentation it looks quite tolerable behavior.
// draft example, I have not checked the implementation in detail

pub const UUID_TICKS_BETWEEN_EPOCHS: u64 = 0x01B2_1DD2_1381_4000;
pub const UUID_SECS_BETWEEN_EPOCHS: u64 = UUID_TICKS_BETWEEN_EPOCHS / 10_000_000;

const fn unix_to_rfc4122_ticks(seconds: u64, nanos: u32) -> u64 {
    seconds
        .wrapping_add(UUID_SECS_BETWEEN_EPOCHS)
        .wrapping_mul(10_000_000)
        .wrapping_add(nanos as u64 / 100)
}

const fn rfc4122_to_unix(ticks: u64) -> (u64, u32) {
    (
        (ticks / 10_000_000).wrapping_sub(UUID_SECS_BETWEEN_EPOCHS),
        (ticks % 10_000_000) as u32 * 100,
    )
}

#[test]
fn any_rfc4122_can_be_mapped_to_the_corresponding_unixtime_and_back() {
    for (rfc4122, unix) in [
        (u64::MAX, (0x0000_01AA_A6D6_0F4A, 955_161_500)),
        (u64::MAX / 2 + 1, (0x0000_00D3_E741_3965, 477_580_800)),
        (UUID_TICKS_BETWEEN_EPOCHS, (0, 0)),
        (UUID_TICKS_BETWEEN_EPOCHS - 1, (u64::MAX, 999_999_900)),
        (10_000_000, (0xFFFF_FFFD_27AC_6381, 0)),
        (1, (0xFFFF_FFFD_27AC_6380, 100)),
        (0, (0xFFFF_FFFD_27AC_6380, 0)),
    ] {
        assert_eq!(rfc4122_to_unix(rfc4122), unix);
        assert_eq!(unix_to_rfc4122_ticks(unix.0, unix.1), rfc4122);
    }
}

#[test]
fn not_any_unixtime_can_be_mapped_to_the_corresponding_rfc4122() {
    assert_eq!(unix_to_rfc4122_ticks(0, 0), UUID_TICKS_BETWEEN_EPOCHS);
    assert_eq!(unix_to_rfc4122_ticks(0, 99), UUID_TICKS_BETWEEN_EPOCHS);
    assert_eq!(
        unix_to_rfc4122_ticks(0xFE00_0000_0000_0000, 0),
        UUID_TICKS_BETWEEN_EPOCHS
    );
}

I’m also more inclined to a non-panicking variant here. For anybody that’s concerned about quietly changing behaviour through wrapping or truncation we can offer a checked_from_rfc4122 that returns an Option.

I do like that the wrapping variant gives you a more uniquely useful timestamp. I’d be happy with an implementation that did that. It might also be worth checking the current draft spec that introduces v6-v8 and see if it recommends any behaviour converting RRC4122 timestamps to Unix time.

Thanks @zheland 👍 I've opened #691 that addresses this by explicitly wrapping, basically as you've suggested here. All we're guaranteeing is that those methods won't panic, not that they'll wrap to any particular value, so I've reflected that in the test by only ensuring overflowing inputs don't panic.