Crell / Serde

Robust Serde (serialization/deserialization) library for PHP 8.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using string-backed enums in type maps

AegirLeet opened this issue · comments

Detailed description

I would like to deserialize a string value into a string-backed Enum instance while also using the same string as part of a type map. Like this:

enum Version: string
{
    case V1 = 'v1';
}

#[StaticTypeMap(key: 'version', map: [Version::V1->value => V1::class])]
interface Payload
{
    //
}

class V1 implements Payload
{
    public function __construct(
        public readonly Version $version,
        public readonly string  $foo,
        public readonly int     $bar,
    ) {
        //
    }
}

$json = <<<JSON
{
    "version": "v1",
    "foo": "asd",
    "bar": 123
}
JSON;


$serde   = new SerdeCommon();
$payload = $serde->deserialize(serialized: $json, from: 'json', to: Payload::class);
var_dump($payload);

This currently leads to an error:

TypeError: Crell\Serde\Attributes\StaticTypeMap::findClass(): Argument #1 ($id) must be of type string, Version given, called in vendor/crell/serde/src/TypeMapper.php on line 61 and defined in vendor/crell/serde/src/Attributes/StaticTypeMap.php:38

Context

Removing the $version property or changing its type to string fixes this issue. However, I think using enums to represent a set of known discriminator values is generally a good idea.
The current behavior is pretty unintuitive.

Possible implementation

We could either patch TypeMapper like this:

diff --git a/src/TypeMapper.php b/src/TypeMapper.php
index 6559322..b974198 100644
--- a/src/TypeMapper.php
+++ b/src/TypeMapper.php
@@ -58,6 +58,10 @@ class TypeMapper
             return null;
         }

+        if ($key instanceof \BackedEnum) {
+            $key = $key->value;
+        }
+
         if (!$class = $map->findClass($key)) {
             throw NoTypeMapDefinedForKey::create($key, $field->phpName ?? $field->phpType);
         }

Or add a separate TypeMap implementation - something like this:

#[Attribute(Attribute::TARGET_CLASS | Attribute::TARGET_PROPERTY)]
class BackedEnumTypeMap extends StaticTypeMap
{
    public function findClass(string|BackedEnum $id): ?string
    {
        if ($id instanceof BackedEnum) {
            $id = $id->value;
        }

        return parent::findClass($id);
    }
}

int-backed enums would remain unsupported. I don't think this is an issue, as ints are already unsupported in static type maps anyway.

Your environment

$ php -v
PHP 8.2.9 (cli) (built: Aug 16 2023 19:49:37) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.9, Copyright (c) Zend Technologies
    with Zend OPcache v8.2.9, Copyright (c), by Zend Technologies
    with Xdebug v3.2.1, Copyright (c) 2002-2023, by Derick Rethans
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy
$ composer show -t crell/serde
crell/serde dev-master A general purpose serialization and deserialization library
├──crell/attributeutils ~0.8.2
│  ├──crell/fp ~0.4.0
│  │  └──php ~8.1
│  └──php ~8.1
├──crell/fp >= 0.3.3
│  └──php ~8.1
└──php ~8.1

Interesting! I can see where Enums make sense here, conceptually. If we're going to go that far, however, does it make sense to go even farther?

enum Version: string 
{
    case v1 = V1::class;
    case v2 = V2::class;
}

Put the entire lookup table in the enum itself.

Now... I'm not actually sure if that's a good approach, but it's worth considering. There's probably other options as well.

In general I'd favor a separate TypeMapper for enums specifically, although I'd prefer to not extend from StaticTypeMap. Just make it its own class.

In general I'd favor a separate TypeMapper for enums specifically, although I'd prefer to not extend from StaticTypeMap. Just make it its own class.

Sounds good to me. Should I start working on a PR?

Well, I'm still undecided about how the enums should be treated. If we go with the "enum defines everything" approach, then it becomes as simple as:

#[EnumMap(Version::class)]

However, I'm not sure if using the enum name as the key and value as the class is wise; that's somewhat abusing enums. The alternative would be a defined method that takes a value and returns the corresponding class, like so:

enum Version
{
    case v1;
    case v2;

    public function getClass() {
        return match ($this) {
             self::v1 => V1::class,
             self::v2 => V2::class,
        }
    }
}

Which then seems to suggest that an enum could itself just implement TypeMap and implement the corresponding methods (it's a class, after all), but that creates a problem because TypeMap currently requires the identifier to be a string. That... may be something to change, to support an enum here?

I'm not sure which way forward is best, so I'd hate for you to try to implement one and then we decide to do a different one. (Though, if you're up for it, it could be a useful experiment.)

Edit: Wait, no, an Enum cannot be a TypeMap because it cannot be instantiated arbitrarily. Dur.

However, I'm not sure if using the enum name as the key and value as the class is wise; that's somewhat abusing enums.

Agreed. That would be an odd and unnecessary limitation. I think a more generic solution where users aren't forced to use class names as their Enum values would be better.

but that creates a problem because TypeMap currently requires the identifier to be a string. That... may be something to change, to support an enum here?

Yes, and now that I've thought about it some more, I'm not sure a new TypeMap is a good idea.

I don't think we should be splitting the TypeMaps by key type (StaticTypeMap for string keys, EnumTypeMap for Enum keys, ...).
They should be split by behavior. StaticTypeMap and ClassNameTypeMap are already perfect examples of this.

In terms of behavior, what I'm trying to achieve is pretty much exactly what the StaticTypeMap already does. Take a value, do a map lookup, get the target class.
The only issue is that the value gets wrapped in an Enum before that lookup happens.

I think requiring identifiers to be strings is fine for now. We just need some mechanism that can turn non-strings (back) into strings.

Ideas

1

Patch the TypeMapper as suggested above. Bit of a hack and it only solves this specific issue (string-backed Enums), but also straightforward to implement.

2

Add a new interface with a getTypeMapIdentifier(): string method. The StaticTypeMap can do an instanceof TypeMapIdentifier check and resolve the underlying identifier through $id->getTypeMapIdentifier().

interface TypeMapIdentifier
{
    public function getTypeMapIdentifier(): string;
}

enum Version: string implements TypeMapIdentifier
{
    case V1 = 'v1';
    case V2 = 'v2';

    public function getTypeMapIdentifier(): string
    {
        return $this->value;
    }
}

3

Add some kind of resolver/transformer strategy system. Similar to the RenamingStrategy.

interface TypeMapResolver
{
    public function resolve(mixed $id): string;
}

class StringBackedEnumResolver implements TypeMapResolver
{
    /**
     * @param BackedEnum $id
     *
     * @return string
     */
    public function resolve(mixed $id): string
    {
        return $id->value;
    }
}

#[StaticTypeMap(resolver: new StringBackedEnumResolver(), key: 'version', map: [Version::V1->value => V1::class, Version::V2->value => V2::class])]
class Payload {}

What do you think?

(Thinking aloud...)

Hm. I'm not a big fan of option 3, because it requires manually doing all the enum->value stuff manually. It still feels like if you're using an enum, the enum itself should contain the mapping table. It's basically using the enum as a mapping table object. Which I think is closer to option 2.

Going back to the original question, though:

I would like to deserialize a string value into a string-backed Enum instance while also using the same string as part of a type map.

This is subtly different. It's not just using an enum as the map, it's also using a real property of the object as the type map key, whereas normally that value is stripped. So there's actually two different issues tied up here.

One would be the ability to do this:

#[StaticTypeMap('type', [
    'paper' => PaperBook::class,
    'digital' => DigitalBook::class,
]);
interface Book {}

class PaperBook implements Book
{
    protected string $type = 'paper';
    protected string $title;
    protected int $pages;
}

class DigitalBook implements Book
{
    protected string $type = 'digital';
    protected string $title;
    protected int $bytes;
}

And the other would be the ability to do this:

enum BookType: string
{
    case Paper;
    case Digital;
    // And some more stuff to specify class names.
}

#[StaticTypeMap('type', BookType::class);
interface Book {}

class PaperBook implements Book
{
    protected string $title;
    protected int $pages;
}

class DigitalBook implements Book
{
    protected string $title;
    protected int $bytes;
}

And then to combine them:

enum BookType: string
{
    case Paper;
    case Digital;
    // And some more stuff to specify class names.
}

#[StaticTypeMap('type', BookType::class);
interface Book {}

class PaperBook implements Book
{
    protected BookType $type = BookType::Paper;
    protected string $title;
    protected int $pages;
}

class DigitalBook implements Book
{
    protected BookType $type = BookType::Digital;
    protected string $title;
    protected int $bytes;
}

So I think we need to address those separately. In particular, the first one is risky because what happens if you do $b = new PaperBook(); $b->type = 'digital'; $serde->serialize($b, 'json')? The data is invalid at that point, so it's unclear what the correct behavior is. I'm almost inclined to say we shouldn't support that to avoid that bug.

So that leaves just the enum-as-map logic. Since enums are classes, that's sort of equivalent to specifying a class that has a buit-in mapping table. But... outside of enums that's honestly a pretty silly thing to do, so I don't know that a generalized version makes sense.

So... I fall back to using the enum cases themselves as the lookup table and a dedicated mapper type for that, which I don't especially like, but I don't see an alternative.

Tricky.