allegro / php-protobuf

PHP Protobuf - Google's Protocol Buffers for PHP

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

proto3 default

sm2017 opened this issue · comments

Accodring to https://developers.google.com/protocol-buffers/docs/proto3#default

When a message is parsed, if the encoded message does not contain a particular singular element, the corresponding field in the parsed object is set to the default value for that field. These defaults are type-specific:

For strings, the default value is the empty string.
For bytes, the default value is empty bytes.
For bools, the default value is false.
For numeric types, the default value is zero.
For enums, the default value is the first defined enum value, which must be 0.
For message fields, the field is not set. Its exact value is langauge-dependent. See the generated code guide for details. 

But in php-protobuf the default value is null , now I have problem with enums
Why the default value of enum in php-protobuf is null when google told it must be 0 ?

The php-protobuf does not adhere to this rule because it has never been needed so far. I don't know what't the exact problem you're facing but PHP is loosely typed language:

var_dump(null == 0); // bool(true)
var_dump(null == ''); // bool(true)
var_dump(null == false); // bool(true)

Still it's worth to implement "default values" as prescribed by Google.

Assume this code

switch($myProto->getEnum()){
    case FIRST:
        //Validation rules for $myProto->getName()
    case SECOND:
        //Validation rules for $myProto->getName()
    case THIRD:
        //Validation rules for $myProto->getName()

}

In next lines , I assume that all inputs are valid but I must add this boring code after validations

$myProto->setEnum((int)$myProto->getEnum());
$myProto->setName((string)$myProto->getName());
$myProto->setOther((bool)$myProto->getOther());

Another example

Assume we want to insert in database always we must remember NULL

INSERT INTO user (id,name) VALUES (null,$myProto->getName())

if name is not null in db schema , you have possible error

so always I must do this

if(is_null($myProto->getName()))
    $myProto->setName('');

Also , I check that when I parse a proto pack from javascript , when the value set as default I get null
that means a '' value for string or 0 for enum from javascript will be converted to null

In the first switch case additional case for null is redundant. null is matched with FIRST (I assume its value is zero).

The second one is more interesting because you need to do extra work. Still you can simplify code by using casting instead of if statement:

$name = (string)$myProto->getName(); // resolves to '' for null

Sorry I know null is redundant , I make a mistake in first switch case [so I edit my post] , I must told that , this switch case is a validator in my scenario , and according to getEnum() I have different validation rules but in my scenario I must validate getEnum() too , and it must never be NULL in next lines so I have to add this additional code

//After switch case validation
$myProto->setEnum((int)$myProto->getEnum());
$myProto->setName((string)$myProto->getName());
$myProto->setOther((bool)$myProto->getOther());
...
...
...

It is very boring codes

I agree this is redundant work you shouldn't be forced to do. It must be fixed.

Looks like the exact definition of default values was introduced in proto3. Not sure it's the same for version 2.

My guess is it's something that Google adheres to from the start. Along with the syntax 3 they included information about it in the docs. I have made a quick test with Python implementation using syntax 2 proto file and the result is the same.

I want to know , Do you want adhere to these rules for default value or not? Do you want fix it?

The default value is defined in both proto2 and proto3
for proto2 : https://developers.google.com/protocol-buffers/docs/proto#optional

If the default value is not specified for an optional element, a type-specific default value is used instead: for strings, the default value is the empty string. For bools, the default value is false. For numeric types, the default value is zero. For enums, the default value is the first value listed in the enum's type definition. This means care must be taken when adding a value to the beginning of an enum value list

for proto3 : https://developers.google.com/protocol-buffers/docs/proto3#default

@socketman2016 I have just fixed that.

Thanks a lot , just please merge it to serggp/php-protobuf : php7 branch

@hjagodzinski The solution to use type casting is not very flexible. In my branch php7 I store ints which exceed PHP_INT_MAX as double values (for x86 systems). This is standard php behavior. Why haven't you initialized all the default values in reset() method?

If I did it as you propose the values would be serialized even though a user didn't set them. I could check if a value is null and only then cast it to proper type.

Sure, you are right. It's better to check if a value is null.

Thank you

I am waiting for bug fix in php7 branch

I saw in Java and JavaScript implementation of Protocol Buffers , if you set a variable as default value , like a string as '' , or an enums as 0 or ... the values would'nt be serialized to reduce output size

It is just an idea and I think it is better to be implemented in php-protobuf too

I was considering this approach. It's interesting what you say because I have done similar test for official Python implementation and a field set to a default value is serialized. I imagine you might find yourself in situation where the information whether a field was actually set is important. Skipping fields set to default values during serialization you loose this information.