Performance

Question

Performance

lidatong opened this issue 4 years ago · comments

Performance in general is on my radar as things to tackle next, as this library gains traction, and the top of a 1.0 release checklist.

In general after some thought I don't think caching / memoization is the right way to tackle this. A few reasons why:

it requires careful thought about how it behaves under concurrency, specifically with respect to memory visibility
could have a big memory footprint on large codebases with a lot of composite dataclasses, and potentially duplicated across threads!
immutability -- should the cached object be mutable / how can we protect it from changes?

Instead, I think an approach involving code generation is the way to go -- similar to how the dataclasses core module itself is implemented. When you think about it, a schema is only generated once and known at "module-load time". In other languages we might call this "compile-time". We can see the code-generation approach utilized in codec schema libraries in other languages, be it json or even other data-interchange formats like protobuf

Going this route, the schema now is loaded as just more code, so to speak, instead of living in memory.

yacc143 · Answer 1 · Tue Jun 30 2020 00:13:50 GMT+0800 (China Standard Time)

Just some observations from benchmarking some code dumping 1000 simple objects resulting in ~3.4 MB of JSON:

@dataclass_json
@dataclass
class Test:
    id : int
    value : str
    second : str

testvalue = [Test(i, TESTSTR, TESTSTR[0:200]) for i in range(1000)]
testvalue2 = [dict(id=i, value=TESTSTR, second=TESTSTR[0:200]) for i in range(1000)]

I've created a number of methods to dump these lists of objects:

def callDCJS():
    len(Test.schema().dumps(testvalue, many=True))

import ujson

def callJS():
    len(ujson.dumps(testvalue2))

import json
def callJS2():
    len(json.dumps(testvalue2))

def callDCJS2(schema=Test.schema()):
    len(schema.dumps(testvalue, many=True))

def callDCJS3(schema=Test.schema()):
    len(ujson.dumps(schema.dump(testvalue, many=True)))

As you can see the callJS functions are the ones the dump the native list of Python dictionaries, while the DCJS ones use dataclasses_json.

And the astounding numbers suggest to me that dataclasses_json (marshmallow? Not sure if it uses it under the hood, haven't looked at the code yet) has optimization prospects:

(uber38) andreas@obelix:~/work/venvs/uber38/NLP/test38.py$ time python3.8 test38.py 
callJS               0.009778 [0.010089821879984112, 0.010208235128905747, 0.009778391534543678]
callJS2              0.018111 [0.01811142571168384, 0.019572446219253917, 0.02105408304198939]
callDCJS             0.028419 [0.02841886689791361, 0.03097145833031697, 0.032973052026961255]
callDCJS2            0.026938 [0.0269384963994471, 0.029055634546708932, 0.034756338603475746]
callDCJS3            0.018359 [0.01837075536919607, 0.018359050057148812, 0.021072761750676565]

The first time is the minimum. As you can see, the best strategy that you can currently use with dataclasses_json seems to use it to serialize to Python data structures, and then use the fastest JSON python package that you can find for your data. (And ujson seems to be fast, beating out the standard json module by factor 2. And the time differences between DCJS2/DCJS3 suggest that dataclasses_json use the default json module.

JsBergbau · Answer 2 · Thu Apr 29 2021 19:06:09 GMT+0800 (China Standard Time)

Even when building the dict by accessing each element of the dataclass like dict(id=test.i, value=test.value, second=test.second)
and then dumping to JSON is in my tests about twice faster than using dataclass_json
Building the string manually like json = '{"id": ' + str(test.i) and so on takes even only about half of using json.dumps, so dataclass_json is about 4 times slower than building the string manually.

Is there any timetable to give it better performance?

Daniel Golding · Answer 3 · Tue May 04 2021 02:19:53 GMT+0800 (China Standard Time)

Hi, it's interesting that you mention code generation.
I didn't think to suggest it, as I thought maybe you were trying to stay dynamic.
I implemented essentially the from_dict part of the API using some code generation March last year, in case my approach might be interesting to you.
https://github.com/cakemanny/fastclasses-json

edit: there is now a release on pypi, to_dict, and some configurable options