address new ci failure

Question

address new ci failure

travisstaloch opened this issue 10 months ago · comments

travisstaloch commented 10 months ago

I created a minimal reproduction and an issue here ziglang/zig#17996

travisstaloch · Answer 1 · Fri Feb 02 2024 12:44:46 GMT+0800 (China Standard Time)

this fix made it into the tarballs today. ziglang/zig#18729.

was able to run zig build test again locally 🚀

here is a perf point. this is with new versions of simdjson.cpp/h from https://github.com/simdjson/simdjson/blob/master/singleheader/

~/.../zig/simdjzon $ zig version
0.12.0-dev.2540+776cd673f
~/.../zig/simdjzon $ g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
~/.../zig/simdjzon $ zig build -Doptimize=ReleaseFast
~/.../zig/simdjzon $ cat main.cpp
#include "simdjson.h"
using namespace simdjson;
int main(int argc, char** argv) {
    if(argc != 2) {
        std::cout << "USAGE: ./simdjson <file.json>" << std::endl;
        exit(1);
    }
    dom::parser parser; 
    try
    {
        const dom::element doc = parser.load(argv[1]);
    }
    catch(const std::exception& e)
    {
        std::cerr << e.what() << '\n';
        return 1;
    }
    return 0;
}

~/.../zig/simdjzon $ g++ main.cpp simdjson.cpp -o simdjson -O3 -march=native
~/.../zig/simdjzon $ poop "./simdjson test/twitter.json" "zig-out/bin/simdjzon test/twitter.json"
Benchmark 1 (2408 runs): ./simdjson test/twitter.json
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          2.03ms ±  117us    1.78ms … 2.90ms         77 ( 3%)        0%
  peak_rss           4.98MB ± 2.67KB    4.85MB … 4.98MB          1 ( 0%)        0%
  cpu_cycles         2.16M  ±  102K     2.07M  … 3.32M         225 ( 9%)        0%
  instructions       4.87M  ± 77.8      4.87M  … 4.87M         599 (25%)        0%
  cache_references    125K  ± 2.46K      113K  …  167K         164 ( 7%)        0%
  cache_misses       19.7K  ±  885      17.5K  … 26.3K          80 ( 3%)        0%
  branch_misses      19.2K  ±  255      17.8K  … 20.3K          47 ( 2%)        0%
Benchmark 2 (3103 runs): zig-out/bin/simdjzon test/twitter.json
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.58ms ± 94.6us    1.33ms … 2.09ms         50 ( 2%)        ⚡- 22.2% ±  0.3%
  peak_rss           1.70MB ± 3.33KB    1.57MB … 1.70MB          2 ( 0%)        ⚡- 65.8% ±  0.0%
  cpu_cycles         2.05M  ± 45.6K     1.93M  … 2.77M         225 ( 7%)        ⚡-  5.1% ±  0.2%
  instructions       4.49M  ± 0.55      4.49M  … 4.49M           9 ( 0%)        ⚡-  7.8% ±  0.0%
  cache_references   73.3K  ± 2.44K     68.1K  …  125K         102 ( 3%)        ⚡- 41.3% ±  0.1%
  cache_misses       4.91K  ± 1.36K     1.43K  … 12.4K          92 ( 3%)        ⚡- 75.0% ±  0.3%
  branch_misses      4.83K  ±  677      3.28K  … 7.40K         146 ( 5%)        ⚡- 74.8% ±  0.1%

cc @Validark

Niles Salter · Answer 2 · Sat Feb 03 2024 05:09:51 GMT+0800 (China Standard Time)

I'm surprised how big of a difference there is! Is simdjson doing a lot more work? Have you made more improvements?

travisstaloch · Answer 3 · Sat Feb 03 2024 06:31:26 GMT+0800 (China Standard Time)

No I haven't done anything. I'm surprised too. Makes me think something might be amiss with my benchmark or that were skipping some work.

The only significant changes I've made recently were adding some initExisting() methods to make dom memory re-use possible. But that shouldn't affect main(). 🤔

travisstaloch · Answer 4 · Sat Feb 03 2024 08:05:03 GMT+0800 (China Standard Time)

@Validark if you want to run the benchmark youself, i've added bench/twitter in a new commit just now. you should be able to at least build the benchmark binaries by running bench/twitter/run.sh. and then there is a poop command which you may have to run manually (it fails to find poop for me).

Niles Salter · Answer 5 · Sat Feb 03 2024 09:59:31 GMT+0800 (China Standard Time)

No I haven't done anything. I'm surprised too. Makes me think something might be amiss with my benchmark or that were skipping some work.

The only significant changes I've made recently were adding some initExisting() methods to make dom memory re-use possible. But that shouldn't affect main(). 🤔

Were the results this extreme before those changes? Is the memory reduction just from your new strategy or was it there before?

travisstaloch · Answer 6 · Sun Feb 04 2024 10:57:20 GMT+0800 (China Standard Time)

No I don't remember the results being this different. This is the first time I've noticed such a large difference. Its been quite a while since ran any benchmarks so I'm not sure when it might have changed.

I don't think the memory reuse changes should affect this at all since those should only help in situations when the parser is being re-used to parse different documents.