neon-sunset / related_post_gen

Data Processing benchmark featuring Rust, Go, Swift, Zig etc.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem:

Given a list of posts, compute the top 5 related posts for each post based on the number of shared tags.

Steps:

  • Read the posts JSON file.
  • Iterate over the posts and populate a map containing: tag -> List<int>, with the int representing the post index of each post with that tag.
  • Iterate over the posts and for each post:
    • Create a map: PostIndex -> int to track the number of shared tags
    • For each tag, Iterate over the posts that have that tag
    • For each post, increment the shared tag count in the map.
  • Sort the related posts by the number of shared tags.
  • Write the top 5 related posts for each post to a new JSON file.

Run Benchmark

./run.sh go | rust | python | all

# windows (powershell)
./run.ps1 go | rust | python | all

or
pwsh ./run.ps1 go | rust | python | all
Rules

No:

  • FFI (including assembly inlining)
  • Unsafe code blocks
  • Custom benchmarking
  • Disabling runtime checks (bounds etc)
  • Specific hardware targeting

Must:

  • Parse json at runtime
  • Not hardcode number of posts
  • Support up to 100 tags
  • Use a stable release of the compiler/runtime

Updated Results from github workflow (raw data)

VM Specs NB: The benchmark runs on the free tier of github workflow.
  • CPU: 2 vCPUs
  • RAM: 7GB
  • OS: Ubuntu 22.04

Source

Language Processing Time Total (PT + I/O)
Go 26.39 ms 58.1 ms
Zig 38.00 ms 79.5 ms
Rust 38.91 ms 56.7 ms
Java (GraalVM) 40.00 ms 70.4 ms
Julia 42.67 ms 2.717 s
F# 45.86 ms 304.2 ms
Odin 46.52 ms 294.3 ms
Nim 56.00 ms 85.6 ms
Vlang 59.34 ms 400.6 ms
Swift 65.42 ms 442.3 ms
Crystal 68.53 ms 126.0 ms
C# 73.48 ms 283.7 ms
Dart VM 103.75 ms 573.1 ms
LuaJIT 118.00 ms 400.1 ms
Dart AOT 141.38 ms 283.8 ms
JS (Deno) 183.20 ms 272.4 ms
JS (Node) 202.20 ms 278.4 ms
Java (JIT) 256.69 ms 548.1 ms
Numpy 0.41 s 640.1 ms
JS (Bun) 764.40 ms 838.6 ms
Lua 2362.72 ms 3.037 s
Python 2.83 s 2.904 s

Multicore Results

Language Processing Time Total (PT + I/O)
Go Concurrent 18.89 ms 49.6 ms
Rust Concurrent 23.64 ms 41.8 ms
Swift Concurrent 40.02 ms 421.3 ms
F# Concurrent 40.57 ms 854.3 ms
Old Results with details (on my machine)
Language Processing Time Total (+ I/O) Details
Rust - 4.5s Initial
Rust v2 - 2.60s Replace std HashMap with fxHashMap by phazer99
Rust v3 - 1.28s Preallocate and reuse map and unstable sort by vdrmn and Darksonn
Rust v4 - 0.13s Use Post index as key instead of Pointer and Binary Heap by RB5009
Rust v5 38ms 52ms Rm hashing from loop and use vec[count] instead of map[index]count by RB5009
Rust v6 23ms 36ms Optimized Binary Heap Ops by scottlamb
Rust Rayon 9ms 22ms Parallelize by masmullin2000
Rust Rayon 8ms 22ms Remove comparison out of hot loop
Go - 1.5s Initial
Go v2 - 80ms Add rust optimizations
Go v3 56ms 70ms Use goccy/go-json
Go v3 34ms 55ms Use generic binaryheap by DrBlury
Go v4 26ms 50ms Replace binary heap with custom priority queue
Go v5 20ms 43ms Remove comparison out of hot loop
Go Con 10ms 33ms Go concurrency by tirprox and DrBlury
Go Con v2 5ms 29ms Use arena, use waitgroup, rm binheap by DrBlury
Python - 7.81s Initial
Python v2 1.35s 1.53s Add rust optimizations by dave-andersen
Numpy 0.57s 0.85s Numpy implementation by Copper280z
Crystal 50ms 96ms Inital w/ previous optimizations
Crystal v2 33ms 72ms Replace binary heap with custom priority queue
Odin 110ms 397ms Ported from golang code
Odin v2 104ms 404ms Remove comparison out of hot loop
Dart VM 125ms 530ms Ported frog golang code
Dart bin 274ms 360ms Compiled executable
Vlang 339ms 560ms Ported from golang code
Zig 80ms 110ms Provided by akhildevelops

About

Data Processing benchmark featuring Rust, Go, Swift, Zig etc.

License:MIT License


Languages

Language:C++ 90.8%Language:Shell 1.4%Language:Lua 1.1%Language:Python 0.9%Language:Rust 0.7%Language:Dart 0.6%Language:Swift 0.6%Language:Go 0.5%Language:F# 0.5%Language:Java 0.4%Language:Crystal 0.4%Language:Zig 0.3%Language:C# 0.3%Language:PowerShell 0.3%Language:JavaScript 0.3%Language:Julia 0.2%Language:Odin 0.2%Language:Nim 0.2%Language:V 0.2%Language:jq 0.0%