alphadose / haxmap

Fastest and most memory efficient golang concurrent hashmap

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

now we can use xxh3 to speed up

welldonexing opened this issue · comments

https://github.com/zeebo/xxh3

// your custom hash function
func customStringHasher(s string) uintptr {
return uintptr(xxh3.HashString(s))
}

@welldonexing thanks for the update, xxh3 is indeed faster and I am also planning to make a switch

on another note how does your xxh3 library compare with Cyan4973/xxHash performance wise for xxh3 algo only ?

yes ,by my simple test , it 's almost the same to old one ,but that 's almost the only one xxh3 writen in go .

const sizeOfUintPtr = unsafe.Sizeof(uintptr(0))

func uintptrToBytes(u *uintptr) []byte {
	return (*[sizeOfUintPtr]byte)(unsafe.Pointer(u))[:]
}
func customStringHasher(s uintptr) uintptr {

	return uintptr(xxh3.Hash(uintptrToBytes(&s)))
}
func setupHaxMapWithXX3() *haxmap.Map[uintptr, uintptr] {
	m := haxmap.New[uintptr, uintptr](mapSize)
	m.SetHasher(customStringHasher)
	for i := uintptr(0); i < epochs; i++ {
		m.Set(i, i)
	}
	return m
}

the question is https://github.com/zeebo/xxh3/blob/master/hash64.go use the str type
func hashAny(s str) (acc u64)

maybe faster when directly using pointer.or []byte

nice, I will change to xxh3 hashing and publish a new release possibly this weekend

PS:- I think xxh3 will give a considerable speed boost for long string key types but for int, uint, float etc there shouldn't be any difference

xxhash3 outperforms xxhash in string hashing, according to benchmark results on my machine.

goos: darwin
goarch: arm64
pkg: github.com/alphadose/haxmap/benchmarks
BenchmarkXXHASH_sting16bytes
BenchmarkXXHASH_sting16bytes-10         	  100000	         6.527 ns/op
BenchmarkXXHASH_sting128bytes
BenchmarkXXHASH_sting128bytes-10        	  100000	        18.11 ns/op
BenchmarkXXHASH_sting1024bytes
BenchmarkXXHASH_sting1024bytes-10       	  100000	       109.9 ns/op
BenchmarkXXHASH_sting8192bytes
BenchmarkXXHASH_sting8192bytes-10       	  100000	       709.1 ns/op
BenchmarkXXHASH3_string16_bytes
BenchmarkXXHASH3_string16_bytes-10      	  100000	         3.285 ns/op
BenchmarkXXHASH3_string128_bytes
BenchmarkXXHASH3_string128_bytes-10     	  100000	        12.12 ns/op
BenchmarkXXHASH3_string1024_bytes
BenchmarkXXHASH3_string1024_bytes-10    	  100000	       107.6 ns/op
BenchmarkXXHASH3_string8192_bytes
BenchmarkXXHASH3_string8192_bytes-10    	  100000	       588.2 ns/op
PASS
ok  	github.com/alphadose/haxmap/benchmarks	25.658s

why don't you want to use the FNV family of algorithms?
There is an implementation in the standard library.
FNV64 is 4-5 times faster than xxhash.

prime := 17861029589083791777
h := 14695981039346656037
for _, b := range bytes {
    h ^= uint64(b)
    h *= prime
}
return h

And that's it

@mvestnik can you run a benchmark test with the FNV hash algorithm similar to this one https://github.com/alphadose/haxmap/blob/main/benchmarks/map_test.go ?

You can use haxmap.SetHasher() function to override the default xxhash algo with your FNV algo

also if possible randomize the input dataset so that there is less bias in the final benchmark results

I wrote a simple benchmark. You can check the results:

package experiment

import (
  "testing"

  "github.com/cespare/xxhash/v2"
)

const (
  prime       uint64 = 17861029589083791777
  offsetBasis uint64 = 14695981039346656037
)

func fnv64(bytes []byte) uint64 {
  h := offsetBasis
  for _, b := range bytes {
    h ^= uint64(b)
    h *= prime
  }
  return h
}

func xxh64(bytes []byte) uint64 {
  return xxhash.Sum64(bytes)
}

func BenchmarkFNV64(b *testing.B) {
  for i := 0; i < b.N; i++ {
    fnv64([]byte{102, 105, 120, 116, 117, 114, 101})
  }
}

func BenchmarkXXH64(b *testing.B) {
  for i := 0; i < b.N; i++ {
    xxh64([]byte{102, 105, 120, 116, 117, 114, 101})
  }
}
commented

I guess xxh3 performs in all tested scenarios of direct hashing and when using the digest interface. (5 bytes, 100 bytes, 4KB, 10MB). it provies higher throughput than xxhash .

Also, xxh3 is generally faster than FNV-1a, especially as the data size increases. (in my pc env)

goos: windows
goarch: amd64
cpu: Intel(R) Core(TM) i5-10600KF CPU @ 4.10GHz
BenchmarkHashes/xxhash,direct,bytes,n=5B-12             236713702                5.118 ns/op     976.90 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=5B-12            200201168                5.836 ns/op     856.78 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=5B-12             82036696                15.08 ns/op      331.52 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=5B-12            72117117                18.36 ns/op      272.27 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,bytes,n=100B-12           85273906                13.88 ns/op     7202.97 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=100B-12          76765117                17.48 ns/op     5721.49 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=100B-12           58091688                20.21 ns/op     4949.07 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=100B-12          55638498                20.50 ns/op     4876.92 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,bytes,n=4KB-12             5077185               235.3 ns/op      17002.58 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=4KB-12            5169603               234.8 ns/op      17037.33 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=4KB-12             5079498               242.7 ns/op      16482.49 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=4KB-12            4893588               243.6 ns/op      16419.50 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,bytes,n=10MB-12               1532            753191 ns/op        13276.84 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=10MB-12              1604            740705 ns/op        13500.65 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=10MB-12               1546            760969 ns/op        13141.15 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=10MB-12              1522            753009 ns/op        13280.05 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=5B-12               266165270                4.479 ns/op    1116.33 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=5B-12              277091206                4.251 ns/op    1176.26 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=5B-12               38009204                32.61 ns/op      153.32 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=5B-12              37305080                32.37 ns/op      154.48 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=100B-12             120515226                9.917 ns/op    10083.44 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=100B-12            124008230                9.624 ns/op    10390.52 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=100B-12             32545372                37.36 ns/op     2676.88 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=100B-12            31345341                45.12 ns/op     2216.53 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=4KB-12              14153563                80.95 ns/op     49411.66 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=4KB-12             13895498               104.2 ns/op      38392.30 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=4KB-12               9207777               131.1 ns/op      30504.39 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=4KB-12              8794700               126.3 ns/op      31662.59 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=10MB-12                 3490            335840 ns/op        29776.06 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=10MB-12                3570            348551 ns/op        28690.17 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=10MB-12                 2556            471610 ns/op        21203.94 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=10MB-12                2581            458128 ns/op        21827.96 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=5B-12             150739432                7.740 ns/op     645.97 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=5B-12            44208664                25.49 ns/op      196.19 MB/s           8 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=5B-12             29507592                40.41 ns/op      123.74 MB/s          16 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=5B-12            18258642                67.34 ns/op       74.26 MB/s          24 B/op          2 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=100B-12           59111844                20.27 ns/op     4932.49 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=100B-12          16029044                67.97 ns/op     1471.18 MB/s         112 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=100B-12           20980450                57.49 ns/op     1739.40 MB/s          16 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=100B-12          12361206                99.31 ns/op     1006.95 MB/s         128 B/op          2 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=4KB-12             7089018               163.9 ns/op      24406.59 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=4KB-12            1000000              1074 ns/op        3725.40 MB/s        4096 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=4KB-12             5815550               206.0 ns/op      19421.88 MB/s         16 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=4KB-12            1000000              1081 ns/op        3699.84 MB/s        4112 B/op          2 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=10MB-12               1963            593497 ns/op        16849.29 MB/s          0 B/op          0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=10MB-12               670           1644013 ns/op        6082.68 MB/s    10002443 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=10MB-12               1858            609605 ns/op        16404.08 MB/s         16 B/op          1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=10MB-12               620           1780932 ns/op        5615.04 MB/s    10002453 B/op          2 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=5B-12             288622639                4.059 ns/op    1231.97 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=5B-12            264271659                4.472 ns/op    1118.00 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=100B-12           14979963                78.85 ns/op     1268.23 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=100B-12          15046964                79.35 ns/op     1260.31 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=4KB-12              328664              3575 ns/op        1118.82 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=4KB-12             334381              3553 ns/op        1125.76 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=10MB-12                129           9070709 ns/op        1102.45 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=10MB-12               130           9109688 ns/op        1097.73 MB/s           0 B/op          0 allocs/op
-- 1GB test
BenchmarkHashes/xxhash,direct,bytes,n=1GB-12                   1        6710021500 ns/op        1490.31 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=1GB-12                  1        19206542400 ns/op        520.66 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=1GB-12                   1        2771892900 ns/op        3607.64 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=1GB-12                  1        1085735900 ns/op        9210.34 MB/s           8 B/op          1 allocs/op

BenchmarkHashes/xxh3,direct,bytes,n=1GB-12                     1        4397203700 ns/op        2274.17 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=1GB-12                    1        3627537900 ns/op        2756.69 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=1GB-12                     1        1581577400 ns/op        6322.80 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=1GB-12                    2        1018246600 ns/op        9820.80 MB/s           0 B/op          0 allocs/op

BenchmarkHashes/FNV-1a,digest,bytes,n=1GB-12                   1        9578540900 ns/op        1044.00 MB/s           0 B/op          0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=1GB-12                  1        9733404100 ns/op        1027.39 MB/s           0 B/op          0 allocs/op

Could guys also benchmark wyhash? https://github.com/zeebo/wyhash readme says it a little faster(?) than https://github.com/zeebo/xxh3

For ref: https://bloomberg.github.io/bde/articles/wyhash.html