Test implementations for: http://vanillajava.blogspot.com/2013/04/low-gc-coding-efficient-listeners.html
Attemping to make a thread safe, zero GC observer that gives each observer a chance at being "first" in the iteration order for fairness.
note: the stat collections to verify Run from my iMac OSX 10.8.3, 3.4ghz i7, 16GB ram
10000x100 Iterations used 8,008 bytes, last 10k 571,000 ns
ArraySynchronized{sz=10, idx=1000000, first=[100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000], total=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000]}
10000x100 Iterations used 0 bytes, last 10k 605,000 ns
CasCalls{sz=10, idx=1000000, first=[100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000], total=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000]}
10000x100 Iterations used 1,840 bytes, last 10k 687,000 ns
ArrayLocked{sz=10, idx=1000000, first=[100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000], total=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000]}
Running on an Ubuntu 12 3.8 GHz i7, 24 GB with -XX:-UseTLAB, the last two runs appear as
10000000 Iterations used 0 bytes, took 130 ns per loop
ArraySynchronized{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 141 ns per loop
CasCalls{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 56 bytes, took 151 ns per loop
ArrayLocked{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 52 ns per loop
VanillaObservers{sz=10, idx=0, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
Note: Using just one loop the VanillaObservers runs like this
10000000 Iterations used 0 bytes, took 122 ns per loop
VanillaObservers{sz=10, idx=0, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
More stats: Ubuntu 12.04.1 LTS, Xeon E5540 @ 2.53 Ghz, 96 GB with -XX:-UseTLAB
Run as is:
10000000 Iterations used 0 bytes, took 77 ns per loop
ArraySynchronized{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 80 ns per loop
CasCalls{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 84 ns per loop
ArrayLocked{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 29 ns per loop
VanillaObservers{sz=10, idx=0, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
Changed CasCalls to use dual loop:
10000000 Iterations used 0 bytes, took 77 ns per loop
ArraySynchronized{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 53 ns per loop
CasCalls{sz=10, idx=0, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 85 ns per loop
ArrayLocked{sz=10, idx=10000000, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}
10000000 Iterations used 0 bytes, took 29 ns per loop
VanillaObservers{sz=10, idx=0, first=[1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000, 1000000], total=[10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000, 10000000]}