atomic-counter
Mutable counters that can be modified with atomic operatinos
https://github.com/sergv/atomic-counter
LTS Haskell 23.2: | 0.1.2.3 |
Stackage Nightly 2024-12-28: | 0.1.2.3 |
Latest on Hackage: | 0.1.2.3 |
atomic-counter-0.1.2.3@sha256:81a105aed2ee2f5e479448e44252b24cdfacf81a5a2106aabdd217bad94b6f40,3388
Module documentation for 0.1.2.3
- Control
- Control.Concurrent
- Control.Concurrent.Counter
- Control.Concurrent.Counter.Lifted
- Control.Concurrent.Counter.Unlifted
- Control.Concurrent.Counter
- Control.Concurrent
Synopsis
Mutable cells that hold an integer value and can be safely modified from multiple threads. Support only few operations: read, write, +, -, and bitwise and, or, xor and nand.
Good use case is a shared counter that multiple threads increment.
Operations translate to atomic CPU instructions which involve memory
barrier. For limited set of operation this package provides in
concurrent setting they tend to be faster than atomic modify operation
on IORef
s and other shared var types. Value inside the Counter
type is unboxed which contributes to outperforming all other shared
vars because they box the value they store. For integers boxing is
especially bad.
For single-threaded use case this package will likely not outperform
vanilla IORef
s or STRef
s (it depends on whether GHC adds memory
barrier for them or not, cf
https://gitlab.haskell.org/ghc/ghc/-/issues/22764, if it does end up
adding the same memory barrier then both this package and IORef
should be equal). All in all, operations this package provides will
incur guaranteed memory barrier which will serve no purpose in
single-threaded settings so for that use case this package is probably
not the best. The intended use of this package is in the concurrent
setting where it does seem like a clear winner (please see benchmarks
below).
Still even is single-threaded scenario the Counter
can serve as an
efficient mutable integer cell that does not box the integer.
Benchmark
Summary
Depending on number of threads Counter
from this package can be
up to 10 times faster than next best TVar
from the stm
package.
Details
The benchmark is to spawn N threads each of which will increment the same counter by 1 for a number of iterations.
Test setup: Intel i5-9600K CPU @ 3.70GHz, 6 cores, no hyperthreading, GHC 9.6.1.
NB IORef inconsistent
is the benchmark that just reads and writes
IORef
with the fastest functions the IORef
supports. It gives
wrong results when multiple threads access IORef
this way since
these reads and writes are not synchronized. It’s included only for
speed comparison purposes. All others do proper synchronization of
increments between the threads. Current package is shown as Counter
.
$ cabal run bench -- -j1 --timeout 300 --stdev 2
All
Correctness: OK (13.97s)
+++ OK, passed 10000 tests.
Read/write contention with 10 iterations and 1 threads
Counter: OK (14.23s)
839 ns ± 33 ns
IORef inconsistent: OK (1.08s)
1.03 μs ± 36 ns, 1.23x
IORef atomic: OK (1.30s)
1.25 μs ± 40 ns, 1.48x
MVar: OK (10.53s)
1.24 μs ± 13 ns, 1.48x
TMVar: OK (0.65s)
1.18 μs ± 30 ns, 1.41x
TVar: OK (0.33s)
1.13 μs ± 45 ns, 1.35x
Addr: OK (3.37s)
818 ns ± 28 ns, 0.97x
Read/write contention with 100 iterations and 1 threads
Counter: OK (1.47s)
1.35 μs ± 24 ns
IORef inconsistent: OK (17.27s)
2.09 μs ± 47 ns, 1.55x
IORef atomic: OK (0.45s)
3.22 μs ± 122 ns, 2.38x
MVar: OK (3.15s)
2.97 μs ± 30 ns, 2.20x
TMVar: OK (1.24s)
4.65 μs ± 64 ns, 3.44x
TVar: OK (2.43s)
4.61 μs ± 158 ns, 3.41x
Addr: OK (1.16s)
1.09 μs ± 22 ns, 0.80x
Read/write contention with 1000 iterations and 1 threads
Counter: OK (0.47s)
6.48 μs ± 235 ns
IORef inconsistent: OK (0.78s)
2.76 μs ± 88 ns, 0.43x
IORef atomic: OK (0.60s)
17.5 μs ± 399 ns, 2.69x
MVar: OK (2.09s)
15.8 μs ± 208 ns, 2.43x
TMVar: OK (0.40s)
48.3 μs ± 1.9 μs, 7.46x
TVar: OK (0.55s)
32.4 μs ± 713 ns, 5.00x
Addr: OK (0.34s)
4.73 μs ± 189 ns, 0.73x
Read/write contention with 10000 iterations and 1 threads
Counter: OK (0.50s)
58.4 μs ± 2.1 μs
IORef inconsistent: OK (0.42s)
23.8 μs ± 829 ns, 0.41x
IORef atomic: OK (0.73s)
168 μs ± 3.8 μs, 2.87x
MVar: OK (0.66s)
157 μs ± 4.6 μs, 2.69x
TMVar: OK (2.09s)
512 μs ± 7.2 μs, 8.78x
TVar: OK (1.39s)
321 μs ± 6.7 μs, 5.50x
Addr: OK (47.12s)
43.8 μs ± 562 ns, 0.75x
Read/write contention with 10 iterations and 2 threads
Counter: OK (1.10s)
7.85 μs ± 144 ns
IORef inconsistent: OK (1.93s)
7.65 μs ± 201 ns, 0.97x
IORef atomic: OK (0.54s)
8.16 μs ± 235 ns, 1.04x
MVar: OK (7.81s)
7.81 μs ± 72 ns, 0.99x
TMVar: OK (0.49s)
7.98 μs ± 292 ns, 1.02x
TVar: OK (1.91s)
7.98 μs ± 181 ns, 1.02x
Addr: OK (2.01s)
7.64 μs ± 213 ns, 0.97x
Read/write contention with 100 iterations and 2 threads
Counter: OK (1.00s)
8.50 μs ± 290 ns
IORef inconsistent: OK (8.00s)
8.13 μs ± 133 ns, 0.96x
IORef atomic: OK (2.80s)
13.3 μs ± 358 ns, 1.56x
MVar: OK (4.64s)
19.2 μs ± 473 ns, 2.25x
TMVar: OK (5.52s)
32.3 μs ± 1.1 μs, 3.80x
TVar: OK (83.27s)
30.5 μs ± 279 ns, 3.59x
Addr: OK (4.38s)
8.71 μs ± 333 ns, 1.03x
Read/write contention with 1000 iterations and 2 threads
Counter: OK (28.57s)
43.7 μs ± 106 ns
IORef inconsistent: OK (0.91s)
17.2 μs ± 669 ns, 0.39x
IORef atomic: OK (130.48s)
223 μs ± 10 μs, 5.09x
MVar: OK (1.19s)
5.11 ms ± 173 μs, 116.88x
TMVar: OK (0.33s)
295 μs ± 7.7 μs, 6.75x
TVar: OK (1.44s)
337 μs ± 10 μs, 7.71x
Addr: OK (0.47s)
49.3 μs ± 915 ns, 1.13x
Read/write contention with 10000 iterations and 2 threads
Counter: OK (0.51s)
466 μs ± 17 μs
IORef inconsistent: OK (13.49s)
190 μs ± 1.2 μs, 0.41x
IORef atomic: OK (71.77s)
1.80 ms ± 33 μs, 3.86x
MVar: OK (7.23s)
60.1 ms ± 505 μs, 128.83x
TMVar: OK (1.50s)
2.39 ms ± 39 μs, 5.13x
TVar: OK (108.24s)
3.13 ms ± 405 μs, 6.70x
Addr: OK (0.29s)
531 μs ± 20 μs, 1.14x
Read/write contention with 10 iterations and 4 threads
Counter: OK (2.33s)
15.5 μs ± 384 ns
IORef inconsistent: OK (2.41s)
15.6 μs ± 52 ns, 1.01x
IORef atomic: OK (1.56s)
19.5 μs ± 487 ns, 1.26x
MVar: OK (1.51s)
28.9 μs ± 647 ns, 1.87x
TMVar: OK (2.88s)
19.9 μs ± 439 ns, 1.29x
TVar: OK (0.38s)
18.2 μs ± 332 ns, 1.18x
Addr: OK (0.59s)
15.3 μs ± 477 ns, 0.99x
Read/write contention with 100 iterations and 4 threads
Counter: OK (1.63s)
25.3 μs ± 723 ns
IORef inconsistent: OK (0.35s)
17.6 μs ± 451 ns, 0.69x
IORef atomic: OK (0.46s)
154 μs ± 2.1 μs, 6.08x
MVar: OK (0.62s)
1.20 ms ± 38 μs, 47.27x
TMVar: OK (0.12s)
191 μs ± 7.4 μs, 7.54x
TVar: OK (0.14s)
221 μs ± 6.6 μs, 8.71x
Addr: OK (0.75s)
21.0 μs ± 228 ns, 0.83x
Read/write contention with 1000 iterations and 4 threads
Counter: OK (0.17s)
242 μs ± 6.7 μs
IORef inconsistent: OK (0.15s)
103 μs ± 2.7 μs, 0.43x
IORef atomic: OK (2.28s)
1.71 ms ± 36 μs, 7.07x
MVar: OK (0.41s)
12.9 ms ± 338 μs, 53.16x
TMVar: OK (0.29s)
2.08 ms ± 65 μs, 8.61x
TVar: OK (0.17s)
2.23 ms ± 73 μs, 9.21x
Addr: OK (0.58s)
234 μs ± 1.9 μs, 0.97x
Read/write contention with 10000 iterations and 4 threads
Counter: OK (0.17s)
2.24 ms ± 60 μs
IORef inconsistent: OK (9.34s)
1.00 ms ± 31 μs, 0.45x
IORef atomic: OK (70.87s)
22.0 ms ± 820 μs, 9.80x
MVar: OK (0.74s)
126 ms ± 1.9 ms, 56.07x
TMVar: OK (0.33s)
19.0 ms ± 670 μs, 8.48x
TVar: OK (0.09s)
20.8 ms ± 732 μs, 9.27x
Addr: OK (0.18s)
2.45 ms ± 97 μs, 1.09x
Read/write contention with 10 iterations and 6 threads
Counter: OK (6.77s)
23.9 μs ± 813 ns
IORef inconsistent: OK (0.42s)
23.4 μs ± 712 ns, 0.98x
IORef atomic: OK (0.70s)
39.4 μs ± 747 ns, 1.65x
MVar: OK (1.32s)
50.0 μs ± 1.0 μs, 2.09x
TMVar: OK (0.52s)
35.3 μs ± 449 ns, 1.47x
TVar: OK (3.93s)
32.6 μs ± 93 ns, 1.37x
Addr: OK (0.42s)
23.4 μs ± 381 ns, 0.98x
Read/write contention with 100 iterations and 6 threads
Counter: OK (1.20s)
47.8 μs ± 866 ns
IORef inconsistent: OK (0.24s)
26.9 μs ± 983 ns, 0.56x
IORef atomic: OK (1.46s)
483 μs ± 17 μs, 10.10x
MVar: OK (2.01s)
1.86 ms ± 13 μs, 38.97x
TMVar: OK (0.49s)
519 μs ± 4.5 μs, 10.86x
TVar: OK (14.78s)
521 μs ± 14 μs, 10.91x
Addr: OK (0.56s)
43.1 μs ± 1.6 μs, 0.90x
Read/write contention with 1000 iterations and 6 threads
Counter: OK (13.77s)
459 μs ± 8.8 μs
IORef inconsistent: OK (0.23s)
207 μs ± 4.1 μs, 0.45x
IORef atomic: OK (2.10s)
3.48 ms ± 70 μs, 7.58x
MVar: OK (0.64s)
19.3 ms ± 555 μs, 42.03x
TMVar: OK (1.07s)
5.55 ms ± 161 μs, 12.08x
TVar: OK (0.27s)
4.85 ms ± 189 μs, 10.56x
Addr: OK (27.53s)
461 μs ± 4.8 μs, 1.00x
Read/write contention with 10000 iterations and 6 threads
Counter: OK (0.52s)
4.88 ms ± 142 μs
IORef inconsistent: OK (0.25s)
2.25 ms ± 24 μs, 0.46x
IORef atomic: OK (66.59s)
75.5 ms ± 11 ms, 15.46x
MVar: OK (2.85s)
193 ms ± 4.2 ms, 39.54x
TMVar: OK (0.16s)
54.9 ms ± 1.6 ms, 11.25x
TVar: OK (10.04s)
47.1 ms ± 339 μs, 9.64x
Addr: OK (0.52s)
4.94 ms ± 89 μs, 1.01x
Read/write contention with 10 iterations and 8 threads
Counter: OK (1.84s)
27.3 μs ± 1.0 μs
IORef inconsistent: OK (1.98s)
27.5 μs ± 1.0 μs, 1.01x
IORef atomic: OK (0.91s)
56.1 μs ± 787 ns, 2.06x
MVar: OK (0.69s)
94.9 μs ± 3.2 μs, 3.48x
TMVar: OK (1.35s)
56.1 μs ± 1.5 μs, 2.06x
TVar: OK (0.36s)
51.7 μs ± 1.5 μs, 1.89x
Addr: OK (0.24s)
26.3 μs ± 672 ns, 0.96x
Read/write contention with 100 iterations and 8 threads
Counter: OK (1.41s)
63.1 μs ± 1.2 μs
IORef inconsistent: OK (8.38s)
33.3 μs ± 503 ns, 0.53x
IORef atomic: OK (0.59s)
709 μs ± 21 μs, 11.25x
MVar: OK (1.21s)
2.31 ms ± 49 μs, 36.65x
TMVar: OK (4.67s)
667 μs ± 18 μs, 10.58x
TVar: OK (4.81s)
690 μs ± 6.5 μs, 10.94x
Addr: OK (0.69s)
59.6 μs ± 1.7 μs, 0.94x
Read/write contention with 1000 iterations and 8 threads
Counter: OK (0.16s)
583 μs ± 13 μs
IORef inconsistent: OK (1.13s)
304 μs ± 3.5 μs, 0.52x
IORef atomic: OK (93.87s)
7.12 ms ± 158 μs, 12.22x
MVar: OK (1.54s)
24.4 ms ± 541 μs, 41.83x
TMVar: OK (0.09s)
6.32 ms ± 241 μs, 10.84x
TVar: OK (0.67s)
6.36 ms ± 221 μs, 10.92x
Addr: OK (0.56s)
540 μs ± 20 μs, 0.93x
Read/write contention with 10000 iterations and 8 threads
Counter: OK (5.13s)
5.67 ms ± 68 μs
IORef inconsistent: OK (0.16s)
3.14 ms ± 69 μs, 0.55x
IORef atomic: OK (35.04s)
148 ms ± 319 μs, 26.14x
MVar: OK (1.58s)
230 ms ± 8.7 ms, 40.61x
TMVar: OK (26.99s)
65.9 ms ± 4.1 ms, 11.62x
TVar: OK (3.41s)
66.6 ms ± 779 μs, 11.75x
Addr: OK (0.16s)
5.84 ms ± 164 μs, 1.03x
Read/write contention with 10 iterations and 12 threads
Counter: OK (0.54s)
30.5 μs ± 452 ns
IORef inconsistent: OK (0.29s)
32.4 μs ± 820 ns, 1.06x
IORef atomic: OK (7.93s)
70.3 μs ± 436 ns, 2.30x
MVar: OK (0.44s)
232 μs ± 8.7 μs, 7.59x
TMVar: OK (13.74s)
80.4 μs ± 2.3 μs, 2.64x
TVar: OK (1.73s)
79.2 μs ± 2.7 μs, 2.60x
Addr: OK (0.26s)
29.6 μs ± 1.0 μs, 0.97x
Read/write contention with 100 iterations and 12 threads
Counter: OK (0.49s)
98.0 μs ± 1.1 μs
IORef inconsistent: OK (0.66s)
48.2 μs ± 561 ns, 0.49x
IORef atomic: OK (0.88s)
1.07 ms ± 39 μs, 10.89x
MVar: OK (0.86s)
3.18 ms ± 61 μs, 32.48x
TMVar: OK (0.21s)
1.00 ms ± 21 μs, 10.19x
TVar: OK (0.12s)
1.06 ms ± 29 μs, 10.77x
Addr: OK (0.13s)
94.0 μs ± 2.7 μs, 0.96x
Read/write contention with 1000 iterations and 12 threads
Counter: OK (6.78s)
955 μs ± 5.9 μs
IORef inconsistent: OK (0.22s)
466 μs ± 16 μs, 0.49x
IORef atomic: OK (2.46s)
11.1 ms ± 109 μs, 11.62x
MVar: OK (1.14s)
34.3 ms ± 923 μs, 35.98x
TMVar: OK (8.60s)
10.7 ms ± 226 μs, 11.22x
TVar: OK (4.18s)
9.82 ms ± 42 μs, 10.29x
Addr: OK (0.42s)
1.00 ms ± 40 μs, 1.05x
Read/write contention with 10000 iterations and 12 threads
Counter: OK (1.99s)
9.45 ms ± 116 μs
IORef inconsistent: OK (7.78s)
4.74 ms ± 7.4 μs, 0.50x
IORef atomic: OK (82.99s)
304 ms ± 66 ms, 32.15x
MVar: OK (11.29s)
343 ms ± 8.3 ms, 36.34x
TMVar: OK (0.33s)
116 ms ± 1.9 ms, 12.23x
TVar: OK (10.63s)
105 ms ± 1.5 ms, 11.15x
Addr: OK (3.97s)
9.47 ms ± 361 μs, 1.00x
Read/write contention with 10 iterations and 16 threads
Counter: OK (0.20s)
42.1 μs ± 1.3 μs
IORef inconsistent: OK (0.18s)
40.5 μs ± 1.4 μs, 0.96x
IORef atomic: OK (1.08s)
153 μs ± 662 ns, 3.63x
MVar: OK (1.56s)
408 μs ± 9.1 μs, 9.69x
TMVar: OK (2.43s)
128 μs ± 1.9 μs, 3.04x
TVar: OK (0.16s)
135 μs ± 5.0 μs, 3.19x
Addr: OK (0.32s)
40.6 μs ± 940 ns, 0.96x
Read/write contention with 100 iterations and 16 threads
Counter: OK (0.32s)
131 μs ± 2.1 μs
IORef inconsistent: OK (1.68s)
75.4 μs ± 1.6 μs, 0.58x
IORef atomic: OK (4.05s)
1.27 ms ± 45 μs, 9.70x
MVar: OK (4.53s)
4.39 ms ± 85 μs, 33.50x
TMVar: OK (0.16s)
1.48 ms ± 40 μs, 11.30x
TVar: OK (1.20s)
1.45 ms ± 5.5 μs, 11.06x
Addr: OK (4.85s)
131 μs ± 1.3 μs, 1.00x
Read/write contention with 1000 iterations and 16 threads
Counter: OK (0.15s)
1.05 ms ± 32 μs
IORef inconsistent: OK (0.28s)
632 μs ± 12 μs, 0.60x
IORef atomic: OK (53.85s)
16.3 ms ± 1.5 ms, 15.58x
MVar: OK (2.98s)
46.6 ms ± 672 μs, 44.57x
TMVar: OK (0.09s)
15.2 ms ± 472 μs, 14.54x
TVar: OK (0.36s)
14.6 ms ± 354 μs, 13.96x
Addr: OK (0.15s)
1.38 ms ± 47 μs, 1.32x
Read/write contention with 10000 iterations and 16 threads
Counter: OK (0.17s)
13.3 ms ± 301 μs
IORef inconsistent: OK (0.16s)
5.73 ms ± 119 μs, 0.43x
IORef atomic: OK (117.97s)
464 ms ± 38 ms, 34.81x
MVar: OK (14.95s)
481 ms ± 4.5 ms, 36.11x
TMVar: OK (1.80s)
155 ms ± 2.7 ms, 11.65x
TVar: OK (7.20s)
146 ms ± 663 μs, 10.93x
Addr: OK (5.34s)
12.8 ms ± 187 μs, 0.96x
Read/write contention with 10 iterations and 20 threads
Counter: OK (0.19s)
49.3 μs ± 1.9 μs
IORef inconsistent: OK (0.20s)
48.2 μs ± 1.7 μs, 0.98x
IORef atomic: OK (5.11s)
188 μs ± 5.2 μs, 3.82x
MVar: OK (4.08s)
536 μs ± 5.4 μs, 10.86x
TMVar: OK (1.44s)
159 μs ± 531 ns, 3.23x
TVar: OK (0.37s)
176 μs ± 3.0 μs, 3.57x
Addr: OK (0.82s)
50.6 μs ± 1.0 μs, 1.03x
Read/write contention with 100 iterations and 20 threads
Counter: OK (0.10s)
171 μs ± 5.7 μs
IORef inconsistent: OK (0.25s)
90.0 μs ± 2.1 μs, 0.52x
IORef atomic: OK (9.07s)
1.52 ms ± 59 μs, 8.87x
MVar: OK (2.90s)
5.61 ms ± 159 μs, 32.73x
TMVar: OK (1.57s)
1.85 ms ± 54 μs, 10.77x
TVar: OK (0.19s)
1.88 ms ± 28 μs, 10.98x
Addr: OK (0.18s)
169 μs ± 3.8 μs, 0.98x
Read/write contention with 1000 iterations and 20 threads
Counter: OK (0.74s)
1.68 ms ± 65 μs
IORef inconsistent: OK (0.18s)
881 μs ± 24 μs, 0.52x
IORef atomic: OK (73.70s)
21.3 ms ± 600 μs, 12.66x
MVar: OK (0.94s)
61.0 ms ± 1.4 ms, 36.31x
TMVar: OK (7.19s)
18.1 ms ± 472 μs, 10.75x
TVar: OK (7.26s)
18.1 ms ± 494 μs, 10.79x
Addr: OK (0.09s)
1.74 ms ± 43 μs, 1.04x
Read/write contention with 10000 iterations and 20 threads
Counter: OK (6.72s)
16.2 ms ± 174 μs
IORef inconsistent: OK (6.36s)
7.73 ms ± 247 μs, 0.48x
IORef atomic: OK (21.41s)
641 ms ± 264 μs, 39.55x
MVar: OK (38.82s)
609 ms ± 21 ms, 37.53x
TMVar: OK (4.56s)
192 ms ± 648 μs, 11.82x
TVar: OK (1.13s)
193 ms ± 7.5 ms, 11.91x
Addr: OK (0.43s)
16.8 ms ± 581 μs, 1.03x
Read/write contention with 10 iterations and 32 threads
Counter: OK (1.01s)
73.7 μs ± 1.3 μs
IORef inconsistent: OK (0.51s)
72.0 μs ± 777 ns, 0.98x
IORef atomic: OK (0.24s)
178 μs ± 3.8 μs, 2.41x
MVar: OK (0.91s)
919 μs ± 17 μs, 12.47x
TMVar: OK (2.27s)
277 μs ± 4.4 μs, 3.76x
TVar: OK (0.15s)
299 μs ± 10 μs, 4.06x
Addr: OK (0.50s)
71.8 μs ± 2.6 μs, 0.97x
Read/write contention with 100 iterations and 32 threads
Counter: OK (0.29s)
280 μs ± 6.4 μs
IORef inconsistent: OK (1.44s)
150 μs ± 3.2 μs, 0.53x
IORef atomic: OK (0.25s)
1.74 ms ± 56 μs, 6.22x
MVar: OK (1.19s)
9.06 ms ± 269 μs, 32.31x
TMVar: OK (4.80s)
2.98 ms ± 36 μs, 10.64x
TVar: OK (19.94s)
3.10 ms ± 27 μs, 11.04x
Addr: OK (0.55s)
272 μs ± 9.4 μs, 0.97x
Read/write contention with 1000 iterations and 32 threads
Counter: OK (2.27s)
2.69 ms ± 48 μs
IORef inconsistent: OK (2.18s)
1.27 ms ± 23 μs, 0.47x
IORef atomic: OK (62.21s)
35.2 ms ± 4.3 ms, 13.08x
MVar: OK (5.90s)
89.9 ms ± 233 μs, 33.44x
TMVar: OK (2.96s)
30.7 ms ± 470 μs, 11.40x
TVar: OK (5.98s)
30.4 ms ± 124 μs, 11.30x
Addr: OK (0.29s)
2.75 ms ± 93 μs, 1.02x
Read/write contention with 10000 iterations and 32 threads
Counter: OK (0.08s)
28.8 ms ± 1.0 ms
IORef inconsistent: OK (0.32s)
12.9 ms ± 351 μs, 0.45x
IORef atomic: OK (82.13s)
1.180 s ± 65 ms, 41.02x
MVar: OK (128.00s)
982 ms ± 14 ms, 34.15x
TMVar: OK (0.17s)
299 ms ± 7.6 ms, 10.41x
TVar: OK (7.51s)
316 ms ± 12 ms, 11.00x
Addr: OK (1.36s)
27.0 ms ± 871 μs, 0.94x
Read/write contention with 10 iterations and 64 threads
Counter: OK (0.84s)
132 μs ± 3.3 μs
IORef inconsistent: OK (0.44s)
132 μs ± 2.4 μs, 1.00x
IORef atomic: OK (0.18s)
323 μs ± 6.5 μs, 2.45x
MVar: OK (0.47s)
1.89 ms ± 74 μs, 14.33x
TMVar: OK (4.18s)
557 μs ± 1.7 μs, 4.22x
TVar: OK (2.19s)
575 μs ± 1.6 μs, 4.36x
Addr: OK (0.19s)
125 μs ± 2.8 μs, 0.95x
Read/write contention with 100 iterations and 64 threads
Counter: OK (0.28s)
568 μs ± 13 μs
IORef inconsistent: OK (0.32s)
284 μs ± 2.9 μs, 0.50x
IORef atomic: OK (0.21s)
2.62 ms ± 95 μs, 4.62x
MVar: OK (2.37s)
18.1 ms ± 688 μs, 31.81x
TMVar: OK (0.15s)
6.24 ms ± 173 μs, 10.98x
TVar: OK (5.00s)
6.30 ms ± 162 μs, 11.08x
Addr: OK (2.06s)
555 μs ± 7.4 μs, 0.98x
Read/write contention with 1000 iterations and 64 threads
Counter: OK (0.59s)
5.80 ms ± 111 μs
IORef inconsistent: OK (2.10s)
2.49 ms ± 96 μs, 0.43x
IORef atomic: OK (2.36s)
75.8 ms ± 396 μs, 13.06x
MVar: OK (2.88s)
188 ms ± 4.8 ms, 32.41x
TMVar: OK (5.95s)
61.7 ms ± 176 μs, 10.63x
TVar: OK (1.52s)
63.8 ms ± 2.4 ms, 11.00x
Addr: OK (0.28s)
5.52 ms ± 167 μs, 0.95x
Read/write contention with 10000 iterations and 64 threads
Counter: OK (2.74s)
56.6 ms ± 2.2 ms
IORef inconsistent: OK (10.26s)
26.0 ms ± 746 μs, 0.46x
IORef atomic: OK (100.70s)
3.018 s ± 328 ms, 53.35x
MVar: OK (30.56s)
1.994 s ± 16 ms, 35.25x
TMVar: OK (0.36s)
665 ms ± 3.4 ms, 11.76x
TVar: OK (1.79s)
653 ms ± 2.4 ms, 11.55x
Addr: OK (11.06s)
57.0 ms ± 546 μs, 1.01x
Read/write contention with 10 iterations and 128 threads
Counter: OK (0.67s)
240 μs ± 5.9 μs
IORef inconsistent: OK (1.43s)
247 μs ± 8.5 μs, 1.03x
IORef atomic: OK (1.38s)
642 μs ± 6.2 μs, 2.68x
MVar: OK (1.92s)
3.83 ms ± 34 μs, 15.94x
TMVar: OK (16.12s)
1.11 ms ± 23 μs, 4.63x
TVar: OK (4.27s)
1.15 ms ± 36 μs, 4.78x
Addr: OK (0.16s)
230 μs ± 6.9 μs, 0.96x
Read/write contention with 100 iterations and 128 threads
Counter: OK (0.26s)
1.15 ms ± 42 μs
IORef inconsistent: OK (0.31s)
560 μs ± 18 μs, 0.49x
IORef atomic: OK (0.31s)
7.80 ms ± 274 μs, 6.81x
MVar: OK (0.55s)
36.2 ms ± 1.2 ms, 31.54x
TMVar: OK (19.40s)
12.4 ms ± 20 μs, 10.78x
TVar: OK (0.62s)
13.1 ms ± 411 μs, 11.39x
Addr: OK (0.13s)
1.12 ms ± 24 μs, 0.98x
Read/write contention with 1000 iterations and 128 threads
Counter: OK (0.15s)
8.93 ms ± 264 μs
IORef inconsistent: OK (0.52s)
5.28 ms ± 104 μs, 0.59x
IORef atomic: OK (80.71s)
310 ms ± 57 ms, 34.68x
MVar: OK (5.86s)
378 ms ± 3.1 ms, 42.38x
TMVar: OK (1.51s)
131 ms ± 3.5 ms, 14.62x
TVar: OK (3.07s)
129 ms ± 981 μs, 14.44x
Addr: OK (4.47s)
11.3 ms ± 139 μs, 1.27x
Read/write contention with 10000 iterations and 128 threads
Counter: OK (1.37s)
112 ms ± 3.3 ms
IORef inconsistent: OK (5.13s)
53.3 ms ± 571 μs, 0.47x
IORef atomic: OK (11.14s)
6.952 s ± 54 ms, 61.84x
MVar: OK (12.70s)
4.075 s ± 154 ms, 36.25x
TMVar: OK (3.51s)
1.268 s ± 5.7 ms, 11.27x
TVar: OK (0.71s)
1.306 s ± 50 ms, 11.62x
Addr: OK (11.18s)
117 ms ± 1.7 ms, 1.04x
All 309 tests passed (2397.33s)
Memory overhead
In pure Haskell (i.e. with no-cmm
flag enabled) each unlifted value of
type Counter
is a singleton mutable array from GHC primitives under
the hood. Thus it occupies at least platform integer size
+ array size
+ header
bytes which should typically be at least 3 machine
words. Lifted values may occupy more depending on optimizations.
By default CMM will be enabled which should save one word of overhead because there would be no array any more hence no need to store trivial size.
Changes
0.1.2.3
- Remove internal ‘test-utils’ library which wasn’t supposed to be used by clients anyway but was confusing cabal dependency resolution
0.1.2.2
- Lower minimum supported GHC to 8.6 and base to 4.12
- Improve generated documentation
- Faster
stg_newCounterzh
0.1.2.1
- Disable cmm on 32 bit x86 architecture
0.1.2
- Add compare and swap operation
- Add flag to force non-CMM implementation (
no-cmm
), not that users should use it normally
0.1.1
- On non-javascript platforms the counter is implemented directly in CMM instead of using singleton primitive array. Likely to occupy less memory and perform faster (https://github.com/sergv/atomic-counter/pull/3).
0.1
Initial release