Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement LockFreeExponentiallyDecayingReservoir (#1656)
* Implement LockFreeExponentiallyDecayingReservoir This implementation has several advantages over the existing ExponentiallyDecayingReservoir: * It exclusively uses the precise system clock (nanotime/clock.tick) instead of a combination of nanotime and currentTimeMillis so it's not vulnerable to unexpected NTP clock jumps. * Lock free for substantially better performance under concurrent load[1] and improved performance in uncontended use[2] * Allows the rescale threshold to be configured programatically. Potential trade-offs: * Updates which occur concurrently with rescaling may be discarded if the orphaned state node is updated after rescale has replaced it. * In the worst case, all concurrent threads updating the reservoir may attempt to rescale rather than a single thread holding an exclusive write lock. It's expected that the configuration is set such that rescaling is substantially less common than updating at peak load. [1] substantially better performance under concurrent load 32 concurrent update threads ``` Benchmark (reservoirType) Mode Cnt Score Error Units ReservoirBenchmarks.updateReservoir EXPO_DECAY avgt 5 8235.861 ± 1306.404 ns/op ReservoirBenchmarks.updateReservoir LOCK_FREE_EXPO_DECAY avgt 5 758.315 ± 36.916 ns/op ``` [2] improved performance in uncontended use 1 benchmark thread ``` Benchmark (reservoirType) Mode Cnt Score Error Units ReservoirBenchmarks.updateReservoir EXPO_DECAY avgt 5 92.845 ± 36.478 ns/op ReservoirBenchmarks.updateReservoir LOCK_FREE_EXPO_DECAY avgt 5 49.168 ± 1.033 ns/op ``` * Precompute alpha at nanosecond scale Avoid expensive division operations on each rescale and update * Optimize weightedsample allocations on update * Refactor `update` to avoid isEmpty ``` Benchmark (reservoirType) Mode Cnt Score Error Units ReservoirBenchmarks.updateReservoir EXPO_DECAY avgt 5 8365.308 ± 1880.683 ns/op ReservoirBenchmarks.updateReservoir LOCK_FREE_EXPO_DECAY avgt 5 73.966 ± 5.305 ns/op ``` * Code review from @spkrka * remove unnecessary validation * readable decimals * require positive size * optimize rescaleIfNeeded below the 35b inline threshold This makes the check more likely to be optimized without forcing analysis of the less common doRescale path. Before: ``` private LockFreeExponentiallyDecayingReservoir$State rescaleIfNeeded(long) 0: aload_0 1: getfield #56 // Field state:Lcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State; 4: astore_3 5: aload_3 6: invokestatic #81 // Method com/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State.access$600:(Lcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State;)J 9: lstore 4 11: lload_1 12: lload 4 14: lsub 15: aload_0 16: getfield #36 // Field rescaleThresholdNanos:J 19: lcmp 20: iflt 51 23: aload_3 24: lload_1 25: invokevirtual #85 // Method com/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State.rescale:(J)Lcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State; 28: astore 6 30: getstatic #88 // Field stateUpdater:Ljava/util/concurrent/atomic/AtomicReferenceFieldUpdater; 33: aload_0 34: aload_3 35: aload 6 37: invokevirtual #92 // Method java/util/concurrent/atomic/AtomicReferenceFieldUpdater.compareAndSet:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Z 40: ifeq 46 43: aload 6 45: areturn 46: aload_0 47: getfield #56 // Field state:Lcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State; 50: areturn 51: aload_3 52: areturn ``` After: ``` private LockFreeExponentiallyDecayingReservoir$State rescaleIfNeeded(long) 0: aload_0 1: getfield #56 // Field state:Lcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State; 4: astore_3 5: lload_1 6: aload_3 7: invokestatic #81 // Method com/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State.access$600:(Lcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State;)J 10: lsub 11: aload_0 12: getfield #36 // Field rescaleThresholdNanos:J 15: lcmp 16: iflt 26 19: aload_0 20: lload_1 21: aload_3 22: invokespecial #85 // Method doRescale:(JLcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State;)Lcom/codahale/metrics/LockFreeExponentiallyDecayingReservoir$State; 25: areturn 26: aload_3 27: areturn ``` * Avoid incrementing count when the value has reached size. ``` Benchmark (reservoirType) Mode Cnt Score Error Units ReservoirBenchmarks.updateReservoir EXPO_DECAY avgt 5 8309.300 ± 1900.398 ns/op ReservoirBenchmarks.updateReservoir LOCK_FREE_EXPO_DECAY avgt 5 70.028 ± 0.887 ns/op ``` * rescale cannot exceed size elements * line length * requireNonNull validation * Include LockFreeExponentiallyDecayingReservoir in ReservoirBenchmark Results with 32 concurrent threads on a 14c/28t Xeon W-2175 ``` Benchmark Mode Cnt Score Error Units ReservoirBenchmark.perfExponentiallyDecayingReservoir avgt 4 8.817 ± 0.310 us/op ReservoirBenchmark.perfLockFreeExponentiallyDecayingReservoir avgt 4 0.076 ± 0.002 us/op ReservoirBenchmark.perfSlidingTimeWindowArrayReservoir avgt 4 14.890 ± 0.489 us/op ReservoirBenchmark.perfSlidingTimeWindowReservoir avgt 4 39.066 ± 27.583 us/op ReservoirBenchmark.perfSlidingWindowReservoir avgt 4 4.257 ± 0.187 us/op ReservoirBenchmark.perfUniformReservoir avgt 4 0.704 ± 0.040 us/op ``` Thanks @carterkozak, @schlosna, and @spkrka!
- Loading branch information