stats: integrate real symbol table into stats system #4980

jmarantz · 2018-11-07T01:31:35Z

Description: Switches the default from using fake symbol tables to using real symbol tables.
Risk Level: medium -- every effort was made to eliminate risk of contention by symbolizing stat names during initialization phases. However, some code paths may have evaded this. Using real symbol tables does appear to be significantly faster. For example, a microbenchmark for the HTTP response code stats shows approximately 40% speed improvement, in addition to a significant reduction in per-cluster memory usage.

----------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations
----------------------------------------------------------------------
BM_AddResponsesFakeSymtab         9116 ns         9116 ns        66687
BM_AddResponsesRealSymtab         5613 ns         5613 ns       125048
BM_ResponseTimingFakeSymtab        678 ns          678 ns      1032669
BM_ResponseTimingRealSymtab        379 ns          379 ns      1840949

Testing: //test/...
Docs Changes: should update stats.md to reflect the default change.
Release Notes: will need those still.
Fixes: #3585, #4196

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz · 2018-11-07T01:31:56Z

@ambuc FYI

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz · 2018-11-07T20:00:17Z

Got some encouraging perf results to motivate this large complicated scary PR :)

Memory usage is cut by 50% in HeapStatsThreadLocalStoreTest.Memory*:

--- a/test/common/stats/thread_local_store_test.cc
+++ b/test/common/stats/thread_local_store_test.cc
@@ -668,7 +668,7 @@ TEST_F(HeapStatsThreadLocalStoreTest, MemoryWithoutTls) {
-  EXPECT_LT(end_mem - start_mem, 28 * million); // actual value: 27203216 as of Oct 29, 2018
+  EXPECT_LT(end_mem - start_mem, 13 * million); // actual value: 12443472 as of Nov 7, 2018
 
 TEST_F(HeapStatsThreadLocalStoreTest, MemoryWithTls) {
@@ -691,7 +691,7 @@ TEST_F(HeapStatsThreadLocalStoreTest, MemoryWithTls) {
-  EXPECT_LT(end_mem - start_mem, 31 * million); // actual value: 30482576 as of Oct 29, 2018
+  EXPECT_LT(end_mem - start_mem, 16 * million); // actual value: 15722832 as of Nov 7, 2018

Speed tests also look like they doubled as well...but will confirm after a couple of corroborating runs.

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz · 2018-11-07T22:23:37Z

Siege results (with a modified siege.py to include contention) tell an interesting story. I was expecting a perf improvement, but it was about a wash. But now that contentions are tracked explicitly we can see that in the current state of this PR, mutex contention increases significantly.

./siege.py ~/git4/envoy/bazel-bin/source/exe/envoy-static ~/git3/envoy/bazel-bin/source/exe/envoy-static  /tmp/envoy-perf/
Logging 10 runs to /tmp/envoy-perf/siege.log ...1...2...3...4...5...6...7...8...9...10

             Clean      Std Dev      Failures  Experimental  Std Dev      Failures  Improvement
             -----      -------      --------  ------------  -------      --------  -----------
Trans Rate   39682.54   3843.494     0         38759.69      2458.056     0         2.326%
Throughput   130.16     12.606       0         127.13        8.062        0         2.328%
Failed       0.0        0.0          0         0.0           0.0          0         0
EnvoyMem     4195776.0  3275.021     0         4162984.0     1233.033     0         0.782%
VSZ          343464.0   2.191        0         343496.0      2.191        0         -0.009%
RSS          51936.0    5102.665     0         54304.0       3349.689     0         -4.559%
Contentions  5.0        4.561        0         208.0         33.147       0         -4060.0%
WaitCycles   297543.0   2381894.537  0         57518417.0    11873445.75  0         -19231.128%

More work is required on the PR to remove the contentions, by symbolizing stat names on startup rather than composing strings on every request to lookup a stat. Or better yet, hold onto the stat objects themselves when possible.

jmarantz · 2018-11-07T22:37:00Z

Microbenchmark shows significant improvement. This differs from the siege results in that there will be no contention in the single-threaded benchmark, and of course it measures only the stat lookups and does not include all the other work needed to proxy a few K of lorem ipsum.

old:

--------------------------------------------------------------
Benchmark                       Time           CPU Iterations
--------------------------------------------------------------
BM_StatsNoTls_mean       17363665 ns   17363797 ns         31
BM_StatsNoTls_median     17291555 ns   17291754 ns         31
BM_StatsNoTls_stddev       362025 ns     362036 ns         31
BM_StatsWithTls_mean     20593810 ns   20593997 ns         27
BM_StatsWithTls_median   20169259 ns   20169508 ns         27
BM_StatsWithTls_stddev    1412912 ns    1412939 ns         27

new

--------------------------------------------------------------
Benchmark                       Time           CPU Iterations
--------------------------------------------------------------
BM_StatsNoTls_mean        7010833 ns    7010901 ns         78
BM_StatsNoTls_median      6977452 ns    6977538 ns         78
BM_StatsNoTls_stddev       133710 ns     133713 ns         78
BM_StatsWithTls_mean      9030426 ns    9030522 ns         64
BM_StatsWithTls_median    8972650 ns    8972531 ns         64
BM_StatsWithTls_stddev     171496 ns     171491 ns         64

This is more than a 2x improvement in stats overhead (assuming we are looking up stats from symbolized names rather than from strings).

Signed-off-by: Joshua Marantz <jmarantz@google.com>

… it in, and preconstruct strings. Signed-off-by: Joshua Marantz <jmarantz@google.com>

…_view rather than std::string. Signed-off-by: Joshua Marantz <jmarantz@google.com>

…ars. Signed-off-by: Joshua Marantz <jmarantz@google.com>

Signed-off-by: Joshua Marantz <jmarantz@google.com>

…, to avoid plumbing one common SymbolTable through the object graph. Signed-off-by: Joshua Marantz <jmarantz@google.com>

Signed-off-by: Joshua Marantz <jmarantz@google.com>

…ile-system not per-file, and add histogramming of symbol-table encodes. Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123

Nice!

jmarantz · 2020-06-25T16:15:29Z

tsan tests failures due to timeouts, one of which is eds_speed_test_benchmark_test. On my workstation, with tsan:

//test/common/upstream:eds_speed_test_benchmark_test PASSED in 156.6s

So I'm trying to re-run that; I guess maybe I've trigged that already but it will run after 'format'?

jmarantz · 2020-06-25T20:53:56Z

/azp run

azure-pipelines · 2020-06-25T20:54:10Z

Azure Pipelines successfully started running 1 pipeline(s), but failed to run 2 pipeline(s).

Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123 · 2020-06-26T15:38:09Z

@jmarantz sorry can you merge master again to fix the conflict? The TSAN flake should be fixed.

/wait

jmarantz · 2020-06-26T15:42:20Z

I think I have another TSAN timeout issue as well, which I am hoping will be mitigated by #11768

but let's see what happens after the merge.

Signed-off-by: Joshua Marantz <jmarantz@google.com>

…bing it. Signed-off-by: Joshua Marantz <jmarantz@google.com>

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz · 2020-06-27T19:27:11Z

@mattklein123 this looks like it passes now; I had to de-flake the http2 integration test with a wait-for-counter-ge, and increase the test-size for ads_integration_test. I suspect that tsan overhead disproportionately slows down the tests with real symbol table due to all the locks being taken, regardless of whether they contend.

mattklein123

Awesome, LGTM with a small merge issue I think.

/wait

mattklein123 · 2020-06-28T02:44:48Z

docs/root/version_history/current.rst

@@ -107,6 +107,9 @@ New Features
 * metrics service: added added :ref:`API version <envoy_v3_api_field_config.metrics.v3.MetricsServiceConfig.transport_api_version>` to explicitly set the version of gRPC service endpoint and message to be used.
 * network filters: added a :ref:`postgres proxy filter <config_network_filters_postgres_proxy>`.
 * network filters: added a :ref:`rocketmq proxy filter <config_network_filters_rocketmq_proxy>`.
+* performance: stats symbol table implementation (enabled by default; to disable it, add
+  `--use-fake-symbol-table 1` to the command-line arguments when starting Envoy).
+* prometheus stats: fix the sort order of output lines to comply with the standard.


Merge issue?

Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123

Nice!

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz · 2020-06-29T02:31:51Z

Thanks for all the help pushing that through @mattklein123! This has to be close a the record for longest PR issue->merge time delta.

jmarantz added 4 commits October 31, 2018 12:48

checkpoint

ed31d09

Signed-off-by: Joshua Marantz <jmarantz@google.com>

got heap_stat_data_test working.

1cd7a7b

Signed-off-by: Joshua Marantz <jmarantz@google.com>

all tests working; but still taking locks in hot-path.

d62f533

Signed-off-by: Joshua Marantz <jmarantz@google.com>

format

b3cad8a

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz added 3 commits November 6, 2018 20:33

Merge branch 'master' into integration-symtab

63138e8

Signed-off-by: Joshua Marantz <jmarantz@google.com>

format

524cf02

Signed-off-by: Joshua Marantz <jmarantz@google.com>

fix botched merge.

bb84b90

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz added 3 commits November 7, 2018 16:12

perf test and tweaks

f9c0441

Signed-off-by: Joshua Marantz <jmarantz@google.com>

formatting, comments, and tests.

7c5ac2f

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Merge branch 'master' into integration-symtab

5e1c713

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz added 3 commits November 8, 2018 08:59

use std::make_unique for clang-tidy

bdbf340

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Make a class for Http::CodeUtility::chargeResponseTiming et al, plumb…

d70cb18

… it in, and preconstruct strings. Signed-off-by: Joshua Marantz <jmarantz@google.com>

Use join() rather than fmt::format, and hold static strings in string…

9c49dec

…_view rather than std::string. Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz mentioned this pull request Nov 8, 2018

stats: Capture all the strings used to accumulate http stats in an object, plumbed through the system. #4997

Merged

jmarantz added 2 commits November 8, 2018 13:30

fix router test (empty prefix) and use const string_view for member v…

df7d7a1

…ars. Signed-off-by: Joshua Marantz <jmarantz@google.com>

add speed-test.

da648b3

Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123 self-assigned this Nov 8, 2018

jmarantz added 9 commits November 8, 2018 14:01

Remove libraries not needed by speed test.

ec66aaf

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Merge branch 'master' into integration-symtab

461471a

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Merge branch 'http-code-stats-as-object' into integration-symtab

32ff476

Signed-off-by: Joshua Marantz <jmarantz@google.com>

checkpoint

b711b1b

Signed-off-by: Joshua Marantz <jmarantz@google.com>

checkpoint

7723309

Signed-off-by: Joshua Marantz <jmarantz@google.com>

partially symbolize http-response-code stats.

a892f70

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Virtualize SymbolTable (but not symbols or StatName) for use in mocks…

5fadb96

…, to avoid plumbing one common SymbolTable through the object graph. Signed-off-by: Joshua Marantz <jmarantz@google.com>

Merge branch 'master' into integration-symtab

3dacb8e

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Pre-allocate symbols for 200s and 404s, alloc file-system stats per f…

aaca920

…ile-system not per-file, and add histogramming of symbol-table encodes. Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123 previously approved these changes Jun 25, 2020

View reviewed changes

Merge branch 'master' into integration-symtab

c3bf36d

Signed-off-by: Joshua Marantz <jmarantz@google.com>

repokitteh-read-only bot added the waiting label Jun 26, 2020

Merge branch 'master' into integration-symtab

3193beb

Signed-off-by: Joshua Marantz <jmarantz@google.com>

jmarantz dismissed mattklein123’s stale review via 3193beb June 26, 2020 15:43

repokitteh-read-only bot removed the waiting label Jun 26, 2020

mattklein123 added the waiting label Jun 26, 2020

jmarantz added 2 commits June 26, 2020 17:19

Merge branch 'master' into integration-symtab

55e2060

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Merge branch 'master' into integration-symtab

4d3043e

Signed-off-by: Joshua Marantz <jmarantz@google.com>

repokitteh-read-only bot removed the waiting label Jun 27, 2020

jmarantz added 2 commits June 27, 2020 13:06

try to avoid race by waiting for a timeout stat rather than just stro…

5695a17

…bing it. Signed-off-by: Joshua Marantz <jmarantz@google.com>

Declare ads_integregion_test as enormous to try to avoid tsan timeouts.

9d33cce

Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123 requested changes Jun 28, 2020

View reviewed changes

repokitteh-read-only bot added the waiting label Jun 28, 2020

remove superfluous line, that probably was from a merge.

96e2c6c

Signed-off-by: Joshua Marantz <jmarantz@google.com>

repokitteh-read-only bot removed the waiting label Jun 28, 2020

back out accidental commit

141c716

Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123 approved these changes Jun 28, 2020

View reviewed changes

Merge branch 'master' into integration-symtab

fabc533

Signed-off-by: Joshua Marantz <jmarantz@google.com>

mattklein123 merged commit 0c03a76 into envoyproxy:master Jun 28, 2020

jmarantz deleted the integration-symtab branch June 29, 2020 02:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stats: integrate real symbol table into stats system #4980

stats: integrate real symbol table into stats system #4980

jmarantz commented Nov 7, 2018 •

edited

Loading

jmarantz commented Nov 7, 2018

jmarantz commented Nov 7, 2018

jmarantz commented Nov 7, 2018 •

edited

Loading

jmarantz commented Nov 7, 2018 •

edited

Loading

mattklein123 left a comment

jmarantz commented Jun 25, 2020

jmarantz commented Jun 25, 2020

azure-pipelines bot commented Jun 25, 2020

mattklein123 commented Jun 26, 2020

jmarantz commented Jun 26, 2020

jmarantz commented Jun 27, 2020

mattklein123 left a comment

mattklein123 Jun 28, 2020

mattklein123 left a comment

jmarantz commented Jun 29, 2020

stats: integrate real symbol table into stats system #4980

stats: integrate real symbol table into stats system #4980

Conversation

jmarantz commented Nov 7, 2018 • edited Loading

jmarantz commented Nov 7, 2018

jmarantz commented Nov 7, 2018

jmarantz commented Nov 7, 2018 • edited Loading

jmarantz commented Nov 7, 2018 • edited Loading

mattklein123 left a comment

Choose a reason for hiding this comment

jmarantz commented Jun 25, 2020

jmarantz commented Jun 25, 2020

azure-pipelines bot commented Jun 25, 2020

mattklein123 commented Jun 26, 2020

jmarantz commented Jun 26, 2020

jmarantz commented Jun 27, 2020

mattklein123 left a comment

Choose a reason for hiding this comment

mattklein123 Jun 28, 2020

Choose a reason for hiding this comment

mattklein123 left a comment

Choose a reason for hiding this comment

jmarantz commented Jun 29, 2020

jmarantz commented Nov 7, 2018 •

edited

Loading

jmarantz commented Nov 7, 2018 •

edited

Loading

jmarantz commented Nov 7, 2018 •

edited

Loading