Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Add support for wasm runtime metrics try #2 #4483

Merged
merged 67 commits into from
Dec 16, 2021
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
e5ac481
Add runtime metrics provider
sandreim Dec 7, 2021
f2fedf5
Runner changes
sandreim Dec 7, 2021
3932b89
Some sample metrics in paras_inherent
sandreim Dec 7, 2021
c9099e9
update cargo toml
sandreim Dec 7, 2021
21c967b
fmt
sandreim Dec 7, 2021
b6205b2
bug
sandreim Dec 7, 2021
e058dc1
more fmt after merge
sandreim Dec 7, 2021
0d9760a
Refactor metric prefix override
sandreim Dec 8, 2021
ae187f7
fmt
sandreim Dec 8, 2021
3189117
remove bug comment
sandreim Dec 8, 2021
561c414
Add runtime metric primitives
sandreim Dec 8, 2021
bf5e5eb
Impl trace event parsing
sandreim Dec 8, 2021
e3385de
Update metrics
sandreim Dec 8, 2021
267fc85
cargo lock
sandreim Dec 8, 2021
82356e5
fmt
sandreim Dec 8, 2021
41c4034
Fix target check
sandreim Dec 8, 2021
a6ed5d4
Merge branch 'master' of github.com:paritytech/polkadot into sandreim…
sandreim Dec 9, 2021
6d8cb15
Runtime metrics primitives
sandreim Dec 9, 2021
d2381ea
Review feedback
sandreim Dec 9, 2021
155c9b7
Runtime metrics crate
sandreim Dec 9, 2021
2f27805
Node side runtime metric changes
sandreim Dec 9, 2021
cfd62ab
use runtime CounterVec instead of macro
sandreim Dec 9, 2021
e6855e1
fmt nice
sandreim Dec 9, 2021
90b78c8
remove dead code
sandreim Dec 9, 2021
ff02191
base58 decoding
sandreim Dec 9, 2021
975e46b
base58 encoding
sandreim Dec 9, 2021
6be0e3b
fix warn
sandreim Dec 9, 2021
fd612f1
typo
sandreim Dec 9, 2021
609bb74
Review feedback
sandreim Dec 10, 2021
c07f5cc
Finish label support
sandreim Dec 10, 2021
06af03b
fmt
sandreim Dec 10, 2021
c446960
please compile
sandreim Dec 10, 2021
470b976
add feature gate
sandreim Dec 10, 2021
1342382
fmt
sandreim Dec 10, 2021
fe95d3d
Comment cargo toml
sandreim Dec 10, 2021
1c66365
Fix cargo toml description
sandreim Dec 10, 2021
b4aa11d
merge master
sandreim Dec 10, 2021
129aae2
merge master
sandreim Dec 13, 2021
cff2691
Update doc.
sandreim Dec 13, 2021
a97a742
switch to `runtime-metrics` feature
sandreim Dec 13, 2021
eab86b2
fmt
sandreim Dec 13, 2021
9d64fd0
cargo toml
sandreim Dec 13, 2021
2703729
fix tests
sandreim Dec 13, 2021
fdc99dc
fixes
sandreim Dec 13, 2021
2eb038d
better ux
sandreim Dec 13, 2021
ca4dd9f
from_utf8_unchecked is safe
sandreim Dec 13, 2021
977f84d
fmt
sandreim Dec 13, 2021
e7c5636
Add Counter and refactor
sandreim Dec 13, 2021
9ed04c6
Fixes
sandreim Dec 13, 2021
09e0978
review fixes
sandreim Dec 14, 2021
a6442af
more fixes
sandreim Dec 14, 2021
04f517c
add integration test
sandreim Dec 14, 2021
1c44ffe
dev deps
sandreim Dec 14, 2021
90dd6fb
gitlab script update
sandreim Dec 14, 2021
7303a4a
review fixes
sandreim Dec 15, 2021
796338f
fix merge damage
sandreim Dec 15, 2021
0b6f0f2
Run tests twice
sandreim Dec 15, 2021
bc01068
small fix
sandreim Dec 15, 2021
e818617
typo
sandreim Dec 15, 2021
5a1050d
cargo lock
sandreim Dec 15, 2021
2febc4c
tests
sandreim Dec 15, 2021
fecd12b
spellcheck happy ?
sandreim Dec 15, 2021
0b405e0
more fixes
sandreim Dec 16, 2021
78620f6
Guard tracing init
sandreim Dec 16, 2021
061efe2
missing copyright
sandreim Dec 16, 2021
6dae53e
Merge remote-tracking branch 'origin/master' into sandreim/runtime_me…
Dec 16, 2021
f1d515e
update lockfile for substrate
Dec 16, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ panic = "unwind"
runtime-benchmarks= [ "polkadot-cli/runtime-benchmarks" ]
try-runtime = [ "polkadot-cli/try-runtime" ]
disputes = [ "polkadot-cli/disputes" ]
with-tracing = [ "polkadot-cli/with-tracing" ]

# Configuration for building a .deb package - for use with `cargo-deb`
[package.metadata.deb]
Expand Down
3 changes: 3 additions & 0 deletions cli/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ frame-benchmarking-cli = { git = "https://github.com/paritytech/substrate", bran
try-runtime-cli = { git = "https://github.com/paritytech/substrate", branch = "master", optional = true }
sc-cli = { git = "https://github.com/paritytech/substrate", branch = "master", optional = true }
sc-service = { git = "https://github.com/paritytech/substrate", branch = "master", optional = true }
sc-tracing = { git = "https://github.com/paritytech/substrate", branch = "master" }
polkadot-node-metrics = { path = "../node/metrics" }

# this crate is used only to enable `trie-memory-tracker` feature
# see https://github.com/paritytech/substrate/pull/6745
Expand Down Expand Up @@ -62,3 +64,4 @@ rococo-native = [ "service/rococo-native" ]

malus = [ "full-node", "service/malus" ]
disputes = [ "service/disputes" ]
with-tracing = ["service/with-tracing"]
8 changes: 8 additions & 0 deletions cli/src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -121,3 +121,11 @@ pub struct Cli {
#[structopt(flatten)]
pub run: RunCmd,
}

impl Cli {
/// Updates the prometheus metric prefix.
/// Must be called before `CliConfiguration::create_configuration()`.
pub fn update_prometheus_metric_prefix(&mut self, prefix: &'static str) {
self.run.base.prometheus_metric_prefix = prefix;
}
}
sandreim marked this conversation as resolved.
Show resolved Hide resolved
26 changes: 22 additions & 4 deletions cli/src/command.rs
Original file line number Diff line number Diff line change
Expand Up @@ -223,8 +223,17 @@ pub fn run_node(run: Cli, overseer_gen: impl service::OverseerGen) -> Result<()>
run_node_inner(run, overseer_gen)
}

fn run_node_inner(cli: Cli, overseer_gen: impl service::OverseerGen) -> Result<()> {
let runner = cli.create_runner(&cli.run.base).map_err(Error::from)?;
fn run_node_inner<F>(
cli: Cli,
overseer_gen: impl service::OverseerGen,
logger_hook: F,
) -> Result<()>
where
F: FnOnce(&mut sc_cli::LoggerBuilder, &sc_service::Configuration),
{
let runner = cli
.create_runner_with_logger_hook::<sc_cli::RunCmd, F>(&cli.run.base, logger_hook)
.map_err(Error::from)?;
let chain_spec = &runner.config().chain_spec;

set_default_ss58_version(chain_spec);
Expand Down Expand Up @@ -265,12 +274,21 @@ fn run_node_inner(cli: Cli, overseer_gen: impl service::OverseerGen) -> Result<(
})
}

/// TODO: Start the node with support for publishing runtime metrics.
ordian marked this conversation as resolved.
Show resolved Hide resolved
pub fn run_with_wasm_metrics() -> Result<()> {
Ok(())
}

/// Parses polkadot specific CLI arguments and run the service.
pub fn run() -> Result<()> {
let cli = Cli::from_args();
let mut cli: Cli = Cli::from_args();

// Override the default substrate metric prefix.
cli.update_prometheus_metric_prefix("polkadot");

match &cli.subcommand {
None => run_node_inner(cli, service::RealOverseerGen),
// TODO: gate by feature `runtime_metrics`.
None => run_node_inner(cli, service::RealOverseerGen, polkadot_node_metrics::logger_hook()),
Some(Subcommand::BuildSpec(cmd)) => {
let runner = cli.create_runner(cmd)?;
Ok(runner.sync_run(|config| cmd.run(config.chain_spec, config.network))?)
Expand Down
6 changes: 6 additions & 0 deletions node/client/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,9 @@ polkadot = ["polkadot-runtime"]
kusama = ["kusama-runtime"]
rococo = ["rococo-runtime"]
westend = ["westend-runtime"]
with-tracing = [
"rococo-runtime/with-tracing",
"kusama-runtime/with-tracing",
"westend-runtime/with-tracing",
"polkadot-runtime/with-tracing",
]
7 changes: 7 additions & 0 deletions node/metrics/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,17 @@ description = "Subsystem traits and message definitions"
[dependencies]
futures = "0.3.17"
futures-timer = "3.0.2"
tracing = "0.1.29"

metered-channel = { path = "../metered-channel" }

substrate-prometheus-endpoint = { git = "https://github.com/paritytech/substrate", branch = "master" }
sc-tracing = { git = "https://github.com/paritytech/substrate", branch = "master" }
sc-cli = { git = "https://github.com/paritytech/substrate", branch = "master" }
sc-service = { git = "https://github.com/paritytech/substrate", branch = "master" }
codec = { package = "parity-scale-codec", version = "2.2.0" }
sp-tracing = { git = "https://github.com/paritytech/substrate", branch = "master" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp-tracing is in here twice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually not the same :) but only sc-tracing is needed.

primitives = { package = "polkadot-primitives", path = "../../primitives/" }

[features]
default = []
2 changes: 2 additions & 0 deletions node/metrics/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ pub use metered_channel as metered;
/// Cyclic metric collection support.
pub mod metronome;
pub use self::metronome::Metronome;
pub mod runtime;
pub use self::runtime::logger_hook;

/// This module reexports Prometheus types and defines the [`Metrics`] trait.
pub mod metrics {
Expand Down
152 changes: 152 additions & 0 deletions node/metrics/src/runtime.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
// Copyright 2021 Parity Technologies (UK) Ltd.
// This file is part of Polkadot.

// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.

// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.

// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.

//! Runtime Metrics helpers.
//!
//! Builds on top of Substrate wasm tracing support.

use codec::Decode;
use primitives::v0::{RuntimeMetricOp, RuntimeMetricUpdate};
use std::{
collections::hash_map::HashMap,
sync::{Arc, Mutex},
};
use substrate_prometheus_endpoint::{register, CounterVec, Opts, PrometheusError, Registry, U64};

/// We only support CounterVec for now.
/// TODO: add more when needed.
sandreim marked this conversation as resolved.
Show resolved Hide resolved
#[derive(Clone, Default)]
pub struct Metrics {
counter_vecs: Arc<Mutex<HashMap<String, CounterVec<U64>>>>,
}

/// Runtime metrics wrapper.
#[derive(Clone)]
pub struct RuntimeMetricsProvider(Registry, Metrics);

/// A set of metric labels.
#[derive(Clone)]
pub struct RuntimeMetricLabels(Vec<RuntimeMetricLabel>);

/// A metric label value.
#[derive(Clone)]
pub struct RuntimeMetricLabel(&'static str);

impl RuntimeMetricLabel {
/// Returns the inner static string.
pub fn as_str(&self) -> &'static str {
self.0
}
}

impl From<&'static str> for RuntimeMetricLabel {
fn from(s: &'static str) -> Self {
Self(s)
}
}
impl RuntimeMetricsProvider {
/// Creates new instance.
pub fn new(metrics_registry: Registry) -> Self {
Self(metrics_registry, Metrics::default())
}

/// Register a counter vec metric.
pub fn register_countervec(
&self,
metric_name: &str,
description: &str,
label: RuntimeMetricLabel,
) -> Result<(), PrometheusError> {
if !self.1.counter_vecs.lock().expect("bad lock").contains_key(metric_name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold the lock for the entire scope

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, runtime never panics, handle it and print a warning

let counter_vec = register(
CounterVec::new(Opts::new(metric_name, description), &[label.as_str()])?,
&self.0,
)?;

self.1
.counter_vecs
.lock()
.expect("bad lock")
.insert(metric_name.to_owned(), counter_vec);
}

Ok(())
}

/// Increment a counter vec by a value.
pub fn inc_counter_by(&self, name: &str, value: u64, label: RuntimeMetricLabel) {
match self.register_countervec(name, "default description", label.clone()) {
Ok(_) => {
// The metric is in the hashmap, unwrap won't panic.
self.1
.counter_vecs
.lock()
.expect("bad lock")
.get_mut(name)
.unwrap()
.with_label_values(&[label.as_str()])
.inc_by(value);
},
Err(_) => {},
}
}
}

impl sc_tracing::TraceHandler for RuntimeMetricsProvider {
fn handle_span(&self, _span: &sc_tracing::SpanDatum) {}
fn handle_event(&self, event: &sc_tracing::TraceEvent) {
let target = event.values.string_values.get("target");
if target.is_none() || target.unwrap().ne("metrics") {
sandreim marked this conversation as resolved.
Show resolved Hide resolved
return
}

if let Some(update_op_str) = event.values.string_values.get("params").cloned() {
// TODO: Fix ugly hack because the payload comes in as a formatted string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said, adding a base64 encoding would avoid this entirely.

Copy link
Contributor Author

@sandreim sandreim Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added base58 encoding.

const SKIP_CHARS: usize = " { update_op: ".len();

match RuntimeMetricUpdate::decode(&mut update_op_str[SKIP_CHARS..].as_bytes()) {
Ok(update_op) => {
self.parse_metric_update(update_op);
},
Err(e) => {
tracing::error!("TraceEvent decode failed: {:?}", e);
},
}
}
}
}

impl RuntimeMetricsProvider {
fn parse_metric_update(&self, update: RuntimeMetricUpdate) {
match update.op {
RuntimeMetricOp::Increment(value) => {
self.inc_counter_by(update.metric_name(), value, "test_label".into());
},
}
}
}

/// Returns the custom profiling closure that we'll apply to the LoggerBuilder.
pub fn logger_hook() -> impl FnOnce(&mut sc_cli::LoggerBuilder, &sc_service::Configuration) -> () {
|logger_builder, config| {
if config.prometheus_registry().is_none() {
return
sandreim marked this conversation as resolved.
Show resolved Hide resolved
}
let registry = config.prometheus_registry().cloned().unwrap();
let metrics_provider = RuntimeMetricsProvider::new(registry);
logger_builder.with_custom_profiling(vec![Box::new(metrics_provider)]);
}
}
8 changes: 8 additions & 0 deletions node/service/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -173,3 +173,11 @@ try-runtime = [
]
malus = ["full-node"]
disputes = ["polkadot-node-core-dispute-coordinator/disputes"]
with-tracing = [
"polkadot-client/with-tracing",
"rococo-runtime/with-tracing",
"westend-runtime/with-tracing",
"kusama-runtime/with-tracing",
"polkadot-runtime/with-tracing",
"polkadot-runtime-parachains/with-tracing"
]
11 changes: 0 additions & 11 deletions node/service/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -272,15 +272,6 @@ impl IdentifyVariant for Box<dyn ChainSpec> {
}
}

// If we're using prometheus, use a registry with a prefix of `polkadot`.
fn set_prometheus_registry(config: &mut Configuration) -> Result<(), Error> {
if let Some(PrometheusConfig { registry, .. }) = config.prometheus_config.as_mut() {
*registry = Registry::new_custom(Some("polkadot".into()), None)?;
}

Ok(())
}

/// Initialize the `Jeager` collector. The destination must listen
/// on the given address and port for `UDP` packets.
#[cfg(any(test, feature = "full-node"))]
Expand Down Expand Up @@ -344,8 +335,6 @@ where
RuntimeApiCollection<StateBackend = sc_client_api::StateBackendFor<FullBackend, Block>>,
ExecutorDispatch: NativeExecutionDispatch + 'static,
{
set_prometheus_registry(config)?;

let telemetry = config
.telemetry_endpoints
.clone()
Expand Down
24 changes: 24 additions & 0 deletions primitives/src/v0.rs
Original file line number Diff line number Diff line change
Expand Up @@ -923,6 +923,30 @@ pub mod fisherman {
}
}

// TODO: add feature gate.
/// Runtime metric operations.
#[derive(Debug, PartialEq, Eq, Clone, Encode, Decode, TypeInfo)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think all of these are needed.

Copy link
Contributor

@drahnr drahnr Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v0 is deprecated and will be removed soon™, please move these to v1 - make sure to at the very least guard Debug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Got it.

pub enum RuntimeMetricOp {
/// Increment metric by value.
Increment(u64),
}

/// Runtime metric update event.
#[derive(Debug, PartialEq, Eq, Clone, Encode, Decode, TypeInfo)]
pub struct RuntimeMetricUpdate {
/// The name of the metric.
pub metric_name: Vec<u8>,
/// The operation applied to the metric.
pub op: RuntimeMetricOp,
}

impl RuntimeMetricUpdate {
/// Returns the metric name.
pub fn metric_name(&self) -> &str {
unsafe { sp_std::str::from_utf8_unchecked(&self.metric_name) }
Copy link
Contributor

@drahnr drahnr Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of this. Can't we use String::parse_from_utf8_lossy in the "std" case? That's where we need it right?

}
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down
1 change: 1 addition & 0 deletions runtime/kusama/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -264,3 +264,4 @@ disable-runtime-api = []
on-chain-release-build = [
"sp-api/disable-logging",
]
with-tracing = ["frame-executive/with-tracing", "sp-io/with-tracing"]
2 changes: 2 additions & 0 deletions runtime/parachains/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ sp-session = { git = "https://github.com/paritytech/substrate", branch = "master
sp-staking = { git = "https://github.com/paritytech/substrate", branch = "master", default-features = false }
sp-core = { git = "https://github.com/paritytech/substrate", branch = "master", default-features = false }
sp-keystore = { git = "https://github.com/paritytech/substrate", branch = "master", optional = true }
sp-tracing = { version = "4.0.0-dev", branch = "master", git = "https://github.com/paritytech/substrate", default-features = false }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These dependencies should be optional = true


pallet-authority-discovery = { git = "https://github.com/paritytech/substrate", branch = "master", default-features = false }
pallet-authorship = { git = "https://github.com/paritytech/substrate", branch = "master", default-features = false }
Expand Down Expand Up @@ -98,3 +99,4 @@ try-runtime = [
"pallet-timestamp/try-runtime",
"pallet-vesting/try-runtime",
]
with-tracing = ["sp-tracing/with-tracing"]
Loading