Publish richer virtual disk statistics to oximeter #746

bnaecker · 2024-08-22T05:54:41Z

Use the virtual_disk:* timeseries defined in Omicron to publish statistics about the block operations performed on virtual disks.
Add concept of a completion callback to the existing block tracking object, and expose a way to set the callback on a block::Device.
Create object for managing all the statistics for one virtual disk backed by Crucible. This is shared between the existing oximeter producer, and the block devices, via the above completion callback mechanism. As I/Os complete, the stats are updated, and oximeter collects them periodically.
Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing.
Use virtual_machine:reset definition from TOML
Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats.
Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

- Use the `virtual_disk:*` timeseries defined in Omicron to publish statistics about the block operations performed on virtual disks. - Add concept of a completion callback to the existing block tracking object, and expose a way to set the callback on a `block::Device`. - Create object for managing all the statistics for one virtual disk backed by Crucible. This is shared between the existing oximeter producer, and the block devices, via the above completion callback mechanism. As I/Os complete, the stats are updated, and oximeter collects them periodically. - Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing. - Use `virtual_machine:reset` definition from TOML - Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats. - Simplify and cleanup the `ServerStats` types, and remove the kstat sampler from it, which never should have been there at all. - Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

bnaecker · 2024-08-22T06:00:47Z

This still needs some testing, but it's a larger chunk than I intended so I'd like to get early eyes on it. I took this opportunity to rework a lot of the metrics initialization, which had become pretty Byzantine.

The main point here is to publish richer virtual disk metrics than what we currently get from Crucible. This adds the same counters we had before, and also failure counters and histograms tracking I/O latency and I/O sizes. These use the same block device tracking machinery that was added to handle USDT probes, adding a completion callback to that struct. We now set the completion callback to update the statistics for each virtual disk. Those stats are shared with the oximeter producer registry, so they're published along with everything else.

The extras that came along for the ride here are a mishmash of cleanup tasks, mostly listed in the commit message. This PR also relies on a forthcoming PR in Omicron that adds the schema for the new statistics. I'll link that once I post it, and follow up with testing notes as well.

bin/propolis-server/src/lib/stats/virtual_disk.rs

lib/propolis/src/block/crucible.rs

bnaecker · 2024-08-22T20:35:56Z

Talked offline with @ahl. He questions the value of putting the sled identifiers on this kind of timeseries at all, which I can understand. I might have been overzealous in my attempt to address oxidecomputer/omicron#5267.

ahl · 2024-08-22T20:37:44Z

To clarify: I honor the utility (ie to us in support in some situations), but I think it complicated the authz model we've been discussing.

bnaecker · 2024-08-22T21:59:06Z

Ok, I've done an initial round of testing. The Omicron side is still blocked on oxidecomputer/omicron#6409 and this WIP branch: https://github.com/oxidecomputer/omicron/tree/new-virtual-disk-timeseries-definition. But incorporating both of those locally and creating a VM does result in the expected timeseries in ClickHouse.

Here are all of them:

0x〉\l
oximeter_collector:collections
virtual_disk:bytes_read
virtual_disk:bytes_written
virtual_disk:failed_flushes
virtual_disk:failed_reads
virtual_disk:failed_writes
virtual_disk:flushes
virtual_disk:io_latency
virtual_disk:io_size
virtual_disk:reads
virtual_disk:writes
virtual_machine:pv_panic_guest_handled
virtual_machine:pv_panic_host_handled
virtual_machine:reset
virtual_machine:vcpu_usage

Digging into individual things looks like I expect:

0x〉get virtual_disk:reads | last 1

virtual_disk:reads

 attached_instance_id: 6d39bc42-74d3-4f34-a035-6a068a4e4da3
 block_size: 4096
 disk_id: d82074ff-7098-483f-90f3-d7fca35baba6
 project_id: d50c3c4d-0008-45a3-a285-3b9c8ef84d38
 silo_id: 7fec5b38-3c43-4ed1-ab05-5152401c46a8
 sled_id: faf169b9-bf53-4a53-8f52-68c94dc377bd
 sled_model: fake-gimlet
 sled_revision: 1
 sled_serial: fake-serial
   [2024-08-22 21:44:10.723454479, 2024-08-22 21:54:40.861305986]: [109]
0x〉get virtual_disk:io_size | filter io_kind == "read" | last 1

virtual_disk:io_size

 attached_instance_id: 6d39bc42-74d3-4f34-a035-6a068a4e4da3
 block_size: 4096
 disk_id: d82074ff-7098-483f-90f3-d7fca35baba6
 io_kind: read
 project_id: d50c3c4d-0008-45a3-a285-3b9c8ef84d38
 silo_id: 7fec5b38-3c43-4ed1-ab05-5152401c46a8
 sled_id: faf169b9-bf53-4a53-8f52-68c94dc377bd
 sled_model: fake-gimlet
 sled_revision: 1
 sled_serial: fake-serial
   [2024-08-22 21:44:10.723879738, 2024-08-22 21:55:10.862958317]: [0: 105, 4096: 1, 8192: 3, 16384: 0, 32768: 0, 65536: 0, 131072: 0, 262144: 0, 524288: 0, 1048576: 0, 2097152: 0, 4194304: 0, 8388608: 0, 16777216: 0, 33554432: 0, 67108864: 0, 134217728: 0, 268435456: 0, 536870912: 0, 1073741824: 0, min: 512, max: 8192, mean: 1211.8899082568807, std_dev: 1407.1530457991503, p50: 748.4546500654893, p90: 2197.500566706399, p99: 6740.062811662923]

So that shows we had 109 total reads, and the I/O size histogram shows that those were broken out into 105 in the range [0, 4KiB), 1 in [4KiB, 8KiB), and 3 in [8KiB, 16KiB). So there are the same number of items in both, meaning we're consistent at least.

- Use 512B as the lowest I/O size histogram bin - Update Crucible dep so we can test in CI - Fixups for some cfg directives around kstats

Cargo.toml

bnaecker · 2024-08-24T03:12:22Z

There's one kind of structural question my testing revealed. These virtual disk stats are updated all the time, but only sampled when oximeter asks for it. That means we get samples every 30s or 10s or whatever we pick. We could decouple these, by sampling the stats locally more frequently, and publishing more than one sample's worth of data to oximeter. That does require a different architecture though, since we'd need a small polling loop pushing data to a queue somewhere.

I think it largely depends on how densely we want to sample the data. I'd defer to @leftwo and @ahl on that probably. I reduced the collection interval to 10s, which still feels borderline, but I admit that's a WAG.

@bnaecker

This currently brings in all non virtual_disk stats/initializer/kstat changes from @bnaecker's #746 since that abstracts where and how we do start tracking kstats. Includes: - Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing. - Use virtual_machine:reset definition from TOML - Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats. - Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all. - Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

@bnaecker

This currently brings in all non virtual_disk stats/initializer/kstat changes from @bnaecker's #746 since that abstracts where and how we do start tracking kstats. Includes: - Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing. - Use virtual_machine:reset definition from TOML - Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats. - Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all. - Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

@bnaecker

This currently brings in all non virtual_disk stats/initializer/kstat changes from @bnaecker's #746 since that abstracts where and how we do start tracking kstats. Includes: - Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing. - Use virtual_machine:reset definition from TOML - Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats. - Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all. - Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

@bnaecker

This currently brings in all non virtual_disk stats/initializer/kstat changes from @bnaecker's #746 since that abstracts where and how we do start tracking kstats. Includes: - Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing. - Use virtual_machine:reset definition from TOML - Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats. - Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all. - Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

@bnaecker

This currently brings in all non virtual_disk stats/initializer/kstat changes from @bnaecker's #746 since that abstracts where and how we do start tracking kstats. Includes: - Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing. - Use virtual_machine:reset definition from TOML - Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats. - Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all. - Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

@bnaecker

This currently brings in all non virtual_disk stats/initializer/kstat changes from @bnaecker's #746 since that abstracts where and how we do start tracking kstats. Includes: - Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing. - Use virtual_machine:reset definition from TOML - Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats. - Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all. - Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

bin/propolis-server/src/lib/initializer.rs

bin/propolis-server/src/lib/stats.rs

bin/propolis-server/src/lib/initializer.rs

bin/propolis-server/src/lib/vm/ensure.rs

bin/propolis-server/src/lib/initializer.rs

leftwo

Does this leave in place the statistics that the crucible upstairs logs as well?

bnaecker · 2024-08-26T17:03:24Z

Does this leave in place the statistics that the crucible upstairs logs as well?

Yeah, the existing crucible_upstairs:* stats are still there. Those are currently used for the web console's view of instance stats. I think we should switch that over to these data, and then we can separately decide to remove the original data if we want.

- Clarify when we create kstat sampler - Style nits and comment cleanup - Don't return early from blockdev initialization if we fail to track stats, just continue instead

lib/propolis/src/block/tracking.rs

lib/propolis/src/block/mod.rs

lib/propolis/src/block/tracking.rs

- Remove default impl of `set_completion_callback()` - Cleanup internals of `Tracking` creation

bnaecker · 2024-08-26T22:52:03Z

Just a note on the sampling interval. I talked in chat with @leftwo, and we decided that the current sampling rate of 10s is sufficient. In this PR, the sampling rate of these statistics is defined by the rate at which oximeter polls -- Propolis only generates sample on a request from oximeter. If we wanted to change this, we have two choices. We could lower that polling rate from oximeter, or we could decouple the rate at which we internally generate samples from the block-dev producer and the rate at which oximeter asks Propolis for samples. That's an architectural change, since we'd need a task that continually generates samples from the block-dev. It's also fairly easy to do, and not necessary right now. We can decide that later without any breaking changes downstream.

bnaecker requested review from pfmooney and gjcolombo August 22, 2024 05:54

leftwo reviewed Aug 22, 2024

View reviewed changes

bin/propolis-server/src/lib/stats/virtual_disk.rs Outdated Show resolved Hide resolved

leftwo reviewed Aug 22, 2024

View reviewed changes

lib/propolis/src/block/crucible.rs Show resolved Hide resolved

Shorten collection interval, since disk stats are only polled

3ce6ca7

bnaecker mentioned this pull request Aug 23, 2024

Add virtual disk timeseries schema oxidecomputer/omicron#6420

Merged

Small fixes

316feb2

- Use 512B as the lowest I/O size histogram bin - Update Crucible dep so we can test in CI - Fixups for some cfg directives around kstats

bnaecker commented Aug 23, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

bnaecker commented Aug 23, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

bnaecker mentioned this pull request Aug 23, 2024

[network metrics] instance network interface metrics #747

Merged

3 tasks

gjcolombo reviewed Aug 26, 2024

View reviewed changes

leftwo reviewed Aug 26, 2024

View reviewed changes

bin/propolis-server/src/lib/initializer.rs Show resolved Hide resolved

leftwo reviewed Aug 26, 2024

View reviewed changes

Review feedback

15327f2

- Clarify when we create kstat sampler - Style nits and comment cleanup - Don't return early from blockdev initialization if we fail to track stats, just continue instead

bnaecker requested review from leftwo and gjcolombo August 26, 2024 17:08

Rename oximeter collection interval for clarity

1ecf959

gjcolombo approved these changes Aug 26, 2024

View reviewed changes

pfmooney reviewed Aug 26, 2024

View reviewed changes

lib/propolis/src/block/tracking.rs Outdated Show resolved Hide resolved

lib/propolis/src/block/mod.rs Outdated Show resolved Hide resolved

lib/propolis/src/block/tracking.rs Outdated Show resolved Hide resolved

Review feedback

8565b0c

- Remove default impl of `set_completion_callback()` - Cleanup internals of `Tracking` creation

bnaecker requested a review from pfmooney August 26, 2024 21:17

pfmooney approved these changes Aug 26, 2024

View reviewed changes

leftwo approved these changes Aug 26, 2024

View reviewed changes

Point back to Omicron / Crucible main branches

3382496

bnaecker merged commit 88fbde7 into master Aug 27, 2024
11 checks passed

bnaecker deleted the add-virtual-disk-stats branch August 27, 2024 17:29

gjcolombo mentioned this pull request Aug 27, 2024

parse_disk_from_request mishandles backend names #750

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish richer virtual disk statistics to oximeter #746

Publish richer virtual disk statistics to oximeter #746

bnaecker commented Aug 22, 2024

bnaecker commented Aug 22, 2024

bnaecker commented Aug 22, 2024

ahl commented Aug 22, 2024

bnaecker commented Aug 22, 2024

bnaecker commented Aug 24, 2024

leftwo left a comment

bnaecker commented Aug 26, 2024

bnaecker commented Aug 26, 2024

Publish richer virtual disk statistics to oximeter #746

Publish richer virtual disk statistics to oximeter #746

Conversation

bnaecker commented Aug 22, 2024

bnaecker commented Aug 22, 2024

bnaecker commented Aug 22, 2024

ahl commented Aug 22, 2024

bnaecker commented Aug 22, 2024

bnaecker commented Aug 24, 2024

leftwo left a comment

Choose a reason for hiding this comment

bnaecker commented Aug 26, 2024

bnaecker commented Aug 26, 2024