Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish richer virtual disk statistics to oximeter #746

Merged
merged 7 commits into from
Aug 27, 2024

Conversation

bnaecker
Copy link
Contributor

  • Use the virtual_disk:* timeseries defined in Omicron to publish statistics about the block operations performed on virtual disks.
  • Add concept of a completion callback to the existing block tracking object, and expose a way to set the callback on a block::Device.
  • Create object for managing all the statistics for one virtual disk backed by Crucible. This is shared between the existing oximeter producer, and the block devices, via the above completion callback mechanism. As I/Os complete, the stats are updated, and oximeter collects them periodically.
  • Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting unwieldy and confusing.
  • Use virtual_machine:reset definition from TOML
  • Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup. This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics, such as Viona link stats.
  • Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
  • Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.

- Use the `virtual_disk:*` timeseries defined in Omicron to publish
  statistics about the block operations performed on virtual disks.
- Add concept of a completion callback to the existing block tracking
  object, and expose a way to set the callback on a `block::Device`.
- Create object for managing all the statistics for one virtual disk
  backed by Crucible. This is shared between the existing oximeter
  producer, and the block devices, via the above completion callback
  mechanism. As I/Os complete, the stats are updated, and oximeter
  collects them periodically.
- Straighten out a bunch of the initialiazation logic for the
  pre-existing metrics, producer server, and registry. This grew pretty
  organically, and was getting unwieldy and confusing.
- Use `virtual_machine:reset` definition from TOML
- Move kstat sampler creation to machine initialization process, rather
  than buried in the producer server startup. This lets us use the
  sampler in-line when initializing the vCPUs, where it should be much
  easier to re-use when we want to add more kstat-based metrics, such as
  Viona link stats.
- Simplify and cleanup the `ServerStats` types, and remove the kstat
  sampler from it, which never should have been there at all.
- Bump kstat-rs dep, which includes the new version that is
  cross-platform. This gets rid of a bunch of cfg-soup in the stats
  modules.
@bnaecker
Copy link
Contributor Author

This still needs some testing, but it's a larger chunk than I intended so I'd like to get early eyes on it. I took this opportunity to rework a lot of the metrics initialization, which had become pretty Byzantine.

The main point here is to publish richer virtual disk metrics than what we currently get from Crucible. This adds the same counters we had before, and also failure counters and histograms tracking I/O latency and I/O sizes. These use the same block device tracking machinery that was added to handle USDT probes, adding a completion callback to that struct. We now set the completion callback to update the statistics for each virtual disk. Those stats are shared with the oximeter producer registry, so they're published along with everything else.

The extras that came along for the ride here are a mishmash of cleanup tasks, mostly listed in the commit message. This PR also relies on a forthcoming PR in Omicron that adds the schema for the new statistics. I'll link that once I post it, and follow up with testing notes as well.

@bnaecker
Copy link
Contributor Author

Talked offline with @ahl. He questions the value of putting the sled identifiers on this kind of timeseries at all, which I can understand. I might have been overzealous in my attempt to address oxidecomputer/omicron#5267.

@ahl
Copy link
Contributor

ahl commented Aug 22, 2024

To clarify: I honor the utility (ie to us in support in some situations), but I think it complicated the authz model we've been discussing.

@bnaecker
Copy link
Contributor Author

Ok, I've done an initial round of testing. The Omicron side is still blocked on oxidecomputer/omicron#6409 and this WIP branch: https://github.com/oxidecomputer/omicron/tree/new-virtual-disk-timeseries-definition. But incorporating both of those locally and creating a VM does result in the expected timeseries in ClickHouse.

Here are all of them:

0x〉\l
oximeter_collector:collections
virtual_disk:bytes_read
virtual_disk:bytes_written
virtual_disk:failed_flushes
virtual_disk:failed_reads
virtual_disk:failed_writes
virtual_disk:flushes
virtual_disk:io_latency
virtual_disk:io_size
virtual_disk:reads
virtual_disk:writes
virtual_machine:pv_panic_guest_handled
virtual_machine:pv_panic_host_handled
virtual_machine:reset
virtual_machine:vcpu_usage

Digging into individual things looks like I expect:

0x〉get virtual_disk:reads | last 1

virtual_disk:reads

 attached_instance_id: 6d39bc42-74d3-4f34-a035-6a068a4e4da3
 block_size: 4096
 disk_id: d82074ff-7098-483f-90f3-d7fca35baba6
 project_id: d50c3c4d-0008-45a3-a285-3b9c8ef84d38
 silo_id: 7fec5b38-3c43-4ed1-ab05-5152401c46a8
 sled_id: faf169b9-bf53-4a53-8f52-68c94dc377bd
 sled_model: fake-gimlet
 sled_revision: 1
 sled_serial: fake-serial
   [2024-08-22 21:44:10.723454479, 2024-08-22 21:54:40.861305986]: [109]
0x〉get virtual_disk:io_size | filter io_kind == "read" | last 1

virtual_disk:io_size

 attached_instance_id: 6d39bc42-74d3-4f34-a035-6a068a4e4da3
 block_size: 4096
 disk_id: d82074ff-7098-483f-90f3-d7fca35baba6
 io_kind: read
 project_id: d50c3c4d-0008-45a3-a285-3b9c8ef84d38
 silo_id: 7fec5b38-3c43-4ed1-ab05-5152401c46a8
 sled_id: faf169b9-bf53-4a53-8f52-68c94dc377bd
 sled_model: fake-gimlet
 sled_revision: 1
 sled_serial: fake-serial
   [2024-08-22 21:44:10.723879738, 2024-08-22 21:55:10.862958317]: [0: 105, 4096: 1, 8192: 3, 16384: 0, 32768: 0, 65536: 0, 131072: 0, 262144: 0, 524288: 0, 1048576: 0, 2097152: 0, 4194304: 0, 8388608: 0, 16777216: 0, 33554432: 0, 67108864: 0, 134217728: 0, 268435456: 0, 536870912: 0, 1073741824: 0, min: 512, max: 8192, mean: 1211.8899082568807, std_dev: 1407.1530457991503, p50: 748.4546500654893, p90: 2197.500566706399, p99: 6740.062811662923]

So that shows we had 109 total reads, and the I/O size histogram shows that those were broken out into 105 in the range [0, 4KiB), 1 in [4KiB, 8KiB), and 3 in [8KiB, 16KiB). So there are the same number of items in both, meaning we're consistent at least.

- Use 512B as the lowest I/O size histogram bin
- Update Crucible dep so we can test in CI
- Fixups for some cfg directives around kstats
Cargo.toml Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
@bnaecker
Copy link
Contributor Author

There's one kind of structural question my testing revealed. These virtual disk stats are updated all the time, but only sampled when oximeter asks for it. That means we get samples every 30s or 10s or whatever we pick. We could decouple these, by sampling the stats locally more frequently, and publishing more than one sample's worth of data to oximeter. That does require a different architecture though, since we'd need a small polling loop pushing data to a queue somewhere.

I think it largely depends on how densely we want to sample the data. I'd defer to @leftwo and @ahl on that probably. I reduced the collection interval to 10s, which still feels borderline, but I admit that's a WAG.

zeeshanlakhani added a commit that referenced this pull request Aug 26, 2024
This currently brings in all non virtual_disk stats/initializer/kstat changes from
@bnaecker's #746 since that abstracts where
and how we do start tracking kstats.

Includes:
- Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting
  unwieldy and confusing.
- Use virtual_machine:reset definition from TOML
- Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup.
  This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics,
  such as Viona link stats.
- Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
- Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.
zeeshanlakhani added a commit that referenced this pull request Aug 26, 2024
This currently brings in all non virtual_disk stats/initializer/kstat changes from
@bnaecker's #746 since that abstracts where
and how we do start tracking kstats.

Includes:
- Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting
  unwieldy and confusing.
- Use virtual_machine:reset definition from TOML
- Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup.
  This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics,
  such as Viona link stats.
- Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
- Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.
zeeshanlakhani added a commit that referenced this pull request Aug 26, 2024
This currently brings in all non virtual_disk stats/initializer/kstat changes from
@bnaecker's #746 since that abstracts where
and how we do start tracking kstats.

Includes:
- Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting
  unwieldy and confusing.
- Use virtual_machine:reset definition from TOML
- Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup.
  This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics,
  such as Viona link stats.
- Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
- Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.
zeeshanlakhani added a commit that referenced this pull request Aug 26, 2024
This currently brings in all non virtual_disk stats/initializer/kstat changes from
@bnaecker's #746 since that abstracts where
and how we do start tracking kstats.

Includes:
- Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting
  unwieldy and confusing.
- Use virtual_machine:reset definition from TOML
- Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup.
  This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics,
  such as Viona link stats.
- Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
- Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.
zeeshanlakhani added a commit that referenced this pull request Aug 26, 2024
This currently brings in all non virtual_disk stats/initializer/kstat changes from
@bnaecker's #746 since that abstracts where
and how we do start tracking kstats.

Includes:
- Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting
  unwieldy and confusing.
- Use virtual_machine:reset definition from TOML
- Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup.
  This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics,
  such as Viona link stats.
- Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
- Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.
zeeshanlakhani added a commit that referenced this pull request Aug 26, 2024
This currently brings in all non virtual_disk stats/initializer/kstat changes from
@bnaecker's #746 since that abstracts where
and how we do start tracking kstats.

Includes:
- Straighten out a bunch of the initialiazation logic for the pre-existing metrics, producer server, and registry. This grew pretty organically, and was getting
  unwieldy and confusing.
- Use virtual_machine:reset definition from TOML
- Move kstat sampler creation to machine initialization process, rather than buried in the producer server startup.
  This lets us use the sampler in-line when initializing the vCPUs, where it should be much easier to re-use when we want to add more kstat-based metrics,
  such as Viona link stats.
- Simplify and cleanup the ServerStats types, and remove the kstat sampler from it, which never should have been there at all.
- Bump kstat-rs dep, which includes the new version that is cross-platform. This gets rid of a bunch of cfg-soup in the stats modules.
bin/propolis-server/src/lib/initializer.rs Outdated Show resolved Hide resolved
bin/propolis-server/src/lib/stats.rs Show resolved Hide resolved
bin/propolis-server/src/lib/initializer.rs Outdated Show resolved Hide resolved
bin/propolis-server/src/lib/vm/ensure.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@leftwo leftwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this leave in place the statistics that the crucible upstairs logs as well?

@bnaecker
Copy link
Contributor Author

Does this leave in place the statistics that the crucible upstairs logs as well?

Yeah, the existing crucible_upstairs:* stats are still there. Those are currently used for the web console's view of instance stats. I think we should switch that over to these data, and then we can separately decide to remove the original data if we want.

- Clarify when we create kstat sampler
- Style nits and comment cleanup
- Don't return early from blockdev initialization if we fail to track
  stats, just continue instead
lib/propolis/src/block/tracking.rs Outdated Show resolved Hide resolved
lib/propolis/src/block/mod.rs Outdated Show resolved Hide resolved
lib/propolis/src/block/tracking.rs Outdated Show resolved Hide resolved
- Remove default impl of `set_completion_callback()`
- Cleanup internals of `Tracking` creation
@bnaecker bnaecker requested a review from pfmooney August 26, 2024 21:17
@bnaecker
Copy link
Contributor Author

Just a note on the sampling interval. I talked in chat with @leftwo, and we decided that the current sampling rate of 10s is sufficient. In this PR, the sampling rate of these statistics is defined by the rate at which oximeter polls -- Propolis only generates sample on a request from oximeter. If we wanted to change this, we have two choices. We could lower that polling rate from oximeter, or we could decouple the rate at which we internally generate samples from the block-dev producer and the rate at which oximeter asks Propolis for samples. That's an architectural change, since we'd need a task that continually generates samples from the block-dev. It's also fairly easy to do, and not necessary right now. We can decide that later without any breaking changes downstream.

@bnaecker bnaecker merged commit 88fbde7 into master Aug 27, 2024
11 checks passed
@bnaecker bnaecker deleted the add-virtual-disk-stats branch August 27, 2024 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants