Single file on disk format #128

wesleywiser · 2020-08-21T14:09:50Z

Right now, measureme records data into three files for each profiling session: one for event data, one for string data and one which maps string ids to positions in the string data file. We've often gotten requests to consolidate this down to a single file.

One idea I've been thinking about is to used fixed size pages to store the event data and string data:

--------------------------------------------------------------------
| event0 | event1 | eventN... |     stringN... | string1 | string0 |
--------------------------------------------------------------------

Events would be written from the beginning of the page to the end and strings would be written from the end of the page to the beginning. Once there is insufficient space in the page to write a new event or a new string, a new page would be created and the data would be written there. Of course, we'll want to know how many events there are and where the string data starts in the page, so we'll probably need a header block at the start of the page.

That takes care of two of the three data types but we still have the string index data. To handle that, we can have dedicated pages which just contain the string index data.

The resulting data file might look a bit like this:

----page0-----------------------------------------------------------
| header | is_string_index=1 | event_count=0 | string_start_idx=0
| idx0 | idx1 | idxN... |
--------------------------------------------------------------------

----page1-----------------------------------------------------------
| header | is_string_index=0 | event_count=n | string_start_idx=x
| event0 | event1 | eventN... |     stringN... | string1 | string0 |
--------------------------------------------------------------------

----pageN-----------------------------------------------------------
...
--------------------------------------------------------------------

On top of this, there are quite a few enhancements we can add:

String index pages can have a "forward pointer" to the next string index page so we don't have to scan every page linearly.
Header blocks in the pages give us some scratch space to record additional data or metadata
We can also change our approach to multi-threaded writing. For example, we can hand pages out per-thread which would then allow us to do concurrent writes without any locking since each thread is writing to different pages. Adding new pages could be done with an atomic add operation so no locks would be required anywhere.

cc @michaelwoerister

The text was updated successfully, but these errors were encountered:

Mark-Simulacrum · 2020-08-21T14:37:46Z

rustc-perf will need to be updated when changes are made here, but since it'll presumably be a breaking release anyway, that seems fine. We currently store the three files as a Snappy-compressed tarball on S3, but presumably this new format will compress approximately just as well. (I forget if I actually measured how much Snappy compression gave us on many crates, but IIRC on at least one sample case it was 2x).

michaelwoerister · 2020-08-28T13:20:52Z

I haven't thought about all the implications here in detail but I think that implementing #96 first might simplify this approach because it makes string index data just a special case of regular string data.

Depending on whether with have a few disambiguation bits to spare at the start of each event record, we might even be able to freely mix and match string and event data.

I'll think a bit more about this. I need to read up on the data format again.

michaelwoerister · 2020-08-29T16:35:59Z

OK, I spent a little time thinking about this. Splitting things into pages is a good idea, I had not thought of that. But it actually seems like a generalization of a simpler single-file approach that was proposed a few times in the past, namely keeping everything in memory until the end and then just concatenating the three "files" into a single file on disk. Splitting things into pages is similar -- i.e. it also concatenates the data, just in an interleaved fashion -- with the advantage that there's only ever a limited amount of stuff that needs to reside in memory.

This also makes me think that it's actually a bit redundant to have one kind of page that handles event and string data and another kind of page that handles string index data. We could instead just have three kinds of pages, one for events, one for string data, and one for string indices. I think that would make things relatively straightforward to implement.

An alternative approach to having just a single file would be to have the binary format for string data, string indices, and events all follow the same grammar, so that one can unambiguously parse the file. #96 is a first step towards that and would unify the format for string data and string indices. It looks like we could easily take a few bits from the RawEvent's thread id field and use those as an unambiguous tag for events. In essence, one would always be able to tell from the first byte if a given record is an event, a string data list, or a string index entry.

There are a couple of downsides to that approach though:

In order to build the string index table one has to parse the entire file once. Not sure if that is an actual performance issue, as I suppose that parsing can be done basically as quickly as loading the data from disk, and the paging approach also has to touch most of the file in order to find all string index pages.
In order to iterate over events in reverse order (which many of the tools do), one also has to parse the entire file and build an in-memory table that records the location of each event in the stream.
Events and string data all share the same address space now. Since currently string ids are often actual 30-bit addresses, things could get quite crowded for larger profiles, since 30 bits is only 1 gigabyte. The paging approach, in contrast, would allow us to have a separate address space for each kind of page.

I'm not quite sure which approach I like more. The paging approach might be more efficient but it also adds a whole new concept to the file format. The common grammar approach just extends an existing concept, but requires more up-front work in the post-processing tools.

michaelwoerister · 2020-09-01T07:04:43Z

Well, it turns out that the rustc_middle is too large already for the current encoding scheme, so the common-grammar approach described above (which would restrict the address space even more) is off the table, I guess.

@wesleywiser, do you mind if I take a stab at implementing the paging scheme? Are there any other efforts that might conflict with that?

wesleywiser · 2020-09-03T15:20:33Z

@michaelwoerister Sorry for the late reply!

@wesleywiser, do you mind if I take a stab at implementing the paging scheme? Are there any other efforts that might conflict with that?

Please feel free to! I'm not aware of anything in progress that would touch that part of the code.

michaelwoerister mentioned this issue Sep 7, 2020

[Experimental] Performance test new paging-based measureme data sink. rust-lang/rust#76436

Closed

michaelwoerister mentioned this issue Sep 18, 2020

Implement paged data serialization #132

Merged

6 tasks

wesleywiser closed this as completed in #132 Sep 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single file on disk format #128

Single file on disk format #128

wesleywiser commented Aug 21, 2020

Mark-Simulacrum commented Aug 21, 2020

michaelwoerister commented Aug 28, 2020

michaelwoerister commented Aug 29, 2020

michaelwoerister commented Sep 1, 2020

wesleywiser commented Sep 3, 2020

Single file on disk format #128

Single file on disk format #128

Comments

wesleywiser commented Aug 21, 2020

Mark-Simulacrum commented Aug 21, 2020

michaelwoerister commented Aug 28, 2020

michaelwoerister commented Aug 29, 2020

michaelwoerister commented Sep 1, 2020

wesleywiser commented Sep 3, 2020