Elaboration of the blob storage system in Chrome.
Please see the FileAPI Spec for the full specification for Blobs, or Mozilla's Blob documentation for a description of how Blobs are used in the Web Platform in general. For the purposes of this document, the important aspects of blobs are:
- Blobs are immutable.
- Blob can be made using one or more of: bytes, files, or other blobs.
- Blobs can be 'sliced', which creates a blob that is a subsection of another blob.
- Reading blobs is asynchronous.
- Reading blob metadata (like size) is synchronous.
- Blobs can be passed to other browsing contexts, such as Javascript workers or other tabs.
In Chrome, after blob creation the actual blob 'data' gets transported to and lives in the browser process. The renderer just holds a reference - a mojom BlobPtr (and for now a string UUID) - to the blob, which it can use to read the blob or pass it to other processes.
Blobs are created in a renderer process, where their data is temporarily held for the browser (while Javascript execution can continue). When the browser has enough memory quota for the blob, it requests the data from the renderer. All blob data is transported from the renderer to the browser. Once complete, any pending reads for the blob are allowed to complete. Blobs can be huge (GBs), so quota is necessary.
If the in-memory space for blobs is getting full, or a new blob is too large to be in-memory, then the blob system uses the disk. This can either be paging old blobs to disk, or saving the new too-large blob straight to disk.
Blob reading goes through the mojom Blob interface,
where the renderer or browser calls the ReadAll
or ReadRange
methods to read
the blob through a data pipe. This is implemented in the browser process in the
MojoBlobReader
class.
General Chrome terminology:
- Renderer, Browser, and IPCs: See the Multi-Process Architecture document to learn about these concepts.
- Shared Memory: Memory that both the browser and renderer process can read & write. Created only between 2 processes.
Blob system terminology:
- Blob: This is a blob object, which can consist of bytes or files, as described above.
- BlobDataItem: This is a primitive element that can basically be a File, Bytes, or another Blob. It also stores an offset and size, so this can be a part of a file. (This can also represent a "future" file and "future" bytes, which is used to signify a bytes or file item that has not been transported yet).
- dependent blobs: These are blobs that our blob is dependent on to be constructed. As in, a blob is constructed with a dependency on another blob (maybe it is a slice or just a blob in our constructor), and before the new blob can be constructed it might need to wait for the "dependent" blobs to complete. (This can sound backwards, but it's how it's referenced in the code. So think "I am dependent on these other blobs")
- transport strategy: a method for sending the data in a BlobItem from a renderer to the browser. The system currently implements three strategies: Reply, Data Pipe, and Files.
- blob description: the inital data sychronously sent to the browser that describes the items (content and sizes) of the new blob. This can optimistically include the blob data if the size is less than the maximum mojo message size.
We calculate the storage limits here.
In-Memory Storage Limit
- If the architecture is x64 and NOT Chrome OS or Android:
2GB
- If Chrome OS:
total_physical_memory / 5
- If Android:
total_physical_memory / 100
Disk Storage Limit
- If Chrome OS:
disk_size / 2
- If Android:
6 * disk_size / 100
- Else:
disk_size / 10
Note: Chrome OS's disk is part of the user partition, which is separate from the system partition.
Minimum Disk Availability
We limit our disk limit to accomidate a minimum disk availability. The equation we use is:
min_disk_availability = in_memory_limit * 2
Device | Ram | In-Memory Limit | Disk | Disk Limit | Min Disk Availability |
---|---|---|---|---|---|
Cast | 512 MB | 102 MB | 0 | 0 | 0 |
Android Minimal | 512 MB | 5 MB | 8 GB | 491 MB | 10 MB |
Android Fat | 2 GB | 20 MB | 32 GB | 1.9 GB | 40 MB |
CrOS | 2 GB | 409 MB | 8 GB | 4 GB | 0.8 GB |
Desktop 32 | 3 GB | 614 MB | 500 GB | 50 GB | 1.2 GB |
Desktop 64 | 4 GB | 2 GB | 500 GB | 50 GB | 4 GB |
Creating a lot of blobs, especially if they are very large blobs, can cause the renderer memory to grow too fast and result in an OOM on the renderer side. This is because the renderer temporarily stores the blob data while it waits for the browser to request it. Meanwhile, Javascript can continue executing. Transfering the data can take a lot of time if the blob is large enough to save it directly to a file, as this means we need to wait for disk operations before the renderer can get rid of the data.
If the blob object in Javascript is kept around, then the data will never be cleaned up in the backend. This will unnecessarily use memory, so make sure to dereference blob objects if they are no longer needed.
Similarily if a URL is created for a blob, this will keep the blob data around until the URL is revoked (and the blob object is dereferenced). However, the URL is automatically revoked when the browser context that created it is destroyed.
The primary API to interact with the blob system is through its mojo interface. This is how the renderer process interacts with the blob systems and creates and transports blobs, but also how other subsystems in the browser process interact with the blob system, for example to read blobs they received.
New blobs are created through the BlobRegistry
mojo interface. In blink you can get a reference to this interface via
blink::BlobDataHandle::GetBlobRegistry()
. This interface has two methods to
create a new blob. The Register
method takes a blob description in the form of an array of
DataElement
s, while the RegisterFromStream
method creates a blob by reading
data from a mojo DataPipe
. Furthermore Register
will call its callback as
soon as possible after the request has been received, at which point the uuid is
valid and known to the blob system. It will then asynchronously request the data
and actually create the blob. On the other hand the RegisterFromStream
method
won't call its callback until all the data for the blob has been received and
the blob has been entirely completed.
To read the data for a blob, the Blob
mojom interface provides ReadAll
,
ReadRange
and ReadSideData
methods. These methods will wait until the blob
has finished building before they start reading data, and if for whatever reason
the blob failed to build or reading data failed, will report back an error
through the (optional) BlobReaderClient
.
Within blink creating blobs is done through the BlobData
and BlobDataHandle
classes. The BlobData
class can be seen as a builder for an array of mojom
DataElement
s. While doing so it also tries to consolidate all adjacent
memory blob items into one. This is done since blobs are often constructed with
arrays with single bytes.
The implementation tries to avoid doing any copying or allocating of new memory
buffers. Instead it facilitates the transformation between the 'consolidated'
blob items and the underlying bytes items. This way we don't waste any memory.
After the blob has been 'consolidated' and its data has been assembled in a
BlobData
object, it is passed to the blink::BlobDataHandle
constructor. This
then passes the consolidated data to the mojo BlobRegistry.Register
method.
Any DataElementByte
elements in the blob description will have an associated
BytesProvider
, as implemented by the blink::BlobBytesProvider
class. This
class is owned by the mojo message pipe it is bound to, and is what the browser
uses to request data for the blob when quota for it becomes available. Depending
on the transport strategy chosen by the browser one of the Request*
methods
on this interface will be called (or if the blob goes out of scope before the
data has been requested, the BytesProvider
pipe is simply dropped, destroying
the BlobBytesProvider
instance and the data it owned.
BlobBytesProvider
instances also try to keep the renderer alive while we are
sending blobs, as if the renderer is closed then we would lose any pending blob
data. It does this by calling blink::Platform::SuddenTerminationChanged
.
In blink, in addition to going through the mojo Blob
interface as exposed
through blink::Blob::GetBlobDataHandle
, you can also use FileReaderLoader
as an abstraction around the mojo interface. This class for example can convert
the resulting bytes to a String
or ArrayBuffer
, and generally just wraps the
mojo DataPipe
functionality in an easier to use interface.
Generally even in the browser process it should be preferred to go through the
mojo Blob
interface to interact with blobs. This results in a cleaner
separation between the blob system and the rest of chrome. However in some cases
it might still be needed to directly interact with the guts of the blob system,
so for now it is at least possible to interact with the blob system more
directly.
But keep in mind that everything in this section is really for legacy code only. New code should strongly prefer to use the mojo interfaces described above.
Blob interaction in C++ should go through the BlobStorageContext
. Blobs are
built using a BlobDataBuilder
to populate the data and then calling
BlobStorageContext::AddFinishedBlob
or ::BuildBlob
. This returns a
BlobDataHandle
, which manages reading, lifetime, and metadata access for the
new blob.
If you have known data that is not available yet, you can still create the blob
reference, but see the documentation in BlobDataBuilder::AppendFuture* or ::Populate*
methods on the builder, the callback usage on
BlobStorageContext::BuildBlob
, and
BlobStorageContext::NotifyTransportComplete
to facilitate this construction.
All blob information should come from the BlobDataHandle
returned on
construction. This handle is cheap to copy. Once all instances of handles for
a blob are destructed, the blob is destroyed.
BlobDataHandle::RunOnConstructionComplete
will notify you when the blob is
constructed or broken (construction failed due to not enough space, filesystem
error, etc).
The BlobReader
class is for reading blobs, and is accessible off of the
BlobDataHandle
at any time.
The browser side is a little more complicated than the renderer side. We are thinking about:
- Do we have enough space for this blob?
- Pick transportation strategy for blob's components.
- Is there enough free memory to transport the blob right now? Or does older blob data to be paged to disk first?
- Do I need to wait for files to be created?
- Do I need to wait for dependent blobs?
We follow this general flow for constructing a blob on the browser side:
- Does the blob fit, and what transportation strategy should be used.
- Create our browser-side representation of the blob data, including the data items from dependent blobs. We try to share items as much as possible to save memory, and allow for the dependent blob items to be not populated yet.
- Request memory and/or file quota from the BlobMemoryController, which manages our blob storage limits. Quota is necessary for both transportation and any copies we have to do from dependent blobs.
- If transporation quota is needed and when it is granted:
- Tell the
BlobRegistryImpl
and itsBlobUnderConstruction
instance to start asking for blob data given the earlier decision of strategy. * TheBlobTransportStrategy
populates the browser-side blob data item. - When transportation is done we notify the BlobStorageContext
- When transportation is done, copy quota is granted, and dependent blobs are complete, we finish the blob.
- We perform any pending copies from dependent blobs
- We notify any listeners that the blob has been completed.
Note: The transportation sections (steps 1, 2, 3) of this process are described (without accounting for blob dependencies) with diagrams and details in this presentation.
The BlobUnderConstruction
(inside BlobRegistryImpl
) is in charge of the
actual construction of a blob and manages the transportation of the data from
the renderer to the browser. When the initial description of the blob is
sent to the browser, the BlobUnderConstruction asks the BlobMemoryController which
strategy (IPC, Shared Memory, or File) it should use to transport the file.
Based on this strategy it creates a BlobTransportStrategy
instance. That
instance will then translate the memory items sent from the renderer
into a browser represetation to facilitate the transportation. See this
slide, which illustrates how the browser might segment or split up the
renderer's memory into transportable chunks.
Once the transport host decides its strategy, it will create its own transport
state for the blob, including a BlobDataBuilder
using the transport's data
segment representation. Then it will tell the BlobStorageContext
that it is
ready to build the blob.
When the BlobStorageContext
tells the transport host that it is ready to
transport the blob data, the BlobTransportStrategy
requests all of the data
from the renderer, populates the data in the BlobDataBuilder
, and then signals
the storage context that it is done.
The BlobStorageContext
is the hub of the blob storage system. It is
responsible for creating & managing all the state of constructing blobs, as
well as all blob handle generation and general blob status access.
When a BlobDataBuilder
is given to the context, it will do the following:
- Find all dependent blobs in the new blob (any blob reference in the blob item list), and create a 'slice' of their items for the new blob.
- Create the final blob item list representation, which creates a new blob item list which inserts these 'slice' items into the blob reference spots. This is 'flattening' the blob.
- Ask the
BlobMemoryController
for file or memory quota for the transportation if necessary. - Ask the
BlobMemoryController
for memory quota for any copies necessary for blob slicing. - Adds completion callbacks to any blobs our blob depends on.
When all of the following conditions are met:
- The
BlobRegistry
tells us it has transported all the data (or we don't need to transport data), - The
BlobMemoryManager
approves our memory quota for slice copies (or we don't need slice copies), and - All dependent blobs are completed (or we don't have dependent blobs),
The blob can finish constructing, where any pending blob slice copies are performed, and we set the status of the blob.
The BlobStatus tracks the construction procedure (specifically the transport
process), and the copy memory quota and dependent blob process is encompassed
in PENDING_REFERENCED_BLOBS
.
Once a blob is finished constructing, the status is set to DONE
or any of
the ERR_*
values.
During construction, slices are created for dependent blobs using the given offset and size of the reference. This slice consists of the relevant blob items, and metadata about possible copies from either end. If blob items can entirely be used by the new blob, then we just share the item between the. But if there is a 'slice' of the first or last item, then BlobDataBuilder will create a new bytes item for the new blob, and store necessary copy data for later.
While a blob is build in BlobDataBuilder
a 'flat' representation of the new
blob is created, replacing all blob references with the actual elements those
blobs are made up off, possibly slicing them in the process. It also stores any
copy data from the slices.
The BlobMemoryController
is responsable for:
- Determining storage quota limits for files and memory, including restricting file quota when disk space is low.
- Determining whether a blob can fit and the transportation strategy to use.
- Tracking memory quota.
- Tracking file quota and creating files.
- Accumulating and evicting old blob data to files to disk.