Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix excessive memory usage when exporting projects with multiple video tasks #7374

Merged
merged 2 commits into from
Jan 24, 2024

Commits on Jan 19, 2024

  1. Fix excessive memory usage when exporting projects with multiple vide…

    …o tasks
    
    Currently, the following situation causes the export worker to consume more
    memory than necessary:
    
    * A user exports a project with images included, and
    * The project contains multiple tasks with video chunks.
    
    The main reason for this is that the `image_maker_per_task` dictionary in
    `CVATProjectDataExtractor.__init__` indirectly references a distinct
    `FrameProvider` for each task, which in turn, indirectly references an
    `av.containers.Container` object corresponding to the most recent chunk
    opened in that task.
    
    Initially, none of the chunks are opened, so the memory usage is low. But as
    Datumaro iterates over each frame, it eventually requests at least one image
    from each task of the project. This causes the corresponding `FrameProvider`
    to open a chunk file and keep a handle to it. The only way for a
    `FrameProvider` to close this chunk file is to open a new chunk when a frame
    from that chunk is requested. Therefore, each `FrameProvider` keeps at least
    one chunk open from the moment Datumaro requests the first frame from its
    task and _until the end of the export_.
    
    This manifests in the export worker's memory usage growing and growing as
    Datumaro goes from task to task. An open chunk consumes whatever Python
    objects represent it, but more importantly, any C-level buffers allocated by
    FFmpeg, which can be quite significant. In my testing, the size of the
    per-chunk memory was on the order of tens of megabytes. An open chunk also
    takes up a file descriptor.
    
    The fix for this is conceptually simple: ensure that only one
    `FrameProvider` object exists at a time. AFAICS, when a project is exported,
    all frames from a given task are grouped together, so we shouldn't need
    multiple tasks' chunks to be open at the same time anyway.
    
    I had to restructure the code to make this work, so I took the opportunity
    to fix a few other things, as well:
    
    * The code currently relies on garbage collection of PyAV's `Container`
      objects to free the resources used by them. Even though
      `VideoReader.__iter__` uses a `with` block to close the container, the
      `with` block can only do so if the container is iterated all the way to
      the end. This doesn't happen when `FrameProvider` uses it, since
      `FrameProvider` seeks to the needed frame and then stops iterating.
    
      I don't have evidence that this causes any issues at the moment, but
      Python does not guarantee that objects are GC'd promptly, so this could
      become another source of excessive memory usage.
    
      I added cleanup methods (`close`/`unload`/`__exit__`) at several layers of
      the code to ensure that each chunk is closed as soon as it is no longer
      needed.
    
    * I factored out and merged the code used to generate `dm.Image` objects
      when exporting projects and jobs/tasks. It's likely possible to merge even
      more code, but I don't want to expand the scope of the patch too much.
    
    * I fixed a seemingly useless optimization in the handling for 3D frames.
      Specifically, `CVATProjectDataExtractor` performs queries of the form:
    
          task.data.images.prefetch_related().get(id=i)
    
      The prefetch here looks useless, as only a single object is queried - it
      wouldn't be any less efficient to just let Django fetch the `Image`'s
      `related_files` when needed.
    
      I rewrote this code to prefetch `related_files` for all images in the task
      at once instead.
    SpecLad committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    5e1698b View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2024

  1. Configuration menu
    Copy the full SHA
    6de6db8 View commit details
    Browse the repository at this point in the history