Skip to content

Commit

Permalink
Data export documentation update (cvat-ai#6795)
Browse files Browse the repository at this point in the history
  • Loading branch information
mdacoca committed Sep 22, 2023
1 parent 26693dd commit 612f0e7
Show file tree
Hide file tree
Showing 25 changed files with 682 additions and 385 deletions.
123 changes: 99 additions & 24 deletions site/content/en/docs/manual/advanced/formats/_index.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,103 @@
---
title: 'Formats'
linkTitle: 'Formats'
title: 'Export annotations and data from CVAT'
linkTitle: 'Export annotations and data from CVAT'
weight: 20
description: 'List of annotation formats supported by CVAT.'
description: 'List of data export formats formats supported by CVAT.'
---

#### CVAT supported the following formats:

- [CVAT](format-cvat)
- [Datumaro](format-datumaro)
- [LabelMe](format-labelme)
- [MOT](format-mot)
- [MOTS](format-mots)
- [COCO](format-coco)
- [PASCAL VOC and mask](format-voc)
- [YOLO](format-yolo)
- [TF detection API](format-tfrecord)
- [ImageNet](format-imagenet)
- [CamVid](format-camvid)
- [WIDER Face](format-widerface)
- [VGGFace2](format-vggface2)
- [Market-1501](format-market1501)
- [ICDAR13/15](format-icdar)
- [Open Images](format-openimages)
- [Cityscapes](format-cityscapes)
- [KITTI](format-kitti)
- [LFW](format-lfw)
In CVAT, you have the option to export data in various formats.
The choice of export format depends on the type of annotation as
well as the intended future use of the dataset.

See:

- [Data export formats](#data-export-formats)
- [Exporting dataset in CVAT](#exporting-dataset-in-cvat)
- [Exporting dataset from Task](#exporting-dataset-from-task)
- [Exporting dataset from Job](#exporting-dataset-from-job)
- [Data export video tutorial](#data-export-video-tutorial)

## Data export formats

The table below outlines the available formats for data export in CVAT.

<!--lint disable maximum-line-length-->

| Format | Type | Annotation Type | Models | Shapes | Attributes | Video Tracks |
| ----------------------------------------------------------------------------------------------------------------------------------- | ------------- | ----------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | -------------------- | ------------- |
| [CamVid 1.0](format-camvid) | .txt <br>.png | Semantic <br>Segmentation | U-Net, SegNet, DeepLab, <br>PSPNet, FCN, Mask R-CNN, <br>ICNet, ERFNet, HRNet, <br>V-Net, and others. | Polygons | Not supported | Not supported |
| [Cityscapes 1.0](format-cityscapes) | .txt<br>.png | Semantic<br>Segmentation | U-Net, SegNet, DeepLab, <br>PSPNet, FCN, ERFNet, <br>ICNet, Mask R-CNN, HRNet, <br>ENet, and others. | Polygons | Specific attributes | Not supported |
| [COCO 1.0](format-coco) | JSON | Detection, Semantic <br>Segmentation | YOLO (You Only Look Once), <br>Faster R-CNN, Mask R-CNN, SSD (Single Shot MultiBox Detector), <br> RetinaNet, EfficientDet, UNet, <br>DeepLabv3+, CenterNet, Cascade R-CNN, and others. | Bounding Boxes, Polygons | Specific attributes | Not supported |
| [COCO Keypoings 1.0](coco-keypoints) | .xml | Keypoints | OpenPose, PoseNet, AlphaPose, <br> SPM (Single Person Model), <br>Mask R-CNN with Keypoint Detection:, and others. | Skeletons | Specific attributes | Not supported |
| [CVAT for images 1.1](/docs/manual/advanced/formats/format-cvat/#cvat-for-videos-export) | .xml | Universal format<br> for all types of <br>annotations. | Universal format<br> for all types of <br>models. | Bounding Boxes, Polygons, <br>Polylines, Points, Cuboids, <br>Skeletons, Tags. | All attributes | Not supported |
| [CVAT for video 1.1](/docs/manual/advanced/formats/format-cvat/#cvat-for-videos-export) | .xml | Universal format<br> for all types of <br>annotations. | Universal format<br> for all types of <br>annotations. | Bounding Boxes, Polygons, <br>Polylines, Points, Cuboids, <br>Skeletons, Tags, Tracks. | All attributes | Supported |
| [Datumaro 1.0](format-datumaro) | JSON | Universal format<br> for all types of <br>annotations. | Universal format<br> for all types of <br>models. | Bounding Boxes, Polygons, <br>Polylines, Points, Cuboids, <br>Skeletons, Tags, Tracks. | All attributes | Supported |
| [ICDAR](format-icdar)<br> Includes ICDAR Recognition 1.0, <br>ICDAR Detection 1.0, <br>and ICDAR Segmentation 1.0 <br>descriptions. | .txt | Text recognition, <br>Text detection, <br>Text segmentation | EAST: Efficient and Accurate <br>Scene Text Detector, CRNN, Mask TextSpotter, TextSnake, <br>and others. | Tag, Bounding Boxes, Polygons | Specific attributes | Not supported |
| [ImageNet 1.0](format-imagenet) | .jpg <br>.txt | Semantic Segmentation, <br>Classification, <br>Detection | VGG (VGG16, VGG19), Inception, YOLO, Faster R-CNN , U-Net, and others | Tags | No attributes | Not supported |
| [KITTI 1.0](format-kitti) | .txt <br>.png | Semantic Segmentation, Detection, 3D | PointPillars, SECOND, AVOD, YOLO, DeepSORT, PWC-Net, ORB-SLAM, and others. | Bounding Boxes, Polygons | Specific attributes | Not supported |
| [LabelMe 3.0](format-labelme) | .xml | Compatibility, <br>Semantic Segmentation | U-Net, Mask R-CNN, Fast R-CNN,<br> Faster R-CNN, DeepLab, YOLO, <br>and others. | Bounding Boxes, Polygons | Supported (Polygons) | Not supported |
| [LFW 1.0](format-lfw) | .txt | Verification, <br>Face recognition | OpenFace, VGGFace & VGGFace2, <br>FaceNet, ArcFace, <br>and others. | Tags, Skeletons | Specific attributes | Not supported |
| [Market-1501 1.0](format-market1501) | .txt | Re-identification | Triplet Loss Networks, <br>Deep ReID models, and others. | Bounding Boxes | Specific attributes | Not supported |
| [MOT 1.0](format-mot) | .txt | Video Tracking, <br>Detection | SORT, MOT-Net, IOU Tracker, <br>and others. | Bounding Boxes, Tracks | Specific attributes | Supported |
| [MOTS PNG 1.0](format-mots) | .png<br>.txt | Video Tracking, <br>Detection | SORT, MOT-Net, IOU Tracker, <br>and others. | Bounding Boxes, Tracks, Masks | Specific attributes | Supported |
| [Open Images 1.0](format-openimages) | .csv | Detection, <br>Classification, <br>Semantic Segmentaion | Faster R-CNN, YOLO, U-Net, <br>CornerNet, and others. | Bounding Boxes, Tags, Polygons | Specific attributes | Not supported |
| [PASCAL VOC 1.0](format-voc) | .xml | Classification, Detection | Faster R-CNN, SSD, YOLO, <br>AlexNet, and others. | Bounding Boxes, Tags, Polygons | Specific attributes | Not supported |
| [Segmentation Mask 1.0](format-smask) | .txt | Semantic Segmentation | Faster R-CNN, SSD, YOLO, <br>AlexNet, and others. | Polygons | No attributes | Not supported |
| [TFRecord 1.0](format-tfrecord) | .pbtxt | Detection<br>Classification | SSD, Faster R-CNN, YOLO, <br>GG16, ResNet, Inception, MobileNet, <br>and others. | Bounding Boxes, Polygons | No attributes | Not supported |
| [VGGFace2 1.0](format-vggface2) | .csv | Face recognition | VGGFace, ResNet, Inception, <br> and others. | Bounding Boxes, Points | No attributes | Not supported |
| [WIDER Face 1.0](format-widerface) | .txt | Detection | SSD (Single Shot MultiBox Detector), Faster R-CNN, YOLO, <br>and others. | Bounding Boxes, Tags | Specific attributes | Not supported |
| [YOLO 1.0](format-yolo) | .txt | Detection | YOLOv1, YOLOv2 (YOLO9000), <br>YOLOv3, YOLOv4, and others. | Bounding Boxes | No attributes | Not supported |

<!--lint enable maximum-line-length-->

## Exporting dataset in CVAT

### Exporting dataset from Task

To export the dataset from the task, follow these steps:

1. Open Task.
2. Go to **Actions** > **Export task dataset.**
3. Choose the desired format from the list of available options.

4. (Optional) Toggle the **Save images** switch if you
wish to include images in the export.

> **Note**: The **Save images** option is a **paid feature**.
![Save images option](/images/export_job_as_dataset_dialog.png)

5. Input a name for the resulting `.zip` archive.

6. Click **OK** to initiate the export.

### Exporting dataset from Job

To export a dataset from Job follow these steps:

1. Navigate to **Menu** > **Export job dataset**.

![Export dataset](/images/export_job_as_dataset_menu.png)

2. Choose the desired format from the list of available options.

3. (Optional) Toggle the **Save images** switch
if you wish to include images in the export.

> **Note**: The **Save images** option is a **paid feature**.
![Save images option](/images/export_job_as_dataset_dialog.png)

4. Input a name for the resulting `.zip` archive.

5. Click **OK** to initiate the export.

## Data export video tutorial

For more information on the process, see the following tutorial:

<!--lint disable maximum-line-length-->

<iframe width="560" height="315" src="https://www.youtube.com/embed/gzjVpVV9orE?si=2tiBIqts8nk_byTH" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

<!--lint enable maximum-line-length-->
65 changes: 65 additions & 0 deletions site/content/en/docs/manual/advanced/formats/coco-keypoints.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
linkTitle: 'COCO Keypoints'
weight: 5
---

The COCO Keypoints format is designed specifically for human pose estimation tasks, where the objective
is to identify and localize body joints (keypoints) on a human figure within an image.

This specialized format is used with a variety of state-of-the-art models focused on pose estimation.

For more information, see:

- [COCO Keypoint site](https://cocodataset.org/#keypoints-2020)
- [Format specification](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html)
- [Example of the archive](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html#import-coco-dataset)

## COCO Keypoints export

For export of images:

- Supported annotations: Skeletons
- Attributes:
- `is_crowd` This can either be a checkbox or an integer
(with values of 0 or 1). It indicates that the instance
(or group of objects) should include an RLE-encoded mask in the `segmentation` field.
All shapes within the group coalesce into a single, overarching mask,
with the largest shape setting the properties for the entire object group.
- `score`: This numerical field represents the annotation `score`.
- Arbitrary attributes: These will be stored within the `attributes`
section of the annotation.
- Tracks: Not supported.

Downloaded file is a .zip archive with the following structure:

```
archive.zip/
├── images/
│ ├── <image_name1.ext>
│ ├── <image_name2.ext>
│ └── ...
├──<annotations>.xml
```

## COCO import

Uploaded file: a single unpacked `*.json` or a zip archive with the structure described
[here](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html#import-coco-dataset)
(without images).

- supported annotations: Skeletons

`person_keypoints`,

Support for COCO tasks via Datumaro is described [here](https://openvinotoolkit.github.io/datumaro/latest/docs/data-formats/formats/coco.html#export-to-other-formats)
For example, [support for COCO keypoints over Datumaro](https://github.com/openvinotoolkit/cvat/issues/2910#issuecomment-726077582):

1. Install [Datumaro](https://github.com/openvinotoolkit/datumaro)
`pip install datumaro`
2. Export the task in the `Datumaro` format, unzip
3. Export the Datumaro project in `coco` / `coco_person_keypoints` formats
`datum export -f coco -p path/to/project [-- --save-images]`

This way, one can export CVAT points as single keypoints or
keypoint lists (without the `visibility` COCO flag).
30 changes: 23 additions & 7 deletions site/content/en/docs/manual/advanced/formats/format-camvid.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,25 @@ linkTitle: 'CamVid'
weight: 10
---

# [CamVid](http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/)
The CamVid (Cambridge-driving Labeled Video Database) format is most commonly used
in the realm of semantic segmentation tasks. It is particularly useful for training
and evaluating models for autonomous driving and other vision-based robotics
applications.

For more information, see:

- [CamVid Specification](http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/)
- [Dataset examples](https://github.com/cvat-ai/datumaro/tree/v0.3/tests/assets/camvid_dataset)

## CamVid export

Downloaded file: a zip archive of the following structure:
For export of images and videos:

- Supported annotations: Bounding Boxes, Polygons.
- Attributes: Not supported.
- Tracks: Not supported.

The downloaded file is a .zip archive with the following structure:

```bash
taskname.zip/
Expand Down Expand Up @@ -41,14 +53,18 @@ Bicyclist
Bridge
```

Mask is a `png` image with 1 or 3 channels where each pixel
has own color which corresponds to a label.
`(0, 0, 0)` is used for background by default.
A mask in the CamVid dataset is typically a **.png**
image with either one or three channels.

In this image, each pixel is assigned a specific color
that corresponds to a particular label.

- supported annotations: Rectangles, Polygons
By default, the color `(0, 0, 0)`—or `black`—is used
to represent the background.

## CamVid import

Uploaded file: a zip archive of the structure above
For import of images:

- Uploaded file: a _.zip_ archive of the structure above
- supported annotations: Polygons
Loading

0 comments on commit 612f0e7

Please sign in to comment.