Skip to content

SURUIYUAN/Segment-Any-Video

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 简体中文

Introduction

  • The Segment Anything Model(SAM) proposed by facebook has made a great influence in computer vision, as it is a fundamental step in many tasks, such as edge detection, face recognition and autonomous driving. However, there are some weakness in SAM: (1) it can't return the semantic information about the regions, (2) in some cases an instance(eg. a car) may be segmented to different parts, (3) the model can't process video data.
  • In this repository, we implement a segmentation and a tracking method using YOLOv8 and SAM, it can fix the weakness, we name this method Segment Any Video(SAV).
  • In seg.py, our segmentation method is implemented by providing the boxes from YOLOv8 detector as prompts to SAM, and the masks with no semantic info will also be returned, this is the biggest difference with SAM. In track.py, we modified the code from ultralytics/tracker/track.py which sported ByteTrack and BoTSORT, then apply instance segmentation to all frames.

Installation

pip install ultralytics
pip install git+https://github.com/facebookresearch/segment-anything.git

Model CheckPoints

Usage

python seg.py --img_path TestImages --save_dir SegOut --sam_checkpoint model/sam_vit_h_4b8939.pth --yolo_checkpoint model/yolov8x.pt

or

python track.py --video_path video.mp4 --save_path video_test.mp4 sam_checkpoint model/sam_vit_h_4b8939.pth --yolo_checkpoint model/yolov8x.pt --imgsz 1920

Image Segment Results

input
We can see from the above result that our method can segment the bus, car and train to an intact object semantically while SAM segments to different parts.

Video Track Result

track
Segment and track
track
Segment and track
track
Segment and track
track
Segment and track

Demo

  • Our online demo is here.
  • Note: Considering that video segmentation is time-consuming, so we didn't integrate this method. If you are interested, you can git clone this repository and run on your GPU machine.

TODO

  • Train YOLOv8 models with object365 dataset
    • YOLOv8m.pt extract code: 65ge
    • YOLOv8n.pt
    • YOLOv8s.pt
    • YOLOv8l.pt
    • YOLOv8x.pt

License

Contact

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%