Skip to content

Commit

Permalink
now you can use the different available detectors and descriptors for…
Browse files Browse the repository at this point in the history
… main_slam.py too
  • Loading branch information
luigifreda committed Mar 24, 2019
1 parent 64b5d78 commit 08ad990
Show file tree
Hide file tree
Showing 20 changed files with 406 additions and 173 deletions.
36 changes: 30 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Author: [Luigi Freda](https://www.luigifreda.com)
**pySLAM** is a *'toy'* implementation of a monocular *Visual Odometry (VO)* pipeline in Python. I released it for **educational purposes**, for a [computer vision class](https://as-ai.org/visual-perception-and-spatial-computing/) I taught. I started developing it for fun as a python programming exercise, during my free time. I took inspiration from some python repos available on the web.

Main Scripts:
* `main_vo.py` combines the simplest VO ingredients without performing any image point triangulation or windowed bundle adjustment. At each step $k$, `main_vo.py` estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$. With this very basic computation, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a meaningful trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$. This script is a first start to understand the basics of inter frame feature tracking and camera pose estimation.
* `main_vo.py` combines the simplest VO ingredients without performing any image point triangulation or windowed bundle adjustment. At each step $k$, `main_vo.py` estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$. With this very basic approach, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a valid trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$. This script is a first start to understand the basics of inter frame feature tracking and camera pose estimation.

* `main_slam.py` adds feature tracking along multiple frames, point triangulation and bundle adjustment in order to estimate the camera trajectory up-to-scale and a build a local map. It's still a VO pipeline but it shows some basic blocks which are necessary to develop a real visual SLAM pipeline.

Expand Down Expand Up @@ -71,9 +71,9 @@ $ python3 -O main_vo.py
```
This will process a [KITTI]((http://www.cvlibs.net/datasets/kitti/eval_odometry.php)) video (available in the folder `videos`) by using its corresponding camera calibration file (available in the folder `settings`), and its groundtruth (available in the video folder).

**N.B.**: remind, the simple script `main_vo.py` **strictly requires a ground truth**, since - with the used approach - the relative motion between two adjacent camera frames can be only estimated up to scale with a monocular camera (i.e. the implemented inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$).
**N.B.**: as explained above, the script `main_vo.py` **strictly requires a ground truth**.

In order to process a different dataset, you need to set the file `config.ini`:
In order to process a different **dataset**, you need to set the file `config.ini`:
* select your dataset `type` in the section `[DATASET]` (see the section *Datasets* below for further details)
* the camera settings file accordingly (see the section *Camera Settings* below)
* the groudtruth file accordingly (see the section *Camera Settings* below)
Expand All @@ -83,6 +83,8 @@ If you want to test the script `main_slam.py`, you can run:
$ python3 -O main_slam.py
```

You can choose any detector/descriptor among *ORB*, *SIFT*, *SURF*, *BRISK*, *AKAZE* (see below for further information).

**WARNING**: the available **KITTI videos** (due to information loss in video compression) make main_slam tracking peform worse than with the original KITTI *image sequences*. The available videos are intended to be used for a first quick test. Please, download and use the original KITTI image sequences as explained below. For instance, on the original KITTI sequence 06, main_slam successfully completes the round; at present time, this does not happen with the compressed video.

---
Expand Down Expand Up @@ -140,6 +142,29 @@ In order to calibrate your camera, you can use the scripts in the folder `calibr
1. use the script `grab_chessboard_images.py` to collect a sequence of images where the chessboard can be detected (set the chessboard size there)
2. use the script `calibrate.py` to process the collected images and compute the calibration parameters (set the chessboard size there)

---
## Detectors/Descriptors

At present time, the following feature **detectors** are supported:
* *FAST*
* *Good features to track* [[ShiTo94]](https://ieeexplore.ieee.org/document/323794)
* *ORB*
* *SIFT*
* *SURF*
* *AKAZE*
* *BRISK*

You can take a look at the file `feature_detector.py`.

The following feature **descriptors** are supported:
* *ORB*
* *SIFT*
* *SURF*
* *AKAZE*
* *BRISK*

In both the scripts `main_vo.py` and `main_slam.py`, you can set which detector/descritor to use by means of the function *feature_tracker_factory()*. This function be found in the file `feature_tracker.py`.

---
## References

Expand All @@ -161,10 +186,9 @@ Tons of things are still missing to attain a real SLAM pipeline:

* keyframe generation and management
* tracking w.r.t. previous keyframe
* proper local map generation and management
* proper local map generation and management (covisibility)
* loop closure
* general relocalization
* in main_slam, tracking by using all kind of features (not only ORB)


---
Expand All @@ -173,4 +197,4 @@ Tons of things are still missing to attain a real SLAM pipeline:
* [twitchslam](https://github.com/geohot/twitchslam)
* [monoVO](https://github.com/uoip/monoVO-python)
* [pangolin](https://github.com/stevenlovegrove/Pangolin)
* [g2opy](https://github.com/uoip/g2opy)
* [g2opy](https://github.com/uoip/g2opy)
12 changes: 6 additions & 6 deletions config.ini
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ type=VIDEO_DATASET
type=kitti
base_path=/home/luigi/Work/rgbd_datasets/kitti/dataset
;
name=00
cam_settings=settings/KITTI00-02.yaml
#name=00
#cam_settings=settings/KITTI00-02.yaml
;
#name=06
#cam_settings=settings/KITTI04-12.yaml
name=06
cam_settings=settings/KITTI04-12.yaml
;
groundtruth_file=auto

Expand All @@ -44,8 +44,8 @@ type=video
base_path=./videos/kitti00
cam_settings=settings/KITTI00-02.yaml
;
#base_path=./videos/kitti06
#cam_settings=settings/KITTI04-12.yaml
;base_path=./videos/kitti06
;cam_settings=settings/KITTI04-12.yaml
;
name=video.mp4
groundtruth_file=groundtruth.txt
Expand Down
5 changes: 5 additions & 0 deletions config.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ def __init__(self):
self.cam_settings = None
self.dataset_settings = None
self.dataset_type = None
self.current_path = os.getcwd()
#print('current path: ', self.current_path)

self.set_lib_paths()
self.get_dataset_settings()
Expand All @@ -56,6 +58,9 @@ def set_lib_paths(self):
def get_dataset_settings(self):
self.dataset_type = self.config_parser['DATASET']['type']
self.dataset_settings = self.config_parser[self.dataset_type]

self.dataset_path = self.dataset_settings['base_path'];
self.dataset_settings['base_path'] = os.path.join( __location__, self.dataset_path)
#print('dataset_settings: ', self.dataset_settings)

# get camera settings
Expand Down
3 changes: 2 additions & 1 deletion dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,10 @@ class VideoDataset(Dataset):
def __init__(self, path, name, associations=None, type=DatasetType.VIDEO):
super().__init__(path, name, associations, type)
self.filename = path + '/' + name
#print('video: ', self.filename)
self.cap = cv2.VideoCapture(self.filename)
if not self.cap.isOpened():
raise IOError('Cannot open movie file')
raise IOError('Cannot open movie file: ', self.filename)
else:
print('Processing Video Input')
self.num_frames = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT))
Expand Down
68 changes: 47 additions & 21 deletions feature_detector.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,11 @@
* along with PYSLAM. If not, see <http://www.gnu.org/licenses/>.
"""
import sys
import math
import numpy as np
import cv2
from enum import Enum
from geom_helpers import imgBlocks
from geom_helpers import imgBlocks, unpackSiftOctaveKps

kVerbose = True

Expand All @@ -32,6 +33,7 @@
kNumLevels = 4
kNumLevelsInitSigma = 12
kScaleFactor = 1.2
kSigmaLevel0 = 1.

kDrawOriginalExtractedFeatures = False # for debugging

Expand All @@ -44,6 +46,7 @@ class FeatureDetectorTypes(Enum):
ORB = 5
BRISK = 6
AKAZE = 7
FREAK = 8 # DOES NOT WORK!


class FeatureDescriptorTypes(Enum):
Expand All @@ -53,6 +56,7 @@ class FeatureDescriptorTypes(Enum):
ORB = 3
BRISK = 4
AKAZE = 5
FREAK = 6 # DOES NOT WORK!


def feature_detector_factory(min_num_features=kMinNumFeatureDefault,
Expand Down Expand Up @@ -181,7 +185,8 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
self.descriptor_type = descriptor_type

self.num_levels = num_levels
self.scale_factor = kScaleFactor
self.scale_factor = kScaleFactor # scale factor bewteen two octaves
self.sigma_level0 = kSigmaLevel0 # sigma on first octave
self.initSigmaLevels()

self.min_num_features = min_num_features
Expand All @@ -191,20 +196,26 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
self.use_pyramid_adaptor = False
self.pyramid_adaptor = None

print("using opencv ", cv2.__version__)
# check opencv version in order to use the right modules
if cv2.__version__.split('.')[0] == '3':
from cv2.xfeatures2d import SIFT_create, SURF_create
from cv2.xfeatures2d import SIFT_create, SURF_create, FREAK_create
from cv2 import ORB_create, BRISK_create, AKAZE_create
else:
SIFT_create = cv2.SIFT
SURF_create = cv2.SURF
ORB_create = cv2.ORB
BRISK_create = cv2.BRISK
AKAZE_create = cv2.AKAZE
AKAZE_create = cv2.AKAZE
FREAK_create = cv2.FREAK # TODO: to be checked

self.FAST_create = cv2.FastFeatureDetector_create
self.SIFT_create = SIFT_create
self.SURF_create = SURF_create
self.ORB_create = ORB_create
self.BRISK_create = BRISK_create
self.AKAZE_create = AKAZE_create
self.AKAZE_create = AKAZE_create
self.FREAK_create = FREAK_create # DOES NOT WORK!

self.orb_params = dict(nfeatures=min_num_features,
scaleFactor=self.scale_factor,
Expand All @@ -219,25 +230,33 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,

# init detector
if self.detector_type == FeatureDetectorTypes.SIFT:
self._feature_detector = SIFT_create()
self._feature_detector = self.SIFT_create() # N.B.: The number of octaves is computed automatically from the image resolution
# from https://docs.opencv.org/3.4/d5/d3c/classcv_1_1xfeatures2d_1_1SIFT.html
self.scale_factor = 2 # from https://docs.opencv.org/3.1.0/da/df5/tutorial_py_sift_intro.html
# self.layer_scale_factor = math.sqrt(2) # with SIFT, 3 layers per octave are generated with a intra-layer scale factor = sqrt(2)
self.sigma_level0 = 1.6
self.initSigmaLevels()
self.detector_name = 'SIFT'
elif self.detector_type == FeatureDetectorTypes.SURF:
self._feature_detector = SURF_create()
self._feature_detector = self.SURF_create(nOctaveLayers=self.num_levels)
self.detector_name = 'SURF'
elif self.detector_type == FeatureDetectorTypes.ORB:
self._feature_detector = ORB_create(**self.orb_params)
self._feature_detector = self.ORB_create(**self.orb_params)
self.detector_name = 'ORB'
self.use_bock_adaptor = True
elif self.detector_type == FeatureDetectorTypes.BRISK:
self._feature_detector = BRISK_create(octaves=self.num_levels)
self._feature_detector = self.BRISK_create(octaves=self.num_levels)
self.detector_name = 'BRISK'
self.scale_factor = 1.3 # from the BRISK opencv code this seems to be the used scale factor between intra-octave frames
self.initSigmaLevels()
elif self.detector_type == FeatureDetectorTypes.AKAZE:
self._feature_detector = AKAZE_create(nOctaves=self.num_levels)
self.detector_name = 'AKAZE'
self._feature_detector = self.AKAZE_create(nOctaves=self.num_levels)
self.detector_name = 'AKAZE'
elif self.detector_type == FeatureDetectorTypes.FREAK:
self._feature_detector = self.FREAK_create(nOctaves=self.num_levels)
self.detector_name = 'FREAK'
elif self.detector_type == FeatureDetectorTypes.FAST:
self._feature_detector = cv2.FastFeatureDetector_create(threshold=25, nonmaxSuppression=True)
self._feature_detector = self.FAST_create(threshold=25, nonmaxSuppression=True)
self.detector_name = 'FAST'
self.use_bock_adaptor = True
self.use_pyramid_adaptor = self.num_levels > 1
Expand All @@ -257,20 +276,23 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,

# init descriptor
if self.descriptor_type == FeatureDescriptorTypes.SIFT:
self._feature_descriptor = SIFT_create()
self._feature_descriptor = self.SIFT_create()
self.decriptor_name = 'SIFT'
elif self.descriptor_type == FeatureDescriptorTypes.SURF:
self._feature_descriptor = SURF_create()
self._feature_descriptor = self.SURF_create(nOctaveLayers=self.num_levels)
self.decriptor_name = 'SURF'
elif self.descriptor_type == FeatureDescriptorTypes.ORB:
self._feature_descriptor = ORB_create(**self.orb_params)
self._feature_descriptor = self.ORB_create(**self.orb_params)
self.decriptor_name = 'ORB'
elif self.descriptor_type == FeatureDescriptorTypes.BRISK:
self._feature_descriptor = BRISK_create(octaves=self.num_levels)
self._feature_descriptor = self.BRISK_create(octaves=self.num_levels)
self.decriptor_name = 'BRISK'
elif self.descriptor_type == FeatureDescriptorTypes.AKAZE:
self._feature_descriptor = AKAZE_create(nOctaves=self.num_levels)
self.decriptor_name = 'AKAZE'
self._feature_descriptor = self.AKAZE_create(nOctaves=self.num_levels)
self.decriptor_name = 'AKAZE'
elif self.descriptor_type == FeatureDescriptorTypes.FREAK:
self._feature_descriptor = self.FREAK_create(nOctaves=self.num_levels)
self.decriptor_name = 'FREAK'
elif self.descriptor_type == FeatureDescriptorTypes.NONE:
self._feature_descriptor = None
self.decriptor_name = 'None'
Expand All @@ -284,11 +306,13 @@ def initSigmaLevels(self):
self.inv_scale_factors = np.zeros(num_levels)
self.inv_level_sigmas2 = np.zeros(num_levels)

# TODO: in the SIFT case, this sigma management could be refined.
# SIFT method has layers with intra-layer scale factor = math.sqrt(2)
self.scale_factors[0]=1.0
self.level_sigmas2[0]=1.0
self.level_sigmas2[0]=self.sigma_level0*self.sigma_level0
for i in range(1,num_levels):
self.scale_factors[i]=self.scale_factors[i-1]*self.scale_factor
self.level_sigmas2[i]=self.scale_factors[i]*self.scale_factors[i]
self.level_sigmas2[i]=self.scale_factors[i]*self.scale_factors[i]*self.level_sigmas2[i-1]
#print('self.scale_factors: ', self.scale_factors)
for i in range(num_levels):
self.inv_scale_factors[i]=1.0/self.scale_factors[i]
Expand All @@ -306,7 +330,7 @@ def detect(self, frame, mask=None):
kps = self.block_adaptor.detect(frame, mask)
else:
kps = self._feature_detector.detect(frame, mask)
kps = self.satNumberOfFeatures(kps)
kps = self.satNumberOfFeatures(kps)
if kDrawOriginalExtractedFeatures: # draw the original features
imgDraw = cv2.drawKeypoints(frame, kps, None, color=(0,255,0), flags=0)
cv2.imshow('detected keypoints',imgDraw)
Expand All @@ -321,6 +345,8 @@ def detectAndCompute(self, frame, mask=None):
frame = cv2.cvtColor(frame,cv2.COLOR_RGB2GRAY)
kps = self.detect(frame, mask)
kps, des = self._feature_descriptor.compute(frame, kps)
if self.detector_type == FeatureDetectorTypes.SIFT:
unpackSiftOctaveKps(kps)
if kVerbose:
#print('detector: ', self.detector_name, ', #features: ', len(kps))
print('descriptor: ', self.decriptor_name, ', #features: ', len(kps))
Expand Down
8 changes: 4 additions & 4 deletions feature_matcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def __init__(self, norm_type=cv2.NORM_HAMMING, cross_check = False, type = Featu

def match(self, des1, des2):
if kVerbose:
print('BfFeatureMatcher')
print('BfFeatureMatcher, norm ', self.norm_type)
matches = self.bf.knnMatch(des1, des2, k=2) #knnMatch(queryDescriptors,trainDescriptors)
return self.goodMatches(matches, des1, des2)

Expand All @@ -140,9 +140,9 @@ def __init__(self, norm_type=cv2.NORM_HAMMING, cross_check = False, type = Featu
# FLANN parameters for binary descriptors
FLANN_INDEX_LSH = 6
self.index_params= dict(algorithm = FLANN_INDEX_LSH,
table_number = 12, # 12
key_size = 20, # 20
multi_probe_level = 2) # 2
table_number = 6, # 12
key_size = 12, # 20
multi_probe_level = 1) # 2
if norm_type == cv2.NORM_L2:
# FLANN parameters for float descriptors
FLANN_INDEX_KDTREE = 1
Expand Down
Loading

0 comments on commit 08ad990

Please sign in to comment.