diff --git a/README.md b/README.md
index b493fe3..944e8d0 100644
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@ Author: [Luigi Freda](https://www.luigifreda.com)
**pySLAM** is a *'toy'* implementation of a monocular *Visual Odometry (VO)* pipeline in Python. I released it for **educational purposes**, for a [computer vision class](https://as-ai.org/visual-perception-and-spatial-computing/) I taught. I started developing it for fun as a python programming exercise, during my free time. I took inspiration from some python repos available on the web.
Main Scripts:
-* `main_vo.py` combines the simplest VO ingredients without performing any image point triangulation or windowed bundle adjustment. At each step $k$, `main_vo.py` estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$. With this very basic computation, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a meaningful trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$. This script is a first start to understand the basics of inter frame feature tracking and camera pose estimation.
+* `main_vo.py` combines the simplest VO ingredients without performing any image point triangulation or windowed bundle adjustment. At each step $k$, `main_vo.py` estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$. With this very basic approach, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a valid trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$. This script is a first start to understand the basics of inter frame feature tracking and camera pose estimation.
* `main_slam.py` adds feature tracking along multiple frames, point triangulation and bundle adjustment in order to estimate the camera trajectory up-to-scale and a build a local map. It's still a VO pipeline but it shows some basic blocks which are necessary to develop a real visual SLAM pipeline.
@@ -71,9 +71,9 @@ $ python3 -O main_vo.py
```
This will process a [KITTI]((http://www.cvlibs.net/datasets/kitti/eval_odometry.php)) video (available in the folder `videos`) by using its corresponding camera calibration file (available in the folder `settings`), and its groundtruth (available in the video folder).
-**N.B.**: remind, the simple script `main_vo.py` **strictly requires a ground truth**, since - with the used approach - the relative motion between two adjacent camera frames can be only estimated up to scale with a monocular camera (i.e. the implemented inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$).
+**N.B.**: as explained above, the script `main_vo.py` **strictly requires a ground truth**.
-In order to process a different dataset, you need to set the file `config.ini`:
+In order to process a different **dataset**, you need to set the file `config.ini`:
* select your dataset `type` in the section `[DATASET]` (see the section *Datasets* below for further details)
* the camera settings file accordingly (see the section *Camera Settings* below)
* the groudtruth file accordingly (see the section *Camera Settings* below)
@@ -83,6 +83,8 @@ If you want to test the script `main_slam.py`, you can run:
$ python3 -O main_slam.py
```
+You can choose any detector/descriptor among *ORB*, *SIFT*, *SURF*, *BRISK*, *AKAZE* (see below for further information).
+
**WARNING**: the available **KITTI videos** (due to information loss in video compression) make main_slam tracking peform worse than with the original KITTI *image sequences*. The available videos are intended to be used for a first quick test. Please, download and use the original KITTI image sequences as explained below. For instance, on the original KITTI sequence 06, main_slam successfully completes the round; at present time, this does not happen with the compressed video.
---
@@ -140,6 +142,29 @@ In order to calibrate your camera, you can use the scripts in the folder `calibr
1. use the script `grab_chessboard_images.py` to collect a sequence of images where the chessboard can be detected (set the chessboard size there)
2. use the script `calibrate.py` to process the collected images and compute the calibration parameters (set the chessboard size there)
+---
+## Detectors/Descriptors
+
+At present time, the following feature **detectors** are supported:
+* *FAST*
+* *Good features to track* [[ShiTo94]](https://ieeexplore.ieee.org/document/323794)
+* *ORB*
+* *SIFT*
+* *SURF*
+* *AKAZE*
+* *BRISK*
+
+You can take a look at the file `feature_detector.py`.
+
+The following feature **descriptors** are supported:
+* *ORB*
+* *SIFT*
+* *SURF*
+* *AKAZE*
+* *BRISK*
+
+In both the scripts `main_vo.py` and `main_slam.py`, you can set which detector/descritor to use by means of the function *feature_tracker_factory()*. This function be found in the file `feature_tracker.py`.
+
---
## References
@@ -161,10 +186,9 @@ Tons of things are still missing to attain a real SLAM pipeline:
* keyframe generation and management
* tracking w.r.t. previous keyframe
-* proper local map generation and management
+* proper local map generation and management (covisibility)
* loop closure
* general relocalization
-* in main_slam, tracking by using all kind of features (not only ORB)
---
@@ -173,4 +197,4 @@ Tons of things are still missing to attain a real SLAM pipeline:
* [twitchslam](https://github.com/geohot/twitchslam)
* [monoVO](https://github.com/uoip/monoVO-python)
* [pangolin](https://github.com/stevenlovegrove/Pangolin)
-* [g2opy](https://github.com/uoip/g2opy)
\ No newline at end of file
+* [g2opy](https://github.com/uoip/g2opy)
diff --git a/config.ini b/config.ini
index 91c9515..0cfbce8 100644
--- a/config.ini
+++ b/config.ini
@@ -20,11 +20,11 @@ type=VIDEO_DATASET
type=kitti
base_path=/home/luigi/Work/rgbd_datasets/kitti/dataset
;
-name=00
-cam_settings=settings/KITTI00-02.yaml
+#name=00
+#cam_settings=settings/KITTI00-02.yaml
;
-#name=06
-#cam_settings=settings/KITTI04-12.yaml
+name=06
+cam_settings=settings/KITTI04-12.yaml
;
groundtruth_file=auto
@@ -44,8 +44,8 @@ type=video
base_path=./videos/kitti00
cam_settings=settings/KITTI00-02.yaml
;
-#base_path=./videos/kitti06
-#cam_settings=settings/KITTI04-12.yaml
+;base_path=./videos/kitti06
+;cam_settings=settings/KITTI04-12.yaml
;
name=video.mp4
groundtruth_file=groundtruth.txt
diff --git a/config.py b/config.py
index 17be95c..5e6816a 100644
--- a/config.py
+++ b/config.py
@@ -38,6 +38,8 @@ def __init__(self):
self.cam_settings = None
self.dataset_settings = None
self.dataset_type = None
+ self.current_path = os.getcwd()
+ #print('current path: ', self.current_path)
self.set_lib_paths()
self.get_dataset_settings()
@@ -56,6 +58,9 @@ def set_lib_paths(self):
def get_dataset_settings(self):
self.dataset_type = self.config_parser['DATASET']['type']
self.dataset_settings = self.config_parser[self.dataset_type]
+
+ self.dataset_path = self.dataset_settings['base_path'];
+ self.dataset_settings['base_path'] = os.path.join( __location__, self.dataset_path)
#print('dataset_settings: ', self.dataset_settings)
# get camera settings
diff --git a/dataset.py b/dataset.py
index a9b2bed..d4bf0b9 100644
--- a/dataset.py
+++ b/dataset.py
@@ -91,9 +91,10 @@ class VideoDataset(Dataset):
def __init__(self, path, name, associations=None, type=DatasetType.VIDEO):
super().__init__(path, name, associations, type)
self.filename = path + '/' + name
+ #print('video: ', self.filename)
self.cap = cv2.VideoCapture(self.filename)
if not self.cap.isOpened():
- raise IOError('Cannot open movie file')
+ raise IOError('Cannot open movie file: ', self.filename)
else:
print('Processing Video Input')
self.num_frames = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT))
diff --git a/feature_detector.py b/feature_detector.py
index 14ded1d..f163278 100644
--- a/feature_detector.py
+++ b/feature_detector.py
@@ -17,10 +17,11 @@
* along with PYSLAM. If not, see .
"""
import sys
+import math
import numpy as np
import cv2
from enum import Enum
-from geom_helpers import imgBlocks
+from geom_helpers import imgBlocks, unpackSiftOctaveKps
kVerbose = True
@@ -32,6 +33,7 @@
kNumLevels = 4
kNumLevelsInitSigma = 12
kScaleFactor = 1.2
+kSigmaLevel0 = 1.
kDrawOriginalExtractedFeatures = False # for debugging
@@ -44,6 +46,7 @@ class FeatureDetectorTypes(Enum):
ORB = 5
BRISK = 6
AKAZE = 7
+ FREAK = 8 # DOES NOT WORK!
class FeatureDescriptorTypes(Enum):
@@ -53,6 +56,7 @@ class FeatureDescriptorTypes(Enum):
ORB = 3
BRISK = 4
AKAZE = 5
+ FREAK = 6 # DOES NOT WORK!
def feature_detector_factory(min_num_features=kMinNumFeatureDefault,
@@ -181,7 +185,8 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
self.descriptor_type = descriptor_type
self.num_levels = num_levels
- self.scale_factor = kScaleFactor
+ self.scale_factor = kScaleFactor # scale factor bewteen two octaves
+ self.sigma_level0 = kSigmaLevel0 # sigma on first octave
self.initSigmaLevels()
self.min_num_features = min_num_features
@@ -191,20 +196,26 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
self.use_pyramid_adaptor = False
self.pyramid_adaptor = None
+ print("using opencv ", cv2.__version__)
+ # check opencv version in order to use the right modules
if cv2.__version__.split('.')[0] == '3':
- from cv2.xfeatures2d import SIFT_create, SURF_create
+ from cv2.xfeatures2d import SIFT_create, SURF_create, FREAK_create
from cv2 import ORB_create, BRISK_create, AKAZE_create
else:
SIFT_create = cv2.SIFT
SURF_create = cv2.SURF
ORB_create = cv2.ORB
BRISK_create = cv2.BRISK
- AKAZE_create = cv2.AKAZE
+ AKAZE_create = cv2.AKAZE
+ FREAK_create = cv2.FREAK # TODO: to be checked
+
+ self.FAST_create = cv2.FastFeatureDetector_create
self.SIFT_create = SIFT_create
self.SURF_create = SURF_create
self.ORB_create = ORB_create
self.BRISK_create = BRISK_create
- self.AKAZE_create = AKAZE_create
+ self.AKAZE_create = AKAZE_create
+ self.FREAK_create = FREAK_create # DOES NOT WORK!
self.orb_params = dict(nfeatures=min_num_features,
scaleFactor=self.scale_factor,
@@ -219,25 +230,33 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
# init detector
if self.detector_type == FeatureDetectorTypes.SIFT:
- self._feature_detector = SIFT_create()
+ self._feature_detector = self.SIFT_create() # N.B.: The number of octaves is computed automatically from the image resolution
+ # from https://docs.opencv.org/3.4/d5/d3c/classcv_1_1xfeatures2d_1_1SIFT.html
+ self.scale_factor = 2 # from https://docs.opencv.org/3.1.0/da/df5/tutorial_py_sift_intro.html
+ # self.layer_scale_factor = math.sqrt(2) # with SIFT, 3 layers per octave are generated with a intra-layer scale factor = sqrt(2)
+ self.sigma_level0 = 1.6
+ self.initSigmaLevels()
self.detector_name = 'SIFT'
elif self.detector_type == FeatureDetectorTypes.SURF:
- self._feature_detector = SURF_create()
+ self._feature_detector = self.SURF_create(nOctaveLayers=self.num_levels)
self.detector_name = 'SURF'
elif self.detector_type == FeatureDetectorTypes.ORB:
- self._feature_detector = ORB_create(**self.orb_params)
+ self._feature_detector = self.ORB_create(**self.orb_params)
self.detector_name = 'ORB'
self.use_bock_adaptor = True
elif self.detector_type == FeatureDetectorTypes.BRISK:
- self._feature_detector = BRISK_create(octaves=self.num_levels)
+ self._feature_detector = self.BRISK_create(octaves=self.num_levels)
self.detector_name = 'BRISK'
self.scale_factor = 1.3 # from the BRISK opencv code this seems to be the used scale factor between intra-octave frames
self.initSigmaLevels()
elif self.detector_type == FeatureDetectorTypes.AKAZE:
- self._feature_detector = AKAZE_create(nOctaves=self.num_levels)
- self.detector_name = 'AKAZE'
+ self._feature_detector = self.AKAZE_create(nOctaves=self.num_levels)
+ self.detector_name = 'AKAZE'
+ elif self.detector_type == FeatureDetectorTypes.FREAK:
+ self._feature_detector = self.FREAK_create(nOctaves=self.num_levels)
+ self.detector_name = 'FREAK'
elif self.detector_type == FeatureDetectorTypes.FAST:
- self._feature_detector = cv2.FastFeatureDetector_create(threshold=25, nonmaxSuppression=True)
+ self._feature_detector = self.FAST_create(threshold=25, nonmaxSuppression=True)
self.detector_name = 'FAST'
self.use_bock_adaptor = True
self.use_pyramid_adaptor = self.num_levels > 1
@@ -257,20 +276,23 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
# init descriptor
if self.descriptor_type == FeatureDescriptorTypes.SIFT:
- self._feature_descriptor = SIFT_create()
+ self._feature_descriptor = self.SIFT_create()
self.decriptor_name = 'SIFT'
elif self.descriptor_type == FeatureDescriptorTypes.SURF:
- self._feature_descriptor = SURF_create()
+ self._feature_descriptor = self.SURF_create(nOctaveLayers=self.num_levels)
self.decriptor_name = 'SURF'
elif self.descriptor_type == FeatureDescriptorTypes.ORB:
- self._feature_descriptor = ORB_create(**self.orb_params)
+ self._feature_descriptor = self.ORB_create(**self.orb_params)
self.decriptor_name = 'ORB'
elif self.descriptor_type == FeatureDescriptorTypes.BRISK:
- self._feature_descriptor = BRISK_create(octaves=self.num_levels)
+ self._feature_descriptor = self.BRISK_create(octaves=self.num_levels)
self.decriptor_name = 'BRISK'
elif self.descriptor_type == FeatureDescriptorTypes.AKAZE:
- self._feature_descriptor = AKAZE_create(nOctaves=self.num_levels)
- self.decriptor_name = 'AKAZE'
+ self._feature_descriptor = self.AKAZE_create(nOctaves=self.num_levels)
+ self.decriptor_name = 'AKAZE'
+ elif self.descriptor_type == FeatureDescriptorTypes.FREAK:
+ self._feature_descriptor = self.FREAK_create(nOctaves=self.num_levels)
+ self.decriptor_name = 'FREAK'
elif self.descriptor_type == FeatureDescriptorTypes.NONE:
self._feature_descriptor = None
self.decriptor_name = 'None'
@@ -284,11 +306,13 @@ def initSigmaLevels(self):
self.inv_scale_factors = np.zeros(num_levels)
self.inv_level_sigmas2 = np.zeros(num_levels)
+ # TODO: in the SIFT case, this sigma management could be refined.
+ # SIFT method has layers with intra-layer scale factor = math.sqrt(2)
self.scale_factors[0]=1.0
- self.level_sigmas2[0]=1.0
+ self.level_sigmas2[0]=self.sigma_level0*self.sigma_level0
for i in range(1,num_levels):
self.scale_factors[i]=self.scale_factors[i-1]*self.scale_factor
- self.level_sigmas2[i]=self.scale_factors[i]*self.scale_factors[i]
+ self.level_sigmas2[i]=self.scale_factors[i]*self.scale_factors[i]*self.level_sigmas2[i-1]
#print('self.scale_factors: ', self.scale_factors)
for i in range(num_levels):
self.inv_scale_factors[i]=1.0/self.scale_factors[i]
@@ -306,7 +330,7 @@ def detect(self, frame, mask=None):
kps = self.block_adaptor.detect(frame, mask)
else:
kps = self._feature_detector.detect(frame, mask)
- kps = self.satNumberOfFeatures(kps)
+ kps = self.satNumberOfFeatures(kps)
if kDrawOriginalExtractedFeatures: # draw the original features
imgDraw = cv2.drawKeypoints(frame, kps, None, color=(0,255,0), flags=0)
cv2.imshow('detected keypoints',imgDraw)
@@ -321,6 +345,8 @@ def detectAndCompute(self, frame, mask=None):
frame = cv2.cvtColor(frame,cv2.COLOR_RGB2GRAY)
kps = self.detect(frame, mask)
kps, des = self._feature_descriptor.compute(frame, kps)
+ if self.detector_type == FeatureDetectorTypes.SIFT:
+ unpackSiftOctaveKps(kps)
if kVerbose:
#print('detector: ', self.detector_name, ', #features: ', len(kps))
print('descriptor: ', self.decriptor_name, ', #features: ', len(kps))
diff --git a/feature_matcher.py b/feature_matcher.py
index 17f3000..bd925c3 100644
--- a/feature_matcher.py
+++ b/feature_matcher.py
@@ -128,7 +128,7 @@ def __init__(self, norm_type=cv2.NORM_HAMMING, cross_check = False, type = Featu
def match(self, des1, des2):
if kVerbose:
- print('BfFeatureMatcher')
+ print('BfFeatureMatcher, norm ', self.norm_type)
matches = self.bf.knnMatch(des1, des2, k=2) #knnMatch(queryDescriptors,trainDescriptors)
return self.goodMatches(matches, des1, des2)
@@ -140,9 +140,9 @@ def __init__(self, norm_type=cv2.NORM_HAMMING, cross_check = False, type = Featu
# FLANN parameters for binary descriptors
FLANN_INDEX_LSH = 6
self.index_params= dict(algorithm = FLANN_INDEX_LSH,
- table_number = 12, # 12
- key_size = 20, # 20
- multi_probe_level = 2) # 2
+ table_number = 6, # 12
+ key_size = 12, # 20
+ multi_probe_level = 1) # 2
if norm_type == cv2.NORM_L2:
# FLANN parameters for float descriptors
FLANN_INDEX_KDTREE = 1
diff --git a/feature_tracker.py b/feature_tracker.py
index 637325d..ae5fcb6 100644
--- a/feature_tracker.py
+++ b/feature_tracker.py
@@ -23,10 +23,11 @@
from feature_detector import feature_detector_factory, FeatureDetectorTypes, FeatureDescriptorTypes
from feature_matcher import feature_matcher_factory, FeatureMatcherTypes
from helpers import Printer
+from geom_helpers import hamming_distance, l2_distance
class TrackerTypes(Enum):
- LK = 0 # use patch as "descriptor" and match by using Lucas Kanade pyr optic flow
+ LK = 0 # use patch as "descriptor" and match by using Lucas Kanade pyramid optic flow
DES_BF = 1 # descriptor-based, brute force matching
DES_FLANN = 2 # descriptor-based, FLANN-based matching (should be faster)
@@ -71,6 +72,7 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
self.tracker_type = tracker_type
self.detector = None
+ self.descriptor_distance = None # pointer function
# out: keypoints and descriptors
def detect(self, frame, mask):
@@ -80,6 +82,7 @@ def detect(self, frame, mask):
def track(self, image_ref, image_cur, kp_ref, des_ref):
return None, None, None
+
# use patch as "descriptor" and track/"match" by using Lucas Kanade pyr optic flow
class LkFeatureTracker(FeatureTracker):
def __init__(self, min_num_features=kMinNumFeatureDefault,
@@ -132,13 +135,20 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
elif descriptor_type == FeatureDescriptorTypes.BRISK:
self.norm_type = cv2.NORM_HAMMING
elif descriptor_type == FeatureDescriptorTypes.AKAZE:
- self.norm_type = cv2.NORM_HAMMING
+ self.norm_type = cv2.NORM_HAMMING
+ elif descriptor_type == FeatureDescriptorTypes.FREAK:
+ self.norm_type = cv2.NORM_HAMMING
elif descriptor_type == FeatureDescriptorTypes.SURF:
self.norm_type = cv2.NORM_L2
elif descriptor_type == FeatureDescriptorTypes.SIFT:
self.norm_type = cv2.NORM_L2
else:
- raise ValueError("Unmanaged norm type for feature tracker %s" % self.tracker_type)
+ raise ValueError("Unmanaged norm type for feature tracker %s" % self.tracker_type)
+
+ if self.norm_type == cv2.NORM_HAMMING:
+ self.descriptor_distance = hamming_distance
+ if self.norm_type == cv2.NORM_L2:
+ self.descriptor_distance = l2_distance
if tracker_type == TrackerTypes.DES_FLANN:
self.matching_algo = FeatureMatcherTypes.FLANN
@@ -147,7 +157,7 @@ def __init__(self, min_num_features=kMinNumFeatureDefault,
else:
raise ValueError("Unmanaged matching algo for feature tracker %s" % self.tracker_type)
- self.matcher = feature_matcher_factory(norm_type=self.norm_type , type=self.matching_algo)
+ self.matcher = feature_matcher_factory(norm_type=self.norm_type, type=self.matching_algo)
# out: keypoints and descriptors
def detect(self, frame, mask=None):
diff --git a/frame.py b/frame.py
index 8546073..00bace2 100644
--- a/frame.py
+++ b/frame.py
@@ -17,6 +17,7 @@
* along with PYSLAM. If not, see .
"""
+import sys
import cv2
import numpy as np
from scipy.spatial import cKDTree
@@ -33,20 +34,34 @@
kDrawFeatureRadius = [r*3 for r in range(1,20)]
class Frame(object):
- # shared tracker
- tracker = feature_tracker_factory(min_num_features=2000, num_levels = 1,
- detector_type = FeatureDetectorTypes.FAST,
- descriptor_type = FeatureDescriptorTypes.ORB,
- tracker_type = TrackerTypes.DES_BF)
- # shared detector
- detector = tracker.detector
- matcher = tracker.matcher
+ # shared counter
next_id = 0
-
+ # shared tracker
+ tracker = None
+ detector = None
+ matcher = None
+ descriptor_distance = None
+
@staticmethod
def set_tracker(tracker):
Frame.tracker = tracker
Frame.detector = tracker.detector
+ Frame.matcher = tracker.matcher
+ Frame.descriptor_distance = tracker.descriptor_distance
+ Frame.next_id = 0
+
+ if tracker.descriptor_type == FeatureDescriptorTypes.ORB:
+ parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxOrbDistanceSearchByReproj
+ if tracker.descriptor_type == FeatureDescriptorTypes.BRISK:
+ parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxBriskDistanceSearchByReproj
+ if tracker.descriptor_type == FeatureDescriptorTypes.AKAZE:
+ parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxAkazeDistanceSearchByReproj
+ if tracker.descriptor_type == FeatureDescriptorTypes.SIFT:
+ parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxSiftDistanceSearchByReproj
+ if tracker.descriptor_type == FeatureDescriptorTypes.SURF:
+ parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxSurfDistanceSearchByReproj
+
+ parameters.kMaxDescriptorDistanceSearchEpipolar = parameters.kMaxDescriptorDistanceSearchByReproj
def __init__(self, mapp, img, K, Kinv, DistCoef, pose=np.eye(4), tid=None, des=None):
self.H, self.W = img.shape[0:2]
@@ -67,7 +82,6 @@ def __init__(self, mapp, img, K, Kinv, DistCoef, pose=np.eye(4), tid=None, des=N
# self.kps keypoints
# self.kpsu [u]ndistorted keypoints
- # self.kpsn [n]ormalized keypoints
# self.octaves keypoint octaves
# self.des keypoint descriptors
@@ -80,9 +94,9 @@ def __init__(self, mapp, img, K, Kinv, DistCoef, pose=np.eye(4), tid=None, des=N
# convert to gray image
if img.ndim>2:
img = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
- self.kps, self.des = Frame.tracker.detect(img)
+ self.kps, self.des = Frame.tracker.detect(img)
# convert from a list of keypoints to an array of points and octaves
- kps_data = np.array([ [x.pt[0], x.pt[1], x.octave] for x in self.kps], dtype=np.float32)
+ kps_data = np.array([ [x.pt[0], x.pt[1], x.octave] for x in self.kps], dtype=np.float32)
self.octaves = np.uint32(kps_data[:,2].copy())
#print('octaves: ', self.octaves)
self.kps = kps_data[:,:2].copy()
@@ -169,7 +183,7 @@ def compute_points_median_depth(self, points4d):
return z[ ( len(z)-1)//2 ]
- # KD tree of unnormalized keypoints
+ # KD tree of undistorted keypoints
@property
def kd(self):
if not hasattr(self, '_kd'):
diff --git a/geom_helpers.py b/geom_helpers.py
index f2f8f7b..17ba93c 100644
--- a/geom_helpers.py
+++ b/geom_helpers.py
@@ -22,6 +22,13 @@
import cv2
+def poseRt(R, t):
+ ret = np.eye(4)
+ ret[:3, :3] = R
+ ret[:3, 3] = t
+ return ret
+
+
# turn [[x,y]] -> [[x,y,1]]
def add_ones(x):
if len(x.shape) == 1:
@@ -29,8 +36,15 @@ def add_ones(x):
else:
return np.concatenate([x, np.ones((x.shape[0], 1))], axis=1)
+# turn [[x,y,w]] -> [[x/w,y/w,1]]
+def normalize(Kinv, pts):
+ return np.dot(Kinv, add_ones(pts).T).T[:, 0:2]
+
# turn w in its skew symmetrix matrix representation
+# w in IR^3 -> [0,-wz,wy],
+# [wz,0,-wx],
+# [-wy,wx,0]]
def skew(w):
wx,wy,wz = w.ravel()
return np.array([[0,-wz,wy],[wz,0,-wx],[-wy,wx,0]])
@@ -38,7 +52,11 @@ def skew(w):
def hamming_distance(a, b):
r = (1 << np.arange(8))[:,None]
- return np.count_nonzero((np.bitwise_xor(a,b) & r) != 0)
+ return np.count_nonzero((np.bitwise_xor(a,b) & r) != 0)
+
+def l2_distance(a, b):
+ return np.linalg.norm(a.ravel()-b.ravel())
+
# DLT with normalized image coordinates (see [HartleyZisserman Sect. 12.2 ])
def triangulate_point(pose1, pose2, pt1, pt2):
@@ -62,25 +80,13 @@ def triangulate_points(pose1, pose2, pts1, pts2, mask = None):
def triangulate_points_with_mask(pose1, pose2, pts1, pts2, mask):
ret = np.zeros((pts1.shape[0], 4))
- #num_triangulated_points = 0
for i, p in enumerate(zip(pts1, pts2)):
if mask[i]:
ret[i] = triangulate_point(pose1, pose2, p[0], p[1])
- #num_triangulated_points += 1
return ret
-def poseRt(R, t):
- ret = np.eye(4)
- ret[:3, :3] = R
- ret[:3, 3] = t
- return ret
-
-
-def normalize(Kinv, pts):
- return np.dot(Kinv, add_ones(pts).T).T[:, 0:2]
-
-# create a generator over an image to extract its sub-blocks
+# create a generator over an image to extract 'row_divs' x 'col_divs' sub-blocks
def imgBlocks(img, row_divs, col_divs):
rows, cols = img.shape[:2]
#print('img.shape: ', img.shape)
@@ -131,3 +137,38 @@ def draw_lines(img, line_edges, pts=None, radius=5):
pt = pts[i]
img = cv2.circle(img,tuple(pt),radius,color,-1)
return img
+
+
+## SIFT stuff ##
+
+# from https://stackoverflow.com/questions/48385672/opencv-python-unpack-sift-octave
+def unpackSiftOctave(kpt):
+ """unpackSIFTOctave(kpt)->(octave,layer,scale)
+ @brief Unpack Sift Keypoint
+ @param kpt: cv2.KeyPoint (of SIFT)
+ """
+ _octave = kpt.octave
+ octave = _octave&0xFF
+ layer = (_octave>>8)&0xFF
+ if octave>=128:
+ octave |= -128
+ if octave>=0:
+ scale = float(1/(1<octave
+ @brief Unpack Sift Keypoint
+ @param kpt: cv2.KeyPoint (of SIFT)
+ """
+ _octave = kpt.octave
+ octave = _octave&0xFF
+ if octave>=128:
+ octave |= -128
+ return abs(octave) # N.B.: abs() is an HACK that needs to be further verified
+
+def unpackSiftOctaveKps(kps):
+ for kpt in kps:
+ kpt.octave = unpackSiftOctaveSimple(kpt)
\ No newline at end of file
diff --git a/initializer.py b/initializer.py
index 64c6085..88eb8c2 100644
--- a/initializer.py
+++ b/initializer.py
@@ -134,7 +134,7 @@ def initialize(self, f_cur, img_cur):
#pts4d = triangulate(f_cur.pose, f_ref.pose, f_cur.kpsn[idx_cur], f_ref.kpsn[idx_ref])
new_pts_count, mask_points = map.add_points(points4d, None, f_cur, f_ref, idx_cur_inliers, idx_ref_inliers, img_cur, check_parallax=True)
- print("triangulated: %d new points, %d matches" % (new_pts_count, len(idx_cur)))
+ print("triangulated: %d new points from %d matches" % (new_pts_count, len(idx_cur)))
err = map.optimize(verbose=False)
print("pose opt err: %f" % err)
diff --git a/main_slam.py b/main_slam.py
index d4189ba..1fc4781 100644
--- a/main_slam.py
+++ b/main_slam.py
@@ -59,16 +59,21 @@
num_features=parameters.kNumFeatures
"""
select your feature tracker
- N.B.1: here you can use just ORB descriptors! ... at present time
- N.B.2: ORB detector (not descriptor) does not work as expected!
+ N.B.: ORB detector (not descriptor) does not work as expected!
"""
tracker_type = TrackerTypes.DES_BF
#tracker_type = TrackerTypes.DES_FLANN
- feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 1, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+
+ feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 3, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
#feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.BRISK, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.AKAZE, descriptor_type = FeatureDescriptorTypes.AKAZE, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SIFT, descriptor_type = FeatureDescriptorTypes.SIFT, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SURF, descriptor_type = FeatureDescriptorTypes.SURF, tracker_type = tracker_type)
+ # create SLAM object
slam = SLAM(cam, feature_tracker, grountruth)
viewer3D = Viewer3D()
diff --git a/main_vo.py b/main_vo.py
index a75dfa9..fec54f1 100644
--- a/main_vo.py
+++ b/main_vo.py
@@ -59,22 +59,27 @@
num_features=2000 # how many features do you want to detect and track?
+
"""
select your feature tracker
N.B: ORB detector (not descriptor) does not work as expected!
"""
+
+ # LK tracker
feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.NONE, tracker_type = TrackerTypes.LK)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_FLANN)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_FLANN)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.BRISK, tracker_type = TrackerTypes.DES_BF)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.AKAZE, descriptor_type = FeatureDescriptorTypes.AKAZE, tracker_type = TrackerTypes.DES_BF)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SIFT, descriptor_type = FeatureDescriptorTypes.SIFT, tracker_type = TrackerTypes.DES_FLANN)
- #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SURF, descriptor_type = FeatureDescriptorTypes.SURF, tracker_type = TrackerTypes.DES_FLANN)
+ #tracker_type = TrackerTypes.DES_BF
+ tracker_type = TrackerTypes.DES_FLANN
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.BRISK, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.AKAZE, descriptor_type = FeatureDescriptorTypes.AKAZE, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SIFT, descriptor_type = FeatureDescriptorTypes.SIFT, tracker_type = tracker_type)
+ #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SURF, descriptor_type = FeatureDescriptorTypes.SURF, tracker_type = tracker_type)
+
+ # create visual odometry object
vo = VisualOdometry(cam, grountruth, feature_tracker)
is_draw_traj_img = True
diff --git a/map.py b/map.py
index 933b676..5c7cc42 100644
--- a/map.py
+++ b/map.py
@@ -22,7 +22,7 @@
import json
import math
-from geom_helpers import poseRt, hamming_distance, add_ones
+from geom_helpers import poseRt, add_ones
import parameters
from frame import Frame
from map_point import MapPoint
@@ -75,6 +75,7 @@ def add_points(self, points4d, mask_pts4d, f1, f2, idx1, idx2, img1, check_paral
out_mask_pts4d = np.full(points4d.shape[0], False, dtype=bool)
if mask_pts4d is None:
mask_pts4d = np.full(points4d.shape[0], True, dtype=bool)
+
for i, p in enumerate(points4d):
if not mask_pts4d[i]:
#print('p[%d] not good' % i)
@@ -188,7 +189,7 @@ def locally_optimize(self, local_window=parameters.kLocalWindow, verbose = False
#print(' points: ', sorted([p.id for p in points]))
#err = optimizer_g2o.optimize(frames, points, None, False, verbose, rounds)
err, ratio_bad_observations = optimizer_g2o.localOptimization(frames, points, frames_ref, False, verbose, rounds)
- Printer.green('local optimization - %% bad observations: %.2f ' % (ratio_bad_observations*100) )
+ Printer.green('local optimization - perc bad observations: %.2f %% ' % (ratio_bad_observations*100) )
#self.cull_map_points(points) # already performed in optimizer_g2o.localOptimization()
return err
diff --git a/map_point.py b/map_point.py
index 2f4db62..18a83af 100644
--- a/map_point.py
+++ b/map_point.py
@@ -17,7 +17,7 @@
* along with PYSLAM. If not, see .
"""
-from geom_helpers import poseRt, hamming_distance, add_ones
+from geom_helpers import poseRt, add_ones
from frame import Frame
import time
import numpy as np
@@ -42,11 +42,13 @@ def __init__(self, mapp, position, color, tid=None):
def homogeneous(self):
return add_ones(self.pt)
- def orb(self):
+ # return array of corresponding descriptors
+ def des(self):
return [f.des[idx] for f, idx in zip(self.frames, self.idxs)]
- def orb_distance(self, des):
- return min([hamming_distance(o, des) for o in self.orb()])
+ # minimum distance between input descriptor and map point corresponding descriptors
+ def min_des_distance(self, descriptor):
+ return min([Frame.descriptor_distance(d, descriptor) for d in self.des()])
def delete(self):
for f, idx in zip(self.frames, self.idxs): # f.points[idx] = this point
diff --git a/parameters.py b/parameters.py
index 02ccf74..2c8dea9 100644
--- a/parameters.py
+++ b/parameters.py
@@ -17,28 +17,54 @@
* along with PYSLAM. If not, see .
"""
-kCosMinParallax=0.99998 # 0.9998
+'''
+List of shared parameters
+'''
-kMaxReprojectionDistance=15 # pixels
-kMaxOrbDistanceMatch=50
-kMaxOrbDistanceSearchByReproj=50
-kMatchRatioTest=0.75
+# num of desired keypoints per frame
kNumFeatures=2000
-kLocalWindow=10
-kEveryNumFramesLargeWindowBA=10 # num of frames between two large window BA
-kLargeWindow=20
-kUseLargeWindowBA = True # True: use BA over a large window, False: do not use large window BA
+# min cos angle for triangulation (min parallax angle)
+kCosMinParallax=0.99998 # 0.9998
-kChi2Mono = 5.991 # chi-square 2 DOFs, used for reprojection error (Hartley Zisserman pg 119)
-kInitializerDesiredMedianDepth = 20 # after initialization: the initial median depth is computed and forced to this value
+# initializer
+kInitializerDesiredMedianDepth = 20 # when initializing, the initial median depth is computed and forced to this value
kMinTraslation=0.0001*kInitializerDesiredMedianDepth
+
+# tracking
kUseMotionModel = True # True: use kinematic motion model for computing a first guess pose (to be optimized by current pose optimization) without fitting an essential mat
# False: pose and keypoint match inliers are estimated by fitting an essential mat (5 points algorithm), WARNING: this approach comes with some limitations (please, read the comments of the method slam.estimate_pose_ess_mat())
-kUseSearchFrameByEpipolarLines=False # Match frames by using epipolar lines; the current available interframe pose estimate is used for computing the fundamental mat F
-kMinDistanceFromEpipole=100 # Used with search by epipolar lines
\ No newline at end of file
+
+# search matches by projection
+kMaxReprojectionDistance=15 # [pixels]
+kMatchRatioTest=0.75
+#
+kMaxOrbDistanceSearchByReproj=50 # ORB
+kMaxBriskDistanceSearchByReproj=50 # BRISK (needs better tuning)
+kMaxAkazeDistanceSearchByReproj=80 # AKAZE (needs better tuning)
+kMaxSiftDistanceSearchByReproj=100 # SIFT (needs better tuning)
+kMaxSurfDistanceSearchByReproj=0.05 # SURF (needs better tuning)
+kMaxDescriptorDistanceSearchByReproj=kMaxOrbDistanceSearchByReproj # main (use ORB setting by default)
+
+
+# search matches for triangulation by using epipolar lines
+kUseSearchFrameByProjection=False # Match frames by using frame map points projection and epipolar lines; here, the current available interframe pose estimate is used for computing the fundamental mat F
+kMinDistanceFromEpipole=100 # [pixels] Used with search by epipolar lines
+#
+kMaxDescriptorDistanceSearchEpipolar=kMaxDescriptorDistanceSearchByReproj
+
+
+# BA
+kLocalWindow=10 # [# frames]
+kEveryNumFramesLargeWindowBA=10 # num of frames between two large window BA
+kLargeWindow=20 # [# frames]
+kUseLargeWindowBA = True # True: perform BA over a large window; False: do not perform large window BA
+
+
+kChi2Mono = 5.991 # chi-square 2 DOFs, used for reprojection error (Hartley Zisserman pg 119)
+
diff --git a/search_points.py b/search_points.py
index 6e4fbc6..380812a 100644
--- a/search_points.py
+++ b/search_points.py
@@ -23,7 +23,7 @@
import cv2
from frame import Frame
-from geom_helpers import skew, add_ones, hamming_distance, draw_lines, draw_points
+from geom_helpers import skew, add_ones, draw_lines, draw_points
from helpers import Printer, getchar
import parameters
from timer import Timer
@@ -35,8 +35,8 @@ def search_frame_by_projection(f_ref, f_cur):
found_pts_count = 0
idx_ref = []
idx_cur = []
- for i, p in enumerate(f_ref.points):
- if p is None or p.is_bad is True:
+ for i, p in enumerate(f_ref.points): # search in f_ref map points without a corresponding map point
+ if p is None:
continue
# project point on f_cur
proj = f_cur.project_map_point(p)
@@ -45,22 +45,33 @@ def search_frame_by_projection(f_ref, f_cur):
(proj[1] > 0) & (proj[1] < f_cur.H)
if is_visible is False:
continue
- for idx_c in f_cur.kd.query_ball_point(proj, parameters.kMaxReprojectionDistance):
+ best_dist = math.inf
+ #best_dist2 = math.inf
+ best_k_idx = -1
+ best_ref_idx = -1
+ for k_idx in f_cur.kd.query_ball_point(proj, parameters.kMaxReprojectionDistance):
# if no point associated
- if f_cur.points[idx_c] is None:
- orb_dist = p.orb_distance(f_cur.des[idx_c])
- #print('b_dist : ', orb_dist)
- if orb_dist < parameters.kMaxOrbDistanceSearchByReproj:
- p.add_observation(f_cur, idx_c)
- found_pts_count += 1
- idx_ref.append(i)
- idx_cur.append(idx_c)
- break
+ if f_cur.points[k_idx] is None:
+ descriptor_dist = p.min_des_distance(f_cur.des[k_idx])
+ if descriptor_dist < parameters.kMaxDescriptorDistanceSearchByReproj and descriptor_dist < best_dist:
+ #best_dist2 = best_dist
+ best_dist = descriptor_dist
+ best_k_idx = k_idx
+ best_ref_idx = i
+ # N.B.1: given the keypoint search area is limited, here we do not use the match distance ratio test
+ # N.B.2: if we extract keypoints on different octaves/levels, it is very frequent to observe overlapping keypoints (two distinct octave representives for a same interesting point)
+ # => do not use the match distance ration unless you want to exclude these points
+ if best_k_idx > -1:
+ #print('b_dist : ', best_dist)
+ p.add_observation(f_cur, best_k_idx)
+ found_pts_count += 1
+ idx_ref.append(best_ref_idx)
+ idx_cur.append(best_k_idx)
return idx_ref, idx_cur, found_pts_count
# search by projection matches between {input map points} and {unmatched keypoints of frame f_cur}
-def search_by_projection(points, f_cur):
+def search_map_by_projection(points, f_cur):
found_pts_count = 0
if len(points) > 0:
@@ -69,7 +80,7 @@ def search_by_projection(points, f_cur):
# check if points are visible
visible_pts = (projs[:, 0] > 0) & (projs[:, 0] < f_cur.W) & \
- (projs[:, 1] > 0) & (projs[:, 1] < f_cur.H)
+ (projs[:, 1] > 0) & (projs[:, 1] < f_cur.H)
for i, p in enumerate(points):
if not visible_pts[i] or p.is_bad is True:
@@ -78,15 +89,28 @@ def search_by_projection(points, f_cur):
if f_cur in p.frames:
# we already matched this map point to this frame
continue
- for m_idx in f_cur.kd.query_ball_point(projs[i], parameters.kMaxReprojectionDistance):
+ best_dist = math.inf
+ best_dist2 = math.inf
+ best_level = -1
+ best_level2 = -1
+ best_k_idx = -1
+ for k_idx in f_cur.kd.query_ball_point(projs[i], parameters.kMaxReprojectionDistance):
# if no point associated
- if f_cur.points[m_idx] is None:
- orb_dist = p.orb_distance(f_cur.des[m_idx])
- #print('b_dist : ', orb_dist)
- if orb_dist < parameters.kMaxOrbDistanceSearchByReproj:
- p.add_observation(f_cur, m_idx)
- found_pts_count += 1
- break
+ if f_cur.points[k_idx] is None:
+ descriptor_dist = p.min_des_distance(f_cur.des[k_idx])
+ if descriptor_dist < parameters.kMaxDescriptorDistanceSearchByReproj and descriptor_dist < best_dist:
+ best_dist2 = best_dist
+ best_level2 = best_level
+ best_dist = descriptor_dist
+ best_level = f_cur.octaves[k_idx]
+ best_k_idx = k_idx
+ if best_k_idx > -1:
+ # apply match distance ratio test only if the best and second are in the same scale level
+ if (best_level2 == best_level) and (best_dist > best_dist2 * parameters.kMatchRatioTest):
+ continue
+ #print('best des distance: ', best_dist, ", max dist: ", parameters.kMaxDescriptorDistanceSearchByReproj)
+ p.add_observation(f_cur, best_k_idx)
+ found_pts_count += 1
return found_pts_count
@@ -102,12 +126,12 @@ def search_local_frames_by_projection(map, f_cur, local_window = parameters.kLoc
points.append(p)
point_id_set.add(p.id)
print('searching %d map points' % len(points))
- return search_by_projection(points, f_cur)
+ return search_map_by_projection(points, f_cur)
# search by projection matches between {all map points} and {unmatched keypoints of f_cur}
-def search_map_by_projection(map, f_cur):
- return search_by_projection(map.points, f_cur)
+def search_all_map_by_projection(map, f_cur):
+ return search_map_by_projection(map.points, f_cur)
# compute the fundamental mat F12 and the infinite homography H21 [Hartley Zisserman pag 339]
def computeF12(f1, f2):
@@ -137,7 +161,7 @@ def epiline_to_end_points(line, e, xinf, cols):
delta = xinf - e
length = np.linalg.norm(delta.ravel())
- delta /= length
+ delta /= length # normalize delta
step = min(parameters.kMinDistanceFromEpipole, length) # keep a minimum distance from epipole if possible
e = e + delta * step
@@ -165,39 +189,62 @@ def epiline_to_end_points(line, e, xinf, cols):
u1 = xmax[0]
v1 = xmax[1]
- # swap in order to force the search starting from the epipole
+ # swap in order to force the search starting from xinf
if swap is True:
u0, v0, u1, v1 = u1, v1, u0, v0
return [ np.array([u0,v0]), np.array([u1,v1]) ]
-# find a match on f along line for the input descriptor
-# e is the epipole
-def find_matches_along_line(f, e, line, descriptor, radius = 5, overlap_ratio = 0.2):
+# find a keypoint match on 'f' along line 'line' for the input 'descriptor'
+# 'e' is the epipole
+# the found keypoint match must not already have a point match
+# overlap_ratio must be in [0,1]
+def find_matches_along_line(f, e, line, descriptor, radius = parameters.kMaxReprojectionDistance, overlap_ratio = 0.5):
#print('line: ', line[0], ", ", line[1])
step = radius*(1-overlap_ratio)
delta = line[1].ravel() - line[0].ravel()
length = np.linalg.norm(delta)
- n = int(math.ceil(length/step))
+ n = max( int(math.ceil(length/step)), 1)
delta = delta/n
#print('delta: ', delta)
+
+ best_dist = math.inf
+ best_dist2 = math.inf
+ best_k_idx = -1
+
for i in range(n+1):
x = line[0].ravel() + i*delta
for k_idx in f.kd.query_ball_point(x, radius):
# if no point associated
if f.points[k_idx] is None:
- orb_dist = hamming_distance(f.des[k_idx], descriptor)
- if orb_dist < parameters.kMaxOrbDistanceSearchByReproj: # TODO: refine! here, we just take the first match
- return k_idx
- return -1
+ descriptor_dist = Frame.descriptor_distance(f.des[k_idx], descriptor)
+ if descriptor_dist < parameters.kMaxDescriptorDistanceSearchEpipolar:
+ #return k_idx # stop at the first match
+ if descriptor_dist < best_dist:
+ best_dist2 = best_dist
+ best_dist = descriptor_dist
+ best_k_idx = k_idx
+ # N.B.: the search "segment line" can have a large width => it is better to use the match distance ratio test
+ if best_dist < best_dist2 * parameters.kMatchRatioTest:
+ return best_k_idx, best_dist
+ else:
+ return -1, 0
+ #return best_k_idx
+
-# search for triangulations between f1 and f2
-# find matches between unmatched keypoints
-# we have pose estimates for both frames
+# search keypoint matches (for triangulations) between f1 and f2
+# search for matches between unmatched keypoints (without a corresponding map point)
+# in inpput we have already some pose estimates for f1 and f2
def search_frame_for_triangulation(f1, f2, img2, img1 = None):
+ idxs2_out = []
+ idxs1_out = []
+ lines_out = []
+ num_found_matches = 0
+ img2_epi = None
+
if __debug__:
timer = Timer()
timer.start()
@@ -217,7 +264,7 @@ def search_frame_for_triangulation(f1, f2, img2, img1 = None):
# if the translation is too small we cannot triangulate
if np.linalg.norm(O1w-O2w) < parameters.kMinTraslation: # we assume the Inializer has been used for building the first map
Printer.red("search for triangulation: impossible with zero translation!")
- return # EXIT
+ return idxs1_out, idxs2_out, num_found_matches, img2_epi # EXIT
# compute the fundamental matrix between the two frames
F12, H21 = computeF12(f1, f2)
@@ -227,6 +274,7 @@ def search_frame_for_triangulation(f1, f2, img2, img1 = None):
if p is None: # we consider just unmatched keypoints
kp = f1.kpsu[i]
scale_factor = Frame.detector.scale_factors[f1.octaves[i]]
+ # discard points which are too close to the epipole
if np.linalg.norm(kp-e1) < parameters.kMinDistanceFromEpipole * scale_factor:
continue
idxs1.append(i)
@@ -246,23 +294,34 @@ def search_frame_for_triangulation(f1, f2, img2, img1 = None):
assert(len(line_edges) == len(idxs1))
timer.start()
- idxs2_out = []
- idxs1_out = []
- lines_out = []
- for i, idx in enumerate(idxs1):
- f2_idx = find_matches_along_line(f2, e2, line_edges[i], f1.des[idx])
- if f2_idx>-1:
- idxs2_out.append(f2_idx)
- idxs1_out.append(idxs1[i])
- assert(f2.points[f2_idx] is None)
- assert(f1.points[idxs1[i]] is None)
- if __debug__:
- lines_out.append(line_edges[i])
+ len_des2 = len(f2.des)
+ flag_match = np.full(len_des2, False, dtype=bool)
+ dist_match = np.zeros(len_des2)
+ index_match = np.full(len_des2, 0, dtype=int)
+ for i, idx in enumerate(idxs1): # N.B.: a point in f1 can be matched to more than one point in f2, we avoid this by caching matches with f2 points
+ f2_idx, dist = find_matches_along_line(f2, e2, line_edges[i], f1.des[idx])
+ if f2_idx > -1:
+ if not flag_match[f2_idx]: # new match
+ flag_match[f2_idx] = True
+ dist_match[f2_idx] = dist
+ idxs2_out.append(f2_idx)
+ idxs1_out.append(idx)
+ index_match[f2_idx] = len(idxs2_out)-1
+ assert(f2.points[f2_idx] is None)
+ assert(f1.points[idx] is None)
+ if __debug__:
+ lines_out.append(line_edges[i])
+ else: # already matched
+ if dist < dist_match[f2_idx]: # update in case of a smaller distance
+ dist_match[f2_idx] = dist
+ index = index_match[f2_idx]
+ idxs1_out[index] = idx
+ if __debug__:
+ lines_out[index] = line_edges[i]
num_found_matches = len(idxs1_out)
assert(len(idxs1_out) == len(idxs2_out))
- img2_epi = None
if __debug__:
#print("num found matches: ", num_found_matches)
if True:
diff --git a/slam.py b/slam.py
index 1788d9b..9452bb5 100755
--- a/slam.py
+++ b/slam.py
@@ -26,7 +26,7 @@
from frame import Frame, match_frames
from geom_helpers import triangulate_points, add_ones, poseRt, skew
-from search_points import search_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation
+from search_points import search_map_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation
from map_point import MapPoint
from map import Map
from pinhole_camera import Camera, PinholeCamera
@@ -46,7 +46,7 @@
kLocalWindow = parameters.kLocalWindow
kUseMotionModel = parameters.kUseMotionModel
kUseLargeWindowBA = parameters.kUseLargeWindowBA
-kUseSearchFrameByEpipolarLines = parameters.kUseSearchFrameByEpipolarLines
+kUseSearchFrameByProjection = parameters.kUseSearchFrameByProjection
class SLAMStage(Enum):
@@ -122,6 +122,8 @@ def estimate_pose_ess_mat(self, kpn_ref, kpn_cur):
def track(self, img, frame_id, pose=None, verts=None):
# check image size is coherent with camera params
+ print("img.shape ", img.shape)
+ print("cam ", self.H," x ", self.W)
assert img.shape[0:2] == (self.H, self.W)
self.timer_main_track.start()
@@ -164,7 +166,7 @@ def track(self, img, frame_id, pose=None, verts=None):
f_cur.pose = predicted_pose.copy()
#f_cur.pose = f_ref.pose.copy() # get the last pose as an initial guess for optimization
- if kUseSearchFrameByEpipolarLines:
+ if kUseSearchFrameByProjection:
# search frame by projection: match map points observed in f_ref with keypoints of f_cur
print('search frame by projection...')
idx_ref, idx_cur, num_found_map_pts = search_frame_by_projection(f_ref, f_cur)
@@ -215,7 +217,7 @@ def track(self, img, frame_id, pose=None, verts=None):
# populate f_cur with map points by propagating map point matches of f_ref:
# we use map points observed in f_ref and keypoint matches between f_ref and f_cur
num_found_map_pts_inter_frame = 0
- if not kUseSearchFrameByEpipolarLines:
+ if not kUseSearchFrameByProjection:
for i, idx in enumerate(idx_ref): # iterate over keypoint matches
if f_ref.points[idx] is not None: # if we have a map point P for i-th matched keypoint in f_ref
f_ref.points[idx].add_observation(f_cur, idx_cur[i]) # then P is automatically matched to the i-th matched keypoint in f_cur
@@ -223,7 +225,7 @@ def track(self, img, frame_id, pose=None, verts=None):
print("# matched map points in prev frame: %d " % num_found_map_pts_inter_frame)
- # f_cur pose optimization 1 (here we use first available information coming from first guess of f_cur pose and map points matched with f_cur keypoints )
+ # f_cur pose optimization 1 (here we use first available information coming from first guess of f_cur pose and map points of f_ref matched keypoints )
self.timer_pose_opt.start()
pose_opt_error, pose_is_ok, self.num_vo_map_points = optimizer_g2o.poseOptimization(f_cur, verbose=False)
print("pose opt err1: %f, ok: %d" % (pose_opt_error, int(pose_is_ok)) )
@@ -248,8 +250,8 @@ def track(self, img, frame_id, pose=None, verts=None):
if pose_is_ok is True and not self.map.local_map.is_empty():
self.timer_seach_map.start()
#num_found_map_pts = search_local_frames_by_projection(self.map, f_cur)
- num_found_map_pts = search_by_projection(self.map.local_map.points, f_cur) # use the built local map
- print("# new matched map points: %d " % num_found_map_pts)
+ num_found_map_pts = search_map_by_projection(self.map.local_map.points, f_cur) # use the built local map
+ print("# matched map points in local map: %d " % num_found_map_pts)
self.timer_seach_map.refresh()
# f_cur pose optimization 2 with all the matched map points
@@ -263,14 +265,14 @@ def track(self, img, frame_id, pose=None, verts=None):
self.timer_pose_opt.refresh()
- if kUseSearchFrameByEpipolarLines:
- print("search for triangulation...")
+ if kUseSearchFrameByProjection:
+ print("search for triangulation with epipolar lines...")
idx_ref, idx_cur, self.num_matched_kps, _ = search_frame_for_triangulation(f_ref, f_cur, img)
print("# matched keypoints in prev frame: %d " % self.num_matched_kps)
# if pose is ok, then we try to triangulate the matched keypoints (between f_cur and f_ref) that do not have a corresponding map point
- if pose_is_ok is True:
+ if pose_is_ok is True and len(idx_ref)>0:
self.timer_triangulation.start()
# TODO: this triangulation should be done from keyframes!
diff --git a/test/cv/test_feature_detector.py b/test/cv/test_feature_detector.py
index a21794f..0508f86 100644
--- a/test/cv/test_feature_detector.py
+++ b/test/cv/test_feature_detector.py
@@ -13,9 +13,10 @@
num_features=2000
# N.B.: here you can use just ORB descriptors!
#feature_detector = feature_detector_factory(min_num_features=num_features, num_levels=4, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB)
-feature_detector = feature_detector_factory(min_num_features=num_features, num_levels=2, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB)
+#feature_detector = feature_detector_factory(min_num_features=num_features, num_levels=2, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB)
#feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB)
-#feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB)
+feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB)
+#feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FREAK, descriptor_type = FeatureDescriptorTypes.FREAK)
kps, des = feature_detector.detectAndCompute(img)
diff --git a/test/cv/test_search_for_triangulation.py b/test/cv/test_search_for_triangulation.py
index 46b8227..ce67c39 100644
--- a/test/cv/test_search_for_triangulation.py
+++ b/test/cv/test_search_for_triangulation.py
@@ -1,4 +1,5 @@
import sys
+import os
import numpy as np
import cv2
from matplotlib import pyplot as plt
@@ -11,7 +12,7 @@
from frame import Frame, match_frames
from geom_helpers import triangulate_points, add_ones, poseRt, skew, draw_points2
-from search_points import search_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation
+from search_points import search_map_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation
from map_point import MapPoint
from slam import SLAM
from pinhole_camera import Camera, PinholeCamera
@@ -34,6 +35,9 @@
#-------------
config = Config()
+# forced camera settings to be kept coherent with the input file below
+config.config_parser[config.dataset_type]['cam_settings'] = 'settings/KITTI04-12.yaml'
+config.get_cam_settings()
dataset = dataset_factory(config.dataset_settings)
@@ -54,8 +58,8 @@
"""
tracker_type = TrackerTypes.DES_BF
#tracker_type = TrackerTypes.DES_FLANN
-#feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
-feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 1, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
+#feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 1, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
#feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 3, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
#feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type)
@@ -66,6 +70,7 @@
#-------------
+# N.B.: keep this coherent with the above forced camera settings
img_ref = cv2.imread('kitti06-12.png',cv2.IMREAD_COLOR)
#img_cur = cv2.imread('kitti06-12-01.png',cv2.IMREAD_COLOR)
img_cur = cv2.imread('kitti06-13.png',cv2.IMREAD_COLOR)
diff --git a/visual_odometry.py b/visual_odometry.py
index 16589a5..b9eac89 100644
--- a/visual_odometry.py
+++ b/visual_odometry.py
@@ -39,6 +39,12 @@ class VoStage(Enum):
kUseGroundTruthScale = True
+# This class is a first start to understand the basics of inter frame feature tracking and camera pose estimation.
+# It combines the simplest VO ingredients without performing any image point triangulation or
+# windowed bundle adjustment. At each step $k$, it estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$.
+# The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$.
+# With this very basic approach, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a
+# valid trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$.
class VisualOdometry(object):
def __init__(self, cam, grountruth, feature_tracker):
self.stage = VoStage.NO_IMAGES_YET
@@ -99,7 +105,7 @@ def computeFundamentalMatrix(self, kp_ref, kp_cur):
F = F[0:3, 0:3]
return np.matrix(F), mask
- def removeOutliersFromMask(self, mask):
+ def removeOutliersByMask(self, mask):
if mask is not None:
n = self.kpn_cur.shape[0]
mask_index = [ i for i,v in enumerate(mask) if v > 0]