diff --git a/README.md b/README.md index b493fe3..944e8d0 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@ Author: [Luigi Freda](https://www.luigifreda.com) **pySLAM** is a *'toy'* implementation of a monocular *Visual Odometry (VO)* pipeline in Python. I released it for **educational purposes**, for a [computer vision class](https://as-ai.org/visual-perception-and-spatial-computing/) I taught. I started developing it for fun as a python programming exercise, during my free time. I took inspiration from some python repos available on the web. Main Scripts: -* `main_vo.py` combines the simplest VO ingredients without performing any image point triangulation or windowed bundle adjustment. At each step $k$, `main_vo.py` estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$. With this very basic computation, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a meaningful trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$. This script is a first start to understand the basics of inter frame feature tracking and camera pose estimation. +* `main_vo.py` combines the simplest VO ingredients without performing any image point triangulation or windowed bundle adjustment. At each step $k$, `main_vo.py` estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$. With this very basic approach, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a valid trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$. This script is a first start to understand the basics of inter frame feature tracking and camera pose estimation. * `main_slam.py` adds feature tracking along multiple frames, point triangulation and bundle adjustment in order to estimate the camera trajectory up-to-scale and a build a local map. It's still a VO pipeline but it shows some basic blocks which are necessary to develop a real visual SLAM pipeline. @@ -71,9 +71,9 @@ $ python3 -O main_vo.py ``` This will process a [KITTI]((http://www.cvlibs.net/datasets/kitti/eval_odometry.php)) video (available in the folder `videos`) by using its corresponding camera calibration file (available in the folder `settings`), and its groundtruth (available in the video folder). -**N.B.**: remind, the simple script `main_vo.py` **strictly requires a ground truth**, since - with the used approach - the relative motion between two adjacent camera frames can be only estimated up to scale with a monocular camera (i.e. the implemented inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$). +**N.B.**: as explained above, the script `main_vo.py` **strictly requires a ground truth**. -In order to process a different dataset, you need to set the file `config.ini`: +In order to process a different **dataset**, you need to set the file `config.ini`: * select your dataset `type` in the section `[DATASET]` (see the section *Datasets* below for further details) * the camera settings file accordingly (see the section *Camera Settings* below) * the groudtruth file accordingly (see the section *Camera Settings* below) @@ -83,6 +83,8 @@ If you want to test the script `main_slam.py`, you can run: $ python3 -O main_slam.py ``` +You can choose any detector/descriptor among *ORB*, *SIFT*, *SURF*, *BRISK*, *AKAZE* (see below for further information). + **WARNING**: the available **KITTI videos** (due to information loss in video compression) make main_slam tracking peform worse than with the original KITTI *image sequences*. The available videos are intended to be used for a first quick test. Please, download and use the original KITTI image sequences as explained below. For instance, on the original KITTI sequence 06, main_slam successfully completes the round; at present time, this does not happen with the compressed video. --- @@ -140,6 +142,29 @@ In order to calibrate your camera, you can use the scripts in the folder `calibr 1. use the script `grab_chessboard_images.py` to collect a sequence of images where the chessboard can be detected (set the chessboard size there) 2. use the script `calibrate.py` to process the collected images and compute the calibration parameters (set the chessboard size there) +--- +## Detectors/Descriptors + +At present time, the following feature **detectors** are supported: +* *FAST* +* *Good features to track* [[ShiTo94]](https://ieeexplore.ieee.org/document/323794) +* *ORB* +* *SIFT* +* *SURF* +* *AKAZE* +* *BRISK* + +You can take a look at the file `feature_detector.py`. + +The following feature **descriptors** are supported: +* *ORB* +* *SIFT* +* *SURF* +* *AKAZE* +* *BRISK* + +In both the scripts `main_vo.py` and `main_slam.py`, you can set which detector/descritor to use by means of the function *feature_tracker_factory()*. This function be found in the file `feature_tracker.py`. + --- ## References @@ -161,10 +186,9 @@ Tons of things are still missing to attain a real SLAM pipeline: * keyframe generation and management * tracking w.r.t. previous keyframe -* proper local map generation and management +* proper local map generation and management (covisibility) * loop closure * general relocalization -* in main_slam, tracking by using all kind of features (not only ORB) --- @@ -173,4 +197,4 @@ Tons of things are still missing to attain a real SLAM pipeline: * [twitchslam](https://github.com/geohot/twitchslam) * [monoVO](https://github.com/uoip/monoVO-python) * [pangolin](https://github.com/stevenlovegrove/Pangolin) -* [g2opy](https://github.com/uoip/g2opy) \ No newline at end of file +* [g2opy](https://github.com/uoip/g2opy) diff --git a/config.ini b/config.ini index 91c9515..0cfbce8 100644 --- a/config.ini +++ b/config.ini @@ -20,11 +20,11 @@ type=VIDEO_DATASET type=kitti base_path=/home/luigi/Work/rgbd_datasets/kitti/dataset ; -name=00 -cam_settings=settings/KITTI00-02.yaml +#name=00 +#cam_settings=settings/KITTI00-02.yaml ; -#name=06 -#cam_settings=settings/KITTI04-12.yaml +name=06 +cam_settings=settings/KITTI04-12.yaml ; groundtruth_file=auto @@ -44,8 +44,8 @@ type=video base_path=./videos/kitti00 cam_settings=settings/KITTI00-02.yaml ; -#base_path=./videos/kitti06 -#cam_settings=settings/KITTI04-12.yaml +;base_path=./videos/kitti06 +;cam_settings=settings/KITTI04-12.yaml ; name=video.mp4 groundtruth_file=groundtruth.txt diff --git a/config.py b/config.py index 17be95c..5e6816a 100644 --- a/config.py +++ b/config.py @@ -38,6 +38,8 @@ def __init__(self): self.cam_settings = None self.dataset_settings = None self.dataset_type = None + self.current_path = os.getcwd() + #print('current path: ', self.current_path) self.set_lib_paths() self.get_dataset_settings() @@ -56,6 +58,9 @@ def set_lib_paths(self): def get_dataset_settings(self): self.dataset_type = self.config_parser['DATASET']['type'] self.dataset_settings = self.config_parser[self.dataset_type] + + self.dataset_path = self.dataset_settings['base_path']; + self.dataset_settings['base_path'] = os.path.join( __location__, self.dataset_path) #print('dataset_settings: ', self.dataset_settings) # get camera settings diff --git a/dataset.py b/dataset.py index a9b2bed..d4bf0b9 100644 --- a/dataset.py +++ b/dataset.py @@ -91,9 +91,10 @@ class VideoDataset(Dataset): def __init__(self, path, name, associations=None, type=DatasetType.VIDEO): super().__init__(path, name, associations, type) self.filename = path + '/' + name + #print('video: ', self.filename) self.cap = cv2.VideoCapture(self.filename) if not self.cap.isOpened(): - raise IOError('Cannot open movie file') + raise IOError('Cannot open movie file: ', self.filename) else: print('Processing Video Input') self.num_frames = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT)) diff --git a/feature_detector.py b/feature_detector.py index 14ded1d..f163278 100644 --- a/feature_detector.py +++ b/feature_detector.py @@ -17,10 +17,11 @@ * along with PYSLAM. If not, see . """ import sys +import math import numpy as np import cv2 from enum import Enum -from geom_helpers import imgBlocks +from geom_helpers import imgBlocks, unpackSiftOctaveKps kVerbose = True @@ -32,6 +33,7 @@ kNumLevels = 4 kNumLevelsInitSigma = 12 kScaleFactor = 1.2 +kSigmaLevel0 = 1. kDrawOriginalExtractedFeatures = False # for debugging @@ -44,6 +46,7 @@ class FeatureDetectorTypes(Enum): ORB = 5 BRISK = 6 AKAZE = 7 + FREAK = 8 # DOES NOT WORK! class FeatureDescriptorTypes(Enum): @@ -53,6 +56,7 @@ class FeatureDescriptorTypes(Enum): ORB = 3 BRISK = 4 AKAZE = 5 + FREAK = 6 # DOES NOT WORK! def feature_detector_factory(min_num_features=kMinNumFeatureDefault, @@ -181,7 +185,8 @@ def __init__(self, min_num_features=kMinNumFeatureDefault, self.descriptor_type = descriptor_type self.num_levels = num_levels - self.scale_factor = kScaleFactor + self.scale_factor = kScaleFactor # scale factor bewteen two octaves + self.sigma_level0 = kSigmaLevel0 # sigma on first octave self.initSigmaLevels() self.min_num_features = min_num_features @@ -191,20 +196,26 @@ def __init__(self, min_num_features=kMinNumFeatureDefault, self.use_pyramid_adaptor = False self.pyramid_adaptor = None + print("using opencv ", cv2.__version__) + # check opencv version in order to use the right modules if cv2.__version__.split('.')[0] == '3': - from cv2.xfeatures2d import SIFT_create, SURF_create + from cv2.xfeatures2d import SIFT_create, SURF_create, FREAK_create from cv2 import ORB_create, BRISK_create, AKAZE_create else: SIFT_create = cv2.SIFT SURF_create = cv2.SURF ORB_create = cv2.ORB BRISK_create = cv2.BRISK - AKAZE_create = cv2.AKAZE + AKAZE_create = cv2.AKAZE + FREAK_create = cv2.FREAK # TODO: to be checked + + self.FAST_create = cv2.FastFeatureDetector_create self.SIFT_create = SIFT_create self.SURF_create = SURF_create self.ORB_create = ORB_create self.BRISK_create = BRISK_create - self.AKAZE_create = AKAZE_create + self.AKAZE_create = AKAZE_create + self.FREAK_create = FREAK_create # DOES NOT WORK! self.orb_params = dict(nfeatures=min_num_features, scaleFactor=self.scale_factor, @@ -219,25 +230,33 @@ def __init__(self, min_num_features=kMinNumFeatureDefault, # init detector if self.detector_type == FeatureDetectorTypes.SIFT: - self._feature_detector = SIFT_create() + self._feature_detector = self.SIFT_create() # N.B.: The number of octaves is computed automatically from the image resolution + # from https://docs.opencv.org/3.4/d5/d3c/classcv_1_1xfeatures2d_1_1SIFT.html + self.scale_factor = 2 # from https://docs.opencv.org/3.1.0/da/df5/tutorial_py_sift_intro.html + # self.layer_scale_factor = math.sqrt(2) # with SIFT, 3 layers per octave are generated with a intra-layer scale factor = sqrt(2) + self.sigma_level0 = 1.6 + self.initSigmaLevels() self.detector_name = 'SIFT' elif self.detector_type == FeatureDetectorTypes.SURF: - self._feature_detector = SURF_create() + self._feature_detector = self.SURF_create(nOctaveLayers=self.num_levels) self.detector_name = 'SURF' elif self.detector_type == FeatureDetectorTypes.ORB: - self._feature_detector = ORB_create(**self.orb_params) + self._feature_detector = self.ORB_create(**self.orb_params) self.detector_name = 'ORB' self.use_bock_adaptor = True elif self.detector_type == FeatureDetectorTypes.BRISK: - self._feature_detector = BRISK_create(octaves=self.num_levels) + self._feature_detector = self.BRISK_create(octaves=self.num_levels) self.detector_name = 'BRISK' self.scale_factor = 1.3 # from the BRISK opencv code this seems to be the used scale factor between intra-octave frames self.initSigmaLevels() elif self.detector_type == FeatureDetectorTypes.AKAZE: - self._feature_detector = AKAZE_create(nOctaves=self.num_levels) - self.detector_name = 'AKAZE' + self._feature_detector = self.AKAZE_create(nOctaves=self.num_levels) + self.detector_name = 'AKAZE' + elif self.detector_type == FeatureDetectorTypes.FREAK: + self._feature_detector = self.FREAK_create(nOctaves=self.num_levels) + self.detector_name = 'FREAK' elif self.detector_type == FeatureDetectorTypes.FAST: - self._feature_detector = cv2.FastFeatureDetector_create(threshold=25, nonmaxSuppression=True) + self._feature_detector = self.FAST_create(threshold=25, nonmaxSuppression=True) self.detector_name = 'FAST' self.use_bock_adaptor = True self.use_pyramid_adaptor = self.num_levels > 1 @@ -257,20 +276,23 @@ def __init__(self, min_num_features=kMinNumFeatureDefault, # init descriptor if self.descriptor_type == FeatureDescriptorTypes.SIFT: - self._feature_descriptor = SIFT_create() + self._feature_descriptor = self.SIFT_create() self.decriptor_name = 'SIFT' elif self.descriptor_type == FeatureDescriptorTypes.SURF: - self._feature_descriptor = SURF_create() + self._feature_descriptor = self.SURF_create(nOctaveLayers=self.num_levels) self.decriptor_name = 'SURF' elif self.descriptor_type == FeatureDescriptorTypes.ORB: - self._feature_descriptor = ORB_create(**self.orb_params) + self._feature_descriptor = self.ORB_create(**self.orb_params) self.decriptor_name = 'ORB' elif self.descriptor_type == FeatureDescriptorTypes.BRISK: - self._feature_descriptor = BRISK_create(octaves=self.num_levels) + self._feature_descriptor = self.BRISK_create(octaves=self.num_levels) self.decriptor_name = 'BRISK' elif self.descriptor_type == FeatureDescriptorTypes.AKAZE: - self._feature_descriptor = AKAZE_create(nOctaves=self.num_levels) - self.decriptor_name = 'AKAZE' + self._feature_descriptor = self.AKAZE_create(nOctaves=self.num_levels) + self.decriptor_name = 'AKAZE' + elif self.descriptor_type == FeatureDescriptorTypes.FREAK: + self._feature_descriptor = self.FREAK_create(nOctaves=self.num_levels) + self.decriptor_name = 'FREAK' elif self.descriptor_type == FeatureDescriptorTypes.NONE: self._feature_descriptor = None self.decriptor_name = 'None' @@ -284,11 +306,13 @@ def initSigmaLevels(self): self.inv_scale_factors = np.zeros(num_levels) self.inv_level_sigmas2 = np.zeros(num_levels) + # TODO: in the SIFT case, this sigma management could be refined. + # SIFT method has layers with intra-layer scale factor = math.sqrt(2) self.scale_factors[0]=1.0 - self.level_sigmas2[0]=1.0 + self.level_sigmas2[0]=self.sigma_level0*self.sigma_level0 for i in range(1,num_levels): self.scale_factors[i]=self.scale_factors[i-1]*self.scale_factor - self.level_sigmas2[i]=self.scale_factors[i]*self.scale_factors[i] + self.level_sigmas2[i]=self.scale_factors[i]*self.scale_factors[i]*self.level_sigmas2[i-1] #print('self.scale_factors: ', self.scale_factors) for i in range(num_levels): self.inv_scale_factors[i]=1.0/self.scale_factors[i] @@ -306,7 +330,7 @@ def detect(self, frame, mask=None): kps = self.block_adaptor.detect(frame, mask) else: kps = self._feature_detector.detect(frame, mask) - kps = self.satNumberOfFeatures(kps) + kps = self.satNumberOfFeatures(kps) if kDrawOriginalExtractedFeatures: # draw the original features imgDraw = cv2.drawKeypoints(frame, kps, None, color=(0,255,0), flags=0) cv2.imshow('detected keypoints',imgDraw) @@ -321,6 +345,8 @@ def detectAndCompute(self, frame, mask=None): frame = cv2.cvtColor(frame,cv2.COLOR_RGB2GRAY) kps = self.detect(frame, mask) kps, des = self._feature_descriptor.compute(frame, kps) + if self.detector_type == FeatureDetectorTypes.SIFT: + unpackSiftOctaveKps(kps) if kVerbose: #print('detector: ', self.detector_name, ', #features: ', len(kps)) print('descriptor: ', self.decriptor_name, ', #features: ', len(kps)) diff --git a/feature_matcher.py b/feature_matcher.py index 17f3000..bd925c3 100644 --- a/feature_matcher.py +++ b/feature_matcher.py @@ -128,7 +128,7 @@ def __init__(self, norm_type=cv2.NORM_HAMMING, cross_check = False, type = Featu def match(self, des1, des2): if kVerbose: - print('BfFeatureMatcher') + print('BfFeatureMatcher, norm ', self.norm_type) matches = self.bf.knnMatch(des1, des2, k=2) #knnMatch(queryDescriptors,trainDescriptors) return self.goodMatches(matches, des1, des2) @@ -140,9 +140,9 @@ def __init__(self, norm_type=cv2.NORM_HAMMING, cross_check = False, type = Featu # FLANN parameters for binary descriptors FLANN_INDEX_LSH = 6 self.index_params= dict(algorithm = FLANN_INDEX_LSH, - table_number = 12, # 12 - key_size = 20, # 20 - multi_probe_level = 2) # 2 + table_number = 6, # 12 + key_size = 12, # 20 + multi_probe_level = 1) # 2 if norm_type == cv2.NORM_L2: # FLANN parameters for float descriptors FLANN_INDEX_KDTREE = 1 diff --git a/feature_tracker.py b/feature_tracker.py index 637325d..ae5fcb6 100644 --- a/feature_tracker.py +++ b/feature_tracker.py @@ -23,10 +23,11 @@ from feature_detector import feature_detector_factory, FeatureDetectorTypes, FeatureDescriptorTypes from feature_matcher import feature_matcher_factory, FeatureMatcherTypes from helpers import Printer +from geom_helpers import hamming_distance, l2_distance class TrackerTypes(Enum): - LK = 0 # use patch as "descriptor" and match by using Lucas Kanade pyr optic flow + LK = 0 # use patch as "descriptor" and match by using Lucas Kanade pyramid optic flow DES_BF = 1 # descriptor-based, brute force matching DES_FLANN = 2 # descriptor-based, FLANN-based matching (should be faster) @@ -71,6 +72,7 @@ def __init__(self, min_num_features=kMinNumFeatureDefault, self.tracker_type = tracker_type self.detector = None + self.descriptor_distance = None # pointer function # out: keypoints and descriptors def detect(self, frame, mask): @@ -80,6 +82,7 @@ def detect(self, frame, mask): def track(self, image_ref, image_cur, kp_ref, des_ref): return None, None, None + # use patch as "descriptor" and track/"match" by using Lucas Kanade pyr optic flow class LkFeatureTracker(FeatureTracker): def __init__(self, min_num_features=kMinNumFeatureDefault, @@ -132,13 +135,20 @@ def __init__(self, min_num_features=kMinNumFeatureDefault, elif descriptor_type == FeatureDescriptorTypes.BRISK: self.norm_type = cv2.NORM_HAMMING elif descriptor_type == FeatureDescriptorTypes.AKAZE: - self.norm_type = cv2.NORM_HAMMING + self.norm_type = cv2.NORM_HAMMING + elif descriptor_type == FeatureDescriptorTypes.FREAK: + self.norm_type = cv2.NORM_HAMMING elif descriptor_type == FeatureDescriptorTypes.SURF: self.norm_type = cv2.NORM_L2 elif descriptor_type == FeatureDescriptorTypes.SIFT: self.norm_type = cv2.NORM_L2 else: - raise ValueError("Unmanaged norm type for feature tracker %s" % self.tracker_type) + raise ValueError("Unmanaged norm type for feature tracker %s" % self.tracker_type) + + if self.norm_type == cv2.NORM_HAMMING: + self.descriptor_distance = hamming_distance + if self.norm_type == cv2.NORM_L2: + self.descriptor_distance = l2_distance if tracker_type == TrackerTypes.DES_FLANN: self.matching_algo = FeatureMatcherTypes.FLANN @@ -147,7 +157,7 @@ def __init__(self, min_num_features=kMinNumFeatureDefault, else: raise ValueError("Unmanaged matching algo for feature tracker %s" % self.tracker_type) - self.matcher = feature_matcher_factory(norm_type=self.norm_type , type=self.matching_algo) + self.matcher = feature_matcher_factory(norm_type=self.norm_type, type=self.matching_algo) # out: keypoints and descriptors def detect(self, frame, mask=None): diff --git a/frame.py b/frame.py index 8546073..00bace2 100644 --- a/frame.py +++ b/frame.py @@ -17,6 +17,7 @@ * along with PYSLAM. If not, see . """ +import sys import cv2 import numpy as np from scipy.spatial import cKDTree @@ -33,20 +34,34 @@ kDrawFeatureRadius = [r*3 for r in range(1,20)] class Frame(object): - # shared tracker - tracker = feature_tracker_factory(min_num_features=2000, num_levels = 1, - detector_type = FeatureDetectorTypes.FAST, - descriptor_type = FeatureDescriptorTypes.ORB, - tracker_type = TrackerTypes.DES_BF) - # shared detector - detector = tracker.detector - matcher = tracker.matcher + # shared counter next_id = 0 - + # shared tracker + tracker = None + detector = None + matcher = None + descriptor_distance = None + @staticmethod def set_tracker(tracker): Frame.tracker = tracker Frame.detector = tracker.detector + Frame.matcher = tracker.matcher + Frame.descriptor_distance = tracker.descriptor_distance + Frame.next_id = 0 + + if tracker.descriptor_type == FeatureDescriptorTypes.ORB: + parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxOrbDistanceSearchByReproj + if tracker.descriptor_type == FeatureDescriptorTypes.BRISK: + parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxBriskDistanceSearchByReproj + if tracker.descriptor_type == FeatureDescriptorTypes.AKAZE: + parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxAkazeDistanceSearchByReproj + if tracker.descriptor_type == FeatureDescriptorTypes.SIFT: + parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxSiftDistanceSearchByReproj + if tracker.descriptor_type == FeatureDescriptorTypes.SURF: + parameters.kMaxDescriptorDistanceSearchByReproj = parameters.kMaxSurfDistanceSearchByReproj + + parameters.kMaxDescriptorDistanceSearchEpipolar = parameters.kMaxDescriptorDistanceSearchByReproj def __init__(self, mapp, img, K, Kinv, DistCoef, pose=np.eye(4), tid=None, des=None): self.H, self.W = img.shape[0:2] @@ -67,7 +82,6 @@ def __init__(self, mapp, img, K, Kinv, DistCoef, pose=np.eye(4), tid=None, des=N # self.kps keypoints # self.kpsu [u]ndistorted keypoints - # self.kpsn [n]ormalized keypoints # self.octaves keypoint octaves # self.des keypoint descriptors @@ -80,9 +94,9 @@ def __init__(self, mapp, img, K, Kinv, DistCoef, pose=np.eye(4), tid=None, des=N # convert to gray image if img.ndim>2: img = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY) - self.kps, self.des = Frame.tracker.detect(img) + self.kps, self.des = Frame.tracker.detect(img) # convert from a list of keypoints to an array of points and octaves - kps_data = np.array([ [x.pt[0], x.pt[1], x.octave] for x in self.kps], dtype=np.float32) + kps_data = np.array([ [x.pt[0], x.pt[1], x.octave] for x in self.kps], dtype=np.float32) self.octaves = np.uint32(kps_data[:,2].copy()) #print('octaves: ', self.octaves) self.kps = kps_data[:,:2].copy() @@ -169,7 +183,7 @@ def compute_points_median_depth(self, points4d): return z[ ( len(z)-1)//2 ] - # KD tree of unnormalized keypoints + # KD tree of undistorted keypoints @property def kd(self): if not hasattr(self, '_kd'): diff --git a/geom_helpers.py b/geom_helpers.py index f2f8f7b..17ba93c 100644 --- a/geom_helpers.py +++ b/geom_helpers.py @@ -22,6 +22,13 @@ import cv2 +def poseRt(R, t): + ret = np.eye(4) + ret[:3, :3] = R + ret[:3, 3] = t + return ret + + # turn [[x,y]] -> [[x,y,1]] def add_ones(x): if len(x.shape) == 1: @@ -29,8 +36,15 @@ def add_ones(x): else: return np.concatenate([x, np.ones((x.shape[0], 1))], axis=1) +# turn [[x,y,w]] -> [[x/w,y/w,1]] +def normalize(Kinv, pts): + return np.dot(Kinv, add_ones(pts).T).T[:, 0:2] + # turn w in its skew symmetrix matrix representation +# w in IR^3 -> [0,-wz,wy], +# [wz,0,-wx], +# [-wy,wx,0]] def skew(w): wx,wy,wz = w.ravel() return np.array([[0,-wz,wy],[wz,0,-wx],[-wy,wx,0]]) @@ -38,7 +52,11 @@ def skew(w): def hamming_distance(a, b): r = (1 << np.arange(8))[:,None] - return np.count_nonzero((np.bitwise_xor(a,b) & r) != 0) + return np.count_nonzero((np.bitwise_xor(a,b) & r) != 0) + +def l2_distance(a, b): + return np.linalg.norm(a.ravel()-b.ravel()) + # DLT with normalized image coordinates (see [HartleyZisserman Sect. 12.2 ]) def triangulate_point(pose1, pose2, pt1, pt2): @@ -62,25 +80,13 @@ def triangulate_points(pose1, pose2, pts1, pts2, mask = None): def triangulate_points_with_mask(pose1, pose2, pts1, pts2, mask): ret = np.zeros((pts1.shape[0], 4)) - #num_triangulated_points = 0 for i, p in enumerate(zip(pts1, pts2)): if mask[i]: ret[i] = triangulate_point(pose1, pose2, p[0], p[1]) - #num_triangulated_points += 1 return ret -def poseRt(R, t): - ret = np.eye(4) - ret[:3, :3] = R - ret[:3, 3] = t - return ret - - -def normalize(Kinv, pts): - return np.dot(Kinv, add_ones(pts).T).T[:, 0:2] - -# create a generator over an image to extract its sub-blocks +# create a generator over an image to extract 'row_divs' x 'col_divs' sub-blocks def imgBlocks(img, row_divs, col_divs): rows, cols = img.shape[:2] #print('img.shape: ', img.shape) @@ -131,3 +137,38 @@ def draw_lines(img, line_edges, pts=None, radius=5): pt = pts[i] img = cv2.circle(img,tuple(pt),radius,color,-1) return img + + +## SIFT stuff ## + +# from https://stackoverflow.com/questions/48385672/opencv-python-unpack-sift-octave +def unpackSiftOctave(kpt): + """unpackSIFTOctave(kpt)->(octave,layer,scale) + @brief Unpack Sift Keypoint + @param kpt: cv2.KeyPoint (of SIFT) + """ + _octave = kpt.octave + octave = _octave&0xFF + layer = (_octave>>8)&0xFF + if octave>=128: + octave |= -128 + if octave>=0: + scale = float(1/(1<octave + @brief Unpack Sift Keypoint + @param kpt: cv2.KeyPoint (of SIFT) + """ + _octave = kpt.octave + octave = _octave&0xFF + if octave>=128: + octave |= -128 + return abs(octave) # N.B.: abs() is an HACK that needs to be further verified + +def unpackSiftOctaveKps(kps): + for kpt in kps: + kpt.octave = unpackSiftOctaveSimple(kpt) \ No newline at end of file diff --git a/initializer.py b/initializer.py index 64c6085..88eb8c2 100644 --- a/initializer.py +++ b/initializer.py @@ -134,7 +134,7 @@ def initialize(self, f_cur, img_cur): #pts4d = triangulate(f_cur.pose, f_ref.pose, f_cur.kpsn[idx_cur], f_ref.kpsn[idx_ref]) new_pts_count, mask_points = map.add_points(points4d, None, f_cur, f_ref, idx_cur_inliers, idx_ref_inliers, img_cur, check_parallax=True) - print("triangulated: %d new points, %d matches" % (new_pts_count, len(idx_cur))) + print("triangulated: %d new points from %d matches" % (new_pts_count, len(idx_cur))) err = map.optimize(verbose=False) print("pose opt err: %f" % err) diff --git a/main_slam.py b/main_slam.py index d4189ba..1fc4781 100644 --- a/main_slam.py +++ b/main_slam.py @@ -59,16 +59,21 @@ num_features=parameters.kNumFeatures """ select your feature tracker - N.B.1: here you can use just ORB descriptors! ... at present time - N.B.2: ORB detector (not descriptor) does not work as expected! + N.B.: ORB detector (not descriptor) does not work as expected! """ tracker_type = TrackerTypes.DES_BF #tracker_type = TrackerTypes.DES_FLANN - feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 1, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + + feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 3, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.BRISK, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.AKAZE, descriptor_type = FeatureDescriptorTypes.AKAZE, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 4, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SIFT, descriptor_type = FeatureDescriptorTypes.SIFT, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SURF, descriptor_type = FeatureDescriptorTypes.SURF, tracker_type = tracker_type) + # create SLAM object slam = SLAM(cam, feature_tracker, grountruth) viewer3D = Viewer3D() diff --git a/main_vo.py b/main_vo.py index a75dfa9..fec54f1 100644 --- a/main_vo.py +++ b/main_vo.py @@ -59,22 +59,27 @@ num_features=2000 # how many features do you want to detect and track? + """ select your feature tracker N.B: ORB detector (not descriptor) does not work as expected! """ + + # LK tracker feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.NONE, tracker_type = TrackerTypes.LK) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_FLANN) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_FLANN) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = TrackerTypes.DES_BF) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.BRISK, tracker_type = TrackerTypes.DES_BF) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.AKAZE, descriptor_type = FeatureDescriptorTypes.AKAZE, tracker_type = TrackerTypes.DES_BF) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SIFT, descriptor_type = FeatureDescriptorTypes.SIFT, tracker_type = TrackerTypes.DES_FLANN) - #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SURF, descriptor_type = FeatureDescriptorTypes.SURF, tracker_type = TrackerTypes.DES_FLANN) + #tracker_type = TrackerTypes.DES_BF + tracker_type = TrackerTypes.DES_FLANN + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.BRISK, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.AKAZE, descriptor_type = FeatureDescriptorTypes.AKAZE, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SIFT, descriptor_type = FeatureDescriptorTypes.SIFT, tracker_type = tracker_type) + #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SURF, descriptor_type = FeatureDescriptorTypes.SURF, tracker_type = tracker_type) + + # create visual odometry object vo = VisualOdometry(cam, grountruth, feature_tracker) is_draw_traj_img = True diff --git a/map.py b/map.py index 933b676..5c7cc42 100644 --- a/map.py +++ b/map.py @@ -22,7 +22,7 @@ import json import math -from geom_helpers import poseRt, hamming_distance, add_ones +from geom_helpers import poseRt, add_ones import parameters from frame import Frame from map_point import MapPoint @@ -75,6 +75,7 @@ def add_points(self, points4d, mask_pts4d, f1, f2, idx1, idx2, img1, check_paral out_mask_pts4d = np.full(points4d.shape[0], False, dtype=bool) if mask_pts4d is None: mask_pts4d = np.full(points4d.shape[0], True, dtype=bool) + for i, p in enumerate(points4d): if not mask_pts4d[i]: #print('p[%d] not good' % i) @@ -188,7 +189,7 @@ def locally_optimize(self, local_window=parameters.kLocalWindow, verbose = False #print(' points: ', sorted([p.id for p in points])) #err = optimizer_g2o.optimize(frames, points, None, False, verbose, rounds) err, ratio_bad_observations = optimizer_g2o.localOptimization(frames, points, frames_ref, False, verbose, rounds) - Printer.green('local optimization - %% bad observations: %.2f ' % (ratio_bad_observations*100) ) + Printer.green('local optimization - perc bad observations: %.2f %% ' % (ratio_bad_observations*100) ) #self.cull_map_points(points) # already performed in optimizer_g2o.localOptimization() return err diff --git a/map_point.py b/map_point.py index 2f4db62..18a83af 100644 --- a/map_point.py +++ b/map_point.py @@ -17,7 +17,7 @@ * along with PYSLAM. If not, see . """ -from geom_helpers import poseRt, hamming_distance, add_ones +from geom_helpers import poseRt, add_ones from frame import Frame import time import numpy as np @@ -42,11 +42,13 @@ def __init__(self, mapp, position, color, tid=None): def homogeneous(self): return add_ones(self.pt) - def orb(self): + # return array of corresponding descriptors + def des(self): return [f.des[idx] for f, idx in zip(self.frames, self.idxs)] - def orb_distance(self, des): - return min([hamming_distance(o, des) for o in self.orb()]) + # minimum distance between input descriptor and map point corresponding descriptors + def min_des_distance(self, descriptor): + return min([Frame.descriptor_distance(d, descriptor) for d in self.des()]) def delete(self): for f, idx in zip(self.frames, self.idxs): # f.points[idx] = this point diff --git a/parameters.py b/parameters.py index 02ccf74..2c8dea9 100644 --- a/parameters.py +++ b/parameters.py @@ -17,28 +17,54 @@ * along with PYSLAM. If not, see . """ -kCosMinParallax=0.99998 # 0.9998 +''' +List of shared parameters +''' -kMaxReprojectionDistance=15 # pixels -kMaxOrbDistanceMatch=50 -kMaxOrbDistanceSearchByReproj=50 -kMatchRatioTest=0.75 +# num of desired keypoints per frame kNumFeatures=2000 -kLocalWindow=10 -kEveryNumFramesLargeWindowBA=10 # num of frames between two large window BA -kLargeWindow=20 -kUseLargeWindowBA = True # True: use BA over a large window, False: do not use large window BA +# min cos angle for triangulation (min parallax angle) +kCosMinParallax=0.99998 # 0.9998 -kChi2Mono = 5.991 # chi-square 2 DOFs, used for reprojection error (Hartley Zisserman pg 119) -kInitializerDesiredMedianDepth = 20 # after initialization: the initial median depth is computed and forced to this value +# initializer +kInitializerDesiredMedianDepth = 20 # when initializing, the initial median depth is computed and forced to this value kMinTraslation=0.0001*kInitializerDesiredMedianDepth + +# tracking kUseMotionModel = True # True: use kinematic motion model for computing a first guess pose (to be optimized by current pose optimization) without fitting an essential mat # False: pose and keypoint match inliers are estimated by fitting an essential mat (5 points algorithm), WARNING: this approach comes with some limitations (please, read the comments of the method slam.estimate_pose_ess_mat()) -kUseSearchFrameByEpipolarLines=False # Match frames by using epipolar lines; the current available interframe pose estimate is used for computing the fundamental mat F -kMinDistanceFromEpipole=100 # Used with search by epipolar lines \ No newline at end of file + +# search matches by projection +kMaxReprojectionDistance=15 # [pixels] +kMatchRatioTest=0.75 +# +kMaxOrbDistanceSearchByReproj=50 # ORB +kMaxBriskDistanceSearchByReproj=50 # BRISK (needs better tuning) +kMaxAkazeDistanceSearchByReproj=80 # AKAZE (needs better tuning) +kMaxSiftDistanceSearchByReproj=100 # SIFT (needs better tuning) +kMaxSurfDistanceSearchByReproj=0.05 # SURF (needs better tuning) +kMaxDescriptorDistanceSearchByReproj=kMaxOrbDistanceSearchByReproj # main (use ORB setting by default) + + +# search matches for triangulation by using epipolar lines +kUseSearchFrameByProjection=False # Match frames by using frame map points projection and epipolar lines; here, the current available interframe pose estimate is used for computing the fundamental mat F +kMinDistanceFromEpipole=100 # [pixels] Used with search by epipolar lines +# +kMaxDescriptorDistanceSearchEpipolar=kMaxDescriptorDistanceSearchByReproj + + +# BA +kLocalWindow=10 # [# frames] +kEveryNumFramesLargeWindowBA=10 # num of frames between two large window BA +kLargeWindow=20 # [# frames] +kUseLargeWindowBA = True # True: perform BA over a large window; False: do not perform large window BA + + +kChi2Mono = 5.991 # chi-square 2 DOFs, used for reprojection error (Hartley Zisserman pg 119) + diff --git a/search_points.py b/search_points.py index 6e4fbc6..380812a 100644 --- a/search_points.py +++ b/search_points.py @@ -23,7 +23,7 @@ import cv2 from frame import Frame -from geom_helpers import skew, add_ones, hamming_distance, draw_lines, draw_points +from geom_helpers import skew, add_ones, draw_lines, draw_points from helpers import Printer, getchar import parameters from timer import Timer @@ -35,8 +35,8 @@ def search_frame_by_projection(f_ref, f_cur): found_pts_count = 0 idx_ref = [] idx_cur = [] - for i, p in enumerate(f_ref.points): - if p is None or p.is_bad is True: + for i, p in enumerate(f_ref.points): # search in f_ref map points without a corresponding map point + if p is None: continue # project point on f_cur proj = f_cur.project_map_point(p) @@ -45,22 +45,33 @@ def search_frame_by_projection(f_ref, f_cur): (proj[1] > 0) & (proj[1] < f_cur.H) if is_visible is False: continue - for idx_c in f_cur.kd.query_ball_point(proj, parameters.kMaxReprojectionDistance): + best_dist = math.inf + #best_dist2 = math.inf + best_k_idx = -1 + best_ref_idx = -1 + for k_idx in f_cur.kd.query_ball_point(proj, parameters.kMaxReprojectionDistance): # if no point associated - if f_cur.points[idx_c] is None: - orb_dist = p.orb_distance(f_cur.des[idx_c]) - #print('b_dist : ', orb_dist) - if orb_dist < parameters.kMaxOrbDistanceSearchByReproj: - p.add_observation(f_cur, idx_c) - found_pts_count += 1 - idx_ref.append(i) - idx_cur.append(idx_c) - break + if f_cur.points[k_idx] is None: + descriptor_dist = p.min_des_distance(f_cur.des[k_idx]) + if descriptor_dist < parameters.kMaxDescriptorDistanceSearchByReproj and descriptor_dist < best_dist: + #best_dist2 = best_dist + best_dist = descriptor_dist + best_k_idx = k_idx + best_ref_idx = i + # N.B.1: given the keypoint search area is limited, here we do not use the match distance ratio test + # N.B.2: if we extract keypoints on different octaves/levels, it is very frequent to observe overlapping keypoints (two distinct octave representives for a same interesting point) + # => do not use the match distance ration unless you want to exclude these points + if best_k_idx > -1: + #print('b_dist : ', best_dist) + p.add_observation(f_cur, best_k_idx) + found_pts_count += 1 + idx_ref.append(best_ref_idx) + idx_cur.append(best_k_idx) return idx_ref, idx_cur, found_pts_count # search by projection matches between {input map points} and {unmatched keypoints of frame f_cur} -def search_by_projection(points, f_cur): +def search_map_by_projection(points, f_cur): found_pts_count = 0 if len(points) > 0: @@ -69,7 +80,7 @@ def search_by_projection(points, f_cur): # check if points are visible visible_pts = (projs[:, 0] > 0) & (projs[:, 0] < f_cur.W) & \ - (projs[:, 1] > 0) & (projs[:, 1] < f_cur.H) + (projs[:, 1] > 0) & (projs[:, 1] < f_cur.H) for i, p in enumerate(points): if not visible_pts[i] or p.is_bad is True: @@ -78,15 +89,28 @@ def search_by_projection(points, f_cur): if f_cur in p.frames: # we already matched this map point to this frame continue - for m_idx in f_cur.kd.query_ball_point(projs[i], parameters.kMaxReprojectionDistance): + best_dist = math.inf + best_dist2 = math.inf + best_level = -1 + best_level2 = -1 + best_k_idx = -1 + for k_idx in f_cur.kd.query_ball_point(projs[i], parameters.kMaxReprojectionDistance): # if no point associated - if f_cur.points[m_idx] is None: - orb_dist = p.orb_distance(f_cur.des[m_idx]) - #print('b_dist : ', orb_dist) - if orb_dist < parameters.kMaxOrbDistanceSearchByReproj: - p.add_observation(f_cur, m_idx) - found_pts_count += 1 - break + if f_cur.points[k_idx] is None: + descriptor_dist = p.min_des_distance(f_cur.des[k_idx]) + if descriptor_dist < parameters.kMaxDescriptorDistanceSearchByReproj and descriptor_dist < best_dist: + best_dist2 = best_dist + best_level2 = best_level + best_dist = descriptor_dist + best_level = f_cur.octaves[k_idx] + best_k_idx = k_idx + if best_k_idx > -1: + # apply match distance ratio test only if the best and second are in the same scale level + if (best_level2 == best_level) and (best_dist > best_dist2 * parameters.kMatchRatioTest): + continue + #print('best des distance: ', best_dist, ", max dist: ", parameters.kMaxDescriptorDistanceSearchByReproj) + p.add_observation(f_cur, best_k_idx) + found_pts_count += 1 return found_pts_count @@ -102,12 +126,12 @@ def search_local_frames_by_projection(map, f_cur, local_window = parameters.kLoc points.append(p) point_id_set.add(p.id) print('searching %d map points' % len(points)) - return search_by_projection(points, f_cur) + return search_map_by_projection(points, f_cur) # search by projection matches between {all map points} and {unmatched keypoints of f_cur} -def search_map_by_projection(map, f_cur): - return search_by_projection(map.points, f_cur) +def search_all_map_by_projection(map, f_cur): + return search_map_by_projection(map.points, f_cur) # compute the fundamental mat F12 and the infinite homography H21 [Hartley Zisserman pag 339] def computeF12(f1, f2): @@ -137,7 +161,7 @@ def epiline_to_end_points(line, e, xinf, cols): delta = xinf - e length = np.linalg.norm(delta.ravel()) - delta /= length + delta /= length # normalize delta step = min(parameters.kMinDistanceFromEpipole, length) # keep a minimum distance from epipole if possible e = e + delta * step @@ -165,39 +189,62 @@ def epiline_to_end_points(line, e, xinf, cols): u1 = xmax[0] v1 = xmax[1] - # swap in order to force the search starting from the epipole + # swap in order to force the search starting from xinf if swap is True: u0, v0, u1, v1 = u1, v1, u0, v0 return [ np.array([u0,v0]), np.array([u1,v1]) ] -# find a match on f along line for the input descriptor -# e is the epipole -def find_matches_along_line(f, e, line, descriptor, radius = 5, overlap_ratio = 0.2): +# find a keypoint match on 'f' along line 'line' for the input 'descriptor' +# 'e' is the epipole +# the found keypoint match must not already have a point match +# overlap_ratio must be in [0,1] +def find_matches_along_line(f, e, line, descriptor, radius = parameters.kMaxReprojectionDistance, overlap_ratio = 0.5): #print('line: ', line[0], ", ", line[1]) step = radius*(1-overlap_ratio) delta = line[1].ravel() - line[0].ravel() length = np.linalg.norm(delta) - n = int(math.ceil(length/step)) + n = max( int(math.ceil(length/step)), 1) delta = delta/n #print('delta: ', delta) + + best_dist = math.inf + best_dist2 = math.inf + best_k_idx = -1 + for i in range(n+1): x = line[0].ravel() + i*delta for k_idx in f.kd.query_ball_point(x, radius): # if no point associated if f.points[k_idx] is None: - orb_dist = hamming_distance(f.des[k_idx], descriptor) - if orb_dist < parameters.kMaxOrbDistanceSearchByReproj: # TODO: refine! here, we just take the first match - return k_idx - return -1 + descriptor_dist = Frame.descriptor_distance(f.des[k_idx], descriptor) + if descriptor_dist < parameters.kMaxDescriptorDistanceSearchEpipolar: + #return k_idx # stop at the first match + if descriptor_dist < best_dist: + best_dist2 = best_dist + best_dist = descriptor_dist + best_k_idx = k_idx + # N.B.: the search "segment line" can have a large width => it is better to use the match distance ratio test + if best_dist < best_dist2 * parameters.kMatchRatioTest: + return best_k_idx, best_dist + else: + return -1, 0 + #return best_k_idx + -# search for triangulations between f1 and f2 -# find matches between unmatched keypoints -# we have pose estimates for both frames +# search keypoint matches (for triangulations) between f1 and f2 +# search for matches between unmatched keypoints (without a corresponding map point) +# in inpput we have already some pose estimates for f1 and f2 def search_frame_for_triangulation(f1, f2, img2, img1 = None): + idxs2_out = [] + idxs1_out = [] + lines_out = [] + num_found_matches = 0 + img2_epi = None + if __debug__: timer = Timer() timer.start() @@ -217,7 +264,7 @@ def search_frame_for_triangulation(f1, f2, img2, img1 = None): # if the translation is too small we cannot triangulate if np.linalg.norm(O1w-O2w) < parameters.kMinTraslation: # we assume the Inializer has been used for building the first map Printer.red("search for triangulation: impossible with zero translation!") - return # EXIT + return idxs1_out, idxs2_out, num_found_matches, img2_epi # EXIT # compute the fundamental matrix between the two frames F12, H21 = computeF12(f1, f2) @@ -227,6 +274,7 @@ def search_frame_for_triangulation(f1, f2, img2, img1 = None): if p is None: # we consider just unmatched keypoints kp = f1.kpsu[i] scale_factor = Frame.detector.scale_factors[f1.octaves[i]] + # discard points which are too close to the epipole if np.linalg.norm(kp-e1) < parameters.kMinDistanceFromEpipole * scale_factor: continue idxs1.append(i) @@ -246,23 +294,34 @@ def search_frame_for_triangulation(f1, f2, img2, img1 = None): assert(len(line_edges) == len(idxs1)) timer.start() - idxs2_out = [] - idxs1_out = [] - lines_out = [] - for i, idx in enumerate(idxs1): - f2_idx = find_matches_along_line(f2, e2, line_edges[i], f1.des[idx]) - if f2_idx>-1: - idxs2_out.append(f2_idx) - idxs1_out.append(idxs1[i]) - assert(f2.points[f2_idx] is None) - assert(f1.points[idxs1[i]] is None) - if __debug__: - lines_out.append(line_edges[i]) + len_des2 = len(f2.des) + flag_match = np.full(len_des2, False, dtype=bool) + dist_match = np.zeros(len_des2) + index_match = np.full(len_des2, 0, dtype=int) + for i, idx in enumerate(idxs1): # N.B.: a point in f1 can be matched to more than one point in f2, we avoid this by caching matches with f2 points + f2_idx, dist = find_matches_along_line(f2, e2, line_edges[i], f1.des[idx]) + if f2_idx > -1: + if not flag_match[f2_idx]: # new match + flag_match[f2_idx] = True + dist_match[f2_idx] = dist + idxs2_out.append(f2_idx) + idxs1_out.append(idx) + index_match[f2_idx] = len(idxs2_out)-1 + assert(f2.points[f2_idx] is None) + assert(f1.points[idx] is None) + if __debug__: + lines_out.append(line_edges[i]) + else: # already matched + if dist < dist_match[f2_idx]: # update in case of a smaller distance + dist_match[f2_idx] = dist + index = index_match[f2_idx] + idxs1_out[index] = idx + if __debug__: + lines_out[index] = line_edges[i] num_found_matches = len(idxs1_out) assert(len(idxs1_out) == len(idxs2_out)) - img2_epi = None if __debug__: #print("num found matches: ", num_found_matches) if True: diff --git a/slam.py b/slam.py index 1788d9b..9452bb5 100755 --- a/slam.py +++ b/slam.py @@ -26,7 +26,7 @@ from frame import Frame, match_frames from geom_helpers import triangulate_points, add_ones, poseRt, skew -from search_points import search_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation +from search_points import search_map_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation from map_point import MapPoint from map import Map from pinhole_camera import Camera, PinholeCamera @@ -46,7 +46,7 @@ kLocalWindow = parameters.kLocalWindow kUseMotionModel = parameters.kUseMotionModel kUseLargeWindowBA = parameters.kUseLargeWindowBA -kUseSearchFrameByEpipolarLines = parameters.kUseSearchFrameByEpipolarLines +kUseSearchFrameByProjection = parameters.kUseSearchFrameByProjection class SLAMStage(Enum): @@ -122,6 +122,8 @@ def estimate_pose_ess_mat(self, kpn_ref, kpn_cur): def track(self, img, frame_id, pose=None, verts=None): # check image size is coherent with camera params + print("img.shape ", img.shape) + print("cam ", self.H," x ", self.W) assert img.shape[0:2] == (self.H, self.W) self.timer_main_track.start() @@ -164,7 +166,7 @@ def track(self, img, frame_id, pose=None, verts=None): f_cur.pose = predicted_pose.copy() #f_cur.pose = f_ref.pose.copy() # get the last pose as an initial guess for optimization - if kUseSearchFrameByEpipolarLines: + if kUseSearchFrameByProjection: # search frame by projection: match map points observed in f_ref with keypoints of f_cur print('search frame by projection...') idx_ref, idx_cur, num_found_map_pts = search_frame_by_projection(f_ref, f_cur) @@ -215,7 +217,7 @@ def track(self, img, frame_id, pose=None, verts=None): # populate f_cur with map points by propagating map point matches of f_ref: # we use map points observed in f_ref and keypoint matches between f_ref and f_cur num_found_map_pts_inter_frame = 0 - if not kUseSearchFrameByEpipolarLines: + if not kUseSearchFrameByProjection: for i, idx in enumerate(idx_ref): # iterate over keypoint matches if f_ref.points[idx] is not None: # if we have a map point P for i-th matched keypoint in f_ref f_ref.points[idx].add_observation(f_cur, idx_cur[i]) # then P is automatically matched to the i-th matched keypoint in f_cur @@ -223,7 +225,7 @@ def track(self, img, frame_id, pose=None, verts=None): print("# matched map points in prev frame: %d " % num_found_map_pts_inter_frame) - # f_cur pose optimization 1 (here we use first available information coming from first guess of f_cur pose and map points matched with f_cur keypoints ) + # f_cur pose optimization 1 (here we use first available information coming from first guess of f_cur pose and map points of f_ref matched keypoints ) self.timer_pose_opt.start() pose_opt_error, pose_is_ok, self.num_vo_map_points = optimizer_g2o.poseOptimization(f_cur, verbose=False) print("pose opt err1: %f, ok: %d" % (pose_opt_error, int(pose_is_ok)) ) @@ -248,8 +250,8 @@ def track(self, img, frame_id, pose=None, verts=None): if pose_is_ok is True and not self.map.local_map.is_empty(): self.timer_seach_map.start() #num_found_map_pts = search_local_frames_by_projection(self.map, f_cur) - num_found_map_pts = search_by_projection(self.map.local_map.points, f_cur) # use the built local map - print("# new matched map points: %d " % num_found_map_pts) + num_found_map_pts = search_map_by_projection(self.map.local_map.points, f_cur) # use the built local map + print("# matched map points in local map: %d " % num_found_map_pts) self.timer_seach_map.refresh() # f_cur pose optimization 2 with all the matched map points @@ -263,14 +265,14 @@ def track(self, img, frame_id, pose=None, verts=None): self.timer_pose_opt.refresh() - if kUseSearchFrameByEpipolarLines: - print("search for triangulation...") + if kUseSearchFrameByProjection: + print("search for triangulation with epipolar lines...") idx_ref, idx_cur, self.num_matched_kps, _ = search_frame_for_triangulation(f_ref, f_cur, img) print("# matched keypoints in prev frame: %d " % self.num_matched_kps) # if pose is ok, then we try to triangulate the matched keypoints (between f_cur and f_ref) that do not have a corresponding map point - if pose_is_ok is True: + if pose_is_ok is True and len(idx_ref)>0: self.timer_triangulation.start() # TODO: this triangulation should be done from keyframes! diff --git a/test/cv/test_feature_detector.py b/test/cv/test_feature_detector.py index a21794f..0508f86 100644 --- a/test/cv/test_feature_detector.py +++ b/test/cv/test_feature_detector.py @@ -13,9 +13,10 @@ num_features=2000 # N.B.: here you can use just ORB descriptors! #feature_detector = feature_detector_factory(min_num_features=num_features, num_levels=4, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB) -feature_detector = feature_detector_factory(min_num_features=num_features, num_levels=2, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB) +#feature_detector = feature_detector_factory(min_num_features=num_features, num_levels=2, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB) #feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB) -#feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB) +feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB) +#feature_detector = feature_detector_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.FREAK, descriptor_type = FeatureDescriptorTypes.FREAK) kps, des = feature_detector.detectAndCompute(img) diff --git a/test/cv/test_search_for_triangulation.py b/test/cv/test_search_for_triangulation.py index 46b8227..ce67c39 100644 --- a/test/cv/test_search_for_triangulation.py +++ b/test/cv/test_search_for_triangulation.py @@ -1,4 +1,5 @@ import sys +import os import numpy as np import cv2 from matplotlib import pyplot as plt @@ -11,7 +12,7 @@ from frame import Frame, match_frames from geom_helpers import triangulate_points, add_ones, poseRt, skew, draw_points2 -from search_points import search_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation +from search_points import search_map_by_projection, search_frame_by_projection, search_local_frames_by_projection, search_frame_for_triangulation from map_point import MapPoint from slam import SLAM from pinhole_camera import Camera, PinholeCamera @@ -34,6 +35,9 @@ #------------- config = Config() +# forced camera settings to be kept coherent with the input file below +config.config_parser[config.dataset_type]['cam_settings'] = 'settings/KITTI04-12.yaml' +config.get_cam_settings() dataset = dataset_factory(config.dataset_settings) @@ -54,8 +58,8 @@ """ tracker_type = TrackerTypes.DES_BF #tracker_type = TrackerTypes.DES_FLANN -#feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) -feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 1, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) +feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.SHI_TOMASI, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) +#feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 1, detector_type = FeatureDetectorTypes.FAST, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) #feature_tracker = feature_tracker_factory(min_num_features=num_features, num_levels = 3, detector_type = FeatureDetectorTypes.BRISK, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) #feature_tracker = feature_tracker_factory(min_num_features=num_features, detector_type = FeatureDetectorTypes.ORB, descriptor_type = FeatureDescriptorTypes.ORB, tracker_type = tracker_type) @@ -66,6 +70,7 @@ #------------- +# N.B.: keep this coherent with the above forced camera settings img_ref = cv2.imread('kitti06-12.png',cv2.IMREAD_COLOR) #img_cur = cv2.imread('kitti06-12-01.png',cv2.IMREAD_COLOR) img_cur = cv2.imread('kitti06-13.png',cv2.IMREAD_COLOR) diff --git a/visual_odometry.py b/visual_odometry.py index 16589a5..b9eac89 100644 --- a/visual_odometry.py +++ b/visual_odometry.py @@ -39,6 +39,12 @@ class VoStage(Enum): kUseGroundTruthScale = True +# This class is a first start to understand the basics of inter frame feature tracking and camera pose estimation. +# It combines the simplest VO ingredients without performing any image point triangulation or +# windowed bundle adjustment. At each step $k$, it estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. +# The inter frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $||t_{k-1,k}||=1$. +# With this very basic approach, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a +# valid trajectory by composing $C_k = C_{k-1} * [R_{k-1,k}, s t_{k-1,k}]$. class VisualOdometry(object): def __init__(self, cam, grountruth, feature_tracker): self.stage = VoStage.NO_IMAGES_YET @@ -99,7 +105,7 @@ def computeFundamentalMatrix(self, kp_ref, kp_cur): F = F[0:3, 0:3] return np.matrix(F), mask - def removeOutliersFromMask(self, mask): + def removeOutliersByMask(self, mask): if mask is not None: n = self.kpn_cur.shape[0] mask_index = [ i for i,v in enumerate(mask) if v > 0]