Skip to content

Using Deep learning Technique for Stereo vision and 3D reconstruction


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation





1. 目录说明

  • comparisons:对比实验
  • doc:文档和参考文献
  • evision_model:本文提出的模型(new-version)
  • evision_net: 本文提出的模型(old-version)
  • utils:一些工具

2. 环境说明

  • Windows10 or Ubuntu 18.04
  • Python3.6 or 3.7,Anaconda3
  • CUDA10.2
  • tensorflow 1.14.0 (with cudnn7.6.5, for dfv)
  • PyTorch 1.5.0


3.1. depth_from_video_in_the_wild

  • unsupervised learning of depth, ego-motion, object motion, and camera intrinsics)
  • 论文 项目

3.2. SfmLeaner

3.3. struct2depth

  • unsupervised learning of scene depth and robot ego-motion
  • 论文 项目

4. 参考文献

[1]. Pyramid stereo matching network. PSMNet.
[2]. TILDE: a temporally invariant learned detector.TILDE.
[3]. Deep Ordinal Regression Network for Monocular Depth Estimation.
[4]. Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints." Future Internet 10.10 (2018): 92.
[5]. Liu, Qiang, et al. "Using Unsupervised Deep Learning Technique for Monocular Visual Odometry.
[6]. DeepCalib: a deep learning approach for automatic intrinsic calibration of wide field-of-view cameras.[关键词:Camera Calibrate deep learning].
[7]. Depth from Videos in the Wild:Unsupervised Monocular Depth Learning from Unknown Cameras.
[8]. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation.

5. 其他网址

[1]. middlebury 数据集.
[2]. KITTI 数据集.
[3]. VIsion-SceneFlowDatasets数据集.
[4]. PSMNet解析.
[5]. 中科院自动化所三维重建数据集.
[6]. SfMLearner(Depth and Ego-Motion)解析.
[7]. OpenMVS.
[8]. OpenMVG.
[9]. CVonline,图片数据集汇总.
[10]. VisualData数据集搜索.
[11]. 360D-zenodo Dataset.
[12]. RGB-D Panorama Dataset.
[13]. Deep Depth Completion of a Single RGB-D Image解析.
[14]. Unsupervised Learning of Depth and Ego-Motion解析.
[15]. 视觉里程计 第二部分:匹配、鲁棒、优化和应用.
[16]. 怎样通过照片获得高质量3D模型.
[17]. tqdm.postfix.
[18]. KITTI_odometry_evaluation_tool.



  • seq 09和seq 10是ego-motion的指标(smaller the better).
  • 其余是单目深度的指标(for Abs Rel,Sq Rel,rms,log_rms,smaller the better;for A1,A2,A3,bigger the better).
  • 全部为只使用KITTI数据集的实验结果.
ATE in seq 09 ATE in seq 10 Abs Rel Sq Rel rms log_rms A1 A2 A3 备注
0.0160 ± 0.0090 0.0130 ± 0.0090 0.183 1.595 6.700 0.270 0.734 0.902 0.959 SfmLeaner Github1
0.0210 ± 0.0170 0.0200 ± 0.0150 0.208 1.768 6.856 0.283 0.678 0.885 0.957 SfmLeaner Paper2
0.0179 ± 0.0110 0.0141 ± 0.0115 0.181 1.341 6.236 0.262 0.733 0.901 0.964 SfmLeaner third party Github3
0.0107 ± 0.0062 0.0096 ± 0.0072 0.2260 2.310 6.827 0.301 0.677 0.878 0.947 Ours SfmLeaner-Pytorch4
0.0312 ± 0.0217 0.0237 ± 0.0208 0.2330 2.4643 6.830 0.314 0.6704 0.869 0.940 intri_pred5
------ ------- 0.1417 1.1385 5.5205 0.2186 0.8203 0.9415 0.9762 struct2depth baseline 6
0.0110 ± 0.0060 0.0110 ± 0.0100 0.1087 0.8250 4.7503 0.1866 0.8738 0.9577 0.9825 struct2depth M+R 7
0.0090 ± 0.0150 0.0080 ± 0.0110 0.129 0.982 5.23 0.213 0.840 0.945 0.976 DFV Given intrinsics 8
0.0120 ± 0.0160 0.0100 ± 0.0100 0.128 0.959 5.23 0.212 0.845 0.947 0.976 DFV Learned intrinsics 9


  1. SfMLearner文中(参考文献[5])所附Github的readme给出的最好结果,作者说明更改为:增加了数据扩增,移除了BN,一些微调,只用KITTI数据,没有使用explainability regularization.该效果部分略好于论文上的结果
  2. SfMLearner文中(参考文献[5])给出的KITTI上的最好成绩.
  3. SfmLeaner-pytorch的Github上给出的最佳结果.与原作者不同的地方为:Smooth loss从应用到视差上改为应用到深度上,loss除以2.3而不是2.
  4. 我们的SfMLearner-pytorch, -b 4 -m 0.6 -s 0.1 --epoch-size 3000 --sequence-length 3.
  5. 不提供内参,使用简单的内参预测手段-b 4 -m 0.6 -s 0.1 --epoch-size 3000 --sequence-length 3.
  6. from Table.1 in struct2depth paper.
  7. from Table.1 and Table.3 in struct2depth paper.
  8. from Table.1 and Table.6 in the Depth from Video in the wild paper.
  9. from Table.1 and Table.6 in the Depth from Video in the wild paper.
  10. struct2depth 和 Depth from Video in the wild 这两个工作除了使用KITTI等训练数据集,还使用了一个目标检测模型来生成“object mask”,其作用是在motion mask的生成上进行边界限定.
  11. struct2deptht提供了预训练的模型可以进行测试,Depth from Video in the wild的模型下载链接全部都删除了.


  1. 深度指标:

  2. ego-motion指标:
    ATE(Absolute Trajectory Error,绝对轨迹误差)在测试集上的均值和标准差,RE是旋转误差.(ATE (Absolute Trajectory Error) is computed as long as RE for rotation (Rotation Error). RE between R1 and R2 is defined as the angle of R1*R2^-1 when converted to axis/angle. It corresponds to RE = arccos( (trace(R1 @ R2^-1) - 1) / 2). While ATE is often said to be enough to trajectory estimation, RE seems important here as sequences are only seq_length frames long).


  1. windows上的anaconda需要Anaconda3,Anaconda3/Library/bin,Anaconda3/Scripts,Anaconda3/condabin这四个环境变量.
  2. DFV提到了一种"Randomized Layer Normalization",这种操作在PyTorch中很构造出文中描述的实现效果,我搞了一个似是而非的写法,在evision_model/_Deprecated.py中, 事实上这个方法如果真的想文中描述的那样有效,那么症结一定在别的地方.
  3. evision_model/_PlayGround.py用于在开发过程中测试一些函数,其中的代码没有被其他文件依赖,可以所以修改甚至删除.


Using Deep learning Technique for Stereo vision and 3D reconstruction








No releases published


No packages published
