top of page

OUTCOME SPOTLIGHT

Depth Completion Based on Stereo Disparity: A Comparison of Traditional Methods and Deep Learning

A comparative stereo vision pipeline showing how geometric methods and deep learning complement each other in robotic depth perception.

Attended PBL

Amazon Robotics Project

Computer Vision and Robotics

Build intelligent vision systems for robotics by exploring stereo vision, human motion synthesis, posture detection, and scene understanding.

DEMONSTRATED CAPABILITY

Perception System Design

System-Level Insight

Designed and evaluated a perception system by balancing algorithmic accuracy, interpretability, and operational constraints to support reliable decision-making in complex, real-world environments.

Compared alternative technical approaches by analyzing trade-offs in accuracy, robustness, and resource constraints to inform system-level design decisions.

Evaluated system performance under varying data quality and environmental conditions, identifying failure modes and limitations through structured experimentation.

Integrated geometric reasoning and data-driven methods to validate assumptions and articulate evidence-based improvements applicable across domains.

What This Project Achieves

This project investigates how robots perceive depth by comparing traditional stereo vision algorithms with modern deep learning approaches. The team implemented a full stereo vision pipeline—from camera calibration to 3D point cloud reconstruction—and evaluated StereoSGBM against deep learning–based disparity estimation. The results reveal a key trade-off: geometric methods offer interpretable and accurate measurements, while deep learning provides denser depth in low-texture regions. Together, the project highlights why hybrid perception systems are essential for real-world robotic applications.

How This Was Built — Key Highlights

This project built an end-to-end stereo vision and depth completion pipeline, combining classical computer vision techniques with deep learning models to evaluate their strengths and limitations in robotic perception.

  • Captured stereo images using an iPhone 15 in RAW format to avoid digital post-processing artifacts.
    Performed camera calibration with a rigid checkerboard target using images taken from varied angles and distances.

  • Implemented OpenCV’s StereoSGBM algorithm, tuning blockSize, P1, and P2 parameters to balance detail and stability.

  • Integrated a deep learning–based disparity model (MiDaS Vision Transformer) to generate dense depth maps in low-texture regions.

  • Converted disparity maps into depth maps and reconstructed 3D point clouds using geometric refinement techniques.

Challenges

Developing reliable stereo depth estimation exposed several technical and experimental challenges.

  • StereoSGBM performance was highly sensitive to parameter choices, especially blockSize and smoothness penalties.

  • Low-texture surfaces caused missing or noisy disparity in traditional geometric methods.

  • Deep learning approaches required higher computational resources and reduced interpretability.

  • Ensuring accurate calibration demanded high-quality input data and rigid calibration targets.

Insights

The experiments revealed important lessons about combining geometry and learning for robotic perception.

  • High-quality image acquisition is a prerequisite for accurate stereo vision.

  • Geometric methods remain preferable when interpretability and computational efficiency are critical.

  • Deep learning excels at filling missing depth in low-texture regions but benefits from geometric constraints.

  • Hybrid pipelines offer a balanced solution between accuracy, density, and real-world deployability.

Project Gallery

Academic Team Feedback

Feedback from the Project Lead—a Principal Computer Vision Engineer with 15 years of industry experience in robotics and machine vision—highlighted the strong technical depth and practical relevance of this project. Drawing on her background applying computer vision across consumer robotics, healthcare, factory automation, and edge-based embedded systems, she noted that the team demonstrated a clear understanding of stereo vision fundamentals and real-world constraints such as calibration sensitivity, lighting variation, and computational efficiency. Luís was recognized as a key driver of the project, providing technical leadership and shaping a well-structured comparison between geometric methods and deep learning–based depth estimation. The Academic Coordinator additionally emphasized his collaborative leadership, clarity of communication, and ability to translate complex theory into a robust, visually interpretable perception pipeline that elevated the overall quality of the team’s outcome.

Project Contributor(s)

Shuyi Li.jpg

Luís Felipe Rodrigues Dutra

University of Campinas (UNICAMP), Brazil

Project Reflection

This project allowed me to apply computer vision and robotics theory to a real-world problem, bridging geometric algorithms with modern deep learning approaches. By comparing their strengths and limitations, I gained a deeper understanding of how robots perceive depth and why hybrid solutions are key to building robust autonomous systems.

bottom of page