Real-time Material
Recognition System

Computer Vision Course Work with Professor Kenji Shimada
Collaboration with Shiyu Chen, Carla Flores Trávez, Quincy Wang
Completed Dec 2025

The Challenge:
Speed vs. Accuracy in Industrial Automation

Conventional AI models often create bottlenecks in high-speed industrial environments, like recycling plants with conveyor belts running at 2 m/s, forcing a compromise between processing speed and classification precision.

The Technical Solution: A Hybrid Vision Pipeline

We engineered a hybrid vision pipeline that optimizes computational efficiency by strategically integrating Classical Computer Vision with Deep Learning (YOLOv8-seg).

  • The Classical Branch: Utilizes high-speed deterministic methods—including thresholding, morphology, and contour analysis—to isolate objects and perform optical-flow-based slip detection.

  • The Deep Learning Branch: By using the classical branch to identify specific Regions of Interest (ROIs), the YOLOv8-seg inference operates only where necessary rather than scanning the entire frame.

You don’t need deep learning everywhere —
you need it where it matters.

Technical Framework: A Dual-Branch Vision Strategy

The system architecture utilizes a high-efficiency hybrid vision pipeline designed to overcome the speed-accuracy trade-offs inherent in fast-paced industrial recycling environments. Following initial correction, calibration, and preprocessing, the data stream bifurcates into two specialized processing modules. Branch 1 employs Classical CV techniques, such as thresholding, morphology, and contour analysis, to perform rapid foreground segmentation and belt-health diagnostics, including optical-flow-based slip detection. Simultaneously, Branch 2 utilizes YOLOv8-seg for semantic understanding and instance masking. By using the Classical branch to identify specific Regions of Interest (ROIs), the system restricts deep learning inference to relevant areas only, effectively reducing the computational workload by over 80%. These parallel outputs are fused during the Data Analysis & Visualization phase to generate real-time KPIs, enabling the system to sustain 60+ FPS with over 95% classification accuracy under varying motion conditions.

The Impact and Results

This strategic fusion reduced the total inference workload by more than 80%. The resulting prototype sustains an industrial-grade 60+ FPS while maintaining a classification accuracy above 95%. It successfully generates real-time KPIs, such as throughput and per-class ratios, proving that integrating deterministic CV with deep learning can achieve performance levels unattainable by either method alone.

Next
Next

Compression Only Funicular Bench