Prerequisite:
This project builds upon the 2D perception stack developed in:
ROS2 Autonomous Perception Stack — 2D Perception
Extend the perception stack beyond monocular vision by integrating LiDAR data and basic sensor fusion techniques.
- Add LiDAR sensor to CARLA vehicle (completed)
- Publish LiDAR point clouds through ROS2 (completed)
- Visualize point cloud data (completed)
- Point cloud processing
CARLA Online RGB Camera + LiDAR Integration
- Extract camera intrinsic parameters (completed)
- Extract LiDAR extrinsic parameters (completed)
- Validate camera–LiDAR synchronization (completed)
- Transform LiDAR points into camera coordinates (completed)
- Project LiDAR points onto image plane (completed)
- Visualize projected LiDAR points on RGB images (completed)
- ✅ Camera intrinsics extracted from ROS2 CameraInfo
- ✅ Camera matrix validated
- ✅ LiDAR extrinsics derived from CARLA sensor configuration
- ✅ ROS2 TimeSynchronizer implemented
- ✅ Exact camera–LiDAR timestamp synchronization verified
- ✅ PointCloud2 parsing implemented
- ✅ LiDAR → camera coordinate transformation implemented
- ✅ Perspective projection implemented
- ✅ Image bounds filtering implemented
- ✅ OpenCV overlay visualization implemented
- ✅ Depth-colored projection visualization implemented
Topic:
/carla/ego_vehicle/rgb_front/image
Resolution:
- 640 × 480
FOV:
- 90°
Position:
- x = -1.5
- y = 0.0
- z = 2.4
Topic:
/carla/ego_vehicle/lidar
Position:
- x = 0.0
- y = 0.0
- z = 2.4
Parameters:
- Range: 50 m
- Channels: 32
- Points/sec: 56000
- Rotation Frequency: 10 Hz
fx = 320.0
fy = 320.0
cx = 320.0
cy = 240.0
Camera Matrix:
| 320 | 0 | 320 |
|---|---|---|
| 0 | 320 | 240 |
| 0 | 0 | 1 |
Used:
- ROS2
message_filters.TimeSynchronizer
Results:
Image TS : 3045.320203
LiDAR TS : 3045.320203
Delta : 0.000 ms
CARLA Online RGB Camera + LiDAR Integration
- Run YOLOv8 perception pipeline
- Associate LiDAR points with detected objects
- Filter object-specific point clusters
- Estimate object distance
- Object Class
- Bounding Box
- Estimated Distance
Car: 18.4 m
Truck: 27.1 m
Pedestrian: 12.3 m
- Fuse camera detections with LiDAR measurements
- Generate distance-aware detections
- Evaluate fusion robustness
- Analyze fusion performance
- Multi-modal perception
- Detection enhancement
- Scene understanding
- Generate Bird's-Eye View representation
- Visualize projected point clouds
- Estimate object positions
- Build spatial awareness pipeline
Transition from 2D perception toward 3D scene understanding.
| Pipeline | Detection | Distance Estimation | Spatial Awareness |
|---|---|---|---|
| Camera Only | ✓ | ✗ | Limited |
| Camera + LiDAR | ✓ | ✓ | Improved |
- Distance estimation accuracy
- Sensor synchronization stability
- Fusion processing latency
- Perception throughput
- ROS2 LiDAR integration
- Camera-LiDAR calibration pipeline
- Distance-aware object detection
- Sensor fusion perception pipeline
- Basic 3D perception framework
- Fusion benchmarking report
- Multi-modal perception demonstration
Expand perception coverage using multiple synchronized cameras.
- Front camera
- Rear camera
- Left camera
- Right camera
- Multi-camera ROS2 integration
- Camera synchronization
- Multi-stream visualization
- Cross-camera object tracking
- Overlapping field-of-view analysis
- Unified perception visualization
- 360° scene awareness
- Multi-camera architecture
- Perception scalability
- Sensor synchronization
- Multi-camera ROS2 pipeline
- 360° perception visualization
- Multi-camera benchmarking report
- Multi-camera perception demonstration
Expand perception coverage from a single viewpoint to full-surround awareness.
Combine camera, LiDAR, and multi-camera perception into a unified perception system.
- Multi-camera and LiDAR synchronization
- Multi-modal data association
- Bird's-Eye View generation
- Unified world representation
- Object localization in world coordinates
- Scene-level perception analysis
- Advanced sensor fusion
- 3D scene understanding
- Spatial reasoning
- Autonomous perception architecture
- Multi-camera + LiDAR fusion pipeline
- Bird's-Eye View visualization
- Unified perception framework
- Multi-modal benchmarking report
Build a complete multi-modal perception stack resembling modern autonomous systems.
Deploy the optimized perception stack on embedded edge hardware.
- NVIDIA Jetson Nano
- NVIDIA Jetson Xavier NX
- NVIDIA Jetson Orin Nano
- Containerize perception pipeline
- Prepare deployment scripts
- Package ROS2 perception nodes
- Validate TensorRT deployment workflow
- Optimize memory usage
- Optimize power consumption
- Tune TensorRT inference settings
- Analyze thermal behavior
- Evaluate deployment constraints
| Platform | FPS | Latency | GPU Utilization | Memory Usage |
|---|---|---|---|---|
| Desktop GPU | ||||
| Jetson Nano | ||||
| Xavier NX | ||||
| Orin Nano |
- Real-time FPS
- End-to-end latency
- Memory footprint
- Power efficiency
- Thermal stability
- Continuous perception testing
- Long-duration stability testing
- Resource monitoring
- Failure analysis
- Edge deployment workflow
- TensorRT deployment package
- Containerized perception stack
- Edge benchmarking report
- Embedded deployment guide
Deploy a robotics perception system on real edge hardware with validated real-time performance.
- Integrate transformer-based detector
- Accuracy
- FPS
- Latency
- Edge suitability
- CNN vs ViT perception comparison
- GitHub repository
- Demo video
- Benchmark report
- ROS2 modular perception stack
- CNN vs ViT comparison

