摘要
目前市面是已經商品化的立體視覺開發系統包括微軟的Kinect、華碩的Xtion、及Intel的Realsense。然而,這些系統兩個攝影機的距離已經固定,無法依據需求彈性地調整位置。同時,需要搭配指定的作業系統及配置高等級的處理器才能使用,因此在很多需要在移動的環境上處理立體影像常常無法滿足其需求。本論文透過Altera Cyclone V系列SoC FPGA硬體平台,並透過自行開發的IP搭配HPS to FPGA Bridge將影像傳至FPGA進行深度影像計算,並可以將結果回傳至HPS繼續進行影像處理,並將結果輸出至LCD或VGA顯示。除此之外,本論文的立體影像採用雙Web Camera擷取影像,然後搭配半全域匹配演算法及Sobel Filter、Census Transform、視差優化等演算法進行立體影相匹配及最佳化,建構一個即時三維深度影像開發平台。透過HPS與FPGA的軟硬體整合,將需要加速的演算法用FPGA硬體電路加速,並在Embedded Linux下透過OpenCV程式庫進行軟體開發,除了提供使用者更彈性的即時立體影像開發平台,同時更讓研發人員可以透過此系統依據自己的應用需求,彈性搭配自行需求的攝影機規格及攝影機距離來完成設雛型立體影像產品的設計與開發。同時在本文演算法實現結果,實現之深度資訊估算僅使用18,023的LUTs及580,362的Memories,具有低複雜性特點。最後以Middlebury Tsukuba 標準測試圖與即時影像做測試驗證本系統的準確性。
關鍵詞:立體視覺、影像開發平台、嵌入式系統、半全域演算法、硬體處理器系統、現場可程式化閘陣列
Abstract
There are some stereo image systems that have developed in the market, such as Microsoft's Kinect, ASUS’s Xtion, and Intel’s RealSense. However, the upscale CPU is more expensive and high power consumption, so it is not suitable for mobile application. Besides, the distance between two cameras and stereo matching algorithm are limited due to the camera specifications. The proposed system uses SoC and FPGA of Altera Cyclone V series for the Embedded Real-time Stereo Image Development System. Hardware implementations would be conducted by algorithms of deep computation via the intellectual property designed in this system, and applications with large elastic demand would be implemented by software. In this paper, stereo information of images would be calculated mainly through Semi-Global Matching, Sobel filter, Census transform, and Parallax optimization algorithm. Two USB cameras set up in the Embedded Linux operating system to develop Video Capturing Software with V4L2 API. The image data would be transferred to the FPGA memory through AXI Bridges, and the depth image would be obtained with SGM algorithm, then the processed depth information would be synthesized and displayed on LCD. Meanwhile, the information would be sent back to the HPS Memory for applications in Embedded Linux. The camera specifications and stereo matching algorithms can be adjusted by designers. The stereo matching and parallax optimization algorithms consume 18,023 LUTs and 580,362 Memories in the resources of FPGA, and the system were verified by using the Tsukuba Image Pairs from Middlebury Stereo Datasets.
Keywords: Stereo Vision, Image Development Platform, Embedded System, SGM, HPS, FPGA