A framework for accelerating local feature extraction with OpenCL on multi-core CPUs and co-processors

Konrad Moren; Diana Göhringer

doi:10.1007/s11554-016-0576-0

A framework for accelerating local feature extraction with OpenCL on multi-core CPUs and co-processors

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Konrad Moren - , Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung (Autor:in)
Diana Göhringer - , Ruhr-Universität Bochum (Autor:in)

Abstract

In this paper, we examined heterogeneous architectures, for their suitability to run the scale invariant feature transformation (SIFT) algorithm in real time. The SIFT is one of the most robust as well as one of the most computational intensive algorithms to extract local features in many machine-vision applications. Many ongoing researches presented methods on improving the SIFT execution time. However, described techniques focus only on improving the SIFT execution time on a single homogeneous device. To address the gap in improving SIFT algorithm execution time on multi-device heterogeneous platforms we have prepared the OpenCL-SIFT implementation. We have described techniques to efficiently parallelize the application that contains many different computing cores. By a careful optimization process, we presented the performance portable implementation, for an efficient processing on various multi-device heterogeneous platforms. The experimental results showed that our implementation obtains appropriate accuracy and higher efficiency compared to recent open-source SIFT implementations. Using proposed methods we extracted SIFT features with more than 30 FPS on Full-HD images with different processor architectures. Additionally to increase the performance, we showed efficient (in average speed-up of 2.69×) multi-device scheduling methods for SIFT feature extraction. Finally, we described guidelines to optimize GPGPU-OpenCL programs for ×86 multi-core CPUs. The discussed methods are generic and may be used for the design of other algorithms.

Details

Originalsprache	Englisch
Seiten (von - bis)	901-918
Seitenumfang	18
Fachzeitschrift	Journal of Real-Time Image Processing
Jahrgang	16
Ausgabenummer	4
Publikationsstatus	Veröffentlicht - 13 Aug. 2019
Peer-Review-Status	Ja
Extern publiziert	Ja

Externe IDs

ORCID	/0000-0003-2571-8441/work/159607536

Schlagworte

ASJC Scopus Sachgebiete

Information systems

Schlagwörter

GPGPU, Intel Xeon Phi, Multi-core CPU, OpenCL, Platform specific optimizations, SIFT

Forschungsportal der TU Dresden