Models for estimating the time of program loop execution in parallel on a CPU and with the use of OpenCL computation on a GPU
DOI:
https://doi.org/10.24136/atest.2018.501Keywords:
programming loop, estimating the time of the loop, programming CPU and GPGPUAbstract
The authors present models for estimating the time of execution of program loops compliant with the FAN model with no data dependencies or with data dependencies only within the body programming loop, which can be executed either by CPUs or by stream multiprocessors referred to as GPU cores. The models presented will make it possible to determine whether it would be more efficient to execute computation in the existing environment using the CPU (Central Pro-cessing Unit) or a state-of-the-art graphics card with a high-performance GPU (Graphics Processing Unit) and super-fast memory, of-ten implemented in modern graphics cards. Validity checks confirming the developed time estimation model for GPU are presented. The purpose of these models is to provide methods for accelerating the performance of applications performing various tasks, including transport tasks, such as accelerated solution searching, searching paths in graphs, or accelerating image processing algorithms in vision systems of autonomous and semiautonomous vehicles, where these models allow to build an automatic task distribution system between the CPU and the GPU with the variability of computing resources.
Downloads
References
Burmeister H.-C., Bruhn W., Rødseth Ø., Porathe T., Autonomous Unmanned Merchant Vessel and its Contribution towards the e-Navigation Implementation: The MUNIN Perspective, International Journal of e-Navigation and Maritime Economy, Volume 1, December 2014, pp. 1-13
J. Koszelew, P. Wolejsza and D. Oldziej, "Autonomous Vessel with an Air Look," 2018 Baltic Geodetic Congress (BGC Geomat-ics), Olsztyn, 2018, pp. 102-106
Sanders J., Kandrot E., CUDAby Example: An Introduction to General-Purpose GPU Programming, Addison-Wesley, 2011
Rauber T., Runger G., Parallel Programming for multicore and cluster systems, Springer-Verlag 2012
Wolf F., Freitag F., Mohr B., Moore S., Wylie B., Large Event Traces in Parallel Performance Analysis, ARCS Workshops, 2006
Gebali F., Algorithms and Parallel Computing, Wiley, 2011
Lewis T., Foundations of Parallel Programming: A Machine-Inde-pendent Ap-proach. IEEE Computer Society Press, 1992
Wróbel M. (2015) Models for Estimating the Execution Time of Software Loops in Parallel and Distributed Systems. In: Zamojski W., Mazurkiewicz J., Sugier J., Walkowiak T., Kacprzyk J. (eds) Theory and Engineering of Complex Systems and Dependability. DepCoS-RELCOMEX 2015. Advances in Intelligent Systems and Computing, vol 365. Springer, Cham
Nozdrzykowski Ł., Nozdrzykowska M. (2018) Testing the Significance of Parameters of Models Estimating Execution Time of Parallel Program Loops According to the Open MPI Standard. In: Zamojski W., Mazurkiewicz J., Sugier J., Walkowiak T., Kacprzyk J. (eds) Advances in Dependability Engineering of Complex Sys-tems. DepCoS-RELCOMEX 2017. Advances in Intelligent Sys-tems and Computing, vol 582. Springer, Cham
Cegielski M. (2016) Parallel computation of transient processes on OpenCL framework, Przegląd Elektrotechniczny, ISSN 0033-2097, R. 92 NR 7/201
Thouti K., Sathe S. R,(2013) A Methodology for Translating C Programs to OpenCL, International Journal of Computer Applica-tions (0975–8887) Volume 82–No3, November 2013
Sawerwain M., OpenCL Akceleracja GPU w praktyce, PWN, 2014
Farber R, Cuda Application Design and Development, Morgan Kaufmann, 2012
Nvidia’s opencl best practices guide. Dostęp online: https://hpc.oit.uci.edu/nvidia-doc/sdk-cuda-doc/OpenCL/doc/OpenCL_Best_Practices_Guide.pdf, 2011