Title: Accelerating the RICH Particle Detector Algorithm on Intel Xeon Phi
Abstract: At the LHC, particles are collided in order to understand how the universe was created. Those collisions are called events and generate large quantities of data, which have to be pre-filtered before they are stored to hard disks. This paper presents a parallel implementation of these algorithms that is specifically designed for the Intel Xeon Phi Knights Landing platform, exploiting its 64 cores and AVX-512 instruction set. It shows that a linear speedup up until approximately 64 threads is attainable when vectorization is used, data is aligned to cache line boundaries, program execution is pinned to MCDRAM, mathematical expressions are transformed to a more efficient equivalent formulation, and OpenMP is used for parallelization. The code was transformed from being compute bound to memory bound. Overall, a speedup of 36.47x was reached while obtaining an error which is smaller than the detector resolution.
Publication Year: 2018
Publication Date: 2018-03-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot