XJoin
Eugenio Marinelli
Unknown
Portable, parallel hash join implementation across diverse XPU architectures with oneAPI ...learn more
Project status: Under Development
Intel Technologies
oneAPI,
DPC++,
Intel Iris Xe MAX,
Intel Integrated Graphics,
DevCloud
Overview / Usage
Modern server hardware is increasingly heterogeneous with a diverse mix of XPU architectures deployed across CPU, GPU, and FPGAs. However, till date, database developers have had to rely on either proprietary, architecture-specific solutions (like CUDA), or low level, cross-architecture solutions that complicate development (like OpenCL). The lack of portable parallelism caused by the absence of a common high-level programming framework is one of the main reasons preventing a wider adoption of XPUs by database systems. In this project, we take the first steps towards solving this problem using oneAPI – a cross-industry effort for developing an open, standards-based unified programming model that extends standard C++ to provide portable parallelism across diverse processor architectures.
Methodology / Approach
We port a recently-proposed, highly-optimized, GPU-based hash join algorithm from CUDA to Data Parallel CPP (DPCPP). We then execute the hash join on multicore CPUs, integrated GPUs (Intel GEN9), and discrete GPUs (Intel DG1 and NVIDIA GeForce) without changing a single line of kernel code to show that DPCPP enables portable parallelism. We compare the performance of DPCPP kernels with hand-optimized CUDA kernels and model-based theoretical performance bounds to demonstrate the performance–portability trade off in using DPCPP.
Repository
Other links
Collaborators
There are no people to show.