Skip to content
KaiGai Kohei edited this page Nov 11, 2021 · 2 revisions

This note is FAQ (frequently asked questions) when people are interested in PG-Strom.

PG-Strom features

Why GPU-rev Sorting plan is not supported?

Due to the characteristics of Sorting workload, we cannot kick GPU-rev sorting kernel until all the data is loaded onto GPU device memory. It is a fully synchronous operation, unlike other GPU accelerated plans (Scan/Join/GroupBy). It will eliminate one significant advantage of PG-Strom; asynchronous execution in GPU while CPU is operating to load data blocks from the storage/buffer. In addition, we usually concern the performance of sorting large scale data, not small. However, GPU device memory can hold limited amount of data to be sorted (up to 80GB in NVIDIA A100). It implies GPU-rev sorting can handle not a large data sorting more than its device memory. We doubt it is really valuable to provide. Of course, we can split the input data into multiple chunks to fit GPU device memory, then CPU can merge them, however, this design eventually huge CPU cycles in addition to the data transfer cost between host memory and device memory. By these reason, we gave up to support GpuSort logic in PG-Strom. If you are a developer, check out some source files under the deadcode/ directory.

Supported devices/platforms

What GPU supports GPU-Direct SQL?

We implemented GPU-Direct SQL feature on top of NVIDIA GPUDirect RDMA (https://docs.nvidia.com/cuda/gpudirect-rdma/index.html). It allows third-party Linux kernel modules (like, nvme_strom.ko by HeteroDB) to intermediate direct data transfer from other PCI-E devices to GPU. However, this functionality is only enabled at Tesla/Quadro. So, we cannot support SSD-to-GPU Direct SQL on Geforce RTX/GTX devices. In addition, most of GPU devices, except for high-end Tesla, have only 256MB of PCI-E Bar1 memory space; that is used for the window of data transfer at P2P DMA/RDMA. For PG-Strom usage, 256MB is too small for concurrent & multiplexed data transfer. Right now, only Tesla V100, P100 and P40 offers PCI-E Bar1 memory space more than its physical device memory. (NVIDIA A100 shall have large PCI-E Bar1 memory as well) So, we support SSD-to-GPU Direct SQL at the following devices only.

  • NVIDIA A100 (Ampere gen)
  • NVIDIA A40 (Ampere gen)
  • NVIDIA Tesla V100 (Volta gen)
  • NVIDIA Tesla P100 (Pascal gen)
  • NVIDIA Tesla P40 (Pascal gen)

Why Maxwell/Kepler GPUs are not supported?

Pascal model newly supports the feature of demand paging which allocates physical page frame on run-time, like as modern operating system doing. It enables to consume the least amount of device memory, thus provides a significant improvement for the database workloads. Even though we estimates number of result rows for SCAN, JOIN and GROUP BY during query optimization, basically, we cannot know the exact result size unless query is not executed actually. On the other hands, we have to allocate device memory for result buffer prior to launch of GPU kernel when we run these workloads on GPU. It may be able to allocate device memory based on the estimation and some margins, but not perfect. Result size may be often larger than the estimation with margin, and large margin will increase the dead space where other concurrent jobs could use. Once we utilize the demand paging feature of Pascal or later, implementation becomes much much simpler. Even if we reserved very large memory address space, it does not consume physical device memory immediately. The device memory shall be assigned according to growth of consumption of the result buffer. So, PG-Strom is now designed to rely on the demand paging feature entirely. It is not an easy job to support the older architecture, and we have no plan to support them again.