Vectorized performance on the petabyte scale

Filed under: Kinetica

Last updated on: July 21, 2021

Length:
4 minute read, 642 words

While vectorization on its own can deliver blazing fast computational performance, Kinetica does not stop there. The next piece in its high performance puzzle is its memory-first nature.

Memory first

Being memory-first means that Kinetica prioritizes the use of system memory or RAM, as well as the faster VRAM, which is the chip memory co-located with the GPU.

Hard disks are cheap and can store a lot more data. But they are significantly slower. To read 64 bits from a hard disk drive takes about 10 million nanosecond (.01 seconds) while the same from memory takes only 100 nanoseconds. This is 100,000 times faster.

This is because hard disks are generally block or page addressable, meaning that we can only access data from hard disks in blocks of 4 kilobytes each. And any time we need to read data from the disk, this entire 4 KB page needs to be read by a mechanical moving head on a spinning disk and loaded into memory.

This is much slower than system memory where every byte has its own binary address and can be directly accessed without a mechanical head. So every read or write takes about the same amount of time to complete, making it much faster.

By adopting a memory first approach Kinetica can escape the performance bottlenecks of loading data from the disk for analysis.

But the flip side to all of this is that RAM is expensive. So how does Kinetica manage the cost of having to store everything in memory?

There are two things that balance this. The first is Kinetica’s native columnar data storage technology.

Columnar data format

Column oriented designs store all of the values from a column in a table together. This contrasts with row oriented designs where values from each row are stored together.

The columnar format, unlike row oriented formats, makes it easy to apply data compression techniques like dictionary encoding, which is used to encode data types such as character strings that take up a lot of storage space as more lightweight data types like integers to reduce the amount of space required to store them.

This can save a lot of storage space. For instance, a 256 character string column with 10000 unique values and 8 million records, would take about 2 gigabytes of space without any compression. But with compression, Kinetica can store this same column in memory with about 20 megabytes of space. This is about 99% less space.

However, even with the ability to compress columnar data, relying solely on memory to handle all operational data can quickly become prohibitive from a cost perspective, especially when data can run into petabytes.

Tiered storage

This is where Kinetica stands out in comparison with newer databases that are built to use GPUs for analytics. These solutions also lay claim to being vectorized, but because their databases are in-memory, they are constrained by system memory (< 10 TBs) and are unable to deliver vectorized performance at the massive scales required by modern enterprises.

Kinetica, on the other hand, delivers native vectorization across both the CPU and the GPU, at scale, using its tiered storage and memory first architecture.

Tiered storage allows enterprises to manage petabyte-scale data by backing memory storage with several tiers of disk-based storage.

As a rule of thumb, as we go down the tiers, Kinetica trades performance for storage.

But, it optimizes performances by providing fine grained controls over how the data is prioritized between the different tiers. This allows us to keep more frequently used data in the hot memory tiers and less used data in the colder tiers below, thereby minimizing data movement across tiers while running analytics.

With tiered storage, Kinetica can handle enterprise grade workloads with petabytes of data. Making it the only solution that delivers vectorized performance on both CPUs and GPUs at scale.