Row-Oriented Data Layout

行式（row-oriented） 数据布局是指数据以行的方式存储，每一行有相同字段（顺序、类型等都一致），整行连续地存储在磁盘上。

行式数据结构适用于数据由多个关键元素构成，且经常被一并使用。

由于一个数据页中会存储不同列的数据，当只访问部分列时，比如例子中的 User 字段，不得不将所有列读取后丢弃不要的列，导致成本提高。

What is row-oriented data layout?

Row-oriented database management systems store data in records or rows. Their layout is quite close to the tabular data representation, where every row has the same set of fields.^[1]

The suitable scenarios for row-oriented database

This approach works well for cases where several fields constitute the record uniquely identified by the key. All fields representing a single user record are often read together. When creating records (for example, when the user fills out a registration form), we write them together as well. At the same time, each field can be modified individually.

Since row-oriented stores are most useful in scenarios when we have to access data by row, storing entire rows together improves spatial locality.^[1:1]

Problem with row-oriented database

Because data on a persistent medium such as a disk is typically accessed block-wise (in other words, a minimal unit of disk access is a block), a single block will contain data for all columns. This is great for cases when we’d like to access an entire user record, but makes queries accessing individual fields of multiple user records (for example, queries fetching only the phone numbers) more expensive, since data for the other fields will be paged in as well.^[1:2]

Petrov, Alex. Database Internals. 1st ed., 2019. ↩︎ ↩︎ ↩︎