Row-Oriented Data Layout
行式(row-oriented) 数据布局是指数据以 行 的方式存储,每一行有相同字段(顺序、类型等都一致),整行连续地存储在磁盘上。
行式 数据结构适用于数据由多个关键元素构成,且经常被一并使用。
由于一个数据 页 中会存储不同 列 的数据,当只访问部分 列 时,比如例子中的 User
字段,不得不将所有列读取后丢弃不要的列,导致成本提高。
Row-oriented database management systems store data in records or rows. Their layout is quite close to the tabular data representation, where every row has the same set of fields.[1]
This approach works well for cases where several fields constitute the record uniquely identified by the key. All fields representing a single user record are often read together. When creating records (for example, when the user fills out a registration form), we write them together as well. At the same time, each field can be modified individually.
Since row-oriented stores are most useful in scenarios when we have to access data by row, storing entire rows together improves spatial locality.[1:1]
Because data on a persistent medium such as a disk is typically accessed block-wise (in other words, a minimal unit of disk access is a block), a single block will contain data for all columns. This is great for cases when we’d like to access an entire user record, but makes queries accessing individual fields of multiple user records (for example, queries fetching only the phone numbers) more expensive, since data for the other fields will be paged in as well.[1:2]