Buffer Manager
在 关系型数据库管理系统 的 事物存储管理器 中,buffer manager 在内存中拥有一片大小相同的内存块数组,称为 buffer pool frames。buffer manager 负责将数据在磁盘上的数据 页 和内存中 buffer pool frame 之间进行交换,其中 buffer pool frame 与数据 页 大小一致。
比如可以使用 mmap
系统调用,让操作系统负责页面交换。为什么 DBMS 要手动管理 页 交换呢?
如果让操作系统来负责,会有大量问题:
- 事物安全:OS 无法精确知道哪些 页 可以刷写,可能导致脏页面写回
- I/O 阻塞:缺页中断(page fault)系统调用会导致进程阻塞(stall),为了解决这个问题可以引入多线程,比如
epoll
,可是当系统同时有多个写进程时情况就会很复杂 - 错误处理:硬件错误或对齐问题导致的内存访问失败会出现
SIGBUS
中断,不得不注册 signal handler 去处理,导致其他进程阻塞 - 性能问题:操作系统还包含大量自身的数据结构,以及需要处理 TLB shootdown
因此,DBMS 通常自己控制所有事情,因为它自己才知道访问的数据、处理的查询是什么。尽管这会导致重复写了操作系统的部分能力,但这保证了性能和操作性。
In order to provide efficient access to database pages, every DBMS implements a large shared buffer pool in its own memory space.[1]
The database is all on disk, and the data in database files is organized into pages, with the first page being the directory page. To operate on the data, the DBMS needs to bring the data into memory. It does this by having a buffer pool that manages the data movement back and forth between disk and memory.[2]
The buffer pool is organized as an array of frames, where each frame is a region of memory the size of a database disk block.[1:1]
Associated with the array of buffer pool frames is a hash table that maps (1) page numbers currently held in memory to their location in the frame table, (2) the location for that page on backing disk storage, and (3) some metadata about the page.[1:2]
The metadata includes a dirty bit to indicate whether the page has changed since it was read from disk, and any information needed by the page replacement policy to choose pages to evict when the buffer pool is full.[1:3]
Most systems also include a pin count to signal that the page is not eligible for participation in the page-replacement algorithm. When the pin count is non-zero, the page is “pinned” in memory and will not be forced to disk or stolen. This allows the DBMS’s worker threads to pin pages in the buffer pool by incrementing the pin count before manipulating the page, and then decrementing it thereafter.
Hellerstein, Joseph M., Michael Stonebraker, and James Hamilton. “Architecture of a Database System.” Foundations and Trends® in Databases 1, no. 2 (2007): 141–259. https://doi.org/10.1561/1900000002. ↩︎ ↩︎ ↩︎ ↩︎
https://15445.courses.cs.cmu.edu/fall2023/notes/03-storage1.pdf ↩︎