Search CTRL + K

Data Warehouse

数据仓库 是以较为结构化的 架构 存储了不同系统数据的一个中心化数据系统,支撑 数据分析、数据挖掘、AI 甚至机器学习。数据仓库往往通过 ETL 将不同数据源的数据转化为一个统一的 架构 存储。

数据仓库 根据拉取数据的时机(When)和方式(How)可以分为:

数仓 不是一个单一的数据库,往往由负责 ETL 的系统、OLAP 数据库、metadata 数据库和前端查询组成。

20230803142211.8d9kaq


IBM

A data warehouse, or enterprise data warehouse (EDW), is a system that aggregates data from different sources into a single, central, consistent data store to support data analysis, data mining, artificial intelligence (AI), and machine learning. A data warehouse system enables an organization to run powerful analytics on huge volumes (petabytes and petabytes) of historical data in ways that a standard database cannot.[1]

Database System Concepts

Data warehouses gather data from multiple sources at a single site, under a unified schema (which is usually designed to support efficient analysis, even at the cost of redundant storage).[2]

An Overview of Data Warehousing and OLAP Technology

Data warehousing is a collection of decision support technologies, aimed at enabling the knowledge worker (executive, manager, analyst) to make better and faster decisions.[3]

Building the Data Warehouse

A data warehouse is a “subject-oriented, integrated, timevarying, non-volatile collection of data that is used primarily in organizational decision making.”[4]


  1. https://www.ibm.com/topics/data-warehouse ↩︎

  2. Abraham Silberschatz, Henry F. Korth, and S. Sudarshan, Database System Concepts, Seventh edition (New York, NY: McGraw-Hill, 2020). P39. ↩︎

  3. Chaudhuri, Surajit, and Umeshwar Dayal. “An Overview of Data Warehousing and OLAP Technology.” ACM SIGMOD Record 26, no. 1 (March 1, 1997): 65–74. https://doi.org/10.1145/248603.248616. ↩︎

  4. Inmon, W.H., Building the Data Warehouse. John Wiley, 1992. ↩︎