Data Lake
数据湖 是一个高扩展性的集中存储,可以大规模存储原始的结构化、半结构化和无结构数据,而无须对数据预处理。但在查询时需要处理数据格式转换。
Database System Concepts
Thetermdata lake is used to refer to a repository where data can be stored in multiple formats, including structured records and unstructured file formats. Unlike data warehouses, data lakes do not require up-front effort to preprocess data, but they do require more effort when creating queries.[2]
https://aws.amazon.com/cn/big-data/datalakes-and-analytics/what-is-a-data-lake/ ↩︎
Abraham Silberschatz, Henry F. Korth, and S. Sudarshan, Database System Concepts, Seventh edition (New York, NY: McGraw-Hill, 2020). P527 ↩︎