What is a Data Warehouse?
A data warehouse serves as a centralized repository for information collected from various data sources.
Application Database vs Data Warehouse
Application Database (e.g., MS SQL, BigTable):
Application databases are typically optimized for online transaction processing (OLTP). They prioritize efficient handling of frequent, continuous read and write operations. Therefore, they often employ caching techniques and B-tree indexing for optimization.
Data Warehouse (e.g., BigQuery):
Data warehouses, on the other hand, are specifically optimized for online analytical processing (OLAP). They enhance the efficiency of complex analytical queries, such as table scans, groupings, and aggregations. Consequently, data warehouses excel in these operations compared to application databases.
Why Do We Need a Data Warehouse?
- Separates analytics processing from transactional databases, enhancing the performance of both systems.
- Supports time-variant data, facilitating the storage of historical data.
- Facilitates efficient querying and analysis processes.
Why is Querying BigTable from BigQuery Slow?
Querying external sources like BigTable in BigQuery doesn't involve creating a table within one's BigQuery datasets. Since the table isn't permanently stored within a dataset, it can lead to slower query performance.
Extract Transform Load (ETL)
ETL involves extracting data from multiple sources and loading it into data warehousing systems for analysis.
Bigtable
Cloud Bigtable isn't a relational database, so it doesn't support SQL queries or joins, nor does it support multi-row transactions. Therefore, it's not suitable for datasets smaller than 1TB. Alternatives based on specific needs include:
- Google Cloud SQL for comprehensive SQL support in OLTP systems.
- Google BigQuery for interactive queries in OLAP systems.
- Google Cloud Storage for storing immutable blobs larger than 10 MB, such as large images or movies.
- Cloud Datastore for highly structured objects or when ACID transactions and SQL-like query support are required.
留言
張貼留言