Data Warehousing with BigQuery

Course 1473

  • Duration: 3 days
  • Language: English
  • Level: Intermediate

In this course, you learn about the internals of BigQuery and best practices for designing, optimizing, and administering your data warehouse. Through a combination of lectures, demos, and labs, you learn about BigQuery architecture and how to design optimal storage and schemas for data ingestion and changes. Next, you learn techniques to improve read performance, optimize queries, manage workloads, and use logging and monitoring tools. You also learn about the different pricing models. Finally, you learn various methods to secure data, automate workloads, and build machine learning models with BigQuery ML.

Data Warehousing with BigQuery

  • In-Person

  • Online

  • Upskill your whole team by bringing Private Team Training to your facility.

Data Warehousing with BigQuery

In this course, you will:

  • Describe BigQuery architecture fundamentals.
  • Implement storage and schema design patterns to improve performance.
  • Use DML and schedule data transfers to ingest data.
  • Apply best practices to improve read efficiency and optimize query performance.
  • Manage capacity and automate workloads.
  • Understand patterns versus anti-patterns to optimize queries and improve read performance.
  • Use logging and monitoring tools to understand and optimize usage patterns.
  • Apply security best practices to govern data and resources.
  • Build and deploy several categories of machine learning models with BigQuery ML.

Prerequisites

  • Fundamentals of Big Data and Machine Learning

Data Warehousing with BigQuery Training Outling

Module 1) BigQuery Architecture Fundamentals

  • Explain the benefits of columnar storage.
  • Understand how BigQuery processes data.
  • Explore the basics of BigQuery’s shuffling service to improve query efficiency.

 

Module 2) Storage and Schema Optimization

  • Compare the performance of different schemas (snowflake, denormalized, and nested and repeated fields).
  • Partition and cluster data for better performance improve schema design using nested and repeated fields.
  • Describe additional best practices such as table and partition expiration.

 

Module 3) Ingesting Data

  • Ingest batch and streaming data.
  • Query external data sources.
  • Schedule data transfers.
  • Understand how to use Storage Write API.

 

Module 4) Changing Data

  • Write DML statements.
  • Address common DML performance problems and bottlenecks.
  • Identify slowly changing dimensions (SCD) in your data and make updates.

 

Module 5) Improving Read Performance

  • Explore BigQuery’s cache.
  • Create materialized views.
  • Work with BI Engine to accelerate your SQL queries.
  • Use the Storage Read API for fast access to BigQuery-managed storage.
  • Explain the caveats of using external data sources.

 

Module 6) Optimizing and Troubleshooting Queries

  • Interpret BigQuery execution details and the query plan.
  • Optimize query performance by using suggested methods for SQL statements and clauses.
  • Demonstrate best practices for functions in business use cases.

 

Module 7) Workload Management and Pricing

  • Define a BigQuery slot.
  • Explain pricing models and pricing estimations (BigQuery UI, bq dry_run, jobs API).
  • Understand slot reservations, commitments, and assignments.
  • Identify best practices to control costs.

 

Module 8) Logging and Monitoring

  • Use Cloud Monitoring to view BigQuery metrics.
  • Explore the BigQuery admin panel.
  • Use Cloud Audit logs.
  • Work with INFORMATION_SCHEMA tables to get insights for your BigQuery entities.

 

Module 9) Security in BigQuery

  • Explore data discovery using Data Catalog.
  • Discuss data governance using DLP API and Data Catalog.
  • Create IAM policies (e.g., authorized views) to secure resources.
  • Secure data with classifications (e.g., row-level policies).
  • Understand how BigQuery uses encryption.

 

Module 10) Automating Workloads

  • Schedule queries.
  • Use scripting and stored procedures to build custom transformations.
  • Describe how to integrate BigQuery workloads with other Google Cloud big data products.

 

Module 11) Mastering BigQuery ML Applications

  • Describe some of the different applications of BigQuery ML.
  • Build and deploy several categories of machine learning models with BigQuery ML.
  • Use AutoML Tables to solve high-value business problems.

Need Help Finding The Right Training Solution?

Our training advisors are here for you.

Data Warehousing with BigQuery

  • 7 modules
  • 45 videos
  • 4 labs

By leveraging BigQuery's capabilities, you can help your organization efficiently store and analyze large datasets, make data-driven decisions, optimize performance and costs, and ensure data security and compliance.

While this course itself does not provide certification, it prepares participants for further advanced courses and certifications like the Google Cloud Professional Data Engineer certification.

Chat With Us