PDE domain - 18% of the exam

Maintaining and Automating Data Workloads

Maintaining and Automating Data Workloads is 18% of the Google Cloud Professional Data Engineer (PDE) exam. These are the objectives it covers, each with practice questions and worked explanations.

Objectives in this domain

Sample question from this domain

Free sampleMaintaining and Automating Data Workloadsmedium

A retail analytics group has migrated from BigQuery on-demand pricing to BigQuery Editions and now runs all interactive workloads against a single Enterprise edition reservation with autoscaling enabled. They observe that small ad hoc queries from analysts often wait several seconds before any slots are allocated, even though baseline slots are set to zero. Which statement best describes how the baseline and maximum slot settings on a reservation affect this behaviour?

  • ABaseline slots and autoscaler slots are both provisioned on demand, so any query against an Enterprise edition reservation incurs the same scale-up delay regardless of the baseline value.
  • BBaseline slots are always available without scale-up latency, while autoscaler slots above the baseline are provisioned on demand and can take a short time to spin up before they become billable. Correct
  • CBaseline slots define the maximum the reservation can ever use, and the autoscaler simply rebalances those slots between queries when contention is detected by the scheduler.
  • DBaseline slots are billed only when they are actively used by a query, while autoscaler slots are billed for the full reservation window once any query triggers scale-up activity.
Explain how baseline and autoscaler slot settings in a BigQuery Editions reservation affect query start latency and billing. A BigQuery Editions reservation keeps the baseline number of slots permanently assigned to the reservation, so queries can use them with no scale-up delay. When demand exceeds the baseline, the autoscaler adds slots in increments up to the configured maximum. These autoscaler slots take a short time to provision and are billed per second only while they are active, which is why analysts see a small wait when the baseline is zero.

Why A is wrong: Tempting because reservations feel elastic end to end, but it is wrong because baseline capacity is held continuously and is available without scale-up; only the autoscaler portion is provisioned on demand.

Why B is correct: Correct. The baseline is the floor that is reserved continuously, so queries using only baseline capacity start immediately, while slots above the baseline are added by the autoscaler in increments and incur a brief provisioning delay before they begin charging.

Why C is wrong: Tempting because the baseline does set a floor, but it does not cap the reservation. The maximum reservation size is a separate setting, and the autoscaler adds slots above the baseline rather than just rebalancing fixed capacity.

Why D is wrong: Tempting because it sounds like a usage-based model, but it inverts the billing. Baseline slots are billed continuously while reserved, and autoscaler slots are billed per second they are active, not for a full window.

Other domains in this exam

See also the PDE cert hub, the study guide, and the cheat sheet.

Examworthy is not affiliated with or endorsed by Google Cloud. Original, blueprint-aligned practice material only.