Study Google Cloud ACE GKE Operations: key concepts, common traps, and exam decision cues.
This lesson is about reading and adjusting a live GKE environment. Google Cloud expects ACE candidates to understand how node pools, workload inventory, registry access, and autoscaling affect cluster operations.
Node pool: Group of worker nodes inside a GKE cluster that shares machine profile, scaling behavior, and configuration.
Artifact Registry: Google Cloud service for storing and controlling access to container images and other artifacts.
Autoscaling: Automatic adjustment of pods or nodes based on load and capacity conditions.
ACE wants you to separate:
The question often looks like “GKE is broken,” but the real answer is usually one layer below that vague wording.
| If the question is mainly about… | Strongest first lane |
|---|---|
| one workload needs a different machine profile or scaling boundary | node pool |
| pods should scale up when application load rises | workload or horizontal pod autoscaling |
| the cluster has unschedulable pods because capacity is too small | node-level autoscaling or cluster capacity path |
| images cannot be pulled or deployments fail after image publication | Artifact Registry access and image reference path |
| figuring out what is actually running in the cluster | GKE workload and node inventory first |
flowchart LR
A["Artifact Registry image"] --> B["Deployment update"]
B --> C["Pods start on node pool"]
C --> D["Application load rises"]
D --> E["Pods scale"]
E --> F["Nodes scale if capacity is short"]
This is the easiest way to avoid mixing up the layers. Artifact Registry answers whether the cluster can pull the image. Node pools answer where workloads run. Autoscaling answers how the platform reacts after the workload is under pressure.
| Question | Strongest first answer |
|---|---|
| different CPU or memory profile for a class of workloads | node pool |
| more pod replicas because request volume is increasing | workload autoscaling |
| pods remain pending because there is not enough node capacity | node autoscaling or cluster-capacity path |
Do not answer node pool when the real issue is replica count, and do not answer pod autoscaling when the pods cannot be scheduled because the cluster has no room left.
| Trap | Better reading |
|---|---|
| “Autoscaling is one thing.” | ACE expects you to distinguish workload-level scaling from node-level capacity changes. |
| “If a deployment fails, it must be a Kubernetes YAML problem.” | Image path or Artifact Registry access is often the first operational check. |
| “All workloads in a cluster should share the same worker profile.” | Node pools exist specifically to separate worker characteristics. |
| “Seeing the cluster exists means you understand what is running.” | Inventory questions are about workloads, nodes, and actual deployment state, not just cluster presence. |
A deployment uses a new container image, but the rollout fails because the cluster cannot fetch that image. Which lane is strongest first?
Correct answer: B. This is an image access path problem first, not a node-pool or database problem.