PMLE - Serving and Scaling Models - Section 4.2

Scale online model serving by managing and serving features with the Feature Store, deploying to public and private endpoints, choosing CPU, GPU, TPU, and edge hardware, scaling the serving backend for throughput, and tuning models for production training and serving.

Serve and manage features at prediction time using the Agent Platform Feature Store, deploy models to public or private endpoints based on network and security requirements, and select CPU, GPU, TPU, or edge hardware to match latency and throughput targets. Configure serving backend scaling to handle variable traffic without over-provisioning.

Agent Platform Feature StorePrivate endpointsServing backend scalingEdge serving

More in this domain

Back to all Serving and Scaling Models objectives, or the PMLE cert hub.

Examworthy is not affiliated with or endorsed by Google Cloud. Original, blueprint-aligned practice material only.