Amazon Web Services study guide

How to pass AWS Certified Solutions Architect - Professional (SAP-C02)

27 min read4 domains coveredFree practice, no sign-up

The AWS Certified Solutions Architect - Professional (SAP-C02) is the capstone above the associate. It stops testing what a single service does and starts testing whether you can assemble a whole estate: many accounts under one organisation, hybrid links back to data centres, migration waves against a lease deadline, and the trade-offs between cost, resilience, operational overhead and agility that decide which design is actually best. Most stems are long, multi-paragraph scenarios with several competing constraints, and the work is separating the requirement that breaks the tie from the noise around it.

It suits experienced architects who already design across multiple AWS accounts and services: people who have run AWS Organizations, connected on-premises networks, sized disaster-recovery tiers to real RTO and RPO targets, and led or scoped a migration. AWS recommends around two years of hands-on experience designing and deploying cloud architectures. The exam draws across four weighted domains: designing for organisational complexity, designing new solutions, continuously improving existing ones, and accelerating migration and modernisation. New-solution design carries the most marks, but organisational complexity is close behind and is where most associate-level intuition runs out.

What makes it pass-or-fail is judgement under competing constraints, not recall. Three of the four options usually work in isolation. The right one is the managed, requirement-fit architecture that satisfies every stated limit at once: the cost ceiling, the resilience scope, the operational overhead the team will accept, the prohibition on application changes, the RTO and RPO, the compliance control that must hold even against an account administrator. Several questions are multiple-response, so a half-right pairing scores nothing. The skill being tested is reading a dense scenario, naming the binding constraint, and choosing the design that meets it with the least to build and run.

SAP-C02 is a pick-the-best-architecture exam at estate scale: long multi-constraint scenarios where three options work and the right one is the managed, requirement-fit design that satisfies every stated limit (cost, resilience scope, operational overhead, no application change, RTO and RPO, compliance) at once.

Difficulty

Advanced

Best for

Experienced AWS architects who design and deploy across multiple accounts and services: people who have worked with AWS Organizations, hybrid connectivity, multi-Region resilience, and migration planning, and now need to prove they can make cross-service trade-offs under competing constraints rather than recall single facts.

Prerequisites

None enforced, but this is a professional exam and it shows. AWS recommends around two years of hands-on experience designing and deploying on AWS. You should already be comfortable with the SAA-C03 associate material: multi-account governance, hybrid networking, disaster-recovery tiers, IAM and federation, and at least one real migration are what carry you through the scenarios.

75
Questions
180 min
Time allowed
750 / 1000
Pass mark
$300
Exam cost (USD)
273
Practice questions

How this exam thinks

One habit decides this exam: read the long scenario for the constraint that breaks the tie, then pick the managed architecture built for it. Almost every stem names several limits at once (a cost ceiling, an RTO or RPO, an operational-overhead bar, a no-application-change rule, a compliance control that must survive an administrator), and three of the four options will work in isolation. The right answer is the one design that satisfies every stated limit together, usually the most managed, least-operated option that still meets the hard requirement.

The default tie-breaker is the managed, requirement-fit, lowest-overhead choice. AWS designs the exam around its own preference for managed and native services, so when two answers both work, the one with less to run and patch wins: AWS Transit Gateway over a self-managed EC2 router mesh, AWS Application Migration Service over manual image export, a CloudFormation change set over rollback triggers, ACM with automatic renewal over imported commercial certificates. Reach for the manual or less managed option only when the scenario names a reason, such as a provider with no managed rotation function, an engine to preserve, or a control that IAM alone cannot enforce. That named reason is the signal that the obvious managed answer is the trap.

The rest is a set of estate-scale discriminations the exam leans on. For connectivity, transitive any-to-any at scale is Transit Gateway, a single private service to many consumers is AWS PrivateLink, and a small fixed number of networks is VPC peering. For governance, a preventative control that binds even an account administrator is a service control policy, not an IAM permission boundary or a detective finding. For resilience, the disaster-recovery tier (backup and restore, pilot light, warm standby, multi-site) is chosen by the named RTO and RPO. For migration, the 7Rs vocabulary and the AWS MGN-versus-DMS split decide the answer. Name the binding constraint, then choose the service built for it.

What each domain tests and how to study it

The SAP-C02 blueprint is split across 4 domains. Weights are the official share of the exam; see the official exam guide for the authoritative breakdown.

  1. Design Solutions for Organizational Complexity

    26% of exam

    What you must be able to do. Given a multi-account estate with connectivity, identity, governance, resilience, or cost-visibility requirements, choose the network topology, identity model, organisation guardrails, backup strategy, and cost-allocation approach that scale across many accounts and Regions while keeping central control.

    In one sentenceThe estate domain: connecting, governing, securing, and costing many VPCs and accounts under one organisation so control stays central as the estate grows.

    Recall check: answer these from memory first
    • Two hundred VPCs across forty accounts all need to reach a shared services VPC with transitive routing and central route control, and the count grows monthly. What topology scales, and why does a peering mesh or PrivateLink not fit?
    • A control must stop member-account administrators from re-enabling a risky action regardless of their IAM permissions. Which Organizations mechanism enforces it preventatively, and why is a permission boundary or a detective finding not enough?
    • Finance can see total spend but cannot break it down by team or environment because tags are inconsistent. What two things together give per-team Cost Explorer views with low ongoing effort?
    • An edge needs publicly trusted TLS that renews with no engineer touching a key, the key must never leave AWS, and security must be warned before expiry. What certificate strategy delivers all three?

    What it tests. Designing across the whole organisation rather than one VPC. Architecting connectivity for many VPCs and accounts with AWS Transit Gateway, VPC peering and AWS PrivateLink, balancing transitive routing against segmentation and scale; designing hybrid links and name resolution with AWS Direct Connect, AWS Site-to-Site VPN and Amazon Route 53 Resolver; prescribing cross-account and workforce access with AWS IAM Identity Center, IAM roles and third-party federation; centralising security audit, notification and encryption with AWS CloudTrail, AWS Security Hub, AWS KMS and AWS Certificate Manager; matching multi-Region resilience to RTO and RPO; building cross-account and cross-Region backup with AWS Backup; governing a multi-account environment with AWS Organizations and AWS Control Tower, service control policies, organisational units and centralised logging; and producing per-team cost visibility with AWS Cost Explorer, AWS Budgets, purchasing options and a cost-allocation tagging model.

    How to study it. Build the estate-scale decision trees first because this domain is where associate intuition runs out. For connectivity, fix the three-way split by scale and intent: Transit Gateway when many VPCs and accounts need transitive any-to-any routing with central route control, PrivateLink when one private service must reach many consumers individually approved, VPC peering only for a small fixed set of one-to-one links. Make service control policies your reflex for any control that must hold even against a member-account administrator: an SCP is evaluated above IAM, so it is the preventative guardrail, while permission boundaries, Config rules and Security Hub findings cannot bind every principal. Learn the AWS Backup model: policy-driven backup plans applied across the organisation, copied cross-account and cross-Region, with recovery actually tested. For cost visibility, learn that cost-allocation tags only surface once activated in the management account, and that Organizations tag policies plus SCPs enforce the schema so new resources are reportable automatically. For certificates at the edge, ACM with DNS validation renews hands-off and keeps the key inside AWS.

    Easy to confuse

    • AWS Transit Gateway versus VPC peering. Transit Gateway is a regional routing hub giving transitive any-to-any reachability and central route-table control for many VPCs and accounts; VPC peering is non-transitive and one-to-one, so connecting N VPCs needs roughly N-squared links and cannot route through a peer. Use Transit Gateway at scale or when transitive routing is required, peering only for a small fixed set of direct links.
    • AWS PrivateLink versus AWS Transit Gateway. PrivateLink publishes one service behind a Network Load Balancer and exposes only that service, with each consumer account creating an interface endpoint and the producer approving connections individually and no route propagation; Transit Gateway connects whole networks and propagates routes. When the requirement is private access to a single service for many accounts without peering or route sharing, it is PrivateLink, not Transit Gateway.
    • Service control policy versus IAM permission boundary. An SCP attaches to an organisational unit or account and is evaluated above IAM, so it caps what every principal including the account administrator can do; a permission boundary attaches to an individual role or user and only limits that one principal, leaving an administrator free to create unbounded roles. For an organisation-wide preventative guardrail that no member-account admin can override, use an SCP.
    • AWS Control Tower versus AWS Organizations alone. AWS Organizations provides the account structure, consolidated billing and SCP mechanism; AWS Control Tower sits on top and sets up a governed landing zone with prescriptive guardrails, an account factory and centralised logging and audit accounts out of the box. Choose Control Tower when the scenario wants a managed, opinionated multi-account baseline rather than building governance by hand on raw Organizations.

    Worked example from the SAP-C02 bank

    Free sampleDesign Solutions for Organizational Complexityhard

    A multinational runs around 200 VPCs spread across 40 AWS accounts under a single organisation, and the count grows monthly as new product teams onboard. Every VPC must reach a shared services VPC for DNS and patching, and many must also reach each other, with full transitive routing and central control of which routes propagate where. The networking team wants to avoid managing an ever-expanding mesh of point-to-point links. Which design MOST scalably meets these requirements?

    • ADeploy an AWS Transit Gateway shared through AWS Resource Access Manager, attach every VPC to it, and use Transit Gateway route tables to control which attachments can route to the shared services VPC and to each other. Correct
    • BCreate a full mesh of VPC peering connections between every pair of VPCs and add the shared services VPC as another peer, relying on the peering links for any VPC to reach any other VPC directly.
    • CExpose the shared services through AWS PrivateLink endpoint services and create interface endpoints in every VPC, then add PrivateLink endpoints between product VPCs wherever two teams need to reach each other.
    • DDesignate one central VPC as a transit hub, run software routers on EC2 instances inside it, and peer every other VPC to that hub so traffic is forwarded between VPCs through the EC2 routing layer.
    Select AWS Transit Gateway as the scalable transitive hub for connecting many VPCs and accounts with centrally controlled routing. Transit Gateway acts as a regional routing hub that every VPC attaches to, giving transitive any-to-any routing without a quadratic mesh of links. Sharing it through Resource Access Manager lets accounts across the organisation attach, and Transit Gateway route tables centrally decide which attachments propagate routes to which, something a peering mesh, PrivateLink endpoints or self-managed EC2 routers cannot do at this scale.

    Why A is correct: A Transit Gateway is a hub that provides transitive routing for all attached VPCs and accounts, scales to thousands of attachments, and its route tables centrally govern which VPCs reach the shared services VPC or each other.

    Why B is wrong: A peering mesh seems to give any-to-any reachability, but peering is non-transitive and the number of links grows roughly with the square of the VPC count, which becomes unmanageable well before 200 VPCs.

    Why C is wrong: PrivateLink cleanly publishes the shared services, but it exposes single services rather than whole VPCs, so building any-to-any product connectivity from endpoints does not provide the general transitive routing the estate needs.

    Why D is wrong: EC2 software routers can forward traffic to work around non-transitive peering, but they add instances to patch, scale and make highly available, duplicating a managed capability Transit Gateway already provides.

  2. Design for New Solutions

    29% of exam

    What you must be able to do. Given a greenfield workload with security, availability, performance, scaling, or cost requirements, choose the infrastructure-as-code and deployment approach, the managed and serverless building blocks, the multi-AZ and multi-Region resilience design, the layered access controls, and the pricing model that meet every stated requirement with the least operational overhead.

    In one sentenceThe greenfield domain: designing a new workload end to end so it deploys safely, scales elastically, defends in layers, and meets its availability and cost targets with managed services by default.

    Recall check: answer these from memory first
    • A release must show exactly which resources a stack update will modify, replace or delete and get sign-off before anything is applied, after a prior update silently replaced a database. Which CloudFormation feature gives that preview, and why not rollback triggers or drift detection?
    • A Lambda-backed API in a producer account must be consumed privately by dozens of separate accounts, each approved individually, with no peering or route propagation and only that one service exposed. What design fits?
    • Player session traffic in DynamoDB is flat for weeks then jumps many-fold within minutes with no warning, and the team can neither tolerate throttling nor waste spend. Which capacity mode matches, and why not provisioned auto scaling or DAX?
    • Private-subnet instances must call AWS Secrets Manager privately, only these instances may use the path, and the path may fetch only secrets carrying a specific tag. What two controls enforce the network-layer and service-layer restrictions together?

    What it tests. Designing a new solution across deployment, security, resilience, performance, scale and cost. Building repeatable releases and safe rollback with AWS CloudFormation, change sets and CI/CD; choosing managed and serverless services to cut provisioning and patching; designing business continuity with automated data and database replication and Amazon Route 53 routing across Availability Zones and Regions; enforcing least privilege with IAM roles, scoped resource policies, security groups, network ACLs and VPC endpoints; mitigating attacks with AWS WAF, AWS Shield, Amazon GuardDuty and edge protections; building reliability with Multi-AZ and multi-Region patterns, EC2 Auto Scaling and loose coupling through Amazon SQS, Amazon SNS and AWS Step Functions; meeting performance objectives by selecting instance families, storage and purpose-built databases with caching and replicas; designing elastic scaling for varied access patterns; and choosing pricing models, storage tiering and data-transfer paths that cut spend without breaching requirements.

    How to study it. Drill the safe-deployment and managed-scaling reflexes, since this is the heaviest domain. For CloudFormation, learn that a change set previews the exact action and replacement for each resource before anything runs, which is the answer whenever a scenario demands pre-execution visibility and sign-off, ahead of rollback triggers or drift detection that only act during or after. For private service access across accounts, fix that PrivateLink fronts a single service behind a Network Load Balancer with per-consumer approval and no peering. For unpredictable spiky load, prefer the on-demand or serverless mode that absorbs swings without an operator predicting capacity, such as DynamoDB on-demand, over provisioned auto scaling that lags sudden jumps. For layered security, learn to pair a network-layer control (an interface endpoint security group) with a service-layer control (a VPC endpoint policy conditioned on a tag) so multiple-response questions need both halves. For resilience, match decoupling and multi-Region routing to the availability target named.

    Easy to confuse

    • CloudFormation change set versus rollback triggers. A change set is generated before execution and lists each resource with its planned action and whether the update forces a replacement, so a destructive change can be caught and gated on approval; rollback triggers and CloudWatch-alarm rollback only act during or after the update, after the replacement has already started. For pre-execution visibility and sign-off, the change set is the answer.
    • DynamoDB on-demand versus provisioned with auto scaling. On-demand capacity serves each request with no pre-set throughput and scales instantly to sudden many-fold spikes while costing nothing when idle; provisioned auto scaling tracks utilisation and adds capacity reactively, so it lags unannounced surges and can throttle during the jump. For unforecast spiky traffic, on-demand fits; provisioned auto scaling suits gradual, predictable change.
    • Interface endpoint security group versus VPC endpoint policy. An interface VPC endpoint is backed by an elastic network interface, so its security group is a network-layer control deciding which instances may send requests through the path; a VPC endpoint policy is a service-layer control deciding what API calls and resources are allowed through that endpoint. They answer different halves: who may use the path versus what may be retrieved, and a least-privilege design needs both.
    • PrivateLink interface endpoint versus public API Gateway with a resource policy. A PrivateLink interface endpoint keeps traffic on private AWS paths and exposes only the single service, with the producer approving each consumer; a public API Gateway endpoint guarded by a resource policy still sends traffic over the internet even though it restricts which principals may call. When the requirement says private connectivity with no internet path, PrivateLink wins over the public endpoint.

    Worked example from the SAP-C02 bank

    Free sampleDesign for New Solutionshard

    A SaaS provider is designing a new internal API that runs as a Lambda function in a producer account and must be consumed privately by applications in dozens of separate consumer accounts across two AWS Organizations. The security team requires that consumer traffic reach the API over private connectivity without VPC peering or route propagation between accounts, that each consumer account be granted access individually, and that the producer expose only this one service rather than a whole network. Which design BEST meets these requirements?

    • AAttach all producer and consumer VPCs to a central AWS Transit Gateway shared through Resource Access Manager, propagate the producer's routes to every consumer, and use security groups on the API subnet to limit which consumer accounts can send traffic to the Lambda-backed service.
    • BEstablish VPC peering connections between the producer VPC and each consumer VPC, add the necessary routes on both sides of every peering link, and rely on security groups and route-table scoping so that consumers can only reach the subnet hosting the API service.
    • CExpose the Lambda function through a public Amazon API Gateway REST API protected by a resource policy that allows only the listed consumer account principals, so each consumer is granted individually while AWS WAF blocks any caller outside the approved set of accounts.
    • DFront the Lambda function with a Network Load Balancer, publish it as an AWS PrivateLink endpoint service, and have each consumer account create an interface VPC endpoint to that service, approving consumer connections individually so each account reaches only this one API privately. Correct
    Publish a single service with AWS PrivateLink so many consumer accounts get private, individually approved access without peering or route propagation. AWS PrivateLink lets a producer place a service behind a Network Load Balancer and publish it as an endpoint service. Each consumer account creates an interface VPC endpoint and the producer approves the connection individually, so access is granted per account and only this one service is exposed, with traffic staying on private paths and no VPC peering or route propagation involved. Transit Gateway and peering connect whole networks, and a public API Gateway endpoint sends traffic over the internet.

    Why A is wrong: A Transit Gateway can connect the accounts privately, but it joins entire VPC networks and propagates routes between them, which exposes far more than the single API and contradicts the no-route-propagation and one-service-only constraints in the requirement.

    Why B is wrong: VPC peering with scoped routes could reach the API, but peering is exactly the mechanism the requirement excludes, it does not scale cleanly to dozens of accounts, and it still connects networks rather than publishing a single private service.

    Why C is wrong: An account-scoped resource policy on API Gateway does grant access per account, but a public REST API sends traffic over the internet rather than private connectivity, so it fails the requirement that consumer traffic reach the API over a private path.

    Why D is correct: PrivateLink exposes a single service behind a Network Load Balancer as an endpoint service, consumers reach it through interface endpoints over private connectivity with no peering or route propagation, and the producer approves each consumer connection individually, so only the one API is shared rather than a whole network.

  3. Continuous Improvement for Existing Solutions

    25% of exam

    What you must be able to do. Given a running workload with an operational, security, performance, reliability, or cost weakness, choose the monitoring and auto-remediation, deployment-hardening, secret-management, patching, performance, reliability, and cost-analysis improvement that fixes the weakness with managed native mechanisms and the least added overhead.

    In one sentenceThe improvement domain: hardening a workload already in production so recurring failures self-heal, secrets and patches are managed, performance and reliability gaps close, and waste is removed without rebuilds.

    Recall check: answer these from memory first
    • An instance with a weekly memory leak stops answering health checks and an engineer reboots it overnight. The fix must trigger the moment the instance is unhealthy, add no standing servers, and need no custom code. What CloudWatch capability does it?
    • High-severity Amazon Inspector package vulnerabilities on tagged instances must be patched automatically, reusing existing Systems Manager patching and staying auditable. What event-driven chain delivers this?
    • A fleet's CPU follows strong daily and weekly cycles, and a single static alarm either fires every evening or misses overnight spikes. Which CloudWatch capability adapts without manual retuning, and why not a composite of fixed thresholds?
    • A hardcoded third-party API key with no AWS-managed rotation function must be encrypted, rotated every 45 days against the provider, and read by the app without a redeploy. What two elements meet this with the least standing infrastructure?

    What it tests. Improving a solution that already runs. Lifting operational excellence with Amazon CloudWatch monitoring, logging, alerting and automatic remediation of recurring failures; adopting blue/green and rolling deployments and configuration automation with AWS Systems Manager; hardening secrets with AWS Secrets Manager, auditing least privilege and enforcing compliance with AWS Config; designing patch management, backup and automated vulnerability remediation with AWS Systems Manager, Amazon Inspector and AWS Backup; improving performance with Amazon CloudFront, AWS Global Accelerator and caching against measurable SLAs; improving reliability by removing single points of failure, enabling replication and self-healing and resolving service-quota limits; translating business requirements into metrics and right-sizing with AWS Compute Optimizer and CloudWatch; and finding cost savings in running workloads by analysing the AWS Cost and Usage Report, eliminating unused resources and setting billing alarms and tagging.

    How to study it. Learn the event-driven auto-remediation pattern as the spine of this domain, because the exam repeats it. A finding or metric (from Amazon Inspector, AWS Security Hub, or CloudWatch) is routed through Amazon EventBridge to a Systems Manager Automation runbook that fixes the issue and logs the run, with no human in the loop and no standing servers. Prefer the native action where one exists: a CloudWatch alarm with a built-in EC2 recovery action over a polling Lambda. For seasonal metrics with daily and weekly cycles, reach for CloudWatch anomaly detection rather than static thresholds or week-over-week metric maths that either spam or miss. For secrets, learn that Secrets Manager still automates rotation for a third-party provider through a custom rotation Lambda, and that applications should read the secret at runtime by reference (with the caching client) so a rotated value is picked up without a redeploy. Remember that Parameter Store has no built-in rotation, which is the trap when a scenario demands managed rotation.

    Easy to confuse

    • CloudWatch alarm EC2 recovery action versus a scheduled remediation Lambda. A CloudWatch alarm on a status-check metric can invoke a built-in EC2 reboot or recover action the instant the metric breaches, with nothing to run or maintain; a scheduled Lambda that polls the fleet every few minutes adds custom code, lags the failure by the poll interval and must be maintained. For immediate, code-free self-healing the native alarm action wins.
    • CloudWatch anomaly detection versus a static threshold alarm. Anomaly detection trains a model on a metric's history and learns its daily and weekly seasonality, producing an expected band that moves with the cycle so it catches real spikes during quiet hours and tolerates normal peaks; a static threshold is rigid and either spams during normal peaks or misses off-peak spikes. For a seasonal metric with no manual retuning, use anomaly detection.
    • AWS Secrets Manager versus Systems Manager Parameter Store SecureString. Secrets Manager provides built-in rotation, including a custom rotation Lambda for third-party providers, and runtime retrieval by reference; Parameter Store SecureString encrypts a value with KMS but has no native rotation, so rotating it means bolting on your own EventBridge-plus-Lambda scheme. When the scenario demands managed automatic rotation, Parameter Store is the trap and Secrets Manager is the fit.
    • EventBridge-to-SSM Automation remediation versus an SNS email alert. Routing a finding through EventBridge to a Systems Manager Automation runbook fixes the issue automatically and records the run, satisfying a no-human-in-the-loop requirement; publishing the finding to an SNS topic only emails an engineer who then patches by hand. When the requirement is automatic auditable remediation, choose the EventBridge-to-Automation chain, not the alert.

    Worked example from the SAP-C02 bank

    Free sampleContinuous Improvement for Existing Solutionsmedium

    A logistics company already runs Amazon Inspector across its accounts and wants high-severity operating system package vulnerabilities on its tagged Amazon EC2 instances to be patched automatically, without an engineer triaging each finding by hand. The remediation must reuse the existing Systems Manager patching capability and remain auditable. Which design BEST automates remediation of these high-severity findings?

    • AConfigure Amazon Inspector to publish findings to Amazon SNS and subscribe an email distribution list so on-call engineers receive each high-severity finding and apply patches manually during the next change window.
    • BRoute Inspector finding deletions through AWS Config and have a Config remediation action reinstall the operating system on any instance whose finding reappears, rebuilding the host to clear the vulnerable package.
    • CSend Inspector findings to Amazon EventBridge, match high-severity package vulnerabilities with a rule, and trigger a Systems Manager Automation runbook that runs Patch Manager against the affected instances, logging each run for audit. Correct
    • DExport the Inspector findings nightly to an Amazon S3 bucket, run an AWS Glue job to filter the high-severity package rows, and email the resulting list to the patching team so they can decide which instances to schedule for an out-of-band patch during the next change window.
    Wire Amazon Inspector findings through EventBridge to a Systems Manager Automation runbook so high-severity vulnerabilities are patched automatically and auditably. Amazon Inspector emits findings as events, so an EventBridge rule can filter for high-severity package vulnerabilities and invoke a Systems Manager Automation runbook. That runbook calls Patch Manager against the affected tagged instances, reusing the existing patching capability while recording each execution for audit. SNS email, Config-driven reinstalls and a nightly Glue export all keep a human in the loop or fail to reuse Systems Manager, so none delivers automatic auditable remediation.

    Why A is wrong: Email notification keeps a human in the loop for every finding, which is the manual triage the company wants to remove, and it does not reuse Systems Manager patching automatically.

    Why B is wrong: Config remediation does not consume Inspector findings this way, and reinstalling the operating system on every recurring finding is disproportionate and destructive rather than a targeted patch.

    Why C is correct: EventBridge matches high-severity Inspector findings and invokes an SSM Automation runbook that drives Patch Manager against the affected instances, giving automatic, auditable remediation that reuses existing patching.

    Why D is wrong: A nightly export and Glue filter adds a batch pipeline and still ends in manual scheduling, so it is neither automatic remediation nor a reuse of the existing Systems Manager patching.

  4. Accelerate Workload Migration and Modernization

    20% of exam

    What you must be able to do. Given a portfolio to move under a deadline, classify each workload with the 7Rs, choose the data-transfer mechanism, the application and database migration tooling, the target compute, container, storage and database platform, and the modernisation path that move and improve the estate at the lowest risk and downtime.

    In one sentenceThe migration domain: assessing a portfolio, picking the right transfer and migration tooling for each workload, and choosing where to modernise to serverless, containers and purpose-built databases at low risk.

    Recall check: answer these from memory first
    • A per-server licensed commercial ERP suite and a self-hosted SQL Server estate moving to a managed engine each need a 7Rs label before wave planning. Which strategy applies to each, and why not rehost or relocate?
    • A continuously written MySQL 8 database must move to Amazon RDS for MySQL with only minutes of cutover downtime and no schema conversion. Which migration approach achieves near-zero downtime, and why not a one-time full load?
    • Roughly 150 Linux and Windows VMs, including bespoke servers, must be rehosted under a lease deadline with a few-minute cutover and pre-cutover validation while the source keeps running. Which two AWS Application Migration Service actions deliver this?
    • A SaaS payments provider emits events that several teams in different accounts must filter by currency and amount, with new consuming accounts added later without changing the producer. Which managed router fits, and why not SNS, SQS or Kinesis?

    What it tests. Moving and modernising an existing estate. Selecting workloads with a portfolio assessment in AWS Migration Hub, applying the seven common migration strategies and weighing total cost of ownership; choosing data-transfer mechanisms with AWS DataSync, AWS Transfer Family, the AWS Snow Family and Amazon S3 Transfer Acceleration; choosing application and database migration tooling with AWS Application Migration Service, AWS Database Migration Service and the AWS Schema Conversion Tool; designing a target architecture by selecting the right compute, container, storage and database platforms; identifying modernisation opportunities by decoupling and moving to AWS Lambda and to containers on Amazon ECS, Amazon EKS and AWS Fargate; and adopting purpose-built databases such as Amazon DynamoDB and Amazon Aurora Serverless and integration services such as Amazon EventBridge and AWS Step Functions.

    How to study it. Memorise the 7Rs vocabulary cold, because the exam uses it as the shared language for classification questions: retire, retain, rehost, relocate, repurchase, replatform, refactor. Practise mapping a workload to its strategy: licensed packaged software dropped for a SaaS or new product is repurchase, a self-managed database moved to a managed engine such as Amazon RDS with only configuration change is replatform, a lift-and-shift onto EC2 unchanged is rehost. Fix the AWS MGN-versus-DMS split: Application Migration Service does continuous block-level replication of whole servers for low-downtime rehosting with non-disruptive test launches, while Database Migration Service moves database rows, using a full-load-plus-change-data-capture task for near-zero-downtime homogeneous migrations and the Schema Conversion Tool when engines differ. For data transfer, pick by volume and link: DataSync over a network for online file transfer, the Snow Family when the data is too large for the available bandwidth and deadline. For modernisation, learn EventBridge as the cross-account content-based router for SaaS partner events over SNS, SQS or Kinesis glue.

    Easy to confuse

    • AWS Application Migration Service versus AWS Database Migration Service. Application Migration Service (MGN) replicates whole servers at block level, including the operating system and disks, for low-downtime lift-and-shift rehosting with test launches; Database Migration Service (DMS) replicates database rows, with full-load-plus-change-data-capture for near-zero-downtime database moves. Use MGN to move machines, DMS to move databases, and never DMS to replicate a whole virtual machine.
    • Replatform versus rehost (7Rs). Rehost is a lift-and-shift that moves a workload unchanged, typically onto EC2, keeping the existing engine and configuration; replatform is lift-and-optimise that swaps one component for a managed equivalent, such as self-managed SQL Server to Amazon RDS, without rewriting the application. When a self-managed database moves to a managed engine with only configuration change, it is replatform, not rehost.
    • DMS full-load-plus-CDC versus a one-time full load. A full-load-plus-change-data-capture task bulk-copies existing rows then streams ongoing writes so the target stays current and cutover takes minutes; a one-time full load copies a static snapshot, so every write after the load began must be reconciled in a long maintenance window. For a continuously written source with minutes of allowed downtime, full-load-plus-CDC is required.
    • AWS DataSync versus the AWS Snow Family. DataSync moves files over an existing network link and suits online transfer when bandwidth and time allow; the Snow Family ships physical devices and suits datasets too large to move over the available link within the deadline or sites with poor connectivity. The deciding constraint is whether the network can carry the volume in the time available.

    Worked example from the SAP-C02 bank

    Free sampleAccelerate Workload Migration and Modernizationhard

    A logistics company must vacate a leased data centre and rehost roughly 150 Linux and Windows virtual machines, including several bespoke application servers, into a migration account in its AWS Organizations estate. The applications cannot be re-architected or have their operating systems reconfigured before the move, the business mandates a cutover window of only a few minutes per server, and the team wants to prove each migrated server works in AWS before the final switch while the source keeps running. The programme wants the fastest, lowest-risk lift-and-shift with the least manual rebuilding. Which TWO actions using AWS Application Migration Service deliver this rehost? (Select TWO.)

    • AUse AWS Database Migration Service to replicate each virtual machine, including its operating system and locally attached disks, into Amazon EC2 instances and cut over once the machine-level replication completes.
    • BExport each source virtual machine to an image file, copy the images to Amazon S3, and import them one at a time as AMIs, accepting a long per-server downtime while each large image uploads and converts.
    • CInstall the AWS Application Migration Service replication agent on each source server so it performs continuous block-level replication of the running machines into a staging subnet, keeping the targets in step until each cutover. Correct
    • DLaunch test instances from the replicated volumes to validate each application in AWS while the source servers keep running, then perform the brief cutover only once each migrated server has been verified. Correct
    • ERebuild each server from scratch on new EC2 instances using configuration management to reinstall the operating system and applications, then migrate only the data, treating a clean rebuild as the lowest-risk rehost.
    Use the Application Migration Service agent for continuous block-level replication and its non-disruptive test launches to rehost servers with a few-minute cutover and pre-cutover validation. Application Migration Service is built for low-downtime rehosting. Its replication agent performs continuous block-level replication of the running source into a staging subnet, so the target is always current and cutover takes only the minutes needed to launch the latest instance, with no manual rebuild of the bespoke servers. The service also launches test instances from the replicated volumes while the source keeps running, letting the team validate each application in AWS before the final switch. Database Migration Service moves database rows not whole machines, a manual image export and import imposes long per-server downtime, and rebuilding from scratch is the slowest and most error-prone option for a lift-and-shift under a lease deadline.

    Why A is wrong: Database Migration Service replicates database table data between engines, not whole operating systems or block volumes of a virtual machine, so it cannot rehost a full server image and is the wrong tool for a lift-and-shift of machines.

    Why B is wrong: A one-off image export and import requires the source to be quiesced and incurs a long upload and conversion per server, breaching the few-minute cutover requirement and offering none of the continuous replication or test-launch capabilities the deadline demands.

    Why C is correct: The Application Migration Service agent streams continuous block-level replication from each running source into a staging area, so the target stays current and the final cutover takes only the few minutes needed to launch the up-to-date instance, with no manual rebuild.

    Why D is correct: Application Migration Service can boot non-disruptive test instances from the replicated volumes without stopping replication or the source, so the team validates each server in AWS and cuts over only after verification, meeting the prove-before-switch requirement.

    Why E is wrong: Rebuilding every server by hand is the slowest, highest-effort path and risks configuration drift from the bespoke originals, directly contradicting the requirement for the fastest lift-and-shift with the least manual rebuilding under a fixed lease deadline.

A study plan that works

  1. Map the four domains and book a date

    Day 1

    Read the official AWS exam guide and the four domains with their weights. Book a provisional date now: a fixed date converts open-ended study into a plan and is the strongest predictor of actually sitting. Note that Design for New Solutions and Design Solutions for Organizational Complexity are the two heaviest domains and together carry over half the marks, so plan the deepest study there. Be honest about the jump from associate: this exam reads long multi-constraint scenarios, not single-service questions.

  2. Build the estate-scale decision trees

    Week 1

    Before drilling any domain, build the cross-service decision trees the whole exam rests on: connectivity (Transit Gateway versus PrivateLink versus peering), governance (SCP versus permission boundary versus Config), safe deployment (CloudFormation change set), event-driven auto-remediation (EventBridge to Systems Manager Automation), migration tooling (MGN versus DMS and the 7Rs), and the disaster-recovery ladder against RTO and RPO. Use the recall prompts in this guide: cover the answer, choose the design from the constraint, then reveal. If you cannot pick from the requirement alone, you do not own it yet.

  3. Go deep on organisational complexity (Domain 1)

    Weeks 1 to 2

    This is where associate intuition runs out, so it gets early, heavy time. Drill multi-account connectivity, the SCP-as-preventative-guardrail reflex, AWS Control Tower landing zones, centralised security and logging, AWS Backup across accounts and Regions, and cost-allocation tagging with Organizations tag policies. Practise on full scenario questions and read the worked explanation on every one, including the ones you got right, naming the constraint that picked the answer and the reason each distractor fails.

  4. Master new-solution design (Domain 2)

    Weeks 2 to 3

    The largest domain by weight, so it earns the most practice. Fix safe deployment with CloudFormation change sets, layered least-privilege access (interface endpoint security group plus VPC endpoint policy), managed and serverless scaling for spiky load, multi-AZ and multi-Region resilience with Route 53 routing, and edge attack mitigation with WAF, Shield and GuardDuty. Drill the multiple-response questions until you reliably pick both correct halves, because a half-right pairing scores nothing.

  5. Lock continuous improvement and migration (Domains 3 and 4)

    Weeks 3 to 4

    Treat the event-driven auto-remediation pattern (finding to EventBridge to Systems Manager Automation) as the spine of Domain 3, alongside CloudWatch anomaly detection, Secrets Manager rotation including custom rotation Lambdas, and right-sizing with Compute Optimizer. For Domain 4, memorise the 7Rs, the MGN-versus-DMS split, DataSync versus Snow Family by volume, and EventBridge as the cross-account SaaS event router. Do the trickiest discriminations by hand until the constraint alone decides them.

  6. Drill weak domains, then space the review

    Week 5

    Use your per-domain accuracy to attack the domains dragging you down, not to re-read what you already know. Then space it: revisit each domain's recall prompts after a few days and again a week later. Spacing roughly doubles what sticks compared with cramming, and on a professional exam with this much surface area it is the cheapest gain available before the exam.

  7. Sit a timed mock and calibrate

    Weeks 5 to 6

    Take at least one full timed mock under exam conditions to rehearse pacing across long stems and the flag-and-return habit, because reading fatigue is real over a professional-length paper. Treat the score as a per-domain readiness signal, not a single number, and review every missed question, naming the binding constraint you misread, before you book or sit.

Know when you're ready

Readiness for the AWS Certified Solutions Architect - Professional is a measured score on long, multi-constraint scenario questions you have not seen before, not a feeling that the services are familiar. Those are different things, and on a professional exam the gap between them is wide. Re-reading documentation and watching service walkthroughs builds fluency, and fluency feels like knowledge, so confidence rises while real recall under competing constraints does not. The fix is to test yourself: if you can read a dense fresh scenario, name the binding constraint among several, pick the best architecture, and explain why each other option fails, you know it; if you can only nod along to an explanation, you do not yet.

Be especially wary of associate-level confidence carrying over. Knowing what Transit Gateway, PrivateLink, AWS Organizations and the migration services each do is the easy half; choosing the one design that satisfies cost, resilience scope, operational overhead and a compliance control together, when three options each work in isolation, is the half this exam actually tests. The multiple-response questions raise the bar further, because a half-correct pairing scores nothing. Trust your measured per-domain accuracy over your gut, and set the bar at clearing every domain comfortably on unseen questions across more than one session, not scraping a single pass.

This guide gives you the map at estate scale. The practice bank is where you find out whether you can navigate it, with a worked explanation and a reason every distractor is wrong on every question. Readiness scoring tells you when you are there. Not before.

Ready to put this into practice?

Free SAP-C02 questions with worked explanations. No sign-up.

Practise SAP-C02 free

Exam-day tips

  • Read the long stem for its binding constraint before judging any option. Professional scenarios name several limits at once (cost, RTO and RPO, operational overhead, no application change, a compliance control); the one that breaks the tie is what picks the answer, and the rest is noise.
  • When two designs both work, default to the managed, lowest-overhead one. AWS prefers managed and native services, so Transit Gateway over EC2 routers, change sets over rollback triggers, MGN over manual image export; reach for the manual option only when the scenario names a reason such as an engine to preserve or a provider with no managed rotation.
  • Treat a service control policy as the answer for any control that must hold against an account administrator. An SCP is evaluated above IAM, so it binds every principal preventatively; a permission boundary, a Config rule or a Security Hub finding cannot, and offering one of those is the trap.
  • On multiple-response questions, both halves must be right or you score nothing. These often pair a network-layer control with a service-layer control, or a remediation action with a preventative guardrail; pick the complementary pair, not two answers that do the same job.
  • Match the migration tooling to what is moving. AWS Application Migration Service replicates whole servers for low-downtime rehosting, AWS Database Migration Service moves database rows with full-load-plus-change-data-capture for near-zero downtime, and the 7Rs label (repurchase, replatform, rehost) classifies the workload before any move.
  • Match the disaster-recovery tier to the stated RTO and RPO. Near-zero targets point to warm standby or multi-site active-active; relaxed targets allow pilot light or backup and restore for less cost, and AWS Backup handles policy-driven cross-account and cross-Region recovery.
  • Reach for the event-driven auto-remediation chain when a fix must need no human and no standing servers. A finding from Inspector, Security Hub or CloudWatch routed through EventBridge to a Systems Manager Automation runbook fixes and logs the issue; an SNS email or a polling Lambda is the weaker distractor.
  • Pace for reading fatigue and flag aggressively. The stems are long, so cover every question once and bank the clear marks before you sink time into a hard one; flag-and-return protects the marks you actually know across a professional-length paper.

Frequently asked questions

Is the AWS Certified Solutions Architect - Professional hard?

Yes, it is one of the harder AWS exams. The difficulty is sustained judgement across long, multi-constraint scenarios rather than recall. Three options usually work in isolation and only one satisfies every stated limit (cost, resilience scope, operational overhead, no application change, a compliance control) at once. Scenario practice with worked explanations matters far more than memorising what each service does.

How is SAP-C02 different from the SAA-C03 associate exam?

The associate tests choosing the right service for a single workload; the professional tests assembling and improving a whole estate. Expect multi-account organisations, hybrid and migration estates, multi-Region resilience, and explicit trade-offs between cost, resilience and agility. The stems are longer, several questions are multiple-response, and the binding constraint is buried in more detail.

How long should I study for the SAP-C02?

Most candidates who already hold the associate and have real multi-account experience are ready in five to eight weeks of steady study. With less hands-on exposure to AWS Organizations, hybrid networking and migration tooling, plan more time on the two heaviest domains, Design for New Solutions and Design Solutions for Organizational Complexity.

Do I need to hold the associate certification first?

No, there is no enforced prerequisite, but the professional assumes you are fluent in the associate material. If multi-account governance, hybrid connectivity, IAM and federation, and the disaster-recovery tiers are not already comfortable, build that foundation first rather than learning it inside professional-level scenarios.

Which domains should I focus on?

Design for New Solutions and Design Solutions for Organizational Complexity are the two heaviest domains and together carry over half the marks, so they deserve the most time. Continuous Improvement and Accelerate Workload Migration and Modernization are smaller but reward clean patterns: the event-driven auto-remediation chain and the 7Rs plus the MGN-versus-DMS split respectively.

How should I handle the multiple-response questions?

Treat them as two decisions, not one. They often pair a network-layer control with a service-layer control, or an automatic remediation with a preventative guardrail, so both selected answers must be correct and complementary. A half-right pairing scores nothing, so confirm each chosen option independently against the requirement before you commit.

What is the difference between AWS Transit Gateway, AWS PrivateLink and VPC peering on this exam?

Transit Gateway is a regional hub giving transitive any-to-any routing and central route control for many VPCs and accounts; PrivateLink exposes one service privately to many consumer accounts, each approved individually, with no route propagation; VPC peering is non-transitive and one-to-one, fitting only a small fixed set of links. Scale and whether you are connecting networks or publishing a single service decide which one wins.

How many practice questions should I do before booking?

Enough that every domain clears comfortably on questions you have not seen, and a full timed mock feels comfortable on pacing across long stems. Quality of review beats raw volume: on every question, read the explanation and name the binding constraint that picked the answer, including on the ones you got right and on every multiple-response pairing.

Is the AWS Solutions Architect Professional certification worth it?

SAP-C02 is worth it for experienced AWS architects who want a professional-level credential that goes beyond associate knowledge into multi-account governance, complex migration patterns, and cross-service trade-off reasoning. The exam is genuinely difficult and the preparation process deepens understanding in ways that affect real design decisions, not just exam scores. It is best approached after holding SAA-C03 and accumulating hands-on experience across several AWS domains rather than sitting it as a first certification.

Examworthy is not affiliated with or endorsed by Amazon Web Services. This guide is original study material based on the public exam blueprint. We never reproduce live exam items. SAP-C02 and related marks belong to their respective owners.