An order-processing web tier writes directly to a fleet of EC2 worker instances over HTTP. During flash sales the workers are overwhelmed and requests are dropped, but at night the workers sit idle. The team wants to absorb traffic spikes, let the workers pull work at their own pace, and stop losing orders, with the least operational effort. Which change best meets these requirements?
- APlace an Application Load Balancer in front of the worker fleet and enable connection draining so that surplus order requests queue at the load balancer until a worker becomes available.
- BSend each order to an Amazon SQS standard queue and have the worker instances poll the queue, so messages persist until a worker is free to process them. Correct
- CPublish each order to an Amazon SNS topic and subscribe every worker instance so that all workers receive the same order and the fastest worker processes it first.
- DRoute every order through an Amazon EventBridge bus with a rule that invokes the worker fleet directly, relying on EventBridge to retain orders the workers cannot yet accept.
Why A is wrong: An ALB distributes synchronous requests but does not durably buffer them; when no healthy target can respond the requests time out, so orders are still lost during a spike.
Why B is correct: An SQS queue durably buffers messages and lets consumers poll at their own rate, smoothing spikes and preventing dropped orders with minimal operational effort.
Why C is wrong: SNS pushes a copy to every subscriber, so all workers would process the same order, and it does not buffer messages for slow consumers to pull later.
Why D is wrong: EventBridge routes and filters events to targets but is built for push-style delivery, not for letting a worker pool pull buffered work at its own pace.