Step Function State Machine has timed out

This monitor is in place to inform you in case your Step Function experiences a timeout.

Dashbird continuously monitors and analyses your serverless applications to ensure reliability, cost and performance optimisation and alignment with the Well Architected Framework.

Product Features Start Free Trial

Severity: CRITICAL
Interval: 30 minutes
Time slot: 60 minutes
Threshold: 1 or more

Metrics:
METRICS.STEPFUNCTIONS.TIMEOUTS
State: States.Timeout

Why do I see this?

One of your state machines timed out because one of its tasks ran longer than the TimeoutSeconds value or failed to send a heartbeat for a period longer than the HeartbeatSeconds value.

What does this mean?

The TimeoutSeconds and HeartbeatSeconds configuration parameters enable your state machine to check if a task takes too long or failed and won’t come back.

How do I fix Step Function State Machine has timed out?

Check that you set sensible values for TimeoutSeconds and HeartbeatSeconds and if this is the case, check why your tasks still exceed these timeouts.


This rule resolution is part of the Dashbird Serverless Well Architected Reports tool for AWS. Dashbird features a collection of rules and checks continuously applied to your infrastructure, surfacing ways to improve it.

Catch errors and detect anomalies for AWS Step functions and learn the best practice rules for Step Functions.

Industry leader in serverless monitoring

Dashbird is a monitoring, debugging and intelligence platform designed to help serverless developers build, operate, improve, and scale their modern cloud applications on AWS environment securely and with ease.