Monitoring platform for keeping systems up and running at all times.
Full stack visibility across the entire stack.
Detect and resolve any incident in record time.
Conform to industry best practices.
This monitor is in place to inform you in case your Step Function experiences a timeout.
Dashbird continuously monitors and analyses your serverless applications to ensure reliability, cost and performance optimisation and alignment with the Well Architected Framework.
Severity: CRITICALInterval: 30 minutesTime slot: 60 minutesThreshold: 1 or more
Metrics:METRICS.STEPFUNCTIONS.TIMEOUTSState: States.Timeout
Metrics:METRICS.STEPFUNCTIONS.TIMEOUTS
One of your state machines timed out because one of its tasks ran longer than the TimeoutSeconds value or failed to send a heartbeat for a period longer than the HeartbeatSeconds value.
The TimeoutSeconds and HeartbeatSeconds configuration parameters enable your state machine to check if a task takes too long or failed and won’t come back.
Check that you set sensible values for TimeoutSeconds and HeartbeatSeconds and if this is the case, check why your tasks still exceed these timeouts.
This rule resolution is part of the Dashbird Serverless Well Architected Reports tool for AWS. Dashbird features a collection of rules and checks continuously applied to your infrastructure, surfacing ways to improve it.
Catch errors and detect anomalies for AWS Step functions and learn the best practice rules for Step Functions.
Dashbird is a monitoring, debugging and intelligence platform designed to help serverless developers build, operate, improve, and scale their modern cloud applications on AWS environment securely and with ease.