A simple introduction to Step Functions

Let’s cut to the point and not lose your scarse time.

Quick (I promise) intro

Step Functions is a managed service by AWS that implements the Finite State Machine (FSM) model.

You coordinate multiple AWS services into serverless workflows so you can build and update apps quickly. Using Step Functions, you can design and run workflows that stitch together services such as AWS Lambda and Amazon ECS into feature-rich applications.

A FSM, according to Wikip is “a finite-state machine (FSM) or finite-state automaton (FSA, plural: automata), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time.”

By example, we learn

Have you noticed that intersection traffic lights never turn green simultaneously for crossing directions? Not even once in a Million times?

It never bugs down! How come?

Thank FSM Gods next time you drive through it safely.

The FSM model ensures that, when a traffic light is about to go green for one direction, all the others turned red before. :traffic_light:

It does that by managing states and transitions. Mark those two words. They’re pivotal.

Right, let’s clarify. :bulb:

For one light to transition into the green state, all other lights must have transitioned to the red state before. Simple, right?

FSM is a robust model for that type of scenario: multiple states with transitional rules.

It’s kind of a design pattern. Not quite, but not too strechy.

Wanna see more?

Ever wondered how vending machines work? Yup, FSM is there too!

It guarantees you won’t get a snack until you feed it up with at least those 3 bucks. Ouch, expensive!

FSM will also manage the transition to delivering the snack after payment, and also secure your change if any is owed.

Yes, some times vending machines fail, but it’s not due to the FSM model

FSM is a mature and proven model that can be trusted. Implementing correctly isn’t hard. When done right, rest assured it is going to fulfill its promises.

FSM implementations

As a mature, battle-tested model, there are implementations in many programming languages, such as Python, Javascript, Java, etc.

Not recommending these libraries, just exemplifying. Do your research.

Or… even better… wait for it:

Wouldn’t there be a way to use FSM without having to implement anything programming-wise?

YES! Enters AWS Step Functions.

Step Functions: a managed FSM service

Following FSM concepts, AWS Step Functions also has states and transitions.

Tasks are also part of the package, but more on that later.

Let’s skip straight to the point.

We’ll staret with some examples and go from there.

Consider a huge number of entries in a database. They all need to move to another storage location.

Too many entries, can’t do in a single shot. Will need to loop.

Step Functions example diagram

Step Functions implementation - example diagram - adapted from AWS Docs

This outlines how Step Functions would handle that data migration process for us.

The initial state is data seeded from the source database. It goes through a loop, then reading next entry, sending it to another location, until it finally succeeded.

It won’t magically guess what we need. An FSM is coded in Step Functions using the Amazon States Language. It’s just a JSON representation of your States, Transitions and Tasks.

Here’s the JSON snippet corresponding the diagram above.

Now Tasks. It’s what executes what you need once data transitions across states.

In our example, when a DB entry comes from the loop, it gotta go to the new storage location. A Task would involve, in simple terms:

  1. Getting the original data
  2. Formatting it for the new location standards
  3. Establishing a connection with the target database
  4. Inserting the data
  5. Waiting for confirmation
  6. Returning a 200 | Success response from the transfer request

In the AWS example, this Task is accomplished by a Lambda function.

Advantages of Step Functions

Auto-retry failures

In case the migration for a given entry fails for any reason, the Step Function can automatically retry it for us. Could even have it notifying ourselves in case of errors.

It will make sure an entry is not duplicated in the target storage by tracking its state and whether a retry is needed.

Once successfully migrated, an entry cannot possibly go back to “pending migration” state. That’s guaranteed by the FSM model.

FSM Benefits

Everything beautiful about the FSM model comes bundled by default:

  • Maturity and trustworthiness: resilient and fault-tolerant
  • Flexibility
  • Quickly move from an abstract, conceptual process to code and execution
  • Little processing overhead

Adapted from: elprocus.com


As we discussed previously, we can avoid the hassle of implementing an FSM library in our own environment.

AWS will manage everything. Make sure servers are provisioned to run our state machines. Scale the infrastructure when needed. Up and down. Horizontally. In the fourth dimension. Kidding. :smile: :blush:

No, but really, scalability is difficult to get done right, this is a big plus.


Step Functions will seamlessly integrate with various other AWS services.

If AWS is your cloud partner, you may be able to immediately start coordinating your current resources in an FSM fashion. Just like that. :zap:

Stay tunned

I won’t even ask. You’re smart enough to see the value.

Therefore, stay tuned. We’re preparing a hands-on, techy article showing how to actually put all of these concepts in practice, for real-world use cases.

Leave your contact here (bottom-right corner) to get notified when we do.

A collection of lessons learned at Dashbird after working with 4,000+ customers and 300,000+ Lambda functions

Write for us!

We're looking for developers to share their experience with serverless.

Emails and pull requests welcome!

Start using Dashbird for free!

Failure detection, analytics and visibility for serverless applications in under 5 minutes.

Request Demo