In this article, we’ll be taking you through the steps and what to bear in mind in each stage of migrating to serverless – from preparation to migration and post-transition.
The Cloud Spectrum
To understand more easily the wider context of migrating legacy systems into a serverless form, we should first understand the cloud spectrum. This spectrum ranges from on-premises workloads to virtual machines, containers, and cloud functions. Serverless typically falls within the cloud functions area, as function as a service (FaaS), but now it’s an umbrella term growing to include back-end as a service (BaaS), such as fully managed databases.
The first thing when looking at legacy transitions is understanding where you are on the cloud spectrum.
Despite being four or five years old, we are starting to get into a cycle where even serverless apps are becoming legacy systems as well. Anybody that writes Node.js knows that if you make no updates for two years, you’re going to have dependencies breaking all over the place.
Your serverless experience
The next question to ask is: does your team already have serverless experience?
There are two different routes here:
- Yes – We already have serverless experience, we already have cloud experience. In this case, you need to identify the key team members who can help drive and evangelize the serverless migration, including training, pattern development, and so on. The engineering hours involved in the transition can all be streamlined by having the patterns, the documentation and the training.
- No – If you don’t have that serverless or cloud experience internally, you would benefit from finding a consulting partner that specializes in serverless migrations (such as Serverless Guru or Theodo) or serverless adoption to help retrain and retool existing employees and help them grow.
The serverless acceleration team
Following this, you would need to develop the serverless acceleration team. This would be a working group that will help accelerate the rest of your organization, which will focus on:
- building reusable infrastructure as code
- practices around building serverless apps
- processes around development workflow.
Drawing service boundaries
Next, you need to identify a common service or use case. For that, you can ask the following question: what service represents 80% to 90% of how other services are built in the legacy system?
Let’s take a monolithic API as an example. If we have 100 endpoints and 10 of them are related to account and 10 of them are related to users and 10 of them are related to feeds, we can draw three service boundaries there. We could find that the APIs that are being done 100 times, one of them looks the same as the rest of the 100. With these service boundaries, we can start chunking this migration up into pieces. That makes it easier to migrate!
If we develop a pattern for one service boundary and it’s composed of 10 endpoints and there’s nine more of those services that all have 10 endpoints each, we know that if we can develop one, we can reuse it across the rest of them.
Documenting the migration process
To ensure that the migration is not done in isolation by the group of people responsible for the transition, we need to document the entire service migration process and the nuances of how the teams build services. Using that knowledge, you can improve training material, and then use the service migration to describe the benefits for further leadership buy-in.
Training and retooling
Once 80% to 90% of the services get migrated, we need to look at training and retooling. The rest of the developers on the team will need to go through training to be effective in this new environment. The training can start simply with tool training, which would entail infrastructure as code frameworks or other serverless tools, Dashbird being one of them or CI/CD pipelines. Beyond tooling, we’re going to familiarize developers with common commands, demonstrate service development workflow, testing and monitoring.
For pattern training, we will need to do a line-by-line review. Having migrated that common service, developers can learn high-level, ‘hello world’ type of applications. For example, building an AppSync API, a GraphQL API, or a REST API with AWS with serverless.
But that doesn’t go deep enough. When we’re migrating a legacy system to serverless we must go as deep as possible to actually make the application work. As it’s not a completely greenfield application, we have to map things over. This line-by-line review would present how the pattern is built with the infrastructure-as-code plus other components, and explaining why things were built in a specific way.
At this point, it’s important to establish a standard for the team to follow in terms of naming conventions, organizational projects, mono, and multiple repositories so that the developers would be able to replicate the pattern themselves.
Creating a new VPC environment from the ground up, and then maybe even the automation, would take 40 to 50 hours the first time you do it. But afterward, if you want to modify it and you want to spin up new VPCs, it only takes as long as the CI/CD takes to run. This would end up saving that entire 49 hours a time because you have those templates in place.
Templating is a very important piece of serverless.
Serverless Guru have created some VPC templates that would save the initial 50 hours of work required to set up the environments yourself.
Creating self-sufficient teams
This migration approach we laid out is fairly slow. Why did we propose this slower approach, focusing on assessments, documentation, training, all these things? So that once these are established, we can then move quickly. The development teams will be able to pick up services in parallel to migrate them.
Although the developers got the basic pattern training, we can start getting into details, such as ‘How do I connect this queue to this Lambda to this DynamoDB stream and make sure that it’s efficient, performant, and know all the different layers of that?’. That part is not going to be able to be covered with the basic training. So, we do our best to create a general training to map out 80%, 90% of the service, but that other 10% can be quite a large list. During this process, the development team will hit barriers and roadblocks. The serverless acceleration team will have to provide ongoing support and also identify areas for improvement.
For edge case pattern development and training, as the developers are working, migrating things, they identify opportunities such as , ‘When we use this SQS FIFO queue, it’s not processing fast enough because it only processes it one at a time based on how we set it.’ The developers can decide to create a pattern around how to use SQS FIFO properly so that later on they don’t have additional production issues happen that are causing delays.
These patterns can be shared with the entire development team, ensuring optimal knowledge reuse and even an internal serverless best practices playbook. With that best practices playbook, the developers can start writing that stuff down, documenting it, and it can now be used for onboarding new developers and also as an auditing tool.
Upon successfully completing the migration from legacy to serverless, we have a working service. However, we don’t consider the project done until all the developers know what they’re doing. Even a consulting company, such as Serverless Guru, the transition process should always be about bringing the entire team along. Ultimately, we want the developers at the company to independently build new services without having to rely on the consulting company at all.
To fully internalize serverless, the team should adopt knowledge sharing and finding new ways to further optimize. When looking at how this affects your company culture, it’s not just about the serverless migration. It’s about creating a culture where people that are working at the company are talking to each other, feeling confident in sharing ideas, talking to leadership, getting feedback, helping improve the process from the internal point of view.
What comes next? Well, we don’t stop. There’s no finish line. You may have to tell that there’s a finish line to leadership to get them to buy in, there’s a start, there’s a middle, there’s an end. But ultimately, we’re trying to create a cyclical system that is going to feed into itself.
Developers will feed information, the team will consistently keep skilling up as they’re going, and we’re going to get faster, we’re going to build better products for customers, and we’re going to just keep iterating on that ad infinitum. It’s never going to stop.
Constantly, every day new companies are popping up. They’re working on new tools for CI/CD or for different aspects of the development workflow, like local testing or emulating AWS services. And we’re going to be able to find those improvements with the serverless acceleration team and then be able to create a pattern for it, test it out, isolate, experiment with it. Once we do have something, we’ve got a concrete example of something that works, we’re then going to present that back out to the team, we’re going to do training on it, record videos about it, and make sure that the team starts moving towards that direction.
Importance of specialized tooling
Rather than reinventing the wheel every time, specialized tools can help streamline your processes. One example would be for centralizing and making the monitoring data available. This means giving engineers the ability to interrogate monitoring data at scale really easily and reducing the time it takes to understand different aspects of the system, understanding the health of your system, and overall just getting a feel of what’s actually going on. Another example would be for reducing the time to discovery. This is also what can be abstracted and automated away. Maintaining posture and enforcing best practices also should be an automatic and abstracted process, rather than a manual one.
Tooling can help you enable learning and understanding your system and the way it changes over time, to help you make informed decisions. The approach we have at Dashbird is based on three core pillars that make up our platform and enable our customers to be successful:
- We provide a single pane of glass and a central store for all of your monitoring data. So there’s always one place you can go and look at anything about your system. You can build complex queries, you can look at dashboards of microservices – all the way from an account level, a microservice level, into a single transaction level.
- We automate and abstract all of the failure, threat and risk detection across the stack. So that means analyzing logs, metrics, and then figuring out what you should be paying attention to.
- We look at everything through the Well-Architected lens. So we have an overview and all of the things we continuously check against the five pillars of the AWS Well-Architected Framework (cost optimization, performance, operational excellence, security, and reliability) and then make a report of what is the current state of your application infrastructure.
What stage should we consider operational excellence?
In our experience, it’s the sooner the better. The challenge is that if you’ve built everything already, it takes a lot more to straighten the ship. And the quicker you get feedback on your issues and the quicker you are able to learn what the best practices are and how you should be building and where are inefficiencies, the sooner you can actually start implementing those changes.
Ryan is the founder and CEO/CTO of Serverless Guru, cohost of the Talking Serverless podcast and Serverless Economics podcast, and author of the online course “Serverless Zero to Paid Professional.” He started his serverless journey at Nike in the innovation engineering department in late 2018, and from there he’s fully adopted the serverless lifestyle.