From Hoverfly to Lyft Envoy, there are a number of tools available for troubleshooting and debugging microservices. But even with tools in hand, debugging microservices can be a challenge. With so many layers of potential abstraction and complexity, developers have to dig deep into their logs, dependencies, and reporting. Microservices architecture gets only more complicated as it scales, and many administrators and developers may find themselves struggling to manage and maintain a system that has outgrown them.
Still, debugging microservices is essential. How can developers make their jobs easier? What are the best practices and easiest methods of testing microservices—and ensuring that the issues are found?
How to test microservices
With any element of architectural complexity, it’s important that testing is conducted in a structured, refined, and optimized way. Manually testing microservices happens in a step-by-step process:
- Validating each code branch
- Digging into the latest code available
- Making sure dependencies have been updated
- Verifying database validity
- Restarting the service
But there are also automated processes that can be used for the process of testing microservices. Unit testing, contract testing, integration testing, end-to-end testing, and UI functional testing are all segments of microservice testing, dependent on what areas of the microservice are being tested, and how the microservice is malfunctioning.
- Unit testing. This ensures that the microservice itself is operating correctly. Given the correct parameters, the microservice provides the right information. This is usually done before deploying the microservice and gives the developer information regarding whether the microservice is performing its own work with the consistency that it needs to. If an issue is traced directly back to the microservice, the microservice will need to undergo unit testing again until the issue can be replicated and resolved. Docker, Kubernetes, and other containerization solutions can help reduce scale.
- Contract testing. Contract testing tests the communication layer between the microservice and anything that the microservice is communicating with. The goal is to validate that data isn’t being corrupted or altered as it is being transmitted. This is generally done when the microservice is being added to the organization’s existing infrastructure, but may need to be repeated if the microservice has been updated, or the systems that it’s integrated with have been updated.
- Integration testing. Microservices are often tied into other, third-party solutions, and it becomes necessary to test to make sure that these other third-party solutions are operating the right way along with the microservice. Integration testing should be performed in full not only when the microservice is introduced but also whenever the solutions are patched or updated.
- End-to-end testing. This testing involves testing the entirety of the system from the microservice to the end of the chain. This can involve many different services communicating with each other, but it’s the best way to ensure that data is maintaining its fidelity throughout and that the correct actions are being taken. Because it is so involved, it takes more time, but it is often what developers need to resort to if they don’t have accurate logs. Without specific logs, most tests will become end to end tests because there’s no way of knowing where the issue occurred.
Testing microservices when they are already experiencing issues is often going to relate to trying to track the issue down first, which is where logging and tracing comes in. While microservices are going to be tested before they are operative, they also need to be tested afterward, often within a live environment. A lot of this can be done manually or through a debugger, depending on DevOps processes.
Best practices for debugging microservices
In development, best practices are the perfect way to start. While they don’t always have to be followed, they provide the easiest and most expedient methods for product debugging. Here are some of the best practices for debugging microservices in any environment, whether using open source or commercial tools.
1. Make Sure Your Logs are Searchable
Microservices debugging really requires high fidelity logs. Make sure that the logs are searchable and that you can change your logging level at will. Being able to change your logging level at will means that you’ll be able to drill down to issues much faster, rather than having to search through the entirety of the system. Since the system is not a monolith, it can be difficult to trace.
2. Return Transactional References Back to the Client
Returning references to the client makes it much easier to track and debug any issues as they occur. One of the major problems with microservices architecture is that it’s extremely challenging to track issues on a higher level. By returning references regularly, a breadcrumb of trails is provided. Distributed systems are notoriously difficult to track; this makes it easier.
3. Invest in Setting Up a Logging Framework
Since logging is the most important component to debugging microservices, you may want to invest in a logging framework. A logging framework will give your logs the structure that you might need, so that the log is both easy to read and also high performance. Consider Node-Loggly, NLog, or Log4J if you need to improve your stack trace tooling.
4. Consider Monitoring Tools
Monitoring tools can automate the process of debugging and make it easier for you to identify potential issues, especially performance issues that might otherwise cause problems without outright failing. Monitoring tools can cut down on a significant amount of debugging time by at least drilling down to the service that is responsible. In software development, more automation is always better.
At its core, microservices debugging is going to be about logs and reporting. Whether you upload your logs into a database or use a log framework, you should start to take more detailed, comprehensive logging. These aren’t the only best practices you should be following, but they are the key best practices you should use. The goal is to ensure that you have the information you need to trace issues back to the source—and that getting that data isn’t a processing or performance burden that could ultimately tax the system.
Challenges & changes with microservices
Foremost, the challenge with microservices involves sheer volume. With an ever-increasing number of microservices available, it can be very difficult for anyone to manage and monitor their systems. This is where tools and automation come in: Without tools and automation, it becomes very difficult for any developer or development team to track down issues.
But there are other issues, too:
- Dependencies. Systems often depend on each other, and these increasing integrations can cause services to operate in unpredictable ways. Everything has to be working correctly in a single ecosystem, and when an integration isn’t working properly, it can be difficult to determine which service is failing. Any time there is an update, things can break, and they can break in potentially unpredictable ways.
- Logging distribution. When logging is distributed, it can be hard for developers to even know where they begin debugging a certain problem. With inconsistent logging or logging that’s in multiple locations, developers have to hunt bugs down throughout the entire environment. Greater consistency with logging is essential to systems that are constantly growing in complexity. Heightened observability, real-time monitoring, and opentracing protocols will help.
- Lack of familiarity. Many developers are only now working with microservices on a large scale, and this increased unfamiliarity to the environment can make it much harder for developers to troubleshoot their systems. As developers become more familiar with microservices, this will become less of an issue. Developers may want to go to seminars and learn more about microservices if they don’t feel confident with it.
Much of this simply has to do with the increased complexity of the system. Preparation is always key. The more preparation developers and development teams do to ensure that they are able to properly manage their microservices, the better—and the more they do to improve their log data, the more they’ll thank themselves later.
Logging and crash reporting
It cannot be emphasized enough that logging and crash reporting is the most important part of troubleshooting and debugging microservices. Obfuscation and complexity are the key enemies when debugging microservice architecture, so the best way to defeat it is to ensure that logs are robust enough, high performing, and also able to be filtered and searched for information.
There are many solutions that provide for better logging and crash reporting. There are solutions that can save logs to databases so that the information can easily be pulled up by developers. There are also solutions that can monitor services for erratic behavior and report back to the developer, sometimes, even more, a crash or other issue has occurred. Either way, the developer needs to be able to understand their tools and work well with them.
A frequent mistake with logging is to take very robust logs but not have a way to filter or search through them. In this way, the logs themselves become a barrier to understanding what’s going on. On the other end of the spectrum, logs may be very easy to search through, but may not be useful because they aren’t logging the most important events.
A log framework gives a developer a starting point with their logging, so they don’t have to try to develop their log structure on their own. However, developers will often need to tailor their logs at some point to give them the most relevant data, as every architecture is going to be unique, and the data that’s needed will vary between systems and environments.
When troubleshooting and debugging microservices, developers should consider whether they’re getting the information they need from their existing log files, or whether they feel that their log files need to be enhanced. GitHub has a number of solutions, depending on whether someone is looking for a Java, Visual Studio, or other environment loggers. The debugging process may vary depending on programming language and service.
Distributed tracing makes it possible to trace requests through services, making it easier to identify the locations of issues as they occur. Tracing has a fairly narrow band, not only does the service have to be identified, but also when the incident took place. Much like space and time coordinates are necessary to capture a moving body, multiple coordinates are needed to capture an issue, especially if that issue is intermittent by nature. Distributed tracing allows for end-to-end visibility and ultimately makes it far easier for developers to trace issues in their system.
Tracing microservices is naturally complex because the issues in the system aren’t always noticed where they originate. Because errors can be passed through a multitude of systems, they can also be divorced from where they first started. And because of that, tracing microservices can take an exceptional amount of time, especially if they pass through services that aren’t owned by the organization. But distributed tracing is a scalable and powerful method of tracing throughout a system, to yield better, faster, and more consistent results.
Debugging microservices is always likely to be a challenge for many developers, simply because of the inherent complexity in the task. As companies grow, they are going to use more microservices than ever before, and the number of microservices an entity uses is likely to only increase and rarely decrease. But by following best practices, developers can continue to manage these microservices effectively—even as they scale.
The right processes, procedures, and technology are, in fact, essential for organizations that are going to be scaling their architecture and shifting their needs. For developers who are already struggling with debugging their microservices architecture, investing in additional tools, seminars, and training can help.