Michael Haberman

This talk was presented at API The Docs Virtual 2020 event series on 10 June. We are glad to present the video recording, slide deck and talk summary below. Enjoy!

Visit the API The Docs 2020 recaps overview to explore all presentations from 2020. API the Docs Virtual begins May 2021! CFP closes February 28, 2021.

Michael's slides

Microservices are small: they have only a few responsibilities. The more microservices you use the more complex the system is.

The microservice journey starts simple

A few services communicate with each other and their databases
Running locally is still simple
The whole architecture is transparent
Onboarding new developers is straightforward
The communication is usually synchronous HTTP

After extending your architecture with more features you’ll end up with more microservices and the communication among them gets complicated.

Problems and solutions

Hard to see the whole picture → create a descriptive architecture document (but it easily becomes outdated)
Hard to know the payload/data structure between services → you can use OAS documentation to document the API usage, but when it’s manually generated, you have to think how accurate it is
Running it locally is a challenge → using cloud services works quite well to a certain extent

HTTP vs. async communication (Kafka)

HTTP is good when the entire environment is up and running; but you can end up with data loss when some services become unavailable

HTTP is not enough, because it’s synchronous and isn’t scalable enough
You need a solution that makes your application as operational as possible without losing any data when one of the components of your architecture isn’t working
Solution: moving to async communication (e.g., Pub/Sub, Distributed Queue, Kafka)
There are many tools available, you have to pick the right one for you
Using async communication tools: services won’t communicate with each other directly, they’ll communicate through a tool instead (e.g., Kafka)
Service A sends data to Kafka → Kafka persists the data and ensures it won’t get lost
Service B will pull the data from Kafka when it’s available
It makes the architecture a bit more complicated, but not too complicated and the data is persistent

Downsides of async

Swagger/OAS can’t be used because it describes HTTP communication
You need to find other solutions to visualize communication → there are no community adopted mature solutions yet
Running services locally is complicated and requires a creative solution (e.g., running Kafka locally)
Breaking API changes: it depends on the working environment → docker compose, contained environments you can control
How to debug a service which gets data from another source? (can’t send an API request via Postman)
The whole picture got way more complicated as you start using more microservices and async communication

Tools to handle complexity

Distributed tracing – the ability to trace how microservices communicate with one another.

OpenTelemetry

An OS tool allows you to instrument distributed traces in microservices
It instruments what data comes into a service and what data goes out to a service within the code
Everything is being sent to a central location

Distributed tracing tools: Jaeger UI and Zipkin

For a developer, the ability to inspect and understand data and visualize the flow is very important.

Distributed tracing tools can help

Visualizing the flow
Debugging (trace ID)
Understanding the big picture
Understanding the outcome for a particular endpoint
Understanding what components are involved
Inspecting the relation between the components
Monitoring what actions were done during the API call.
Writing logs

If you don’t know how the line of code was executed, what endpoint executed it, you can grab the trace ID, paste it into Jaeger UI or Zipkin and search on that, so the entire flow of microservices becomes visible → seeing the bigger picture has a direct impact on reducing the amount of bugs within production.

Distributed tracing tools won’t help you with:

Local development (which is a problem in this field)
Deciding which test you need to write (but it's worth the investment)
Reproducing an issue (it provides the flow but not the actual data to reproduce a specific flow)

Takeaways

Microservices are complicated by definition and they get more and more complicated if you use more of them
The roles of the developers is to predict the complexities, be ready for that and bring the right tools to overcome the issues
Be aware that the implemented solutions are mostly programming language specific
Use HTTP where you have to, but use asynchronous communication wherever you can (better and safer, but it makes development a bit more complicated)

Sign up to our Developer Portal Newsletter so that you never miss out on the latest API The Docs recaps and our devportal, API documentation and Developer Experience research publications.

Michael Haberman - Hundreds of Microservices without breaking your APIs

Michael's presentation