Microservices is a hot topic in the world of software engineering, with many startups building their entire business around this architectural pattern. I’ve recently written about how to build high-performing microservices using gRPC and .NET 6.0. You can read it here!
The reason I wrote this article is because I have recently been asked to write a blog post on the topic.
When I initially started writing up my ideas, I knew that if it was going to be of any use, it was going to need as much details as possible. Later, after writing the first draft, I realised that some specifics were missing.
Here at https://orzare.com/ has some more information about Building High Performance Microservices.
Specifically, I wanted to talk about how to handle logging in gRPC. If you have never used gRPC before, here is a quick overview of what it is and why we would want to use it.
Here are some points discussed-
1. How should errors be handled in gRPC?
There are many different ways you can choose to handle errors in a gRPC service.
You can log all of your errors, and have them shown in the logs on the server. This is generally not recommended, as it is not very scalable and makes debugging much harder. If you decide to log all of the errors, you will end up having hundreds or thousands of lines of logs for your service.. It would be far better to send just enough information to a users console where they can see what went wrong.
You can allow your users to set up webhooks with HTTP calls back to the service, allowing them to perform actions whenever something goes wrong. You can have a timer that captures an event at various intervals to see what errors have occurred.
You can send a JSON callout to the client with errors and messages returned by the service. When you are done, you can send back any extra information such as exceptions, meta data or logging of the request.
2. How should you implement Hijacking?
gRPC allows us to broadcast requests and responses between client and server, so typically gRPC services are not directly run on the same machine where they are deployed. When a request comes in, you could redirect that request to a new machine and the client will never know the difference. This is called hijacking.
If this is the way you are going to implement your service, then it means that your server code will never be tied to any specific IP address or hostname.
You can upgrade and restart your server as much as you like, without causing any disruption to your users. If you are going with this approach, then make sure to use secure connections (TLS) on each of the machines in between.
3. How can we monitor our gRPC services?
If you are going to deploy your service, then you will need a way to monitor it. Some examples of monitoring tools include Prometheus, Apache Metrics, Vibrato and Datadog. These tools work by sending metrics of the service over the network and storing them in a database or search engine.
You can then set up alerts on metrics if they reach too high of a value. For example, if your CPU hits 90% for 10 seconds, it may be an indication that your machine is being flooded with requests and needs to either scale up or scale out.
To register these metrics into an existing system such as Prometheus, you can use exporters . One of the most commonly used exporters is from microsoft or the one from grpc . You can even get an exporter for the .net collector if you don’t want to use Prometheus.
4. How should your gRPC service scale?
Many people think that because their microservices are stateless and all of their state is stored in a database, then it means that they can have many machines all serving the same content, without having any issues.
However, this is not true. If you have ever worked with a distributed system before, you will know that having multiple copies of your service on multiple machines will cause problems in some way.
If your application is stateless then you can load balance traffic across multiple machines. Then, you will need a way to tell them apart and figure out what is the latest information.
Some examples of how you can solve this problem are using a GUID, or using MS clustering on Windows with file level locking, or using memcache or Redis. You are also able to do IP based checks as well to make sure that you have the right IP address for each of your services.