Hyperledger Fabric network monitoring with Prometheus and Grafana
Monitoring is an important aspect of enterprise applications. Whereas it is important for organizations to embrace new technologies, incorporate new methodologies of build and deployment for greater success, at the same time, monitoring the systems proactively is equally important to sustain that success. Healthy systems are key for businesses to keep operating. Any inadvertent fault in the system could disrupt the functioning of the system, resulting in negative user experience and jeopardize the organization’s reputation.
Hyperledger Fabric, being an enterprise solution, also provides the options for network operators to monitor the deployed solution. Peer and Orderer are the key components of the fabric blockchain network. In addition to the core services offering, peer and orderer have an inbuilt HTTP server that exposes operations API that are Restful. The objective is to provide as much as information of the operators that could help monitor the health check and operational metric. On high level, following are the capabilities that are exposed
· Log level management
o Peer and Orderer exposes a REST resource /logspec that can be helpful in managing the active logging specification.
· Health Check
o A REST resource /healthz is exposed that gives out the details of liveness and health of peer and orderer.
o Peer and orderer exposes a metrics, that is, lot of vital parameters which can be helpful in determining how system is working. For instance, Peer gives out value of ‘chaincode_execute_timeouts’ that tells the number of times Init or Invoke chaincode execution have timed out.
o Similarly, for orderer, one of the property is ‘broadcast_processed_count’ which tells the number of transactions that have been processed.
We shall be talking about Prometheus and Grafana that can be used to monitor the Hyperledger Fabric effectively. Prometheus is widely accepted and popular system for monitoring and altering toolkit and it supports pull based HTTP model. It scrapes and stores the numeric time series data very efficiently. On the other hand, Grafana is a visualization tool and can use very data sources from various tools and Prometheus among on them. Data collected from these sources can be queried and various metrics can be visualized through dashboards using Grafana.
In nutshell, Prometheus scrapes through the data that any system, in this case, Hyperledger Fabric peer or orderer would provide and then Grafana can be used to provide visualization metrics by providing Prometheus as a data source.
In this blog, I shall cover metrics and visualization from Orderer perspective, for details around Peer you can refer to my book Hyperledger Fabric In-Depth and code for this is located at https://github.com/ashwanihlf/sample_monitoring
Below is the snapshot of the orderer service, only noticable thing from monitoring perspective is additional environmental variables ORDERER_OPERATIONS_LISTENADDRESS and OPERATIONS_METRICS_PROVIDER
With this configuration, operation service starts listening to these address to be able to response to any request that comes asking for any information. You can see that we have enabled port 9446 for orderer health resource endpoints, we shall get information as given in below image
In the same docker-compose file where we have given service definition for all of our containers we shall now provide service details for Prometheus and Grafana. The configuration given here is bare minimum which enables us to download and run the container for both.
Whereas we have our operational metrics enabled on orderer, Prometheus and Grafana are available but we need to define certain configurations for Prometheus so that it can start fetching the operational metrics and that is done via yaml file. If you closely look at the Prometheus definition in command section we are passing a config file flag along with the path of prometheus.yml file. This file defines the configurations that Prometheus is supposed to use.
If you look at this snippet, you would notice that we have provided two jobs of scrape configurations. One is for Prometheus itself and second one is for fabric network. We enabled 9446 for orderer and that is given as targets and scrape interval is given as 10 seconds. Again, this is very simplistic view and readers are encouraged to explore more of this.
Now, we have Prometheus scraping the orderer operational metrics and we would look at how this can be viewed in Prometheus dashboard. Prometheus server runs at 9090 port and if we hit http://<host>:9090 then we shall get following dashboard and If we expand the dropdown right next to Execute Button you would realize that it is showing all the operational metrics parameters that we looked earlier.
This proved to be a great resource of information to determine the status of orderer as to how it is behaving.
We have now been able to monitor the operational metrics using Prometheus then why would we need Grafana. It is quite popular analytical and visualization tool that helps bring data together in such a fashion that it is efficient and organized. Grafana works on the concept of data source and they could be Prometheus, AWS CloudWatch, MySQL and many others. So, from an enterprise perspective it provides a single window to monitor data coming from various data sources. It allows users to better understand the metrics of their data through queries, informative visualizations and alerts.
Since we already have Grafana defined in our docker-compose file so it’s already up and running and to access that you simply need to do is http://<host>:3000
Once you hit this URL you would get following Login screen, enter admin/admin as credentials and login.
Since it is fresh installation and we have not configured any data source or dashboard, we would simply get screen as below and we are expected to add a data source.
Once we click on Add data source, it shows us a list of databases that we can use, we would simply select Prometheus and then provide basic configurations such as host and port where Prometheus is running and save the configuration.
Now, if we go to explore option in left side nav bar, and click on Metrics dropdown we would all the operational metrics but now eveything is being presented in dahsboard. We could see that we have selected chaincode_shim_requests_recieved and its showing count as 1.
We can now perform query,invoke etc. functions on our chaincode and could see how any operation done by orderer is getting reflected here.
Hope, by now you must have a fair idea of monitoring aspect of Hyperledger Fabric and how use of Prometheus and Grafana can increase the experience.