COSBench, Intel’s Cloud Object Storage Bench-marking tool and how to visual it’s data with matplotlib

I am probably missing a few images here, but you get the point, Object Storage is here to stay. It’s becoming more popular as workloads move to cloud based application architecture where HTTP dominates at scale more than ever. So comes the need to be able to run performance tests on our (pick your favorite) open-source object-storage implementation… or not, if your not into it.

The tool I want to talk about it Intel’s COSBench, or “Cloud Object Storage Bench” if you will, a Java based performance benchmarking tool for Object Storage Systems. I’ll also touch on a neat way to visualize the data output by COSBench itself. COSBench defines itself in part:

COSBench is a benchmarking tool to measure the performance of Cloud Object Storage services. Object storage is an emerging technology that is different from traditional file systems (e.g., NFS) or block device systems (e.g., iSCSI). Amazon S3 and Openstack* swift are well-known object storage solutions. (https://github.com/intel-cloud/cosbench)

It allows you to run tests at scale using “Driver Nodes” and “Controller Nodes”. A Driver Node is the node that does the heavy lifting and generates the load that the test will be producing. A Controller Node collects metrics, orchestrates the jobs and keep track of which tests are running on which Drivers etc. Essentially the M&O/Dashboard. Read more about the specifics in the User Guide on Github (UserGuide). I wont go into detail about how to install in the post, I will just say the guide is pretty straight forward, I ran a multi-driver installation on top of OpenStack IceHouse to test Ceph (S3 and Swift Interfaces on the Rados Gateway) and Amazon S3 Directly.

What I will go into a bit is how to define a job, below is an example of how to setup a test for a Rados Gateway Swift endpoint using Ceph. As you can see below, I used a token from OpenStack using the keystoneclient, and an endpoint ending in /swift/v1 in the “Storage” directive of the COSBench XML file. This small test will run a 100% READ test on 240 Objects in 12 different swift containers. This is what the “container=(#,#) and Objects=(#,#) denote. These objects will be in size Ranges of 25MB, meaning 25MB, 75MB, 175MB…etc.

After submitting the test you will see output in the Controller dashboard that looks like the below image. To get to this data yourself, click on the the “view details” next to the finished job, then click on “view details” next to the Stage ID of w<#>-s<#>-main with the name “main“. You can then click on the “view timeline status” underneath the General Report to get the below timeline data.

This will give you a breakdown (I believe of every 5 seconds) of the performance metrics collected. The way the metrics are collected and how they are computed is explained in the User Guide (referenced above). If you click the “export CSV file” you can download the CSV version of the output for analysis. Which should look like the below excel sheet:

Now to the fun part, with this data we can do some fun and interesting things with a python graphing library called matplotlib. Using this we can extract the data we want, like bandwidth, latency or throughput and draw graphs to better visualize our data. I have a few scripts that can be used to do this, made specifically to take input from a COSBench CSV file. Just start the script and pass it the CSV file. (more info on the Github page) https://github.com/wallnerryan/matplotlib-utils-cosbench

Run something like:

#cd matplotlib-utils-cosbench/
#python graph_data_bandwidth_bf.py <csv.file>

The output from the script will be a PNG graph image, the above command gives a graph of Bandwidth over Time in MB/s with a best fit line drawn through the graph. Go ahead, try it if you would like. The output will look something like the below image:

*(depending on your performance numbers, and relative hardware and environment, the graph may look very different)

As a tip, in this case, my Y-Axis is capped at 250, because I know my data points did not go above 250MB/s, if they do, look at the lines around 68 in the source code, there is a message about how to change this. In this example it will look like the below image. (I had changed this one to be 250 based on the below code snippet)

A note on the script, if you have a lot of data points the graphs can get junked up with data points being too close, there is a option to specify that you would only like the data points every X number of data points. (e.g it will read every 5th data point) Just pass in a separate argument in the form on an integer to the script at the end.

#python graph_data_bandwidth_bf.py <csv.file> <number>

Well, I hope this was interesting for some, and if you have any questions of comments please feel free to comment here or on my github or send me and email. Until next time, cheers.

Ryan's Thoughts

"let's work within some interesting technology and write about it"

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply