HDS USP Analytics App for Splunk
Finally I'm making this Splunk app freely available to the general public (see attachments to this post). I built this app almost a year ago and it is definitely not near perfection. I built it over a couple of days on spare time to work within our environment where we only have a few USPVM’s. So I can't speak for how well it scales or how well the searches work over very large data sets. Frankly it works for me and I haven't had enough spare time to invest into further development or optimization of it. This is why I am launching it here first and not on Splunkbase, sadly in my mind it's nowhere near ready to be on Splunkbase.
First let’s look at how it works. The USP and USPVM models allow remote retrieval of performance data using what they call the "Export Tool". This tool is freely available from Hitachi; it can be obtained through the local Hitachi tech or sales rep. There is a catch; Hitachi produces a new Export Tool for each release of the microcode for the USP or USPVM. So unless you make sure all of the USP's are running on the same microcode version then you will need to obtain the appropriate version of the Export Tool to meet the microcode version you are running at the site.
I've included a copy of the Export Tool with this post (see attachments) that works for microcode version 60-07-54/00. I've included a PDF with this post that outlines the use of the Export Tool, simply click on the section labeled "Using the Export Tool" (section 7-1 page 167) and it outlines the exported files and how to configure the tool using the command.txt file.
I've included a sample of the command.txt file I use, be careful when adjusting the shortrange and longrange values this can REALLY cause the Export Tool to take a LONG time to execute. Right now my command.txt file retrieves values for the previous 10 minute period.
Ok, so now to actually get the data out and usable I've written a simple script that executes the Export Tool. After the Export Tool runs it generates over 500 CSV files with all of the performance data inside. My script then executes a simple sed command that removes the first 6 lines of each of the files, this is because the Export Tool adds a header to each of the files that does not need to be indexed. After sed is complete it then moves the exported CSV's to a specific directory that I have Splunk monitoring to index the files and their appropriate values.
Now the problem here is that each of these files are unique in the fact that the header is generated based on the configuration of the USP and for the specific data set its polling (you’ll see what I mean when you look at the data). So if the configuration of the USP is changed then the header on these files change, this is why I simply have to have Splunk re-index the files each time and do the field extraction/association based on the header of the CSV instead of using a simple transform stanza for each file. Just makes things easier in the long run so that if someone add’s disks, re-associates LUNs, etc… and they do not have to go in and reconfigure Splunk. You will also need to change the allowance on the amount of open files the user running Splunk can have just so it can index so many CSV's at once. Hopefully this process can one day be simplified or improved upon.Then simply setup a cron job to execute the script every ten minutes and viola, you have your performance data.
Ok, so that's how the data is extracted and indexed... Once the data is indexed the searches are very simple but because of the ever changing headers and so on the searches are VERY vanilla and this is where optimization went out the window. Take a peak at the savedsearches file and you will see what I mean. The Port Cluster searches, "CL1 IOPs per Port" for example, will need to be tweaked based on the environment configuration.Hopefully this clears everything up and someone out there can find use in this application as we have. I really haven't had as much time as I would like to work on this app and make it cleaner, etc...
Some ideas I had wanted to build were summary indexing for legacy data and views where people could build custom charts based on the values they selected from various drop downs. Such as choosing "Port 1A" then that expands all the LUNs mapped to that port and then they'd choose "VMWare Cluster" and charts covering the various data (KBPS, IOPS, Response Times, etc…) would appear for that specific cluster. This would make the app much more usable by an audience who may not have knowledge of the LUN, LDEV, Port, etc… configurations.
In any case, Enjoy!
|Performance Manager_rd61713.pdf||1.55 MB|