The Capacity Reports Management Pack contains a set of reports that give you a powerful forecasting tool for your Operations Manager environment. This blog focuses on the “Forecast Performance Percentage Value Report” reports.
Besides providing forecasted capacity values, the Forecast Performance Percentage Value Reports allow you to identify thresholds in time to proactively respond to the forecasted capacity issue and resolve it before it causes downtime or becomes an issue.
Consider the graph above. The graph shows a steady decline in free disk space (of a particular storage device) over time. The X- Axis shows the date and the Y-Axis shows the percent of free disk space expressed in a percentage. The blue line represents the actual data in the SCOM data warehouse on the free disk space (of a particular storage device). The red-dotted line represents the forecast line (or trend line) for the free disk space data points. When looking at the forecast line, we can see that the trend for free disk space is downwards towards 0%. This would mean that (at point T3 in the graph) the disk would be completely out of free space. Now, as we know, being out of free disk space only means bad things and potentially angry customers or users. Ideally we would like to know well in advance that we are going to run out of free disk space so that we can proactively take action such as cleaning up the disk, or extending the disk.
The “Forecast Performance Percentage Value Report” reports are specifically designed to help you avoid a situation where you would unexpectedly end up at point T3 in the graph, where you have 0% free disk space left. By letting you set thresholds, the report allows you to receive warnings on approaching capacity issues.
When we examine the configurable parameters in the “Forecast Performance Percentage Value Report – displays all capacity objects” report, there are two parameters that can be set in order to receive a warning before the disk is at capacity (T3). The two parameters that can be set are:
- Number of days for Warning Level
- Number of days for Critical Level
The “Number of days for Warning Level” corresponds with point T1 in the Free Disk Space graph. When we set the “Number of days for Warning Level” threshold, it will cause the report to show a warning when this threshold is breached as shown in the image below. In this example, the threshold is set to 150 days. This is the distance in days between T1 and T3 on the graph.
The “Number of days for Critical Level” corresponds with point T2 in the Free Disk Space graph. When we set the “Number of days for Critical Level” threshold, it will cause the report to show a critical state when this threshold is breached, as shown in the image below. In this example the threshold is set to 100 days. This is the distance in days between T2 and T3 in the graph.
You might also notice that the “Today” point is still to the left of T1 and outside of the area between T1 and T3. If today is 1-Jan-2016 and T3 is 10-June-2016, the number of days between Today and T3 is 165. Thus, the number of days until capacity (T3) is reached is 165. This means that in this case, “Today” is outside of both the warning (T1) and critical (T2) thresholds set, and therefore displaying a healthy state in the report as shown below.
In some cases after importing the Capacity Reporting Management Pack, you might receive the error message “An error has occurred during report processing. (rsProcessingAborted) Query execution failed for dataset ‘DataSet_LinearRegression’. (rsErrorExecutingCommand) For more information about this error navigate to the report server on the local server machine, or enable remote errors” when trying to run one of the reports.
This error is usually caused by a setting on the SQL server which hosts the SCOM data warehouse database. The setting on the SQL server prohibits the reports from executing the assembly needed for forecasting portion of the reports. For additional information please see the following Microsoft knowledge base article: https://msdn.microsoft.com/en-us/library/ms254506(v=vs.80).aspx
To allow the reports to make use of (and execute) the assembly, the query below should be run on the SQL server which hosts the SCOM data warehouse database. This query can only be run by an account with sysadmin privileges.
sp_configure ‘clr enabled’, 1
“The OpsLogix Capacity Intelligent Management Pack (IMP) provides administrators with a forecasting tool in the System Center Operations Manager™ console that enables accurate strategic prediction of the enterprise IT capacity.”
In 2012, Microsoft updated the SCOM 2007 R2 platform and SCOM 2012 added new features for High Availability (HA), application performance monitoring, network device monitoring, Java Application server monitoring and dashboards. Parallel to the release of SCOM 2012, OpsLogix began working on the capacity MP for System Center Operations Manager basing this on the dashboard technology that was added in the new 2012 version of Operations Manager.
OpsLogix has always maintained the core principle of making an MP as “natively true” as possible. That is to say that OpsLogix Management Packs should be as simple as possible to set-up, configure and fully integrate into the existing System Center framework. Management Packs should directly contribute to easier systems manageability without compromising security or increasing the size of the system’s footprint by having to add additional components.
The first iterations of the capacity MP looked very promising and worked very well in small environments. Unfortunately scalability and sizing quickly became an issue when the capacity MP was imported into larger environments. Its behavior became erratic. In the best case scenario retrieving data and displaying it was extremely sluggish, in the worst case scenario it caused the SCOM console to freeze or even crash.
After a series of tests in our own lab and in the QA environment of some of our larger beta testing partners it was concluded that the primary reason for this was because of the dashboard technology of Operations Manager itself. This is noticeable even without any management packs when trying to utilize out-of-the-box SCOM dashboard views and a large number of objects are added. The Object Picker runs into stability issues when trying to add too many objects.
Microsoft has since tweaked the dashboard technology in SCOM 2012 R2. Once again, after some code rewriting of their own, OpsLogix went into a new round of testing. Unfortunately, the SCOM performance (coding) improvements were nowhere near the point they needed to be to leverage consistent stable usage through the capacity MP. Since the OpsLogix capacity dashboards relied heavily on the SCOM dashboarding technology, the capacity MP dashboards performed far below the expected performance point.
We had hoped that the dashboarding performance would have improved dramatically with the release of SCOM 2012 R2, but this is not yet the case. Because of all the setbacks, for the time being, OpsLogix decided to abandoned the approach of building a capacity MP on the Operations Manager dashboarding technology.
The multiple rounds of testing have not been in vain and OpsLogix has listened to the community and have gone back to the drawing board to completely rework the capacity MP. The main point of feedback that we got was that having capacity dashboards that can be played around with are fun and nice to have but they are not in any way essential to the daily activities of a system administrator or operator.
Capacity management is something that is needed and looked at by system architects, analysts, consultants and of course, IT managers and CTOs. In all of these cases periodic reports are considered mandatory and as result a start has been made to leverage and expand the SCOM reporting technology.
As can be seen in the above screenshot, we have an alpha version of a capacity reporting MP but this still needs some work before we can release it for beta testing with customers.