scom reporting - topqore
TRANSCRIPT
SCOM Reporting
Bob Cornelissen MVP
and the TopQore team
Introduction:
In our many SCOM engagements with customers and from experiences in the SCOM
community and forums, we have seen that the SCOM Reporting feature has been
under-utilized. Some of the reasons are insufficient awareness of its capabilities, other
priorities, disappointment from empty reports and many more. However, managers
and other stakeholders often ask for reports for purposes such as Capacity
Management, Incident Management, SLA/SLO uptime of machines or websites.
In this booklet we will try to get you started with SCOM Reporting and explain how to
access the vast wealth of data sitting in your Data Warehouse. We will talk about the
reporting feature and its intended audience, running reports, saving and exporting
reports for future use, and we will show a number of very useful reports for different
stakeholders and also for SCOM Admins.
Table of Content
− What is SCOM Reporting
− Who are the Stakeholders
− Discover SCOM Reporting
− Running the first report
− Exporting Reports
− Saving Reports for later runs
− Scheduling reports
− Why is my report empty?
− Reporting from a State View
− Useful reports - Performance Detail
− Useful reports - Data Volume By Management Pack
− Useful reports - OS performance reports
− Useful reports - SCOM Health Check Reports
− Useful reports - Availability and SLA
− Additional Reporting Options
− Epilogue
What is SCOM Reporting?
SCOM is a monitoring product and it
stores a lot of data in its databases. Short
term data is stored in the operational
database for a few days and can be
viewed directly from the SCOM console
views (for example). For the longer term
those data are also kept in the SCOM
Data Warehouse database and aggre-
gated into Hourly and Daily data sets
with additional statistical data, such as
highest and lowest values and standard
deviations. The data retention is usually
months to a year, to provide enough
retention to run reports over longer
periods of time. The data included has to
do with Performance Counters, Alerts,
Availability (health states of objects),
Events and several others.
In order to get data from the data ware-
house in native SCOM we have the SCOM
Reporting feature, which is based on SQL
Server Reporting Services with an extra
sauce on top. Other ways of displaying
longer term data is being used by paid
third party SCOM dashboarding prod-
ucts, such as Martello Live Maps portal
and the SquaredUp portal. This is
however mainly for viewing purposes to
see trends, like performance trends.
SCOM Reporting is meant to get relevant
data from SCOM and display it in a
report. This report can be saved,
exported, printed and scheduled for
example as we will show later in this
series. For example we can get a perfor-
mance report from a few servers sent to
us every month through the e-mail in a
PDF format. Some regular asks in
companies are SLA and KPI reports
having to do with base performance
counters of the operating system (CPU,
Memory, disk) and availability reports of
machines or websites. In the other
chapters we will go deeper into the
different aspects you may encounter
when reporting on SCOM and also show
you a few common ones.
Who are the Stakeholders
In this chapter we want to discuss the
Stakeholders. So in short: Who are we
doing it for? As with other aspects of
monitoring in general, also in SCOM and
SCOM Reporting we should have a closer
look at the stakeholders.
Why?
Because with monitoring tools often only
Third line System Admins are considered
to be using it and looking at it.
However monitoring and reporting is
important for several more people in the
company, or it could be, or it should be.
An example is that in larger companies
there are Audits. The auditors request
reports from management. And manage-
ment wants to have some regular reports
as well about the performance and avail-
ability of their systems. For reporting we
would consider these to be some of the
stake-holders who can request reports or
benefit from it:
- IT Managers
- SCOM Admins
- Company Managers (outside IT)
- Auditors
- Capacity Management
- Incident Management
When looking at monitoring and dash-
boarding in general there would be a few
more, such as an IT helpdesk for example.
If you have good dashboards to provide to
them about the current status of the
environment, that is usually great and
good enough.
Looking at these stakeholders you will
recognize that they likely need different
types of reports. Luckily SCOM provides
for a number of report types and many
are built-in in management packs or can
be created from the generic reports.
There are also a few community reports
which are very useful. As you can see
from the stakeholders list, a lot of them
are managers or management related and
you can imagine they often do not look at
the technical aspects of the backend
systems and causes, but need to have an
overview of performance and availabil-
ity and KPI’s. While the System Admins
usually focus on very specific technical
details, or can use reports for SCOM
specific things in the environment. We
will show a bit of everything in this book.
Discover SCOM Reporting
Where can you find SCOM Reporting?
If you want to start running reports, you
first need to gain access to the SCOM
Reporting pane as shown below in the
picture. We will show you other ways
later, but this is starting from the top.
After opening the SCOM Console, take a
look at the bottom-left to see if you have
access to the Reporting Pane. There
should be a green button there, saying
Reporting. If the green button is not there,
you can talk to your SCOM Admin to
arrange access to it for you or a group you
are a member of.
If you click Reporting, it will take you to
the Reporting Pane as shown in the
picture.
To the left you will see a list of report
folders and if you click for example the
Microsoft Generic Report Library you will
see a list of reports in the middle pane of
the screen. If the list is empty, this could
indicate either that you do not have rights,
or there is something wrong with SCOM
Reporting.
Some of these folders are default and
belong to SCOM, like the folder selected in
the picture. Others are coming from the
software specific management packs
imported into SCOM, such as SQL and
Windows Operating System.
The reports in the Microsoft Generic
Report Library are meant to be the start of
many other reports provided or that you
can make yourself. Such as generic alert,
performance, availability and event
based reports. For example you could
open up a Performance Detail report from
here and get a choice of targets to
report on.
You might select a group of servers which
are running some application and a time-
frame like Last Month and run the report.
You can save that selection for later and
have your “Application XYZ monthly
performance report” ready to run next
time, but based off the default provided
base reports.
One more thing to note in this view is the
Report Details section at the bottom-
middle of the screen when you select any
report with your mouse. The information
in those details is sometimes extensive
and sometimes limited. However,
whenever you want to run reports you are
not familiar with, you should check these
details. Often the details provide
information about what the report is
meant for and very importantly what kind
of objects you select in the report to get
any data shown! We will come back to that
later in the section talking about why a
report could turn up empty.
So now you know where the default
Reporting Pane is in SCOM and you can
find reports by scrolling the folders list
and the reports, we will talk a little bit
more about running reports in the next
section.
Running the first report
In this article we will use a basic
Performance report as an example. We
can use this example in the sections about
Exporting/Saving/Scheduling reports .
For this example we created a manage-
ment pack where we will store configura-
tion data related to monitoring of
“Application XYZ”. We created a group
called “Application XYZ Computers” and
saved the group in this management pack.
We made a few Windows Computers in
our demo environment a member of this
group. Now because at first we are looking
at base reports, we do not have to create
custom monitors or anything like that.
Now we want to have a performance
report of these servers and we are
beginning first from the base report
mentioned in the previous section. Later
we will show you another useful example
and link it at the bottom of the page.
The report we are going to use as an
example is not the easiest one, but it is a
very flexible one. There are some choices
to be made. If you select reports which are
ready made for a specific purpose, there
will be less choices to be made and the
report will be less difficult to run. Do not
worry however. Feel free to play along
with the example.
Let’s start making something…
Open the SCOM Console and go to the
Reporting Pane. Select Microsoft Generic
Report Library in the menu to the left and
select the Performance Detail report from
the list in the middle of the screen.
(see next page >>
Note at the bottom-middle of the screen
you see the Report Details. In this case
there is an explanation telling us what it
is used for, that there are additional lines
and areas in the report, what they mean
and some example usage entries below
that. Since this is a very generic report,
you can place any object in there with a
performance counter to show.
In other reports there is often a statement
telling you what types of objects to put in
the report (a computer, a database, a web-
site, a disk, etc.).
Now double click the report in the list to
open it or click Open. You will see the
figure below. We need two basic things
here: We need to know what timeframe
we want the data for, and what the
objects and counters will be.
Let’s first take care of the time, which is on
the left hand side of the report wizard.
Data aggregation is going to be Hourly or
Daily, depending on what you are looking
for. If you are running reports over the
course of a day or week or up to a month
you can take the Hourly aggregation to get
enough data points. If you are running
reports across an extended period of time
like a quarter or year, you might want to
select Daily aggregation.
Next are the From and To fields. You can
do nice things with these, such as selecting
specific dates, but also relative dates like
Last Week or Last Month. This is often used
in monthly reports where you select first
Previous Month First Day in the from field
and Previous Month Last Day in the To
field. Relative dates are useful because it
means we can later schedule this report to
run on the first day of every month and it
would pull in data from the month before
it. If you run a report for the first time we
always change the From field to Yesterday
and leave the rest of the fields default, so
this will be a report from the last 24 hours.
We will select the times of day as well, so
we will schedule it for 12:10 AM to 11:50 PM
to cover the whole days of the first and last
days of the month. Time Zone could be ap-
plicable if you are running reports for a
day and if you have to deal with counters
in a central database for servers all over
the world.
The Use Business Hours button is often
useful for KPI indicators and availability
reports, whereby we might only be
interested in the values during business
hours. During the night there might be
maintenance, backups, scan jobs,
automated jobs, and nobody online.
You may have this requirement to put in
only office hours on business days.
Next we need to pick which counters from
which objects we are interested in. We do
it by moving to the right hand side of the
report wizard and clicking the Change
button above the Objects section.
This can be a confusing area of the wizard.
What we are asked to do is create one or
more performance graphs. And select for
each graph which objects to put in there
and which counter. These are the three
areas in the wizard here. Let’s start.
Click the New Series button at the top.
Click the Add Group… button which
becomes available at the bottom.
(See next page >>
The Add Group wizard pops up. In the
search area at the top we type
“Application XYZ Computers” and click
Search. You could also try the group
called “All Windows Computers”, but
keep in mind that in large environments
a lot of servers are in there, which might
take some time.
After clicking the Search button, in the
middle of the wizard below “Available
Items” it found the group. Click the Add
button below it to add it to the bottom
field of the Add Group wizard and click
OK. This brings us back to the earlier
wizard and we now focus to the right-
bottom part where it says Rule and we
click Browse…
As you can see from the screenshot above
we went to find “processor time” in order
to find the rule which collects our CPU
usage on the servers. So we type
“Processor Time” and clicked the Search
button to see if we can find the counter. As
you can see it gives us 5 results and we
have to find the one we need from there.
In this case we want to have the counter
for the Windows 2016 Operating System
and make good use of the columns there to
find the one you need. You can see the
name of the counter and rule, but also the
management pack it is coming from
usually gives a good hint. What is also good
to keep in mind is the Rule Target. Because
this target must be in whatever group or
object you chose in the step before this.
Windows Server Operating System is a
class below the Windows Computer class
which are in our group, so this will be
alright. For your case and environment,
look for an applicable counter which will
suit you. It might be different from this
screenshot depending on the management
pack version you have loaded in SCOM.
Click the OK button to confirm the
selection of the rule.
Now we have all 3 fields entered in the
series we can move on. Of course you can
add multiple series here and select other
groups or counters. For now one series is
fine with us and we click OK.
Back in the Report Wizard itself there is a
Green arrow saying Run. Click that to
actually run the report with the selected
options. Give it a minute to run.
So that gave us the report visible to
the right. And yes we know, we
forgot to select Previous Month Last
day there and just left Today in the
To field in the date selector.
What we can see is a section where
it states when the report was run
with which time options.
We see the Rule used for this series
and it says 4 objects in this case. If
you click the + plus symbol next to it
you will see the 4 machines which
happen to be in the group we
created.
After that you see the graph with a
black line and yellow and blue
areas.
Below that is a green graph and
detail table showing you how many
data points this graph is based on
and how many data points there
were over the time period of the
report.
Now this is a graph for 4 machines at the same
time, so that average line hovering around the
20 to 40 percent CPU is the average across all
four machines for the same time frame. We
always have these kind of higher level reports
first to check if there is anything out of the
ordinary.
The Performance Detail report will do
that for us because it gives more than just
that average line. From the Blue and
Yellow areas we get statistical
information. Especially if you see the
blue area there, you can see at least one of
the machines at nearly every point in
time was near the 5% in CPU or less. Also
we see an area where at least one of the
CPUs was in the 60 to 80% range. It might
be interesting to investigate further for
this set of servers which ones are under-
utilized and which ones have a harder
time, so we can act accordingly by
perhaps adjusting the virtual hardware
specs of these machines.
For a look at how this report can be used
for reporting performance across large
groups of servers in often-changing data-
center environments, we had a blog post
years ago on https://blog.topqore.com,
named: “SCOM Reports on performance
counters for large groups of servers”. The
problem in those types of environments
is often that auditors or managers want to
see performance graphs every month in a
report, and within a month time there
have been several servers
removed or added and you need to try to
adjust the report every month to add or
remove those servers from the report. Of
course this is manual work and gets
forgotten as well, so the blog post
explains a way to handle such a situation.
Not completely automated either, but it
sure saved us some time before.
Concluding
We took a report from the Generic
reports library and selected the time
period the report should cover. Selected
a group of targets and a performance
collection rule containing what you are
interested in and run the report.
After this you would generally either
Export (Save) the report, or save it for
later runs, or schedule it to run automati-
cally. All these options will be handled in
the other chapter of this book. They all
start from this point: You selected the
report, selected time and objects, and you
ran the report and it shows data. And the
report is open in your screen that way. If
you want to follow any of the guides in
those other sections you can do that from
here or run any other report to the same
point. As long as you are looking at a
report with data in it.
Exporting Reports
How to export a SCOM report? In follow-
ing chapters we will write about how to
Save or Schedule the report.
The assumption here is that you have
opened a report, filled in the required
fields like time-range and objects and
successfully ran the report. You now have
a report in front of you with data.
From the menu near the top of the screen
there is a little button as indicated in the
figure above with the red rectangle. This
is the export button.
From there you can select which format
you want to export it in. There are options
such as Excel, PDF, TIFF, MHTML there.
If you select one, for example Excel, you
will get a popup while the report is being
generated like this:
And after a while the File Save wizard
comes up where you select your file path
to store the generated report.
From there you can open it up. For
example if we open up that report in
Excel we see this:
As you can see it has the fields, the graph
and the table of data points included. In
the SCOM report view those tables are
usually collapsed and while exporting
reports to other formats you might not
see the contents of collapsed tables. In
Excel as you see it is opened by default, so
the report does become longer, but most
focus on the graph near the top.
There is one more thing to note here:
We have found years ago that the Export
to PDF function can be problematic if you
plan on printing the pdf on paper. Many
managers and auditors print the reports
to paper versions when they arrive
through the e-mail for example. The
reason for this is that this PDF
export option uses the Letter paper
format. This is a default in the USA, but
not in many other countries (such as ours,
where it is A4 by default). Some printers
might give errors or refuse to print if they
don’t understand the format.
We have created a blog post years ago
that tries to solve this by adding 2 options
to the drop-down menu for PDF in A4
format (Landscape and Portrait orienta-
tion). This is done by adjusting a configu-
ration file belonging to SQL Reporting
Services and the method is shown for
SQL 2005, 2008 and higher. Because this
is a SQL Reporting Services thing it will
also work for other reports that you
export from SSRS in other products.
The blog post, named “SQL Reporting
Services Render PDF in A4 paper size in-
stead of Letter”, can be found here;
https://blog.topqore.com/sql-reporting-
services-render-pdf-in-a4-1/
Saving Reports for later runs
How to save a SCOM report for later
runs? It is possible to run the same report
again next time, without having to re-
configure all settings of the report and get
a similar result without much effort. Also
there are ways here to have somebody
else run the report while not having to
change all these fields.
The assumption here is that you have
opened a report, filled in the required
fields like time-range and objects and
successfully ran the report. You now
have a report in front of you with data.
What we often do is first run a report
with a time range between Yesterday and
Today to see if there is data. Next we
change the date range to for example
from Previous Month First Day to
Previous Month Last Day range and run
the report again. You can also take other
relative time frames.
You have a report ready now, and
preferably you have used relative ranges
in the date range. At the top of the report
click the File menu option and in the
drop down you will see a few
possibilities. The ones in the red square
are the interesting ones for this moment.
Let’s review these 3 options
Save to Favorites... This will show up in
the Reporting pane in the Favorite
Reports folder, which can only be seen
by you. We would advise to only do this
for reports which are already authored
and in the other folders or which you
Saved to a management pack.
Publish... This will show up in the
Reporting pane in the Authored Reports
folder and can be run from there later.
Save to management pack... This actually
makes an entry in a management of your
choosing and displays a folder in the
reporting pane with the name of the
management pack and a report inside of
it with the name you specify. These
reports can be found by anybody in
SCOM reporting pane and can be run and
scheduled as well.
Let’s use the last option as an example.
Click Save to Management Pack...
We have to try to give it an appropriate
name if possible. In this case it is showing
Performance data from last month
against the group of servers for Applica-
tion XYZ, so this is what the name of the
report reflects. Next you can give it a de-
scription. And we have to select a
management pack to save it in. We still
had our Application XYZ management
pack there to save it in. Next click the
Finish button. After a while the wizard
says it successfully saved the report in
the management pack.
After a few minutes you go to the top of
the report folders list in the left hand
menu, right-click on that yellow folder at
the top of the list called Reporting and
click Refresh. You should now see a
folder appear in the list on the left with
the name of that management pack we
saved the report in and to the right the
report inside it as you can see in the
figure here. If you double click this
report it will just run immediately with
the settings you defined before.
Look also at the extra items at the bottom
of the folders list to the left. There are the
Favorite Reports, Authored Reports and
Scheduled Reports.
In this chapter you learned how to
consistently run a report by saving it in
one of 3 ways so the report can be
accessed and run at a later time, without
needing so much configuration. The next
step might be to schedule a report to run
automatically and deliver it in PDF to an
e-mail account each month for example.
Scheduling reports
In this chapter you will learn how to
schedule a report, so you can get a report
delivered to you or your manager every
month in an e-mail for example .
The assumption here is that you have
opened a report, filled in the required
fields like time-range and objects and
successfully ran the report. You now have
a report in front of you with data. In
general you would first save this report or
publish it . From there you can run it again
and Schedule it.
In this case we take that report which we
saved to a management pack from our
previous section about Saving reports for
later runs. We run it and wait for the data
to appear.
In the top menu click File and select
Schedule...
You can also right-click a report from the
reports list and select Schedule from
there.
Next you will end up at a wizard where
you need to select the delivery method.
This could be a File share or e-mail. Lets
select e-mail and continue filling in some
fields (see figure):
In this case we started with a description
so it’s clear what this is.
Next the delivery method = e-mail. This
opens up more options in the wizard.
We specified the managers e-mail you
want this report delivered to and then the
reply-to address has filled in. If that field
is always the same you will know where it
comes from. Some e-mail bridgeheads
also require it for e-mails appearing to
come from inside the network.
Choose to include the report and select
from the drop-down list PDF this time.
The subject line below it is auto-generated
and you can change it.
We de-selected the Include Link option
there, because this manager does not have
SCOM console access and will just get this
report in the e-mail as the PDF
attachment.
Click Next. This brings us to the Schedule
tab (see next figure)
Again there are several options here for
scheduling. We selected Monthly and
from there we got the months selected
and we get a choice to select the calendar
days and selected the 1st of every month.
You can see multiple options exist also
for weeks and to send only on weekdays
and so on. We can set a starting date and
an end date if applicable. Click Next.
You will see the Parameters tab, which is
filled by the last run. If the content is as
desired you can just move on from here.
Otherwise adjust as needed. It is just
easier to do it first in the report and next
schedule it so you can ignore this screen.
Click Finish.
After a while the report will be scheduled.
In the SCOM Reporting pane you can
scroll down the left hand menu to
Scheduled Reports and find your
scheduled report in the middle pane.
From there you can either run it by click-
ing Open, or Edit the schedule or Cancel
the schedule.
In this chapter we showed you how to get
from a report you ran to a scheduled
report, which can run automatically on a
time schedule and deliver a report by file
share or e-mail in one of the supported
rendering .
Keep in mind that for successful sending
of e-mail the SQL Reporting Services
configuration must be edited to configure
what e-mail server it should use and any
authentication methods. Also an e-mail
bridgehead may need to have this server
added in the allow list to allow e-mail
coming from this server. For File Shares
you also might need to configure settings
for it to have a file share to write to and
authentication.
Why is my report empty?
In this chapter we will discuss some
common reasons why you might find your
report to be empty. It is a very common
thing when you start out playing with
reports.
There are a few common reasons why
reports you run through the SCOM console
reporting pane can be empty. You opened
up a report and you specified a number of
things like time range, objects, counters
and those kind of things. You run the
report and it runs successfully, but it is
empty.
These are some of the most common
reasons why the SCOM report is empty:
1. The data is not collected.
The performance collection rule, which is
meant to collect this counter is turned off.
You can check this by going to the
monitoring pane in the SCOM console and
finding the Windows Computers state
view. Its near the top of the list. Right-click
any computer you feel should have this
counter and select Open Performance
View. Have a look in the list at the bottom
to see if your counter is there, and if it is
there click the checkbox next to it to
confirm there is data in the graph. Of
course for Linux or Network related
counters you check in those related views.
For Windows Computers and most things
running on those machines you can select
the Windows Computer class, because it is
a parent (in the end) of most other classes
where your rule might be targeted.
Two things to keep in mind here are the
rule name with counter name and the
target. It is easier to find the counters
containing data in the graph by using this
method first, before going to the reporting
pane and trying to figure it out by guessing.
2. There might be a problem with the Data
Warehouse. Sometimes there can be
problems with the data flow and handling
to the Data Warehouse database or inside
of it. In this case you will have to trouble-
shoot what is wrong and check for alerts
in SCOM and event log entries on the
SCOM management servers and the SQL
server hosting the SCOM databases. If
something is wrong you will see notifica-
tions about it.
3. The wrong target is selected.
This might be the most common reason
for an empty report. SCOM is all about
targeting the right class or class-object.
All the rules and monitors are targeted at
a class. If you select to see Database Free
Space from a Website you are likely not
going to get an answer in the report. The
same if you want to know a Windows
logical disk free space from a Linux
server. They are different classes and the
rules are targeted elsewhere. If you target
correctly there is a higher chance of you
seeing the data. For stuff running on
Windows you either have to select the
correct target class where the rule
collecting the performance counter is
looking at, or you select a higher parent
class (for example Windows Computer).
In the case you go for a parent class to try
to be sure you have your counter, you
would add the machines in the report not
as an Object, but as a Group.
It sounds strange, but see the Windows
Computer as a bag for a group of things
sitting on the machine (operating system,
IIS, file shares, etc.). If you add it as object
you might only see counters for rules
targeted at the specific Windows
Computer class, but if you add the same
thing as a group, it will show you all child
classes and rules targeted at those as well.
Check back to number one in the list
above.
4. Try and select a different time frame.
For example try and run the report from
Yesterday to Today. Or Last week Monday
to Last week Friday. You can do the same
with step one as long as you have enough
data in your OpsDB database (usually sev-
en days).
For reporting it happens sometimes that a
different time frame suddenly gives data.
Sometimes it is because some data has not
been aggregated yet. Or you might find out
your Data Warehouse has a problem since
two days, because it does show older data,
but nothing for today.
We do spend time in the SCOM Trainings
we provide with our customers on the
Class Model and Health Model of SCOM
and therefore also targeting. Because it is
the most important thing in SCOM to
understand. Feel free to visit our web-
site to see SCOM training for both SCOM
Administrator and SCOM Operator.
Reporting from a State View
In general, SCOM reports can be found
from the SCOM Reporting pane. However
there are other ways to create a report.
A very easy one is to run a report from a
SCOM state view.
If you go to any state view in SCOM and
check out the right-hand side Tasks pane,
you will likely see a number of
interesting reports, which would be
targeted at what you are looking at in the
state view. In part this list consists of
targeted reports, based on the class or
class-object you have selected in the state
view.
If you go to another state view (for
example Windows Computer, or Website
or SQL Database Engine), you might see a
different set of reports listed in the Tasks
pane.
Some of these reports there are very
generic (Health, Performance History),
and others are more specific to a certain
type of data.
If you use this method you will most
likely get more useful data, because the
report would be targeted at and be
applicable to the selected object. From
there you can build out the report the
way you want and run, export, publish or
schedule it.
Useful reports
Performance Detail
In a previous chapter we showed you an
example of the Performance Detail
report, where we went through selecting
a time range, and made a series (graph)
for a group of servers called Application
XYZ Computers and we got the Processor
Time % counter in there relevant for the
report.
Let us show you the picture again
There are a few reasons why we think
this report is very useful.
If you know exactly which counters from
which servers you need, so a very
specific set, you can create performance
reports per server or per counter per
object.
However sometimes its needed to create
graphs of several objects (servers, web-
sites, databases, disks) at the same time.
Such as an audit or capacity across all
servers, or like the example we provided
the performance of a group of servers for
Application XYZ.
What makes the Performance Detail
report useful in this is the statistical
information around it. It is not simply a
line (the black line in the graph in the
figure above), but it also contains
additional information. These are the
minimum and maximum measures values
and the Standard Deviation. When we
look at a report of these four machines in
one graph in the example above, look at
the average line first. 30% processor
usage is good enough. However, in the
graph you see immediately if the Blue and
Yellow areas are very close to the average
line or if they are further out.
From the blue areas you can see that after
the first week the blue area is bigger on
the higher CPU side. From the blue area
you see minimum values below 5% and
near the top to 70-80%. That looks like at
least one machine might not have so
much to do and at least one machine may
have more to do in CPU. The blue area
shows the highest measured value across
all machines for the time frame (which is
an hourly data aggregation, based on
performance measurements - like five
minutes apart in this case). The yellow
area just shows you where the majority of
the data points for that time period are
located. If it close to the average line it
means the blue entries far outside the
line might be one-off’s.
Looking at this graph as somebody
interested in Application XYZ and its
performance and capacity planning, the
first week in the graph CPU is between
0 and 40% with average at 20%. You do
not have to check further. Looking at the 4
weeks after that, you see the average
creeping up a bit, but the highest values
are consistently higher.
You want to know which of the machines
is causing this to happen. That means you
can quickly see from this generic graph
bundling several machines together if you
are interested to know more, or if this is
enough info for you.
If you want to know more, zoom into each
of the four machines in the group for
Application XYZ. There is a simple trick
to it, that is hidden.
Look right above the graph. There is a
word “Actions” with a + plus sign next to
it. Click the + sign.
Now you opened it up with the Actions +
sign and you get a sub-menu. Click the
Performance details for every object
option and it renders you the same graph
for the same time period, but for each
machine separately.
Now there is a child report with four
graphs and you can see what is going on.
We saw this:
1. A machine sitting around 10% CPU
being quite constant, so this one may
account for the very low CPU entries. Is
this machine scaled too high in resources?
Might save some money here.
2. A machine sitting at 10% CPU or so, but
with regular (near exact) spikes to 80%.
Looks like once per day. Could be a
nightly job running there, maybe a
backup or anti virus scan? Could be
interesting to look at.
3. Another machine sitting between 0 and
10% CPU. Is this machine scaled too high
in resources? Might save some money
here.
4. The last machine looks like the reason
the graph changed from the first week.
We sat at 40% CPU and went to 60% CPU
average and later went down a bit again
near the end of the time period. This one
too has spikes once per day going up to
90% or more. This machine has the
highest load overall. Could be a candidate
to throw more resources at. Also a
candidate to investigate the daily spikes.
Just by looking at the main report there
was a reason to zoom in. We zoomed in
and found a few machines with different
behavior. As we expect with Application
XYZ, which might be built with a front-
end and back-end structure and we
found some reasoning for conclusions
relating to capacity management and
reasons to ask a sys admin to investigate
what is happening on the remaining
machines.
Here is a screenshot of that machine 4 in the list
above.
We want to draw your attention also to
the red number 1 in the figure. While
quickly scrolling through the graphs in
this ‘child’ report you might get the wrong
idea. This is because the scale on the
Y-axis of the graph can change. In this
case because it had values up to 100% it
adjusted the scale to run from 0 to 120.
But the other machine which had nothing
to do, has its scale running from 0 to 40
because there is nothing more to show.
But if that machine is running at 20% CPU
in a graph that scales from 0 to 40% it
looks like it is sitting at half the CPU.
Always look at the Y-axis in performance
reports, because they will auto-scale to
what is needed to render the picture. And
in this case it is not always a pure 0 to
100% as you would expect.
When you are done zooming in for this
‘child’ report you can click the blue back
button in the menu above it indicated
with the red 2 we put in the figure.
The report “Performance Detail” is a very
useful one for reporting across a larger
number of instances and determining if it
makes sense for you to zoom in and find
an object which clearly stands out from
the group and potentially needs your
attention. We can tell you it is very
difficult otherwise to look through many
graphs (each with different Y axis!) and
look for an odd one out which might not
be there. Now you know what to look for
(or not), you could zoom into the ‘child’
graphs and find the one you need to take
action on.
Useful reports
Data Volume By Management Pack
We want to discuss a very useful report
for SCOM Admins especially. For other
stakeholders this will be less interesting.
It is pure SCOM specific, but gives a LOT
of information. The report we are talking
about is the “Data Volume By Manage-
ment Pack”. We highly recommend each
SCOM Admin to run this report regularly
(Weekly for example). Let us have a look
at this report and some of the things it
shows us.
In the figure shown you can find the
location of the report in the folder System
Center Core Monitoring Reports. There
are other very useful reports in this
folder, but we will focus on the Data
Volume by Management Pack in this case.
Double click the report and change the
time frame. For instance Yesterday to
Today. In our case we took today minus
seven days until today to get a weeks data
(small demo environment).
Next in the middle of the report wizard
you see Data Types and Show Top. We
usually start with not filtering the Data
Types for a first look at the report. After
that we usually make choices in that list
to focus on Performance or Events or the
other choices. Reasoning here is that
usually the amount of performance
counter entries is so much higher than all
other data that the top-x in the list only
reflects the amount of data collected in
performance counters. If you de-select
that data type you will see another top-x
listing to work on. We will see this later.
The Show Top field we usually set to 40 or
50 entries, so we get a feeling of what is
going on. Let’s run it.
This is a screenshot from one of our demo/
test environments. The first question you
should ask yourself here is always a why?
The different columns give other types of
reasons to look at these. In short:
Performance and Events have to do with
the amount of data in the database,
causing large databases and a lot of data
flow. You can ask yourself if you need all
this information, if you view and report on
these counters and events. Some perfor-
mance collection rules might be turned off
if they are not needed or you can change
collection intervals.
Discovery Data, Alert Count and State
Changes. This has more to do with what
we call Config Churn. Even though the
numbers in these columns are lower than
Performance Counters, they are very
Important regarding the performance of
SCOM, the management servers and the
SCOM console or other dashboards.
Of course Alerts will be very visible to
SCOM Operators and if they are linked to
incident management as well. Also a few
core management packs are in this report
for this reason.
This report is very valuable for tuning
management packs, but also for finding
issues or potential issues affecting either
capacity, management server perfor-
mance or visible and less visible issues or
slow downs of SCOM and impact on the
users of SCOM.
As you can see in the figure above you can
click on the numbers in the table. Let click
in the performance column on the
Windows Operating System pack entry
(second in the list of packs in the figure).
This shows a list of in this case Rules and
names of the rules collecting data. It shows
the percentage of data volume within this
pack for each rule and number of data
points.
From here you can think about if you
need some of these counters to be
collected, or if you feel that the amount of
data is too much you could change the
collection interval on some counters.
Changing the collection interval from 5
minutes to 10 minutes effectively halves
the number of data points. Also, now you
can find the correct rule names, because
that is not always obvious in naming
convention.
Going back to the main report and
zooming into the events column you can
also ask yourself if you are actively using
this data. If you are never using the event
data to look back (using SCOM) or
reporting on it, maybe you can turn off
the collection of those. Especially the
Operating System packs can collect lot of
events, sometimes in the millions per day
if there is something wrong on some of the
machines on the network. You can also
keep those turned on, but we suggest in
that case to look at those on a daily basis.
You can zoom into that from the main
report and schedule the child report.
It often indicates machines having
problems with services or drivers
crashing sometimes every second and
thus rendering the monitored server
useless and sometimes impossible to even
login to. You will want to fix that for the
reason of fixing an application server,
and also for the amount of collected data
it results in for the SCOM databases. Act
on it or turn it off.
If we turn off the Data Types for
Performance and Events, we see this in
the same report (see figure below):
These 3 columns have to do with config
churn. Normally there are not that many
items in Discovery Data. This is because
you are not adding many servers and
websites each day, thus not many new
objects are discovered. If you see a high
number in there, it could be a manage-
ment pack with a wrong configuration. In
the past we have seen discoveries with a
counter used as a property. That changes
every time the object goes through
re-discovery.
Alert Count is something simple and you
know this looking at the SCOM console.
However, there might also be alerts you
do not see. For example if they happen
during the night and they close again
before you get back into the office, or if
these alerts are not forwarded to ticketing
or e-mail, or if they open and close within
a minute. It might still be useful to have a
look if these numbers are what you
expect. And of course zooming in you find
what most common alerts are and so on.
The State changes have to do with
Monitors in SCOM because they hold
state. A state change in for example an
object in SQL (first line in the figure
above) will result in a state change on that
monitor. However it can and will also
result in changes of state moving up the
health tree, from database file to database
to DB engine to server and several rollups
in between. Those rollups and so on result
in the more generic packs showing up in
this list for state changes. You do not have
to zoom into those much because they will
simply reflect rollups and such. But you
can image that SCOM has to calculate
through all those state changes and thus
many more rollups and parent class
objects and perhaps dashboards. This
causes a lot of work for the SCOM infra-
structure.
Check which are the state changes
causing the chain of state changes (SQL
Pack, OS pack, IIS pack, Defender pack in
our example case). If you solve those the
resulting config churn change makes a big
difference. And if it causes alerts as well
or state changes on an important business
dashboard this is even more visibly as
well.
As you can see, there are many reasons
why this report is important to a SCOM
admin and as the SCOM admin works on
it, it will also result in better results for
the whole business. By finding problems,
and by tuning SCOM and thus its
performance.
Useful reports
OS performance reports
In this chapter you will have a look at the
Operating System reports.
There are a few to look at. First of all there
are these types of folders relating to a ver-
sion of Windows, but the reports in it are
basically the same (see first figure).
Some of these reports can also be
accessed from the state views in the
monitoring pane as shown in an earlier
post in this series.
As you can see there are a number of
reports having to do with performance of
CPU, Memory, Disk and such.
Have a look through those and determine
what could be useful for you. There are
report details descriptions at the bottom
of the screen.
Another example is this Operating
System Configuration report, which gives
a look at the discovered inventory and
properties of the class objects found for
the class Windows Server Operating
System.
There is also another folder in this
reports list “Windows Server Operating
System Reports” (see figure below). This
has only two reports in it, but you will
find them useful. Let’s have a look at both
of them.
The first one is Performance By System:
This is a cut-out of this report targeted at
one server with a selection of seven days
of data. As you can see the report has
space for 7 days of data in it. If you select
only one day of data you can not see those
bar charts next to each other, so it would
make sense to select a few days.
You can see there are multiple graphs in
the report, all pre-prepared for you.
You can not change much in the layout,
but we find it very readable and gives a
trend across the week and shows some
numbers at the sides. Processor, memory,
disk, network. A very useful report to
show some base metrics for a server. You
can simply schedule this one.
Next we want to show you the “Performance By Utilization” report:
We left the parameters selection screen
open on purpose here, so you can see it.
We selected a week of data, Went for the
Windows Server Computer Group to get
all of our servers (not that many in the
demo environment obviously). Selected
the Utilization – Most option to show the
highest values of each counter. And the
number of systems we wanted to see in
the top-x tables.
Now below it if you run the report you
will see a number of tables with
different counters relating to a few
counters for Processor, Memory, Disk,
Network. And as you can see it simply
shows the top-x servers with the highest
utilization of each counter separately.
It is very useful to be able to determine
who are the machines with high
utilization of these counters, because
you can then do something about it.
By finding out if a process is doing more
than it needs to do, by finding resource
hogs, or by finding out your capacity
management for some servers needs to
add resources to some of the busy
machines.
Likewise you can also select the
Utilization – Least option in the
parameters to find the machines using
the least of these resources. That might
mean they have nothing to do, or they
might have too many system resources
assigned to them - you could claim some
resources back in capacity management.
In all, the reports coming with the
Windows Operating System
management packs, but also with the
Linux Operating System management
packs are very useful to have a look at
and use.
Useful reports
SCOM Health Check Reports
In this section we will have a look at the
“SCOM Health Check Reports”.
This is a set of reports created in the
SCOM community, by Pete Zerger and
Oskar Landman mainly.
What you do is download the package.*
You install the SCOM management packs.
Next you do not touch anything until you
read the Management Pack guide!
Why? Because nobody does. However, in
this case it is needed, because you need to
do something to make it work.
SCOM Reporting uses the Data Ware-
house database for everything. However
some reports in the SCOM Health Check
Reports have to be able to read in the
Operational database of SCOM as well.
Therefore you need to create an
additional Data Source in SQL Reporting
Services for that. It is pretty simple to do,
but needed to be able to run at least half
the reports in there.
You will see this list of reports now. As
you can see it has reports having to do
with Alerts, Events, Performance,
Monitors, Infrastructure, Agents, etc.
These reports are especially useful for
SCOM Admins. Have a look at how this
can help you optimize your SCOM
environment!
* The current version can still be found on TechNet Gallery, but it will be moved soon.
We will update the link to it when that happens.
https://gallery.technet.microsoft.com/SCOM-Health-Check-Reports-c32e8f93
Useful reports
Availability and SLA
In this last chapter about SCOM
reporting you will learn about
Availability and/or SLA/SLO reporting.
Basically stakeholders for the monitoring
often want to know if a server or applica-
tion was UP during the last month/week/
day. In SCOM this is defined by two
items, the agent itself being available
(Agent heartbeats through the watcher),
and the Health state of whatever it is you
are looking at. We must make a choice
here on what we call down.
Most often a red state is considered
critical and a down state. This is not
always the case in the real world of
course. But we have to make a choice on
what we define as down.
A server itself has an agent and we could
say the server is down when the agent is
down. Of course the server itself could be
running fine and the SCOM agent has a
problem. But we need to make a choice
on what we can go on. It is why we have
alerts and dashboards telling us when a
server is unavailable, and an object
monitored by SCOM into a red state.
So you can react to it.
Availability Reporting
In the Microsoft Generic Reports
Library you will find the Availability
Report template (see figure below).
If we open up the report we can make the
choices of time range again. In this case
we just selected the Previous Month
entries.
Next we define what objects to report on.
We left that popup screen open in the
picture above. And we used the Add
Group method, because it covers also
underlying objects.
We took as example the SCOM server
itself here, which is not what we are
normally interested in (if SCOM is down,
the data will not get into the report very
well). But as example we selected the
Health Service Watcher class object of
this server and imagine this is any
normal server. This is basically the thing
which tells you if the SCOM infra has
gotten heartbeats from that agent. For
SCOM this is an indication if the server is
up or not.
You can also select other objects living on
these servers. For example a database or
a website. The whole server being
available is not that interesting because
you are interested in what the machine is
actually doing!
To the right of the report wizard in the
back is also a list of health states you can
consider to be Down Time. So Critical is
by default, and all the other choices can
be added, even Warning state. Be careful
again, because you may report things as
down, while they were still effectively
running. Always understand your
choices, because you WILL have to
explain them to the stakeholders looking
at these reports!
SLA/SLO Reporting
Now there are cases where you have
implemented more monitoring, such as a
Distributed Application. This could be an
application with a front-end and a back-
end and maybe also made high available
across several servers. In some cases
these can be as simple as adding a single
Website object, but can also be much
more complex. You can run similar
reports on these, but you can also run
SLA / SLO reports! This is where you
define a threshold for the amount of
availability for a Distributed Application
or a Synthetic Check like a website.
Before you can run an SLO report you
must define an SLO first. By default there
are none defined in any default manage-
ment pack.
Go to the Authoring pane of the SCOM
console and go to Management Pack
Objects – Service Level Tracking.
Here you can create a new Service Level
Tracking SLO. You can select a
Distributed App or a Website check for
example as a target, and you need to
specify a percentage where you feel the
SLO will be broken. This could be at 90%
or 95% or whatever your needs are. Save
this in a management pack and wait for it
to gather some data.
Once you have defined the SLO target,
you can create either a dashboard for
displaying the SLA values or you can use
reporting to show you the SLA values.
In the SCOM Reporting pane you can find
the Microsoft Service Level Report
Library and in it the report Service Level
Tracking Summary Report (see figure
below).
For this report, you can specify the time
range (Previous Month or Previous
Quarter is the standard range for this
type of report, but for testing purposes, it
is recommended to use Yesterday to
Today) and the SLO target you are
looking for. The last thing to define is
which time periods you want to report on
in comparison to the initially specified
report duration. You could select
Previous Week, Previous Month, or
Previous Quarter and show them side by
side for each of the SLO targets you
specified. If you run this report it will
show you a few columns with the SLO
numbers for each selected time period
and for each object you selected to run
the report against.
Additional SLA Reporting
We discussed this a few years ago, but if
you happen to run Martello Live Maps
(used to be Savision Live Maps), you will
have Service definitions. A Service is the
same as a distributed application in this
case. If you create a Service from within
Live Maps this will automatically create
Service Level targets for each of the
Service sub structures (User,
Application, Infrastructure) and assign a
default threshold to it. It automatically
starts monitoring and displaying it in the
dashboarding and you can change the
threshold settings etc. from within. Also
you can turn on when you want to be
alerted of an SLA breach. If you happen
to have this product this could make
defining the service levels easier.
However even if you do not have this
product, there is still a very nice report
they have created.
The SCOM SLA Reporting Management
Pack is a free pack which can be
downloaded at: https://
martellotech.com/downloads/free-scom
-management-packs/
This management pack can run against
any SLA/SLO target you have. So if it is
SCOM or Live Maps related you can
target it and run the report. It will also
give you a drill-down possibility pointing
to the objects within that SLO with the
most problems, so you can find the cause
more easily.
Feel free to go get it and add it to your
arsenal of SCOM Reports.
Additional Reporting Options
Using the SCOM Reporting feature is the
most logical way to report on all kinds of
data being gathered and calculated by
SCOM. However, since the beginning
other methods have been used as well.
SCOM Console
First of all there is the SCOM Console.
You can create a performance view in
there, showing a graph for the last 7 days
for a certain counter or a few of them. It is
possible to export that as a picture.
Dashboarding Solutions
Another method of pulling data from
SCOM is by using the various SCOM
related dashboarding tools. For example
Martello Live Maps, SquaredUp and
OpsLogix. Often it is easy in these tools to
create the view you want to see using the
SCOM Data and exporting a picture of it
to use in a report. Some even have the
possibility to export the underlying data
points to a CSV for example.
Azure Monitor
It is also possible to use a hybrid solution
to also have Azure Monitor pull some
data to the cloud. For reporting often
Performance Data is used, which gets
collected from the agent and sent to the
cloud to your private workspace. There
you can report on the data without
aggregation for any period of time
(depending on your retention time in
Azure Monitor). From here you can show
the data in graphs or use queries to show
sets of data to further analyze.
Power BI
It is possible to connect Power BI
Desktop to your SCOM Data Warehouse
and create views from there. In the
beginning the connection needs to be
created and the right tables selected.
After that you can start creating views in
Power BI. An example of this was shown
by Cameron Fuller in a blog post that can
be found here;
https://www.catapultsystems.com/
blogs/using-power-bi-for-disk-space-
dashboards-and-reports-in-operations-
manager/
Lately there have also been a few Power
BI dashboards to show the health and
performance of a SCOM Infrastructure
as an example of what you can do with it.
An example of this is one from Silect,
available at ;
https://www.silect.com/dashboards-for-
scom/
Using a Power BI Enterprise Gateway
and a Power BI Pro account (paid) it is
also possible to connect to your SCOM
data from the cloud based version of
Power BI Sites and show data there. And
if the connection is in the cloud, you
could access this data through mobile
apps as well. Tao Yang has published an
article about Power Bi Sites with SCOM,
available at;
https://blog.tyang.org/2015/12/14/
extending-your-opsmgr-power-bi-
dashboards-to-power-bi-sites/
Epilogue
We hope that after reading this booklet
about SCOM Reporting you will feel more
comfortable with the feature and will try
out some of the suggestions included.
This way you can use the wealth of data
in the Datawarehouse to your advantage
and to inform the other stakeholders of
what is going on in the company
environment, such as performance,
capacity, alerts (incidents), health and
uptime of applications and machines.
These reports can be saved and run again
or scheduled to run regularly, so you do
not have to do the same work every week
to get these reports to the people who
need it.
Also, for SCOM Admins there are a
number of reports which are very useful
for keeping the SCOM Infrastructure
clean and healthy, so we highly suggest
having a look at those.
There are more possibilities using the
SCOM Reporting feature, such as using a
report builder or Visual Studio to create
custom reports with adjusted
visualizations and personalization, but
we did not go into those advanced topics
in this booklet.
During our SCOM Administrator
trainings, Migration handovers, Health
Checks and Maintenance as a Service we
look at how the reporting feature is used
and advise on how to make better use of
it. We understand that in a small booklet
we can not discuss all cases and details
about reporting and that you might want
further assistance.
If you visit topqore.com you will see
the services we provide and feel
free to contact us through
We will be happy to provide you with the
consultancy or products to improve your
SCOM environment in whatever way is
needed.
The TopQore Team