scom reporting - topqore

40
SCOM Reporting Bob Cornelissen MVP and the TopQore team

Upload: others

Post on 16-Oct-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SCOM Reporting - TopQore

SCOM Reporting

Bob Cornelissen MVP

and the TopQore team

Page 2: SCOM Reporting - TopQore
Page 3: SCOM Reporting - TopQore

Introduction:

In our many SCOM engagements with customers and from experiences in the SCOM

community and forums, we have seen that the SCOM Reporting feature has been

under-utilized. Some of the reasons are insufficient awareness of its capabilities, other

priorities, disappointment from empty reports and many more. However, managers

and other stakeholders often ask for reports for purposes such as Capacity

Management, Incident Management, SLA/SLO uptime of machines or websites.

In this booklet we will try to get you started with SCOM Reporting and explain how to

access the vast wealth of data sitting in your Data Warehouse. We will talk about the

reporting feature and its intended audience, running reports, saving and exporting

reports for future use, and we will show a number of very useful reports for different

stakeholders and also for SCOM Admins.

Page 4: SCOM Reporting - TopQore

Table of Content

− What is SCOM Reporting

− Who are the Stakeholders

− Discover SCOM Reporting

− Running the first report

− Exporting Reports

− Saving Reports for later runs

− Scheduling reports

− Why is my report empty?

− Reporting from a State View

− Useful reports - Performance Detail

− Useful reports - Data Volume By Management Pack

− Useful reports - OS performance reports

− Useful reports - SCOM Health Check Reports

− Useful reports - Availability and SLA

− Additional Reporting Options

− Epilogue

Page 5: SCOM Reporting - TopQore

What is SCOM Reporting?

SCOM is a monitoring product and it

stores a lot of data in its databases. Short

term data is stored in the operational

database for a few days and can be

viewed directly from the SCOM console

views (for example). For the longer term

those data are also kept in the SCOM

Data Warehouse database and aggre-

gated into Hourly and Daily data sets

with additional statistical data, such as

highest and lowest values and standard

deviations. The data retention is usually

months to a year, to provide enough

retention to run reports over longer

periods of time. The data included has to

do with Performance Counters, Alerts,

Availability (health states of objects),

Events and several others.

In order to get data from the data ware-

house in native SCOM we have the SCOM

Reporting feature, which is based on SQL

Server Reporting Services with an extra

sauce on top. Other ways of displaying

longer term data is being used by paid

third party SCOM dashboarding prod-

ucts, such as Martello Live Maps portal

and the SquaredUp portal. This is

however mainly for viewing purposes to

see trends, like performance trends.

SCOM Reporting is meant to get relevant

data from SCOM and display it in a

report. This report can be saved,

exported, printed and scheduled for

example as we will show later in this

series. For example we can get a perfor-

mance report from a few servers sent to

us every month through the e-mail in a

PDF format. Some regular asks in

companies are SLA and KPI reports

having to do with base performance

counters of the operating system (CPU,

Memory, disk) and availability reports of

machines or websites. In the other

chapters we will go deeper into the

different aspects you may encounter

when reporting on SCOM and also show

you a few common ones.

Page 6: SCOM Reporting - TopQore

Who are the Stakeholders

In this chapter we want to discuss the

Stakeholders. So in short: Who are we

doing it for? As with other aspects of

monitoring in general, also in SCOM and

SCOM Reporting we should have a closer

look at the stakeholders.

Why?

Because with monitoring tools often only

Third line System Admins are considered

to be using it and looking at it.

However monitoring and reporting is

important for several more people in the

company, or it could be, or it should be.

An example is that in larger companies

there are Audits. The auditors request

reports from management. And manage-

ment wants to have some regular reports

as well about the performance and avail-

ability of their systems. For reporting we

would consider these to be some of the

stake-holders who can request reports or

benefit from it:

- IT Managers

- SCOM Admins

- Company Managers (outside IT)

- Auditors

- Capacity Management

- Incident Management

When looking at monitoring and dash-

boarding in general there would be a few

more, such as an IT helpdesk for example.

If you have good dashboards to provide to

them about the current status of the

environment, that is usually great and

good enough.

Looking at these stakeholders you will

recognize that they likely need different

types of reports. Luckily SCOM provides

for a number of report types and many

are built-in in management packs or can

be created from the generic reports.

There are also a few community reports

which are very useful. As you can see

from the stakeholders list, a lot of them

are managers or management related and

you can imagine they often do not look at

the technical aspects of the backend

systems and causes, but need to have an

overview of performance and availabil-

ity and KPI’s. While the System Admins

usually focus on very specific technical

details, or can use reports for SCOM

specific things in the environment. We

will show a bit of everything in this book.

Page 7: SCOM Reporting - TopQore

Discover SCOM Reporting

Where can you find SCOM Reporting?

If you want to start running reports, you

first need to gain access to the SCOM

Reporting pane as shown below in the

picture. We will show you other ways

later, but this is starting from the top.

After opening the SCOM Console, take a

look at the bottom-left to see if you have

access to the Reporting Pane. There

should be a green button there, saying

Reporting. If the green button is not there,

you can talk to your SCOM Admin to

arrange access to it for you or a group you

are a member of.

If you click Reporting, it will take you to

the Reporting Pane as shown in the

picture.

To the left you will see a list of report

folders and if you click for example the

Microsoft Generic Report Library you will

see a list of reports in the middle pane of

the screen. If the list is empty, this could

indicate either that you do not have rights,

or there is something wrong with SCOM

Reporting.

Some of these folders are default and

belong to SCOM, like the folder selected in

the picture. Others are coming from the

software specific management packs

imported into SCOM, such as SQL and

Windows Operating System.

Page 8: SCOM Reporting - TopQore

The reports in the Microsoft Generic

Report Library are meant to be the start of

many other reports provided or that you

can make yourself. Such as generic alert,

performance, availability and event

based reports. For example you could

open up a Performance Detail report from

here and get a choice of targets to

report on.

You might select a group of servers which

are running some application and a time-

frame like Last Month and run the report.

You can save that selection for later and

have your “Application XYZ monthly

performance report” ready to run next

time, but based off the default provided

base reports.

One more thing to note in this view is the

Report Details section at the bottom-

middle of the screen when you select any

report with your mouse. The information

in those details is sometimes extensive

and sometimes limited. However,

whenever you want to run reports you are

not familiar with, you should check these

details. Often the details provide

information about what the report is

meant for and very importantly what kind

of objects you select in the report to get

any data shown! We will come back to that

later in the section talking about why a

report could turn up empty.

So now you know where the default

Reporting Pane is in SCOM and you can

find reports by scrolling the folders list

and the reports, we will talk a little bit

more about running reports in the next

section.

Running the first report

In this article we will use a basic

Performance report as an example. We

can use this example in the sections about

Exporting/Saving/Scheduling reports .

For this example we created a manage-

ment pack where we will store configura-

tion data related to monitoring of

“Application XYZ”. We created a group

called “Application XYZ Computers” and

saved the group in this management pack.

We made a few Windows Computers in

our demo environment a member of this

group. Now because at first we are looking

at base reports, we do not have to create

custom monitors or anything like that.

Now we want to have a performance

report of these servers and we are

beginning first from the base report

mentioned in the previous section. Later

we will show you another useful example

and link it at the bottom of the page.

The report we are going to use as an

example is not the easiest one, but it is a

very flexible one. There are some choices

to be made. If you select reports which are

ready made for a specific purpose, there

will be less choices to be made and the

report will be less difficult to run. Do not

worry however. Feel free to play along

with the example.

Let’s start making something…

Open the SCOM Console and go to the

Reporting Pane. Select Microsoft Generic

Report Library in the menu to the left and

select the Performance Detail report from

the list in the middle of the screen.

(see next page >>

Page 9: SCOM Reporting - TopQore

Note at the bottom-middle of the screen

you see the Report Details. In this case

there is an explanation telling us what it

is used for, that there are additional lines

and areas in the report, what they mean

and some example usage entries below

that. Since this is a very generic report,

you can place any object in there with a

performance counter to show.

In other reports there is often a statement

telling you what types of objects to put in

the report (a computer, a database, a web-

site, a disk, etc.).

Now double click the report in the list to

open it or click Open. You will see the

figure below. We need two basic things

here: We need to know what timeframe

we want the data for, and what the

objects and counters will be.

Page 10: SCOM Reporting - TopQore

Let’s first take care of the time, which is on

the left hand side of the report wizard.

Data aggregation is going to be Hourly or

Daily, depending on what you are looking

for. If you are running reports over the

course of a day or week or up to a month

you can take the Hourly aggregation to get

enough data points. If you are running

reports across an extended period of time

like a quarter or year, you might want to

select Daily aggregation.

Next are the From and To fields. You can

do nice things with these, such as selecting

specific dates, but also relative dates like

Last Week or Last Month. This is often used

in monthly reports where you select first

Previous Month First Day in the from field

and Previous Month Last Day in the To

field. Relative dates are useful because it

means we can later schedule this report to

run on the first day of every month and it

would pull in data from the month before

it. If you run a report for the first time we

always change the From field to Yesterday

and leave the rest of the fields default, so

this will be a report from the last 24 hours.

We will select the times of day as well, so

we will schedule it for 12:10 AM to 11:50 PM

to cover the whole days of the first and last

days of the month. Time Zone could be ap-

plicable if you are running reports for a

day and if you have to deal with counters

in a central database for servers all over

the world.

The Use Business Hours button is often

useful for KPI indicators and availability

reports, whereby we might only be

interested in the values during business

hours. During the night there might be

maintenance, backups, scan jobs,

automated jobs, and nobody online.

You may have this requirement to put in

only office hours on business days.

Next we need to pick which counters from

which objects we are interested in. We do

it by moving to the right hand side of the

report wizard and clicking the Change

button above the Objects section.

This can be a confusing area of the wizard.

What we are asked to do is create one or

more performance graphs. And select for

each graph which objects to put in there

and which counter. These are the three

areas in the wizard here. Let’s start.

Click the New Series button at the top.

Click the Add Group… button which

becomes available at the bottom.

(See next page >>

Page 11: SCOM Reporting - TopQore

The Add Group wizard pops up. In the

search area at the top we type

“Application XYZ Computers” and click

Search. You could also try the group

called “All Windows Computers”, but

keep in mind that in large environments

a lot of servers are in there, which might

take some time.

After clicking the Search button, in the

middle of the wizard below “Available

Items” it found the group. Click the Add

button below it to add it to the bottom

field of the Add Group wizard and click

OK. This brings us back to the earlier

wizard and we now focus to the right-

bottom part where it says Rule and we

click Browse…

As you can see from the screenshot above

we went to find “processor time” in order

to find the rule which collects our CPU

usage on the servers. So we type

“Processor Time” and clicked the Search

button to see if we can find the counter. As

you can see it gives us 5 results and we

have to find the one we need from there.

In this case we want to have the counter

for the Windows 2016 Operating System

and make good use of the columns there to

find the one you need. You can see the

name of the counter and rule, but also the

management pack it is coming from

usually gives a good hint. What is also good

to keep in mind is the Rule Target. Because

this target must be in whatever group or

object you chose in the step before this.

Windows Server Operating System is a

class below the Windows Computer class

which are in our group, so this will be

alright. For your case and environment,

look for an applicable counter which will

suit you. It might be different from this

screenshot depending on the management

pack version you have loaded in SCOM.

Click the OK button to confirm the

selection of the rule.

Now we have all 3 fields entered in the

series we can move on. Of course you can

add multiple series here and select other

groups or counters. For now one series is

fine with us and we click OK.

Back in the Report Wizard itself there is a

Green arrow saying Run. Click that to

actually run the report with the selected

options. Give it a minute to run.

Page 12: SCOM Reporting - TopQore

So that gave us the report visible to

the right. And yes we know, we

forgot to select Previous Month Last

day there and just left Today in the

To field in the date selector.

What we can see is a section where

it states when the report was run

with which time options.

We see the Rule used for this series

and it says 4 objects in this case. If

you click the + plus symbol next to it

you will see the 4 machines which

happen to be in the group we

created.

After that you see the graph with a

black line and yellow and blue

areas.

Below that is a green graph and

detail table showing you how many

data points this graph is based on

and how many data points there

were over the time period of the

report.

Now this is a graph for 4 machines at the same

time, so that average line hovering around the

20 to 40 percent CPU is the average across all

four machines for the same time frame. We

always have these kind of higher level reports

first to check if there is anything out of the

ordinary.

Page 13: SCOM Reporting - TopQore

The Performance Detail report will do

that for us because it gives more than just

that average line. From the Blue and

Yellow areas we get statistical

information. Especially if you see the

blue area there, you can see at least one of

the machines at nearly every point in

time was near the 5% in CPU or less. Also

we see an area where at least one of the

CPUs was in the 60 to 80% range. It might

be interesting to investigate further for

this set of servers which ones are under-

utilized and which ones have a harder

time, so we can act accordingly by

perhaps adjusting the virtual hardware

specs of these machines.

For a look at how this report can be used

for reporting performance across large

groups of servers in often-changing data-

center environments, we had a blog post

years ago on https://blog.topqore.com,

named: “SCOM Reports on performance

counters for large groups of servers”. The

problem in those types of environments

is often that auditors or managers want to

see performance graphs every month in a

report, and within a month time there

have been several servers

removed or added and you need to try to

adjust the report every month to add or

remove those servers from the report. Of

course this is manual work and gets

forgotten as well, so the blog post

explains a way to handle such a situation.

Not completely automated either, but it

sure saved us some time before.

Concluding

We took a report from the Generic

reports library and selected the time

period the report should cover. Selected

a group of targets and a performance

collection rule containing what you are

interested in and run the report.

After this you would generally either

Export (Save) the report, or save it for

later runs, or schedule it to run automati-

cally. All these options will be handled in

the other chapter of this book. They all

start from this point: You selected the

report, selected time and objects, and you

ran the report and it shows data. And the

report is open in your screen that way. If

you want to follow any of the guides in

those other sections you can do that from

here or run any other report to the same

point. As long as you are looking at a

report with data in it.

Exporting Reports

How to export a SCOM report? In follow-

ing chapters we will write about how to

Save or Schedule the report.

The assumption here is that you have

opened a report, filled in the required

fields like time-range and objects and

successfully ran the report. You now have

a report in front of you with data.

From the menu near the top of the screen

there is a little button as indicated in the

figure above with the red rectangle. This

is the export button.

From there you can select which format

you want to export it in. There are options

such as Excel, PDF, TIFF, MHTML there.

Page 14: SCOM Reporting - TopQore

If you select one, for example Excel, you

will get a popup while the report is being

generated like this:

And after a while the File Save wizard

comes up where you select your file path

to store the generated report.

From there you can open it up. For

example if we open up that report in

Excel we see this:

As you can see it has the fields, the graph

and the table of data points included. In

the SCOM report view those tables are

usually collapsed and while exporting

reports to other formats you might not

see the contents of collapsed tables. In

Excel as you see it is opened by default, so

the report does become longer, but most

focus on the graph near the top.

There is one more thing to note here:

We have found years ago that the Export

to PDF function can be problematic if you

plan on printing the pdf on paper. Many

managers and auditors print the reports

to paper versions when they arrive

through the e-mail for example. The

reason for this is that this PDF

export option uses the Letter paper

format. This is a default in the USA, but

not in many other countries (such as ours,

where it is A4 by default). Some printers

might give errors or refuse to print if they

don’t understand the format.

We have created a blog post years ago

that tries to solve this by adding 2 options

to the drop-down menu for PDF in A4

format (Landscape and Portrait orienta-

tion). This is done by adjusting a configu-

ration file belonging to SQL Reporting

Services and the method is shown for

SQL 2005, 2008 and higher. Because this

is a SQL Reporting Services thing it will

also work for other reports that you

export from SSRS in other products.

The blog post, named “SQL Reporting

Services Render PDF in A4 paper size in-

stead of Letter”, can be found here;

https://blog.topqore.com/sql-reporting-

services-render-pdf-in-a4-1/

Page 15: SCOM Reporting - TopQore

Saving Reports for later runs

How to save a SCOM report for later

runs? It is possible to run the same report

again next time, without having to re-

configure all settings of the report and get

a similar result without much effort. Also

there are ways here to have somebody

else run the report while not having to

change all these fields.

The assumption here is that you have

opened a report, filled in the required

fields like time-range and objects and

successfully ran the report. You now

have a report in front of you with data.

What we often do is first run a report

with a time range between Yesterday and

Today to see if there is data. Next we

change the date range to for example

from Previous Month First Day to

Previous Month Last Day range and run

the report again. You can also take other

relative time frames.

You have a report ready now, and

preferably you have used relative ranges

in the date range. At the top of the report

click the File menu option and in the

drop down you will see a few

possibilities. The ones in the red square

are the interesting ones for this moment.

Let’s review these 3 options

Save to Favorites... This will show up in

the Reporting pane in the Favorite

Reports folder, which can only be seen

by you. We would advise to only do this

for reports which are already authored

and in the other folders or which you

Saved to a management pack.

Publish... This will show up in the

Reporting pane in the Authored Reports

folder and can be run from there later.

Save to management pack... This actually

makes an entry in a management of your

choosing and displays a folder in the

reporting pane with the name of the

management pack and a report inside of

it with the name you specify. These

reports can be found by anybody in

SCOM reporting pane and can be run and

scheduled as well.

Let’s use the last option as an example.

Click Save to Management Pack...

Page 16: SCOM Reporting - TopQore

We have to try to give it an appropriate

name if possible. In this case it is showing

Performance data from last month

against the group of servers for Applica-

tion XYZ, so this is what the name of the

report reflects. Next you can give it a de-

scription. And we have to select a

management pack to save it in. We still

had our Application XYZ management

pack there to save it in. Next click the

Finish button. After a while the wizard

says it successfully saved the report in

the management pack.

After a few minutes you go to the top of

the report folders list in the left hand

menu, right-click on that yellow folder at

the top of the list called Reporting and

click Refresh. You should now see a

folder appear in the list on the left with

the name of that management pack we

saved the report in and to the right the

report inside it as you can see in the

figure here. If you double click this

report it will just run immediately with

the settings you defined before.

Look also at the extra items at the bottom

of the folders list to the left. There are the

Favorite Reports, Authored Reports and

Scheduled Reports.

In this chapter you learned how to

consistently run a report by saving it in

one of 3 ways so the report can be

accessed and run at a later time, without

needing so much configuration. The next

step might be to schedule a report to run

automatically and deliver it in PDF to an

e-mail account each month for example.

Page 17: SCOM Reporting - TopQore

Scheduling reports

In this chapter you will learn how to

schedule a report, so you can get a report

delivered to you or your manager every

month in an e-mail for example .

The assumption here is that you have

opened a report, filled in the required

fields like time-range and objects and

successfully ran the report. You now have

a report in front of you with data. In

general you would first save this report or

publish it . From there you can run it again

and Schedule it.

In this case we take that report which we

saved to a management pack from our

previous section about Saving reports for

later runs. We run it and wait for the data

to appear.

In the top menu click File and select

Schedule...

You can also right-click a report from the

reports list and select Schedule from

there.

Next you will end up at a wizard where

you need to select the delivery method.

This could be a File share or e-mail. Lets

select e-mail and continue filling in some

fields (see figure):

In this case we started with a description

so it’s clear what this is.

Next the delivery method = e-mail. This

opens up more options in the wizard.

We specified the managers e-mail you

want this report delivered to and then the

reply-to address has filled in. If that field

is always the same you will know where it

comes from. Some e-mail bridgeheads

also require it for e-mails appearing to

come from inside the network.

Choose to include the report and select

from the drop-down list PDF this time.

The subject line below it is auto-generated

and you can change it.

We de-selected the Include Link option

there, because this manager does not have

SCOM console access and will just get this

report in the e-mail as the PDF

attachment.

Click Next. This brings us to the Schedule

tab (see next figure)

Again there are several options here for

scheduling. We selected Monthly and

from there we got the months selected

and we get a choice to select the calendar

days and selected the 1st of every month.

Page 18: SCOM Reporting - TopQore

You can see multiple options exist also

for weeks and to send only on weekdays

and so on. We can set a starting date and

an end date if applicable. Click Next.

You will see the Parameters tab, which is

filled by the last run. If the content is as

desired you can just move on from here.

Otherwise adjust as needed. It is just

easier to do it first in the report and next

schedule it so you can ignore this screen.

Click Finish.

After a while the report will be scheduled.

In the SCOM Reporting pane you can

scroll down the left hand menu to

Scheduled Reports and find your

scheduled report in the middle pane.

From there you can either run it by click-

ing Open, or Edit the schedule or Cancel

the schedule.

In this chapter we showed you how to get

from a report you ran to a scheduled

report, which can run automatically on a

time schedule and deliver a report by file

share or e-mail in one of the supported

rendering .

Keep in mind that for successful sending

of e-mail the SQL Reporting Services

configuration must be edited to configure

what e-mail server it should use and any

authentication methods. Also an e-mail

bridgehead may need to have this server

added in the allow list to allow e-mail

coming from this server. For File Shares

you also might need to configure settings

for it to have a file share to write to and

authentication.

Page 19: SCOM Reporting - TopQore

Why is my report empty?

In this chapter we will discuss some

common reasons why you might find your

report to be empty. It is a very common

thing when you start out playing with

reports.

There are a few common reasons why

reports you run through the SCOM console

reporting pane can be empty. You opened

up a report and you specified a number of

things like time range, objects, counters

and those kind of things. You run the

report and it runs successfully, but it is

empty.

These are some of the most common

reasons why the SCOM report is empty:

1. The data is not collected.

The performance collection rule, which is

meant to collect this counter is turned off.

You can check this by going to the

monitoring pane in the SCOM console and

finding the Windows Computers state

view. Its near the top of the list. Right-click

any computer you feel should have this

counter and select Open Performance

View. Have a look in the list at the bottom

to see if your counter is there, and if it is

there click the checkbox next to it to

confirm there is data in the graph. Of

course for Linux or Network related

counters you check in those related views.

For Windows Computers and most things

running on those machines you can select

the Windows Computer class, because it is

a parent (in the end) of most other classes

where your rule might be targeted.

Two things to keep in mind here are the

rule name with counter name and the

target. It is easier to find the counters

containing data in the graph by using this

method first, before going to the reporting

pane and trying to figure it out by guessing.

2. There might be a problem with the Data

Warehouse. Sometimes there can be

problems with the data flow and handling

to the Data Warehouse database or inside

of it. In this case you will have to trouble-

shoot what is wrong and check for alerts

in SCOM and event log entries on the

SCOM management servers and the SQL

server hosting the SCOM databases. If

something is wrong you will see notifica-

tions about it.

Page 20: SCOM Reporting - TopQore

3. The wrong target is selected.

This might be the most common reason

for an empty report. SCOM is all about

targeting the right class or class-object.

All the rules and monitors are targeted at

a class. If you select to see Database Free

Space from a Website you are likely not

going to get an answer in the report. The

same if you want to know a Windows

logical disk free space from a Linux

server. They are different classes and the

rules are targeted elsewhere. If you target

correctly there is a higher chance of you

seeing the data. For stuff running on

Windows you either have to select the

correct target class where the rule

collecting the performance counter is

looking at, or you select a higher parent

class (for example Windows Computer).

In the case you go for a parent class to try

to be sure you have your counter, you

would add the machines in the report not

as an Object, but as a Group.

It sounds strange, but see the Windows

Computer as a bag for a group of things

sitting on the machine (operating system,

IIS, file shares, etc.). If you add it as object

you might only see counters for rules

targeted at the specific Windows

Computer class, but if you add the same

thing as a group, it will show you all child

classes and rules targeted at those as well.

Check back to number one in the list

above.

4. Try and select a different time frame.

For example try and run the report from

Yesterday to Today. Or Last week Monday

to Last week Friday. You can do the same

with step one as long as you have enough

data in your OpsDB database (usually sev-

en days).

For reporting it happens sometimes that a

different time frame suddenly gives data.

Sometimes it is because some data has not

been aggregated yet. Or you might find out

your Data Warehouse has a problem since

two days, because it does show older data,

but nothing for today.

We do spend time in the SCOM Trainings

we provide with our customers on the

Class Model and Health Model of SCOM

and therefore also targeting. Because it is

the most important thing in SCOM to

understand. Feel free to visit our web-

site to see SCOM training for both SCOM

Administrator and SCOM Operator.

Page 21: SCOM Reporting - TopQore

Reporting from a State View

In general, SCOM reports can be found

from the SCOM Reporting pane. However

there are other ways to create a report.

A very easy one is to run a report from a

SCOM state view.

If you go to any state view in SCOM and

check out the right-hand side Tasks pane,

you will likely see a number of

interesting reports, which would be

targeted at what you are looking at in the

state view. In part this list consists of

targeted reports, based on the class or

class-object you have selected in the state

view.

If you go to another state view (for

example Windows Computer, or Website

or SQL Database Engine), you might see a

different set of reports listed in the Tasks

pane.

Some of these reports there are very

generic (Health, Performance History),

and others are more specific to a certain

type of data.

If you use this method you will most

likely get more useful data, because the

report would be targeted at and be

applicable to the selected object. From

there you can build out the report the

way you want and run, export, publish or

schedule it.

Page 22: SCOM Reporting - TopQore

Useful reports

Performance Detail

In a previous chapter we showed you an

example of the Performance Detail

report, where we went through selecting

a time range, and made a series (graph)

for a group of servers called Application

XYZ Computers and we got the Processor

Time % counter in there relevant for the

report.

Let us show you the picture again

There are a few reasons why we think

this report is very useful.

If you know exactly which counters from

which servers you need, so a very

specific set, you can create performance

reports per server or per counter per

object.

However sometimes its needed to create

graphs of several objects (servers, web-

sites, databases, disks) at the same time.

Such as an audit or capacity across all

servers, or like the example we provided

the performance of a group of servers for

Application XYZ.

What makes the Performance Detail

report useful in this is the statistical

information around it. It is not simply a

line (the black line in the graph in the

figure above), but it also contains

additional information. These are the

minimum and maximum measures values

and the Standard Deviation. When we

look at a report of these four machines in

one graph in the example above, look at

the average line first. 30% processor

usage is good enough. However, in the

graph you see immediately if the Blue and

Yellow areas are very close to the average

line or if they are further out.

From the blue areas you can see that after

the first week the blue area is bigger on

the higher CPU side. From the blue area

you see minimum values below 5% and

near the top to 70-80%. That looks like at

least one machine might not have so

much to do and at least one machine may

have more to do in CPU. The blue area

shows the highest measured value across

all machines for the time frame (which is

an hourly data aggregation, based on

performance measurements - like five

minutes apart in this case). The yellow

area just shows you where the majority of

the data points for that time period are

located. If it close to the average line it

means the blue entries far outside the

line might be one-off’s.

Page 23: SCOM Reporting - TopQore

Looking at this graph as somebody

interested in Application XYZ and its

performance and capacity planning, the

first week in the graph CPU is between

0 and 40% with average at 20%. You do

not have to check further. Looking at the 4

weeks after that, you see the average

creeping up a bit, but the highest values

are consistently higher.

You want to know which of the machines

is causing this to happen. That means you

can quickly see from this generic graph

bundling several machines together if you

are interested to know more, or if this is

enough info for you.

If you want to know more, zoom into each

of the four machines in the group for

Application XYZ. There is a simple trick

to it, that is hidden.

Look right above the graph. There is a

word “Actions” with a + plus sign next to

it. Click the + sign.

Now you opened it up with the Actions +

sign and you get a sub-menu. Click the

Performance details for every object

option and it renders you the same graph

for the same time period, but for each

machine separately.

Now there is a child report with four

graphs and you can see what is going on.

We saw this:

1. A machine sitting around 10% CPU

being quite constant, so this one may

account for the very low CPU entries. Is

this machine scaled too high in resources?

Might save some money here.

2. A machine sitting at 10% CPU or so, but

with regular (near exact) spikes to 80%.

Looks like once per day. Could be a

nightly job running there, maybe a

backup or anti virus scan? Could be

interesting to look at.

3. Another machine sitting between 0 and

10% CPU. Is this machine scaled too high

in resources? Might save some money

here.

4. The last machine looks like the reason

the graph changed from the first week.

We sat at 40% CPU and went to 60% CPU

average and later went down a bit again

near the end of the time period. This one

too has spikes once per day going up to

90% or more. This machine has the

highest load overall. Could be a candidate

to throw more resources at. Also a

candidate to investigate the daily spikes.

Page 24: SCOM Reporting - TopQore

Just by looking at the main report there

was a reason to zoom in. We zoomed in

and found a few machines with different

behavior. As we expect with Application

XYZ, which might be built with a front-

end and back-end structure and we

found some reasoning for conclusions

relating to capacity management and

reasons to ask a sys admin to investigate

what is happening on the remaining

machines.

Here is a screenshot of that machine 4 in the list

above.

We want to draw your attention also to

the red number 1 in the figure. While

quickly scrolling through the graphs in

this ‘child’ report you might get the wrong

idea. This is because the scale on the

Y-axis of the graph can change. In this

case because it had values up to 100% it

adjusted the scale to run from 0 to 120.

But the other machine which had nothing

to do, has its scale running from 0 to 40

because there is nothing more to show.

But if that machine is running at 20% CPU

in a graph that scales from 0 to 40% it

looks like it is sitting at half the CPU.

Always look at the Y-axis in performance

reports, because they will auto-scale to

what is needed to render the picture. And

in this case it is not always a pure 0 to

100% as you would expect.

When you are done zooming in for this

‘child’ report you can click the blue back

button in the menu above it indicated

with the red 2 we put in the figure.

The report “Performance Detail” is a very

useful one for reporting across a larger

number of instances and determining if it

makes sense for you to zoom in and find

an object which clearly stands out from

the group and potentially needs your

attention. We can tell you it is very

difficult otherwise to look through many

graphs (each with different Y axis!) and

look for an odd one out which might not

be there. Now you know what to look for

(or not), you could zoom into the ‘child’

graphs and find the one you need to take

action on.

Page 25: SCOM Reporting - TopQore

Useful reports

Data Volume By Management Pack

We want to discuss a very useful report

for SCOM Admins especially. For other

stakeholders this will be less interesting.

It is pure SCOM specific, but gives a LOT

of information. The report we are talking

about is the “Data Volume By Manage-

ment Pack”. We highly recommend each

SCOM Admin to run this report regularly

(Weekly for example). Let us have a look

at this report and some of the things it

shows us.

In the figure shown you can find the

location of the report in the folder System

Center Core Monitoring Reports. There

are other very useful reports in this

folder, but we will focus on the Data

Volume by Management Pack in this case.

Double click the report and change the

time frame. For instance Yesterday to

Today. In our case we took today minus

seven days until today to get a weeks data

(small demo environment).

Next in the middle of the report wizard

you see Data Types and Show Top. We

usually start with not filtering the Data

Types for a first look at the report. After

that we usually make choices in that list

to focus on Performance or Events or the

other choices. Reasoning here is that

usually the amount of performance

counter entries is so much higher than all

other data that the top-x in the list only

reflects the amount of data collected in

performance counters. If you de-select

that data type you will see another top-x

listing to work on. We will see this later.

The Show Top field we usually set to 40 or

50 entries, so we get a feeling of what is

going on. Let’s run it.

Page 26: SCOM Reporting - TopQore

This is a screenshot from one of our demo/

test environments. The first question you

should ask yourself here is always a why?

The different columns give other types of

reasons to look at these. In short:

Performance and Events have to do with

the amount of data in the database,

causing large databases and a lot of data

flow. You can ask yourself if you need all

this information, if you view and report on

these counters and events. Some perfor-

mance collection rules might be turned off

if they are not needed or you can change

collection intervals.

Discovery Data, Alert Count and State

Changes. This has more to do with what

we call Config Churn. Even though the

numbers in these columns are lower than

Performance Counters, they are very

Important regarding the performance of

SCOM, the management servers and the

SCOM console or other dashboards.

Of course Alerts will be very visible to

SCOM Operators and if they are linked to

incident management as well. Also a few

core management packs are in this report

for this reason.

This report is very valuable for tuning

management packs, but also for finding

issues or potential issues affecting either

capacity, management server perfor-

mance or visible and less visible issues or

slow downs of SCOM and impact on the

users of SCOM.

As you can see in the figure above you can

click on the numbers in the table. Let click

in the performance column on the

Windows Operating System pack entry

(second in the list of packs in the figure).

This shows a list of in this case Rules and

names of the rules collecting data. It shows

the percentage of data volume within this

pack for each rule and number of data

points.

Page 27: SCOM Reporting - TopQore

From here you can think about if you

need some of these counters to be

collected, or if you feel that the amount of

data is too much you could change the

collection interval on some counters.

Changing the collection interval from 5

minutes to 10 minutes effectively halves

the number of data points. Also, now you

can find the correct rule names, because

that is not always obvious in naming

convention.

Going back to the main report and

zooming into the events column you can

also ask yourself if you are actively using

this data. If you are never using the event

data to look back (using SCOM) or

reporting on it, maybe you can turn off

the collection of those. Especially the

Operating System packs can collect lot of

events, sometimes in the millions per day

if there is something wrong on some of the

machines on the network. You can also

keep those turned on, but we suggest in

that case to look at those on a daily basis.

You can zoom into that from the main

report and schedule the child report.

It often indicates machines having

problems with services or drivers

crashing sometimes every second and

thus rendering the monitored server

useless and sometimes impossible to even

login to. You will want to fix that for the

reason of fixing an application server,

and also for the amount of collected data

it results in for the SCOM databases. Act

on it or turn it off.

If we turn off the Data Types for

Performance and Events, we see this in

the same report (see figure below):

These 3 columns have to do with config

churn. Normally there are not that many

items in Discovery Data. This is because

you are not adding many servers and

websites each day, thus not many new

objects are discovered. If you see a high

number in there, it could be a manage-

ment pack with a wrong configuration. In

the past we have seen discoveries with a

counter used as a property. That changes

every time the object goes through

re-discovery.

Page 28: SCOM Reporting - TopQore

Alert Count is something simple and you

know this looking at the SCOM console.

However, there might also be alerts you

do not see. For example if they happen

during the night and they close again

before you get back into the office, or if

these alerts are not forwarded to ticketing

or e-mail, or if they open and close within

a minute. It might still be useful to have a

look if these numbers are what you

expect. And of course zooming in you find

what most common alerts are and so on.

The State changes have to do with

Monitors in SCOM because they hold

state. A state change in for example an

object in SQL (first line in the figure

above) will result in a state change on that

monitor. However it can and will also

result in changes of state moving up the

health tree, from database file to database

to DB engine to server and several rollups

in between. Those rollups and so on result

in the more generic packs showing up in

this list for state changes. You do not have

to zoom into those much because they will

simply reflect rollups and such. But you

can image that SCOM has to calculate

through all those state changes and thus

many more rollups and parent class

objects and perhaps dashboards. This

causes a lot of work for the SCOM infra-

structure.

Check which are the state changes

causing the chain of state changes (SQL

Pack, OS pack, IIS pack, Defender pack in

our example case). If you solve those the

resulting config churn change makes a big

difference. And if it causes alerts as well

or state changes on an important business

dashboard this is even more visibly as

well.

As you can see, there are many reasons

why this report is important to a SCOM

admin and as the SCOM admin works on

it, it will also result in better results for

the whole business. By finding problems,

and by tuning SCOM and thus its

performance.

Page 29: SCOM Reporting - TopQore

Useful reports

OS performance reports

In this chapter you will have a look at the

Operating System reports.

There are a few to look at. First of all there

are these types of folders relating to a ver-

sion of Windows, but the reports in it are

basically the same (see first figure).

Some of these reports can also be

accessed from the state views in the

monitoring pane as shown in an earlier

post in this series.

As you can see there are a number of

reports having to do with performance of

CPU, Memory, Disk and such.

Have a look through those and determine

what could be useful for you. There are

report details descriptions at the bottom

of the screen.

Another example is this Operating

System Configuration report, which gives

a look at the discovered inventory and

properties of the class objects found for

the class Windows Server Operating

System.

There is also another folder in this

reports list “Windows Server Operating

System Reports” (see figure below). This

has only two reports in it, but you will

find them useful. Let’s have a look at both

of them.

Page 30: SCOM Reporting - TopQore

The first one is Performance By System:

This is a cut-out of this report targeted at

one server with a selection of seven days

of data. As you can see the report has

space for 7 days of data in it. If you select

only one day of data you can not see those

bar charts next to each other, so it would

make sense to select a few days.

You can see there are multiple graphs in

the report, all pre-prepared for you.

You can not change much in the layout,

but we find it very readable and gives a

trend across the week and shows some

numbers at the sides. Processor, memory,

disk, network. A very useful report to

show some base metrics for a server. You

can simply schedule this one.

Page 31: SCOM Reporting - TopQore

Next we want to show you the “Performance By Utilization” report:

We left the parameters selection screen

open on purpose here, so you can see it.

We selected a week of data, Went for the

Windows Server Computer Group to get

all of our servers (not that many in the

demo environment obviously). Selected

the Utilization – Most option to show the

highest values of each counter. And the

number of systems we wanted to see in

the top-x tables.

Now below it if you run the report you

will see a number of tables with

different counters relating to a few

counters for Processor, Memory, Disk,

Network. And as you can see it simply

shows the top-x servers with the highest

utilization of each counter separately.

It is very useful to be able to determine

who are the machines with high

utilization of these counters, because

you can then do something about it.

By finding out if a process is doing more

than it needs to do, by finding resource

hogs, or by finding out your capacity

management for some servers needs to

add resources to some of the busy

machines.

Likewise you can also select the

Utilization – Least option in the

parameters to find the machines using

the least of these resources. That might

mean they have nothing to do, or they

might have too many system resources

assigned to them - you could claim some

resources back in capacity management.

In all, the reports coming with the

Windows Operating System

management packs, but also with the

Linux Operating System management

packs are very useful to have a look at

and use.

Page 32: SCOM Reporting - TopQore

Useful reports

SCOM Health Check Reports

In this section we will have a look at the

“SCOM Health Check Reports”.

This is a set of reports created in the

SCOM community, by Pete Zerger and

Oskar Landman mainly.

What you do is download the package.*

You install the SCOM management packs.

Next you do not touch anything until you

read the Management Pack guide!

Why? Because nobody does. However, in

this case it is needed, because you need to

do something to make it work.

SCOM Reporting uses the Data Ware-

house database for everything. However

some reports in the SCOM Health Check

Reports have to be able to read in the

Operational database of SCOM as well.

Therefore you need to create an

additional Data Source in SQL Reporting

Services for that. It is pretty simple to do,

but needed to be able to run at least half

the reports in there.

You will see this list of reports now. As

you can see it has reports having to do

with Alerts, Events, Performance,

Monitors, Infrastructure, Agents, etc.

These reports are especially useful for

SCOM Admins. Have a look at how this

can help you optimize your SCOM

environment!

* The current version can still be found on TechNet Gallery, but it will be moved soon.

We will update the link to it when that happens.

https://gallery.technet.microsoft.com/SCOM-Health-Check-Reports-c32e8f93

Page 33: SCOM Reporting - TopQore

Useful reports

Availability and SLA

In this last chapter about SCOM

reporting you will learn about

Availability and/or SLA/SLO reporting.

Basically stakeholders for the monitoring

often want to know if a server or applica-

tion was UP during the last month/week/

day. In SCOM this is defined by two

items, the agent itself being available

(Agent heartbeats through the watcher),

and the Health state of whatever it is you

are looking at. We must make a choice

here on what we call down.

Most often a red state is considered

critical and a down state. This is not

always the case in the real world of

course. But we have to make a choice on

what we define as down.

A server itself has an agent and we could

say the server is down when the agent is

down. Of course the server itself could be

running fine and the SCOM agent has a

problem. But we need to make a choice

on what we can go on. It is why we have

alerts and dashboards telling us when a

server is unavailable, and an object

monitored by SCOM into a red state.

So you can react to it.

Availability Reporting

In the Microsoft Generic Reports

Library you will find the Availability

Report template (see figure below).

If we open up the report we can make the

choices of time range again. In this case

we just selected the Previous Month

entries.

Next we define what objects to report on.

We left that popup screen open in the

picture above. And we used the Add

Group method, because it covers also

underlying objects.

Page 34: SCOM Reporting - TopQore

We took as example the SCOM server

itself here, which is not what we are

normally interested in (if SCOM is down,

the data will not get into the report very

well). But as example we selected the

Health Service Watcher class object of

this server and imagine this is any

normal server. This is basically the thing

which tells you if the SCOM infra has

gotten heartbeats from that agent. For

SCOM this is an indication if the server is

up or not.

You can also select other objects living on

these servers. For example a database or

a website. The whole server being

available is not that interesting because

you are interested in what the machine is

actually doing!

To the right of the report wizard in the

back is also a list of health states you can

consider to be Down Time. So Critical is

by default, and all the other choices can

be added, even Warning state. Be careful

again, because you may report things as

down, while they were still effectively

running. Always understand your

choices, because you WILL have to

explain them to the stakeholders looking

at these reports!

SLA/SLO Reporting

Now there are cases where you have

implemented more monitoring, such as a

Distributed Application. This could be an

application with a front-end and a back-

end and maybe also made high available

across several servers. In some cases

these can be as simple as adding a single

Website object, but can also be much

more complex. You can run similar

reports on these, but you can also run

SLA / SLO reports! This is where you

define a threshold for the amount of

availability for a Distributed Application

or a Synthetic Check like a website.

Before you can run an SLO report you

must define an SLO first. By default there

are none defined in any default manage-

ment pack.

Go to the Authoring pane of the SCOM

console and go to Management Pack

Objects – Service Level Tracking.

Page 35: SCOM Reporting - TopQore

Here you can create a new Service Level

Tracking SLO. You can select a

Distributed App or a Website check for

example as a target, and you need to

specify a percentage where you feel the

SLO will be broken. This could be at 90%

or 95% or whatever your needs are. Save

this in a management pack and wait for it

to gather some data.

Once you have defined the SLO target,

you can create either a dashboard for

displaying the SLA values or you can use

reporting to show you the SLA values.

In the SCOM Reporting pane you can find

the Microsoft Service Level Report

Library and in it the report Service Level

Tracking Summary Report (see figure

below).

For this report, you can specify the time

range (Previous Month or Previous

Quarter is the standard range for this

type of report, but for testing purposes, it

is recommended to use Yesterday to

Today) and the SLO target you are

looking for. The last thing to define is

which time periods you want to report on

in comparison to the initially specified

report duration. You could select

Previous Week, Previous Month, or

Previous Quarter and show them side by

side for each of the SLO targets you

specified. If you run this report it will

show you a few columns with the SLO

numbers for each selected time period

and for each object you selected to run

the report against.

Page 36: SCOM Reporting - TopQore

Additional SLA Reporting

We discussed this a few years ago, but if

you happen to run Martello Live Maps

(used to be Savision Live Maps), you will

have Service definitions. A Service is the

same as a distributed application in this

case. If you create a Service from within

Live Maps this will automatically create

Service Level targets for each of the

Service sub structures (User,

Application, Infrastructure) and assign a

default threshold to it. It automatically

starts monitoring and displaying it in the

dashboarding and you can change the

threshold settings etc. from within. Also

you can turn on when you want to be

alerted of an SLA breach. If you happen

to have this product this could make

defining the service levels easier.

However even if you do not have this

product, there is still a very nice report

they have created.

The SCOM SLA Reporting Management

Pack is a free pack which can be

downloaded at: https://

martellotech.com/downloads/free-scom

-management-packs/

This management pack can run against

any SLA/SLO target you have. So if it is

SCOM or Live Maps related you can

target it and run the report. It will also

give you a drill-down possibility pointing

to the objects within that SLO with the

most problems, so you can find the cause

more easily.

Feel free to go get it and add it to your

arsenal of SCOM Reports.

Page 37: SCOM Reporting - TopQore

Additional Reporting Options

Using the SCOM Reporting feature is the

most logical way to report on all kinds of

data being gathered and calculated by

SCOM. However, since the beginning

other methods have been used as well.

SCOM Console

First of all there is the SCOM Console.

You can create a performance view in

there, showing a graph for the last 7 days

for a certain counter or a few of them. It is

possible to export that as a picture.

Dashboarding Solutions

Another method of pulling data from

SCOM is by using the various SCOM

related dashboarding tools. For example

Martello Live Maps, SquaredUp and

OpsLogix. Often it is easy in these tools to

create the view you want to see using the

SCOM Data and exporting a picture of it

to use in a report. Some even have the

possibility to export the underlying data

points to a CSV for example.

Azure Monitor

It is also possible to use a hybrid solution

to also have Azure Monitor pull some

data to the cloud. For reporting often

Performance Data is used, which gets

collected from the agent and sent to the

cloud to your private workspace. There

you can report on the data without

aggregation for any period of time

(depending on your retention time in

Azure Monitor). From here you can show

the data in graphs or use queries to show

sets of data to further analyze.

Power BI

It is possible to connect Power BI

Desktop to your SCOM Data Warehouse

and create views from there. In the

beginning the connection needs to be

created and the right tables selected.

After that you can start creating views in

Power BI. An example of this was shown

by Cameron Fuller in a blog post that can

be found here;

https://www.catapultsystems.com/

blogs/using-power-bi-for-disk-space-

dashboards-and-reports-in-operations-

manager/

Lately there have also been a few Power

BI dashboards to show the health and

performance of a SCOM Infrastructure

as an example of what you can do with it.

An example of this is one from Silect,

available at ;

https://www.silect.com/dashboards-for-

scom/

Using a Power BI Enterprise Gateway

and a Power BI Pro account (paid) it is

also possible to connect to your SCOM

data from the cloud based version of

Power BI Sites and show data there. And

if the connection is in the cloud, you

could access this data through mobile

apps as well. Tao Yang has published an

article about Power Bi Sites with SCOM,

available at;

https://blog.tyang.org/2015/12/14/

extending-your-opsmgr-power-bi-

dashboards-to-power-bi-sites/

Page 38: SCOM Reporting - TopQore

Epilogue

We hope that after reading this booklet

about SCOM Reporting you will feel more

comfortable with the feature and will try

out some of the suggestions included.

This way you can use the wealth of data

in the Datawarehouse to your advantage

and to inform the other stakeholders of

what is going on in the company

environment, such as performance,

capacity, alerts (incidents), health and

uptime of applications and machines.

These reports can be saved and run again

or scheduled to run regularly, so you do

not have to do the same work every week

to get these reports to the people who

need it.

Also, for SCOM Admins there are a

number of reports which are very useful

for keeping the SCOM Infrastructure

clean and healthy, so we highly suggest

having a look at those.

There are more possibilities using the

SCOM Reporting feature, such as using a

report builder or Visual Studio to create

custom reports with adjusted

visualizations and personalization, but

we did not go into those advanced topics

in this booklet.

During our SCOM Administrator

trainings, Migration handovers, Health

Checks and Maintenance as a Service we

look at how the reporting feature is used

and advise on how to make better use of

it. We understand that in a small booklet

we can not discuss all cases and details

about reporting and that you might want

further assistance.

If you visit topqore.com you will see

the services we provide and feel

free to contact us through

[email protected]

We will be happy to provide you with the

consultancy or products to improve your

SCOM environment in whatever way is

needed.

The TopQore Team

Page 39: SCOM Reporting - TopQore
Page 40: SCOM Reporting - TopQore

www.TopQore.com

[email protected]

Together, we do more!