how analytics can help backup administrators
TRANSCRIPT
HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS
Balaji PanchanathanEMC - Avamar - Engineer [email protected]
2014 EMC Proven Professional Knowledge Sharing 2
Table of Contents
Introduction ................................................................................................................................ 3
Data collection ........................................................................................................................... 3
Data Protection Advisor .......................................................................................................... 3
Backup and Recovery Manager.............................................................................................. 4
Avamar ................................................................................................................................... 6
Enterprise Manager ............................................................................................................ 6
MCGUI ................................................................................................................................ 7
Data Analysis ............................................................................................................................. 7
Range..................................................................................................................................... 7
Variation ................................................................................................................................. 8
Coefficient of Variation ........................................................................................................... 8
Time series Analysis ..............................................................................................................11
Regression Analysis ..............................................................................................................12
Storage Usage trend ..........................................................................................................12
Input data ...........................................................................................................................12
Regression output ..............................................................................................................13
CPU usage Disk I/O ...........................................................................................................14
Visualization ..............................................................................................................................14
Conclusion ................................................................................................................................15
Reference .................................................................................................................................16
Disclaimer: The views, processes, or methodologies published in this article are those of the
author. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies.
2014 EMC Proven Professional Knowledge Sharing 3
Introduction
This article will focus on how analytics can make solve problems and make a backup
administrator’s job easier and more fruitful.
Backup administrators typically face a couple of problems.
1. Backup failures
2. Ever increasing datasets and need for increasing backup window
Their jobs will become more fruitful if they:
1. Improve backup efficiency, robustness
2. Improve the reliability
3. Periodically report to the management which types of systems/database are backed
up. (This will help management determine percentage usage of each department
and whether the appropriate things are backed up)
Backup administrator’s life will be made easier by doing an analytics project, which usually
three stages.
1. Data Collection
2. Data Analysis
3. Data Reporting
Data collection
Customers using EMC Avamar® backup products can collect backup data collected from:
1. Data Protection Advisor (DPA)
2. Backup and Recovery Manager (BRM)
3. Enterprise Manager (EM) in Avamar
4. MCGUI – Avamar Administrator GUI
Data Protection Advisor
Data Protection Advisor monitors, analyzes, and reports the backup environment, can manage
multiple backup products, and list the details in the data set.
2014 EMC Proven Professional Knowledge Sharing 4
DPA features which will help in improving the backup administrator’s life include:
1. Backup reports – How many clients are backed up, backup failures across client, etc.
2. Capacity planning – capacity reports
3. Utilization – CPU/Memory. This report will help find bottlenecks in case of problems. The
previous section mentioned that if there are backup failures or the backup speed is low
during the particular time, then the CPU/memory utilization at that time can be checked
using Data Protection Advisor.
Backup and Recovery Manager
Backup and Recovery Manager (BRM) can be considered a miniature version of Data
Protection Advisor. DPA can monitor backup environments, storage devices, and also backups
from different vendors, whereas BRM can monitor EMC backup devices; Avamar®, NetWorker®,
and Data Domain®.
The BRM tool can be used to forecast capacity usage. In BRM under Reports tabs, under
System Summary report.
The reports section has options to run the backup summary report from which analysis can be
done (explained in the data analysis section of this article). The backup report has details about
the time zone, duration, dataset, and domain which can be used to perform further analysis
related to max, range, variance, etc, explained in the data analysis section of the article. Below
is a snapshot of the backup summary report exported in Excel.
sys
te
mT
ype
client system group sta
tus
startTim
e
endTim
e
du
ra
tio
n
dat
aCh
ang
ed
pl
u
gi
n
datase
t
tota
lSiz
e
de
du
pR
atio
Av
am
ar
vcente
ribis.br
svblr.c
om
HMSP
1.BRS
VBLR.
COM
vcente
ribis.br
svblr.c
om
co
mp
let
ed
2014-
01-
01T18:1
2:48.63
1-08:00
2014-
01-
01T18:1
5:58.83
3-08:00
3 226
889
398
2
3
0
0
1
/Client
On-
Dema
nd
Data
286
983
037
9
0.2
09
39
78
8
Av rhel64 GVSP1 rhel64 fail 2014- 2014- 4 0 1 //?/MO 0 0
2014 EMC Proven Professional Knowledge Sharing 5
am
ar
dtlt -
117.BR
SVBLR
.COM
dtlt ed 01-
06T04:2
8:36.90
9-08:00
01-
06T04:3
3:08.77
8-08:00
0
0
1
D-
13890
11316
855
Av
am
ar
rhel64
dtlt
GVSP1
-
117.BR
SVBLR
.COM
rhel64
dtlt
fail
ed
2014-
01-
06T03:5
5:33.71
4-08:00
2014-
01-
06T04:0
0:34.87
9-08:00
5 0 1
0
0
1
/lindat
a
114
094
201
17
1
Av
am
ar
rhel64
dtlt
GVSP1
-
117.BR
SVBLR
.COM
rhel64
dtlt
fail
ed
2014-
01-
06T04:3
5:23.01
2-08:00
2014-
01-
06T04:3
9:54.77
7-08:00
4 0 1
0
0
1
//?/MO
D-
13890
11722
947
0 0
Av
am
ar
rhel64
dtlt
GVSP1
-
117.BR
SVBLR
.COM
rhel64
dtlt
fail
ed
2014-
01-
06T03:2
6:52.19
3-08:00
2014-
01-
06T03:5
5:28.20
6-08:00
28 114
094
201
17
1
0
0
1
/lindat
a
114
094
201
17
0
Av
am
ar
rhel64
dtlt
GVSP1
-
117.BR
SVBLR
.COM
rhel64
dtlt
fail
ed
2014-
01-
05T23:0
7:21.90
1-08:00
2014-
01-
05T23:1
1:54.95
2-08:00
4 0 1
0
0
1
/lindat
a
0 0
Av
am
ar
rhel64
dtlt
GVSP1
-
117.BR
SVBLR
.COM
rhel64
dtlt
fail
ed
2014-
01-
06T01:0
4:34.17
0-08:00
2014-
01-
06T01:0
9:06.03
6-08:00
4 0 1
0
0
1
/lindat
a
0 0
2014 EMC Proven Professional Knowledge Sharing 6
A snapshot of where the capacity forecast can be done is shown below.
Avamar
Enterprise Manager
Enterprise Manager can be used to manage multiple Avamar servers and provide capacity
forecast reports. Below is a snapshot of one of the UI windows where reports can be exported.
2014 EMC Proven Professional Knowledge Sharing 7
MCGUI
MCGUI is an administrative tool provided by Avamar for managing the backup environment. In
MCGUI, capacity reports can be run to check when the capacity will be reached. Below is the
snapshot from where you can run the capacity report.
Similarly in Data Domain, if AutoSupport feature is enabled, a report can be sent to a central
server where regression analysis will be done to predict when capacity will be reached.
Data Analysis
In this section we will see how the data collected in the previous section can be put to use. We
will start with simple analytics functions and gradually move on to complex analytical tools and
how they can be used to solve problems faced by backup administrators.
Range
Suppose that we measure the backup throughput of different backups taken and check the
maximum and minimum throughput. If the range is very limited, i.e. minimum is 100Mb/hr and
maximum is 105Mb/hr, the scope of analysis will be limited. In other words, the benefit of doing
the analysis will be less. If the range is very wide, i.e. minimum is 10Mb/hr and 1Gb/hr, the
2014 EMC Proven Professional Knowledge Sharing 8
range is very high and it makes sense to analyze further. The next part of analysis will start with
measuring variation.
Variation
The range could be very high if just one of the backups took a long time to take or one of the
backups took a very short time to complete. Hence, calculating the variation will give more
information variability of the backup throughput at various times. If the variance is high, further
analysis needs to be done to find which factors cause the variance to be high. If we export the
report to Excel, variance can be calculated easily. Functions in the Excel sheet is displayed
below.
Coefficient of Variation
Just as looking at variation might be misleading, the best approach to find the coefficient of
variation is standard deviation/mean. The variation or standard deviation might be misleading
because depending on the unit of measurement or the range of values, the variation might give
a wrong picture. For example, if the unit of backup speed is in Kbps and values are in a range of
1000Kbps-1500Kbps, then standard deviation can be in the range of 400. If the unit is in Mbps
and value is in the range of 1Mbps-1.5Mbps, then the standard deviation will be in the range of
0.5. Clearly, we cannot come to a conclusion directly from the value. However, a conclusion can
be easily reached from the value of co-efficient of variation. The snapshot below of an Excel
sheet with both standard deviation and coefficient of variation make it clear why co-efficient of
variation is a better measure.
1000 4
1015 5
1020 7
1030 3
1000 6
1000 4
Variation 164.17 2.17
Standard Deviation 12.81 1.47
Coeeficient of variation 0.01 0.30
As shown, even though the variation in column 1 is low, the standard deviation is high
compared to column 2, due to its higher values. However, the coefficient of variation reflects the
variation properly. Thus, with coefficient of variation we could correctly conclude that the
variation is greater in column 2 than in column 1.
2014 EMC Proven Professional Knowledge Sharing 9
To measure the variation of a range of values the best thing is to measure the co-efficient of
variation. The Excel commands that can be used to calculate these values are shown below.
=VAR(D4:D9)
=STDEV(D4:D9)
=STDEV(D4:D9)/AVERAGE(D4:D9)
First, filter the backup speed of the various backups by different factors, i.e. client, time,
geography, etc., then calculate the variance under each category to get more clues.
Step 1 Calculate the average of the backup speed by categories such as client, time period,
geography, etc.
Step 2 Compare the averages among the clients for the backup speed and find the coefficient
of variation of those averages. If the coefficient of variation is high, look for the outliers, i.e. for
which client the speed is low. In a similar fashion, take the average backup speed for each time
period (different time periods such as 9AM – 10 AM, 10AM – 11AM, etc.) and find the coefficient
of variation among these averages. If the coefficient of variation is high, conduct further
analysis. This type of variation calculation will be done for different categories; client, time
period, geography, etc.
Step 3 The next step for each category where the coefficient of variation is high is to look at
each category where the average backup speed is very low and frame rules so that the average
backup speed increases. For instance, first look at time period (9AM – 10 AM) to determine if
the backup speed is slow during that particular time.
Step 4 The next step is a repeat of step 2. That is, in that time period, take all the backups and
see the coefficient of variation. If it is very low, backups from this time period for all clients will
be moved to another time period where the backup speeds are high. This will be one rule.
The second rule will be to perform the steps below if the coefficient of variation is high.
For each client, check the backup speed
For clients with lower backup speed, check whether the backup speed is better
for the same client in another time period
2014 EMC Proven Professional Knowledge Sharing 10
If it is better, frame a rule such that the backups for that client are triggered only
during the second time period where it was found that the backup speeds are
better
Step 5 After framing a set of rules from Step 4, the Avamar server backup scheduler will
schedule the backups using those rules and monitor the backup speeds for a period of time
(configurable).
Step 6 After monitoring it will again go to Step 1 and continue. The ideal is to have very low
coefficient of variation across all categories.
Next, we will look at some of the simple analytic methods that can be used to analyze the
backup errors.
Sort the backup errors by error/codes. Then look at the error codes which contribute to
errors most and start analyzing those backup failures. .
After the first step, determine whether the majority of backup errors occur for a particular
client, time zone, geography, etc. This type of analysis by time zone, etc. can be done
easily if we export the data to an Excel spreadsheet.
2014 EMC Proven Professional Knowledge Sharing 11
Flow chart
Time series Analysis
This type of analysis will help predict the backup speeds going forward and dataset growth
which will help guide backup administrators for their planning purposes. The time series
analysis also will help in predict when storage capacity will be exhausted
The time series can be done using Excel. First, we will focus on the capacity management.
To predict backup speed over a period of time, find the trend of average backup speed over a
period of time. Some of the trend could be decreasing linearly or exponentially. If it is going
down, further analysis can be done whether the backup speed has gone down for all
clients/time period or a particular set of time period/client. Based on that, appropriate action can
be taken.
Analyse the data using stat functions,
i.e range/standard deviation
Pick the outliers (where standard deviation is greater)
Derive hypothesis from the outliers based on
time/client/domain, etc.
Test the hypothesis
2014 EMC Proven Professional Knowledge Sharing 12
Regression Analysis
Regression analysis is used to find the factors on which the result depends. There will be
several independent variables and one dependent variable. In our case, the dependent variable
is the result and the factors are independent variables. The results for the backup administrator
could be
Backup speed for a client/domain, etc.
Backup failures time period
Storage usage trend
CPU usage or Average disk I/O
In this section, we will look at results for backup failures, storage usage trend, and CPU
usage/disk I/O
Storage Usage trend
Storage usage depends on:
Number of clients
Retention policy
Time for which the system is up
An equation can be framed like the one below.
Y (storage usage trend) = a + b*no of clients + c*time for which system is up. Excel contains
functions to perform this analysis
A sample analysis is shown below:
Input data
Time in days
No of clients
Capacity in GB
1 100 10
2 123 10.5
3 126 10.6
4 129 10.9
5 135 11
6 140 11.5
7 146 11.6
8 151 11.7
9 141 11.8
10 152 12
2014 EMC Proven Professional Knowledge Sharing 13
Regression output
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.9917658
94
R Square 0.9835995
88 Adjusted R
Square 0.9789137
56 Standard
Error 0.0956386
12 Observation
s 10
ANOVA
df SS MS F Significan
ce F
Regression 2 3.83997
3 1.91998
6 209.909
3 5.65E-07
Residual 7 0.06402
7 0.00914
7 Total 9 3.904
Coefficients Standard Error t Stat P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept 8.3784040
36 0.53066
1 15.7886
3 9.91E-
07 7.123591 9.63321
7 7.12359
1 9.63321
7
X Variable 1 0.1436901
79 0.02511
8 5.72059 0.00072 0.084295 0.20308
5 0.08429
5 0.20308
5
X Variable 2 0.0148272
52 0.00485
5 3.05391
4 0.01848
1 0.003347 0.02630
8 0.00334
7 0.02630
8
In the above output, first look at the value R Square and if value is only greater than 0.8, the
regression model is correct. In other words, the prediction error is less.
The equation would be capacity required = 8.3 + 0.14 * no of days + 0.014 * no of clients.
Now you can predict the storage required if you predict that the number of clients will be 200 by
the end of 50 days.
The storage required according to the above equation would be = 8.3 + 0.14 * 50 + 0.014*200 =
18.1GB. Thus, the backup administrator would be able to determine when the capacity might be
exhausted and plan accordingly.
2014 EMC Proven Professional Knowledge Sharing 14
CPU usage Disk I/O
The factors on which CPU usage and disk I/) might depend on
1. Backed up data per day
2. Number of clients
The equation would be CPU usage = a + b*backed up data per day (in GB) + c*number of
clients.
If we follow the steps in the section above under storage usage trend, the administrator will be
able to predict the CPU usage or disk I/O usage over a period of time.
This data will be helpful in the below scenarios
1. Backup speed is decreasing over a period of time
2. Backup failures are increasing
If we see the above failures and if disk or CPU usage is very high or has increased dramatically,
that would have caused these failures. Corrective steps can be taken, i.e. adding capacity
(adding more disks will lower the disk I/O and most likely increase the backup speed)
Visualization
There are a number of tools which can be used to visualize the data we have. One such popular
tool is Tableau. Using Tableau software, one can connect to different databases, transfer data
from Excel spreadsheets and then do visualization. The screenshots below lists some of what
can be achieved using Tableau.
Tableau enables graphs to be seamlessly printed from Excel spreadsheet or any database.
Some other features of the tableau software are options to forecast, calculate the variation, and
standard deviation.
2014 EMC Proven Professional Knowledge Sharing 15
Conclusion
With the set of data analysis shown above, backup administrators can perform the following
activities in a better way.
1. Discover why backups are failing and take correcting actions
2. Forecast capacity and budgeting
3. Report the backup data used by department or by domain
Findings and benefits accrued because of these activities can be represented to management in
a visual format with tools such as Tableau.
2014 EMC Proven Professional Knowledge Sharing 16
Reference
http://www.wikihow.com/Run-Regression-Analysis-in-Microsoft-Excel
http://users.wfu.edu/cottrell/ecn215/regress.pdf
http://www.cengage.com/resource_uploads/downloads/113318765X_342117.pdf
http://www.spiderfinancial.com/products/numxl
https://www.usenix.org/legacy/events/lisa11/tech/full_papers/Chamness.pdf
http://searchdatabackup.techtarget.com/news/1322981/The-true-role-of-a-backup-administrator
http://www.tableausoftware.com/
http://www.emc.com/collateral/white-papers/h11363-data-protection-advisor-6-overview-wp.pdf
http://www.emc.com/collateral/analyst-reports/esg-data-protection-advisor-6-raise-dp-visibility-
ar.pdf
http://www.emc.com/collateral/hardware/white-papers/h9569-vmware-brs-wp.pdf
http://www.emc.com/collateral/software/white-papers/h6112-enabling-cost-control-operational-
efficiency-data-protect-advisor-wp.pdf
http://www.emc.com/collateral/software/white-papers/h6108-avamar-dpa-wp.pdf
http://www.emc.com/collateral/customer-profiles/h8692-cp-emc-it-dpa.pdf
2014 EMC Proven Professional Knowledge Sharing 17
EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION
MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO
THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an
applicable software license.