presenter: charles krajewski, colby-sawyer 6/18/2014 college · apologies to david letterman –...

21
1 Presenter: Charles Krajewski, Colby-Sawyer College 6/18/2014

Upload: vannguyet

Post on 01-Mar-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

1

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 2: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Introduction:

Scope of talk – meant for across the board users; IT and operational

Keeping the ‘tech talk’ to a minimum; both types IT and User

Definitions – there are many definitions for this topic– data mart/warehouse

ODS, etc. some conflict – it’s a matter of particular approach, but the end product,

whatever you call it, should meet the data reporting and analytical needs of the user

departments that will rely on it

This presentations definitions – ‘Data Repository’ -> the entire database

designed to span all the reporting departments; ‘Data Warehouse’ -> the portion of the

Repository that addresses a particular ‘business process’ (i.e. student academics, Alumni

relations, etc). Not necessarily a given department within the institution

2

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 3: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Apologies to David Letterman – the top 5 reasons why users need a Data Repository

#5 – IT won’t have to do all those nasty application integrations –

a. yes you will, those integration between PC and PF, between PC, PF, GP will

all still be there

b. application releases will still be part of you life

c. Well, with a data repository there will be yet one more database to

maintain and integrate; however, this one will be of your own design (if done according to

YOUR school’s needs); the access rights will be focused on the data here and not

necessarily into the application databases; and the repository will be (mostly) free of the

foibles of the integrations provided by the applications (i.e. ‘The Bridge’)

#4 – It will ‘empower’ the users to build their own reports/charts

a. A data repository, at base, is still a database with all the technical

challenges of an application; except the DR will absorb a lot of the technical detail (i.e.

filtering criteria, access rights, etc.) the an application database has.

b. Remember: an application database is built to collect, edit, and store

data; the DR is built for ease in reporting

c. Besides is it the best use of your other-technically skilled people’s time to

have to be trained in the intricacies of database management– does your CFO really need

to know about left outer joins?

d. From an IT point of view the reporting function will still exist, but it will

be easier.

3

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 4: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

#3 – FINALLY we’ll have a single information source to answer all our questions!

a. The issue here is not so much the answer but the question. A DR provides

the data needed TOWARD answering a question, but we will still be faced with formatting

the question.

b. A real benefit from the building of the DR will be in the meeting to agree

on the terms we use– the “Language of the Institution”

#2 – We need to be competitive with ‘Jones’ University’!

a. The DR will NOT fix your retention rates, your slumping annual fund giving,

or get that one big endowed gift.

b. The DR only provides the data that can be pulled together in a meaningful

way, analyzed, and present the possibilities– the opportunities and pitfalls of past actions. It

can not act on its own.

c. From the data you can see trends, but from there you will need to adjust

your organizational way of working to take advantage of the insights.

#1 – It’s so WAY COOL!!

a. Everybody LOVES that chart and graph that lays out exactly how we’re

doing.

b. Drooling over a piece of that pie chart; bellying up to that bar graph;

wrapping your arms around those ample pivot charts

c. The DR is the data provider; your charting/graphing choices are only the

benefit of having a centralized data store

d. Beware what’s worse than ‘Garbage in Garbage out’ look out for ‘Garbage

in Gospel out’ the flash and dazzle could blind you to inadequate and/or inaccurate data

e. BUT with a DR correctly built and vetted for accuracy and precision a well-

designed and placed chart/graph/spreadsheet can be a thing of beauty

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

3

Page 5: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

The President’s Initiative

The Concerns #1– senior management

a. The Plan will require a new way of looking at the data to determine ‘How

are we doing?’

b. Not sure what measures will be needed

The Concerns #2– operationally

a. The complexity of reporting only getting worse

b. Interestingly, the access to data was creating a situation where technical

users were creating their own reports that should be produced elsewhere resulting in

inaccurate reporting

c. Report sharing – using reports for purposes that they were NOT designed

for

d. External reporting changes and new reporting requirements was pointing

out a need for a more efficient reporting method

4

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 6: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Membership and Roles:

a. Structure of PCUG at Colby-Sawyer – a member of each operational

department with ‘one foot in IT and the other in the user area;’ meet bi-weekly to discuss

the use of PC– new options, upgrades, reporting requirements, etc.

b. PCUG formed the core of the committee

c. Added the IT –DBA; Director of IR, and other users of the reporting but

not necessarily PC users

Statement of Understanding the ‘Project Charter’:

a. Describes the scope of the project, project/task process, reporting

b. A written document in the ‘Language of the School’ – use the terms and

definitions that would be understood by your colleagues; stay away from technical jargon.

It doesn’t help if the charter is written in such a way that only a single

individual/department can understand it.

c. Part of the charter will describe how the DR will be built but ALSO how it

will be maintained! Without constant attention to what the DR holds and HOW it holds the

data it will soon fall to disuse and obsolescence.

Reporting:

a. This is a project that will influence the campus-wide reporting and

analysis into the future; it will start to influence how you even talk-the-talk of the

institution. As the charter needs to be a public document.

b. Not only the charter but the progress of the project should be public– it

5

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 7: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

does no good if you make the ‘grand announcement’ and then go into some dark

development corner and emerge with some product that people are supposed to trust and

use.

c. Reporting will keep the project in front of the users; no mysteries here

Institutional Support:

a. Time, Talent, Treasure:

b. Time: schedule regular meetings to keep the project committee wheels

turning, attendance must be made a priority. BTW: this is not going to be a quick and easy

project, most documents indicate that getting a DR off the ground is a 3-5 year commitment.

c. Talent: this is a time to garner the most experienced and skilled people you

have to insure that the knowledge that builds the DR will be the best

d. Treasure: this will cost not only dollars for the technical infrastructure, but

in the time away from other duties spent by you committee members

e. Senior Management support – keep the committee members feet to the

fire of attending and participating in the DR project; cut them the slack needed to contribute,

ask for reporting back from the meetings and keep engaged in their work.

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

5

Page 8: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Identify the building blocks for the Data Repository:

a. This is a gathering/organizing phase and NOT a design phase

b. It’s helpful to track and name the different business processes that each

department executes. For example: in Admissions look at the candidate funnel process–

what distinct steps do you go through to get some one in the door. In Advancement– name

the areas that address Alumni relations, Major gift stewarding, etc.

c. Identify the areas not through what setflows are used but how you collect

data and how you report data

d. Each one of these identified processes is a candidate for becoming a

distinct subsection of the DR– these will be your datamarts

Collect the data ‘end points’

a. An ‘end point’ is simply a place where your collected data will show up in

the decision making and reporting process anywhere in the institution

b. Look at:

your internal (i.e. departmental) reporting – the reporting that keeps

your department running

interdepartmental reporting – those reports that you’re constantly

sending to other department to keep your departments coordinated and allow the other to

do THEIR job

Senior Management – what reports are you sending up the ‘food

chain’ so they can make their decisions

External reporting – this is leagend IPEDS, VSE, NSLC, CASE, NEASC,

6

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 9: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

FastFacts, etc. Reports that your IR people are creating to report out the status of the

institution

c. Look for those manual reports, spreadsheets, post-it notes that are being

kept in ‘shadow systems’ – those extras that are used to make decisions

Selecting the software platform:

a. The database structure, development language product, and reporting tool

b. Over looked and as such can be ignored and ‘tools of convenience’ slip in

to create a DR reporting/development coordination nightmare.

c. This is where your DBA / IT members will be most helpful!

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

6

Page 10: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Select from the datamarts a sample mart that will be used to proto type your development

efforts.

Keep it:

Small

Well defined

Contained

Public

Remember this first one will be a model for the development process for what follows.

Once decided upon, determine which data ‘end points’ are utilized and decide on a sample

data set that will be used in the development process.

The ‘data set’ should include a typical reporting group of data (i.e. an academic year’s

worth of students; a fiscal year’s worth of giving, etc.); nothing too large or you’ll get

bogged down in seas of data when you want to verify the accuracy of your datamart

development procedures. Part of the data set should also include a sampling of your

‘problem children’ – any data that you know you will come across but are very unusual in

content (i.e. double majors, Alumni with more than one class year, gifts through a DAF, etc.)

these will help stress test your developing system.

7

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 11: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

The Bill of Materials

Collect the datamarts data elements:

a. When labeling/describing your data elements do not use application-

specific terms; use the ’language of the school’

b. Application talk will come

c. Don’t try to merge terms across end-points yet, do it when you’ve

identified the complete set

Collate the collected elements:

a. NOW combine elements. Look for

Elements with different names that describe the same data

Elements with the same name that describe different data

b. When labeling the elements for the DR work across departments so that

you’re naming will work no matter what department is using the term; this can be a

challenge– we all love our terms!

Construct the Data Dictionary:

a. Select a method for collecting and storing your data elements– excel,

share point, home-built database (we use an Access database for this purpose), etc.

b. Whatever it is make it THE place to go for your definitions– the arbiter

c. Make a member of the project committee the ‘keeper’ of the dictionary–

funnel the definitions, etc. through this person for recording

8

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 12: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

d. Make sure it’s available to each member; later it should be available to

anyone who will be using the DR.

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

8

Page 13: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Getting the DBA very involved

The translation of all the logical work into the needed technology

This is where the IT membership on the steering committee will be

most useful – translation, talking through, translating,

Security Issues

This is where the access rights will be granted and the overall

understanding of the appropriateness of who will have access to what tables �

could/will lead to actual structure of the tables within the database

Structuring the Schema

Collect the data elements based on relevancy – work through by

describing the differing entities (i.e. a Student, a Donor, etc.)

What will be structured as a table; what will be structured as a data

view (define a table vs. a view)

Define and build the Tables and Views

Decide on the Logical collections (joins) of tables

Then building Views look to the data end-points to help identify and

collect what the common views might be (example: Registrar’s reporting—17

reports with only 3 different data extracts(views) once we looked at them)

9

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 14: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Example: Why you might ask would I counsel you to keep close track of where you do

your transformations? Remember 3-4 years ago when, I think it was Ellucian,

changed the way phone numbers were being stored in the database? I was in the

middle of shaking down our Students datamart at the time; this change required that I

add 3 fields to the datamart (to capture the number), adding one query (an extract),

and modifying one query to capture the data. All the queries and views that needed

the phone number needed no changes at all. The datamart took care of that.

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

9

Page 15: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Define the queries/scripts

This is where the technical decisions made by the committee come

into play

The table/view structure and the data dictionary will have a bit of a

‘shotgun’ wedding – the elements defined in the dictionary will now need to be

translated from the source application (i.e. PowerCAMPUS) to the datamart – the

data will be extracted -> transformed -> loaded � the ETL process as it’s called in BI

parlance

Build the refresh process

To ‘refresh’ the datamart (repository) is the process by which the data

repository either in total or partially is updated to reflect the state of the

data as of a certain point in time.

Discuss real-time vs. real ENOUGH time � what is current enough to make

for meaningful reporting

Considerations

Getting the datamart up and running

Initial testing

Vetting the test data set

Going live (well almost)

10

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 16: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Remember buy-in

Probably one of the toughest parts of the whole process

WILL require running parallel reports to validate the mart

Watch out for the changes made -> that innocent change to a report structure

or how a data element is defined ‘with the new system’ could be a stumbling

block to acceptance

Major credibility issues at this point -> especially for the prototype

Approval and sign off -> celebration time, really! Party down!! It’s hard work,

it shows, and it’s usable!

A word of caution here – so impressed will your users be with what you’ve

accomplished they’ll holler for more! Beware of the “You did that, can you do

this…” syndrome. Take the suggestions, acknowledge them, look at them as

evidence that your users ARE interested and excited, log them into your

project… as items that will form the core of, let’s call it, Version 2.0 You can

get very hung up with unplanned for iterations.

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

10

Page 17: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Myth: http://en.wikipedia.org/wiki/Sisyphus

11

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 18: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Doing some demo:

Description of our platform

MS – SQL database structure (I am, admittedly, NOT a

systems – type)

PowerCAMPUS w/ all setflows minus Academic Plan,

PFaids, Moodle (with Remote Learner), Symplicity Residence

Development – MS-Access 2010

Reporting – through Access moving to Evisions Argos

reporting (2 years now)

Short stint with Crystal and

MSRS

Very little

VistaView/VistaReports- mostly through the Billing setflow

Scheduled refresh tasks – MSIS, MARS

12

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 19: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

Traps:

Time, talent, treasure– not spending the time, not utilizing you best

talent, not investing in the infrastructure required

Buy in– non involvement and even ‘sabotage’ by management but

also the users. Remember Machiavelli!

Shadow Systems– the will be those systems that lay hidden which will

work to undermine the effort to make the DR relevant

Data “Ownership” – Different departments will view that data-work

they do as creating ‘their’ data and will make the effort to keep access

to it under their control (this could be the largest hurdle you’ll

encounter)

Gate Keeping– goes along with the above ‘data ownership’ issue– user

department will insist that they have the authority to limit who sees

what

Immediacy– When that last-minute, gotta have it now report looms

there will be the temptation to pull the data “just this once” using the

13

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

Page 20: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

old ways of doing things; this undermines the understanding that the

DR will be the source of reporting. Senior managers need to get into

this one and insist that the DR be used even if the immediacy is slowed

a bit

But, there are promises as well! Click

Promises

Time, talent, treasure– Once established (and even during the build

process) time will be saved in not only building those reports and charts

but also in the verifying of the data; time saved to use in moving the

school forward and coming up with new ways of pulling and analyzing;

As the DR grows the work of the committee will build the talent of you

staff across departmental divides (which disappear through the

cooperation of the committee); and that is a treasure for the school–

well trained users accesing a uniformly designed and agreed upon data

store.

Consistent Reporting– your reporting will benefit through the clear

understanding of what reports contain and how they were developed

and what each data element on the report represents

Higher data integrity – across campus; consistent and agreed upon

definitions and methods for creation will lead to high trust in what the

users sees and uses

Informed reporting – users; no more “What does this report do?” it will

aid in using the proper report for the proper reason.

Buy in – the more involvement in the DR development process you have

the more the users will come to trust and with that they will come to

accept, use, and move to further refine your DR

Data Ownership; Ownership becomes Stewardship– each department

will understand their part in the creation and maintenance (the care

and feeding) of the DR. Increased trust in the use and ‘professional’

access to the data will allow posessive users to loosen their hold on the

gatekeeping function.

PLUS– It’s WAY COOL!!

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014

13

Page 21: Presenter: Charles Krajewski, Colby-Sawyer 6/18/2014 College · Apologies to David Letterman – the top 5 reasons why users need a Data Repository #5 – IT won’t have to do all

14

Presenter: Charles Krajewski, Colby-Sawyer

College

6/18/2014