Download - Data Rule Questions
-
7/25/2019 Data Rule Questions
1/23
Report Types pane
Use this pane to create a report. Choose from the list of predefined report types. A report type definesthe organization and the types of details that appear in a report.
Name
Shows the report category and the available report types.
Description
Shows a description of the report.
Tasks pane
Select a report type from the object list and then click New Reportto customize and create thereport.
Creating a report
You can create a report by selecting the type of report that you want to create. he details andorganization of the report vary across report types.
o create a report!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ click Report Types.(. )*pand a category and select a report type.+. &n the ask list$ click New Report.,. &n the -ew 'eport pane$ complete the reuired fields$ and click Next./. #ptional! &n the Select 0roject Association step$ select a project and select Associate report
with project.
1. Click Next.2. &n the Specify -ame and #utput step$ specify the name of the report and the output format.3. Select a folder to save the report in. 4y default$ reports are saved in the root 'eports folder."5. #ptional! Select Save Reportto save your configurations. You can then run the report again
by using the settings that you assigned to this report."". Click Finishto run the report.
After the report runs$ you can view the report.Related tasks
Reports
You can create reports to summarize your analysis results and display details about your project. 6ore*ample$ reports can help you learn about the structure of you date so that you can make informedbusiness decisions.
he type of reports that are available to you vary according to the suite component that you haveinstalled. )ach report corresponds to a task. here are also reports that display general informationabout the project that you are working in.
'eports can display information in a standard template form or in a graphical form. You can alsochoose the output format for the report such as 789 or :89
Rnning a saved report
-
7/25/2019 Data Rule Questions
2/23
After you create$ configure$ and save a report$ you can run that report at a later time to displaychanges or updates in your information.
o run a saved report!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ click Saved Reports.(. Select the report that you want to run.+. &n the asks pane$ click Rn.
After the report runs$ you can view the updated report.
!arking a saved report as a "avorite
6or uick access to a report in the 6avorite 'eports content pane in the 8y :ome workspace$ you canmark a saved report as a favorite.
o mark a saved report as a favorite!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ select Saved Reports.(. Select the report that you want to mark as a favorite.+. &n the asks pane$ click Add To Favorites.
After you mark a report as a favorite$ it appears in the 6avorite 'eports content pane in the 8y :omeworkspace
Creating a new report "rom a saved report
o create a new report that is based on the configuration details of a saved report$ you can create acopy of a saved report.
o create a new report from a saved report!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ click Saved Reports.(. Select the report that you want to copy.+. &n the asks pane$ click Create Copy.,. &n the Steps to Complete wizard$ modify the parameters and settings as needed.
/. Click Nextto progress through the steps.1. Click Finish.You can now view the report results.
Analyzing data by using data rules
The topics in this section describe how to define and execute data rules, which evaluateor validate specific conditions associated with your data sources. Data rules can be used
to extend your data profiling analysis, to test and evaluate data quality, or to improve
your understanding of data integration requirements.
To work with data rules, you will always start by going to the Develop navigator menu inthe console, and select Data Quality. This will get you to the starting point of creating and
working with data rule functionality.
From the Data Quality workspace you can
!reate data rule definitions, rule set definitions, data rules, rule sets, and metrics
"uild data rule definition, rule set definition, and metric logic
!reate data rule definition and rule set definition associations
-
7/25/2019 Data Rule Questions
3/23
#ssociate a data rule definition, rule set definition, metric, data rule, or rule set
with folders
#ssociate a data rule definition, rule set definition, metric, data rule, or rule set
with terms, policies, and contacts
"uild data rule definitions or rule set definitions by using the rule builder
#dd a data rule definition with the free form editor
Characteristics of data rule functionality
$ou can use data rules to evaluate and analy%e conditions found during data profiling, toconduct a data quality assessment, to provide more information to a data integration
effort, or to establish a framework for validating and measuring data quality over time.
$ou can construct data rules in a generic fashion through the use of rule definitions.These definitions describe the rule evaluation or condition. "y associating physical data
sources to the definition, a data rule can be run to return analysis statistics and detail
results. The process of creating a data rule definition and generating the subsequent datarule is shown in the following figureFigure &. 'rocess of creating and running a data rule definition
(")* (nfo+phere (nformation #naly%er data rules include the following
characteristics
-eusableThe definitions are not explicit to one data source, but can be used and applied to
many data sources.
Quickly evaluated
They can be tested interactively, as they are being created, to ensure that theydeliver expected information.
'roduce flexible outputData rules can produce a variety of statistics and results, at both the summary anddetail levels. $ou can evaluate data either positively how many records meet your
condition/ or negatively how many records violate your condition/, and control
the specifics of what you need to view to understand specific issues.0istorical
They capture and retain execution results over time, allowing you to view,
monitor, and annotate trends.
-
7/25/2019 Data Rule Questions
4/23
)anaged
1ach data rule has a defined state, such as draft or accepted, so you can identify
the status of each rule.!ategorical
$ou can organi%e data rules within relevant categories and folders.
Deployable$ou can transfer data rules to another environment. For example, you can export
and transfer a data rule to a production environment.
#udit functionality$ou can identify specific events associated with a rule such as who modified a
rule and the date it was last modified.
Getting started with rule analysis
2hen working with (nfo+phere (nformation #naly%er rules, you will always start
by going to the Develop tab in the main menu. $ou will then select Data Quality
from the menu.
Data rule definitions
-ule definitions are the foundation for rule analysis, allowing you to extend andexplore beyond basic data profiling. They provide a flexible method to define
specific tests, validations, or constraints associated with your data. The resulting
data rules provide you with the ability to ensure compliance to expectedconditions or identify and review exceptions.
Data rules
#fter you create your rule definition logic, you generate data rules to apply therule definition logic to the physical data in your pro3ect.
Rule set definition overview
-ule set definitions are a collection of data rule definitions.
Metrics
$ou use metrics to consolidate the detailed statistical results from one or moredata rules, rule sets, or other metric results into a meaningful measurement. This
could be a specific weighted average, a cost factor, or a risk calculation.
Global logical variables
# global logical variable is a value that you can set to represent a fact or a specific
piece of data. #fter you create a global logical variable, it is a shared construct
that can be used in all of your data rule definitions and data rules.
Accessing output results for data rules, metrics, and rule sets
$ou can access output results from the (nfo+phere (nformation #naly%er )y
0ome workspace, from the pro3ect dashboard, or in the Data Quality workspace.
Viewing data uality activity history
$ou can view the activity history of data rule definitions, data rules, metrics, ruleset definitions, and rule sets by using the audit trail feature. The audit trail feature
tracks the history of activity for each data quality ob3ect such as when it wascreated, changed, executed, and so on.
Associating a data rule definition with folders
Folders provide the ability to organi%e and view data rules, rule sets, and metrics,according to one or multiple business dimensions, or alternately by data source or
system.
http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_working_with_rules.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_definitions.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_executables_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_set_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_metrics_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_logical_variable_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_accessing_output.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_audit_trail.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_folders.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_working_with_rules.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_definitions.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_executables_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_set_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_metrics_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_logical_variable_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_accessing_output.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_audit_trail.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_folders.html -
7/25/2019 Data Rule Questions
5/23
Associating a rule definition with terms, policies, and contacts
$ou can create an association between a rule definition and a term which you can
create by using (nfo+phere "usiness 4lossary/ to provide additional meaning tothe rule definition. $ou can also associate a policy or a contact available in the
repository with a rule definition. These provide additional means to annotate or
give you a link to more information about the rule definition. Debugging data rules and rule sets
$ou can use the debugging log option to isolate errors in your rule logic that are
preventing successful rule evaluation.
Deploying uality components
(n a basic deployment scenario, (nfo+phere (nformation #naly%er quality
components are moved from a development or test environment to one or more
production environments. $ou can import and export quality components,including data rule definitions, rule set definitions, data rules, rule sets, global
variables, and metrics.
Selecting a report template to work with
You can select a report template to manage access control$ create reports$ and view reports that arecreated from the template.
#rere$isite! Your user name$ group$ or role must be on the access control list for the template.
o select a report template to work with!
". &n the -avigation pane of the 'eporting tab$ select Contents.%. )*pand Report Templatesto view the folders that contain templates.(. )*pand a template folder and click %iew Report Templatesto view a list of templates that the
folder contains.
+. Select a template from the list.You can click the name of any task in the task list to perform the task.
Creating reports "rom templates
You can use a report template to create reports.
Selecting a report to work with
You can select a report to control access to the report$ to run$ schedule or delete the report$and to view results that are created from the report.
%iewing and modi"ying reports
You can view reports and change parameter values and other settings.
Deleting reportsYou can delete reports that were created from a report template. his deletes from therepository the reports and all the report results that were created from the reports.
Report parameters
You can specify values for report parameters when you create or modify a report$ and whenyou run a report manually.
http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_terms.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_debugging_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_deploy_overview.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_CreatingReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_SelectReport.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_ModifyReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_DeleteReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_c_ReportParameters.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_terms.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_debugging_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_deploy_overview.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_CreatingReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_SelectReport.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_ModifyReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_DeleteReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_c_ReportParameters.html -
7/25/2019 Data Rule Questions
6/23
&ogging categories
&f you select certain reports$ such as 9og messages by data range$ you are asked whichlogging categories you want to include in the report. he following table includes a list of allavailable categories and their descriptions.
'tpt options "or report reslts(o can speci"y additional otpt options "or report reslts that yo create in HT!&) #DF)
te*t$ 79S$ and 789 format. he '6 and ;:89 formats do not have additional options.
Creating reports "rom templates
You can use a report template to create reports.
#rere$isites! You must have read permission in the access control list for the template.
o create a report from a template!
". Select a report template.
%. Click New Report.
(. &n the -ew 'eport pane$ specify parameters and other settings!
o #ptional! Change the default name and description of the report. he name and
description are inherited from the report template that the report is created from.o #ptional! Select a folder to save the report in. 4y default$ reports are saved in the root
'eports folder.o Specify values for parameters or leave one or more parameter values unspecified$
depending on whether you plan to run the report manually or by using a schedule.o Specify the output format of the report results that are created from the report.
;ifferent templates can offer different output formats.o Specify the time period after which each report result e*pires$ or specify that report
results do not e*pire. 'eport results are deleted at midnight on the e*piration date.o Specify whether to overwrite each report result with the newest version$ or to save a
specified number of report results for the report.
+. Save the report.
You can run the report to create report results.
Templates) reports) and reslts
You can use report templates to create reports. You can run reports to produce report results.
#n the 'eporting tab$ the word reportrefers to the object that you run to produce report results. Areport resultis a file that is created in a specified$ viewable format$ such as :89$ 0;6$ or te*t format.
-
7/25/2019 Data Rule Questions
7/23
when you run the report. he parameter values determine what information is searched forand returned when you run the report.
Reports
'eports are e*ecutable objects that are created from report templates. You run reports toproduce report results. You can schedule reports or run them manually.
-
7/25/2019 Data Rule Questions
8/23
Users who have #wner permission for a template can add other members to the accesscontrol list.
-
7/25/2019 Data Rule Questions
9/23
Delete
;elete the report result.
he user who created the report result is the only member whohas administration permission by default. he user who createdthe report result also has read and delete permission for it. his
user can add other members to the access control list and grantadministration and other permissions.
8embers of the access control list for the report also appear onthe access control list for the report result. hese membersinherit the permissions that they had for the report$ e*cept thatadministration permission is not inherited and update and runpermissions do not apply to report results.
6or e*ample$ a user$ group$ or role that has read$ run$ update and delete permission for areport inherits read and delete permissions for all of the report results
&. 0ow to debugging the data rules and data rule sets when we get errors in our rule
logic5. 0ow to import and export the data quality components like data rule definition
,rule set definition ,global variable, metrics
Q61+T(78+ F-7) +0-)(+T0#1.What is system generated exception table?2.What is Benchmark in IA?3.What is the diference between ol!mn "roperty #al!es and ol!mn
ompleteness?$.an we implement omplex %&' (!nctionality !sing IA )ata *!les?+.an IA *eports ,les can be modi,ed to create new one?-.Why we do not need to r!n ol!mn Analysis beore r!nning ross/)omain Analysis?
Data profiling is essentially data mining, but for a different purpose. $ou mine data to
understand, to gain better knowledge about the data. 2hile the more common use of datamining is for gaining the data insights for business purpose e.g. customer buying
characteristics/, data profiling is for technical purpose. To be more precise, you do data
profiling to gather and analy%e the technical metadata characteristics of the data.
!nformation Analyzer, data profiling software from ("), helps you gain insight intosuch technical metadata characteristics as, for example, column data type and si%e
length/.
This article is based on a case where a database table grew its si%e unexpectedly and itsinitial disk space allocation got strained. 9ooking at the growth pattern of the table, such
as the number of new records which was not that huge/, didn:t give us any clue about the
cause of the problem. (n this article, D"oni Darmawi#artawill step through the
-
7/25/2019 Data Rule Questions
10/23
(nformation #naly%er process, run one of its functions called !olumn #nalysis on a
simple table a scaled;down version of the real table/, and show how the profiling output
help solve the problem.
(nformation #naly%er is a client;server software. # data profiling user metadata analyst/
works on its 46( client, so to make it easier to show you how ( solve the problem (:ll usea lot of screenshots.
7ur example data is an 7racle table that has two columns and three rows (n real life,they can typically be more than and then, its log;in window.
-
7/25/2019 Data Rule Questions
11/23
2hen your log;in is successful, the console main windowwill show up.
#ssuming the 7racle table that we:d like to profile is new> we must identify it to the
#naly%er, which technically means importing its metadata.
Make sure you have connected the Oracle database to theInformation
Analyzer serverbefore you import the metadata of its tables.
1xpand Metadata Managementfrom the %&M'drop;down menu.
-
7/25/2019 Data Rule Questions
12/23
Then, click !mport Metadata.
7ur example 7racle data table/ is in the !9-7'1- database hosted in DD7)=5/, soselect !9-7'1- and then click !dentify (e)t *evel.
-
7/25/2019 Data Rule Questions
13/23
(t might take a while, particularly for a database that has many tables and many columns>
so 3ust wait.
7n the completion message screen, click &+to close the screen.
#ll tables in !9-7'1- database will be identified listed/ including our example tablenamed +'#!1&. 2e:ll next identify the columns of our +'#!1& table> so select +'#!1&
and then click !dentify (e)t *evel.
-
7/25/2019 Data Rule Questions
14/23
The result shows that #naly%er has correctly identified the two columns of the table.
8ow, import metadata of all columns of the table by selecting the table and then clicking
!mport.
!lick &+to continue.
-
7/25/2019 Data Rule Questions
15/23
2ait for completion.
!lick &+on the successful completion screen.
2e:re now done with the metadata of the data> we:re now ready to start our profilingtask.
(n (nformation #naly%er as in most other software of these days/ we group our profiling
works into pro3ects. 0ere, ( 3ust use an existing pro3ect D?78(@T1+T/, so select 7pen
'ro3ect from the drop;down arrow on the right of 87 '-7?1!T +191!T1D.
-
7/25/2019 Data Rule Questions
16/23
$ou:ll be shown the list of existing pro3ects. +elect your pro3ect, and click &pen.
7ur previous existing/ profiling works are shown.
-
7/25/2019 Data Rule Questions
17/23
8ext, open click ro"ect ropertiesfrom the &V'RV!'-drop;down menu.
4o to the Data +ources tab. 7ur +'#!1& table is not in the list yet, as we haven:tidentified it specifically in our pro3ect we did in the previous steps at the server;wide
level/> so we need to add it into our pro3ect, click Add.
-
7/25/2019 Data Rule Questions
18/23
1xpand the +'#!1& table to see its columns. +elect all of the columns as we want toprofile all of them, and then click &+.
2hen completed, click $ave All, and then close the ro"ect ropertieswindow.
-
7/25/2019 Data Rule Questions
19/23
8ow, we:re ready to profile our +'#!1& data, to analy%e its columns. 7n the maintoolbar select !nvestigate . Column analysis.
+elect all columns of the +'#!1& table to analy%e, and click Run Column Analysis.
-
7/25/2019 Data Rule Questions
20/23
!lick $ubmit.
!heck status by clicking Details.
-
7/25/2019 Data Rule Questions
21/23
2hen the 3ob status shows +chedule !omplete, click Closeto close the #ctivity +tatus
3ob status/ window.
!lose the Column Analysiswindow as well.
To check result, click &pen Column Analysis.
-
7/25/2019 Data Rule Questions
22/23
7ur profiling output shows the metadata characteristics of the two columns. 7ur focus is
on their si%es> so if necessary scroll to the right to see the 9ength columns.
The 9ength has three columns Defined, (nferred, and +elected. The Defined length of thefirst column (8T141-&/ is as defined in the metadata of the table we imported, which
is AB. The (nferred length, which is A, is produced, by (nformation #naly%er, by
computing statistically the data lengths of all rows, based on the actual data values of thecolumn> and then, it suggests +elected/ that A should be the length of this column.
+imilarly, (nformation #naly%er did the (nferred and +uggested on the other column, the
9#-41!0#-&.
-
7/25/2019 Data Rule Questions
23/23
"ased on these output produced by (nformation #naly%er, we can decide how much we:d
to reduce the length of the columns, which will certainly reduce the disk space needed for
the data.
$ummary
6sing a data profiling tool, such as the (") (nformation #naly%er, we can analy%e and
gain knowledge particularly large amount of data that otherwise would not be apparent.
The (nformation #naly%er has much more functionalities> this article discussed only thebasics of one of them !olumn #nalysis/.
Data rule
This data ruledefinition can be written in any terms you want to use. For example, you
can typeC#ge,C Cvoter@age,C or you can create a ///