data rule questions

Upload: bhaskar-reddy

Post on 28-Feb-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 Data Rule Questions

    1/23

    Report Types pane

    Use this pane to create a report. Choose from the list of predefined report types. A report type definesthe organization and the types of details that appear in a report.

    Name

    Shows the report category and the available report types.

    Description

    Shows a description of the report.

    Tasks pane

    Select a report type from the object list and then click New Reportto customize and create thereport.

    Creating a report

    You can create a report by selecting the type of report that you want to create. he details andorganization of the report vary across report types.

    o create a report!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ click Report Types.(. )*pand a category and select a report type.+. &n the ask list$ click New Report.,. &n the -ew 'eport pane$ complete the reuired fields$ and click Next./. #ptional! &n the Select 0roject Association step$ select a project and select Associate report

    with project.

    1. Click Next.2. &n the Specify -ame and #utput step$ specify the name of the report and the output format.3. Select a folder to save the report in. 4y default$ reports are saved in the root 'eports folder."5. #ptional! Select Save Reportto save your configurations. You can then run the report again

    by using the settings that you assigned to this report."". Click Finishto run the report.

    After the report runs$ you can view the report.Related tasks

    Reports

    You can create reports to summarize your analysis results and display details about your project. 6ore*ample$ reports can help you learn about the structure of you date so that you can make informedbusiness decisions.

    he type of reports that are available to you vary according to the suite component that you haveinstalled. )ach report corresponds to a task. here are also reports that display general informationabout the project that you are working in.

    'eports can display information in a standard template form or in a graphical form. You can alsochoose the output format for the report such as 789 or :89

    Rnning a saved report

  • 7/25/2019 Data Rule Questions

    2/23

    After you create$ configure$ and save a report$ you can run that report at a later time to displaychanges or updates in your information.

    o run a saved report!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ click Saved Reports.(. Select the report that you want to run.+. &n the asks pane$ click Rn.

    After the report runs$ you can view the updated report.

    !arking a saved report as a "avorite

    6or uick access to a report in the 6avorite 'eports content pane in the 8y :ome workspace$ you canmark a saved report as a favorite.

    o mark a saved report as a favorite!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ select Saved Reports.(. Select the report that you want to mark as a favorite.+. &n the asks pane$ click Add To Favorites.

    After you mark a report as a favorite$ it appears in the 6avorite 'eports content pane in the 8y :omeworkspace

    Creating a new report "rom a saved report

    o create a new report that is based on the configuration details of a saved report$ you can create acopy of a saved report.

    o create a new report from a saved report!". #n the Homenavigator menu$ select Reports.%. &n the 'eports workspace$ click Saved Reports.(. Select the report that you want to copy.+. &n the asks pane$ click Create Copy.,. &n the Steps to Complete wizard$ modify the parameters and settings as needed.

    /. Click Nextto progress through the steps.1. Click Finish.You can now view the report results.

    Analyzing data by using data rules

    The topics in this section describe how to define and execute data rules, which evaluateor validate specific conditions associated with your data sources. Data rules can be used

    to extend your data profiling analysis, to test and evaluate data quality, or to improve

    your understanding of data integration requirements.

    To work with data rules, you will always start by going to the Develop navigator menu inthe console, and select Data Quality. This will get you to the starting point of creating and

    working with data rule functionality.

    From the Data Quality workspace you can

    !reate data rule definitions, rule set definitions, data rules, rule sets, and metrics

    "uild data rule definition, rule set definition, and metric logic

    !reate data rule definition and rule set definition associations

  • 7/25/2019 Data Rule Questions

    3/23

    #ssociate a data rule definition, rule set definition, metric, data rule, or rule set

    with folders

    #ssociate a data rule definition, rule set definition, metric, data rule, or rule set

    with terms, policies, and contacts

    "uild data rule definitions or rule set definitions by using the rule builder

    #dd a data rule definition with the free form editor

    Characteristics of data rule functionality

    $ou can use data rules to evaluate and analy%e conditions found during data profiling, toconduct a data quality assessment, to provide more information to a data integration

    effort, or to establish a framework for validating and measuring data quality over time.

    $ou can construct data rules in a generic fashion through the use of rule definitions.These definitions describe the rule evaluation or condition. "y associating physical data

    sources to the definition, a data rule can be run to return analysis statistics and detail

    results. The process of creating a data rule definition and generating the subsequent datarule is shown in the following figureFigure &. 'rocess of creating and running a data rule definition

    (")* (nfo+phere (nformation #naly%er data rules include the following

    characteristics

    -eusableThe definitions are not explicit to one data source, but can be used and applied to

    many data sources.

    Quickly evaluated

    They can be tested interactively, as they are being created, to ensure that theydeliver expected information.

    'roduce flexible outputData rules can produce a variety of statistics and results, at both the summary anddetail levels. $ou can evaluate data either positively how many records meet your

    condition/ or negatively how many records violate your condition/, and control

    the specifics of what you need to view to understand specific issues.0istorical

    They capture and retain execution results over time, allowing you to view,

    monitor, and annotate trends.

  • 7/25/2019 Data Rule Questions

    4/23

    )anaged

    1ach data rule has a defined state, such as draft or accepted, so you can identify

    the status of each rule.!ategorical

    $ou can organi%e data rules within relevant categories and folders.

    Deployable$ou can transfer data rules to another environment. For example, you can export

    and transfer a data rule to a production environment.

    #udit functionality$ou can identify specific events associated with a rule such as who modified a

    rule and the date it was last modified.

    Getting started with rule analysis

    2hen working with (nfo+phere (nformation #naly%er rules, you will always start

    by going to the Develop tab in the main menu. $ou will then select Data Quality

    from the menu.

    Data rule definitions

    -ule definitions are the foundation for rule analysis, allowing you to extend andexplore beyond basic data profiling. They provide a flexible method to define

    specific tests, validations, or constraints associated with your data. The resulting

    data rules provide you with the ability to ensure compliance to expectedconditions or identify and review exceptions.

    Data rules

    #fter you create your rule definition logic, you generate data rules to apply therule definition logic to the physical data in your pro3ect.

    Rule set definition overview

    -ule set definitions are a collection of data rule definitions.

    Metrics

    $ou use metrics to consolidate the detailed statistical results from one or moredata rules, rule sets, or other metric results into a meaningful measurement. This

    could be a specific weighted average, a cost factor, or a risk calculation.

    Global logical variables

    # global logical variable is a value that you can set to represent a fact or a specific

    piece of data. #fter you create a global logical variable, it is a shared construct

    that can be used in all of your data rule definitions and data rules.

    Accessing output results for data rules, metrics, and rule sets

    $ou can access output results from the (nfo+phere (nformation #naly%er )y

    0ome workspace, from the pro3ect dashboard, or in the Data Quality workspace.

    Viewing data uality activity history

    $ou can view the activity history of data rule definitions, data rules, metrics, ruleset definitions, and rule sets by using the audit trail feature. The audit trail feature

    tracks the history of activity for each data quality ob3ect such as when it wascreated, changed, executed, and so on.

    Associating a data rule definition with folders

    Folders provide the ability to organi%e and view data rules, rule sets, and metrics,according to one or multiple business dimensions, or alternately by data source or

    system.

    http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_working_with_rules.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_definitions.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_executables_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_set_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_metrics_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_logical_variable_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_accessing_output.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_audit_trail.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_folders.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_working_with_rules.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_definitions.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_executables_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_rule_set_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_metrics_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_logical_variable_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_accessing_output.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_audit_trail.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_folders.html
  • 7/25/2019 Data Rule Questions

    5/23

    Associating a rule definition with terms, policies, and contacts

    $ou can create an association between a rule definition and a term which you can

    create by using (nfo+phere "usiness 4lossary/ to provide additional meaning tothe rule definition. $ou can also associate a policy or a contact available in the

    repository with a rule definition. These provide additional means to annotate or

    give you a link to more information about the rule definition. Debugging data rules and rule sets

    $ou can use the debugging log option to isolate errors in your rule logic that are

    preventing successful rule evaluation.

    Deploying uality components

    (n a basic deployment scenario, (nfo+phere (nformation #naly%er quality

    components are moved from a development or test environment to one or more

    production environments. $ou can import and export quality components,including data rule definitions, rule set definitions, data rules, rule sets, global

    variables, and metrics.

    Selecting a report template to work with

    You can select a report template to manage access control$ create reports$ and view reports that arecreated from the template.

    #rere$isite! Your user name$ group$ or role must be on the access control list for the template.

    o select a report template to work with!

    ". &n the -avigation pane of the 'eporting tab$ select Contents.%. )*pand Report Templatesto view the folders that contain templates.(. )*pand a template folder and click %iew Report Templatesto view a list of templates that the

    folder contains.

    +. Select a template from the list.You can click the name of any task in the task list to perform the task.

    Creating reports "rom templates

    You can use a report template to create reports.

    Selecting a report to work with

    You can select a report to control access to the report$ to run$ schedule or delete the report$and to view results that are created from the report.

    %iewing and modi"ying reports

    You can view reports and change parameter values and other settings.

    Deleting reportsYou can delete reports that were created from a report template. his deletes from therepository the reports and all the report results that were created from the reports.

    Report parameters

    You can specify values for report parameters when you create or modify a report$ and whenyou run a report manually.

    http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_terms.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_debugging_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_deploy_overview.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_CreatingReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_SelectReport.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_ModifyReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_DeleteReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_c_ReportParameters.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_associating_rule_terms.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_debugging_overview.htmlhttp://publib.boulder.ibm.com/infocenter/iisinfsv/v8r1/topic/com.ibm.swg.im.iis.ia.quality.doc/topics/dq_deploy_overview.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_CreatingReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_SelectReport.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_ModifyReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_t_DeleteReports.htmlhttp://a25aciwas010b:9080/infocenter/topic/com.ibm.swg.im.iis.found.moz.wc.reporting.doc/topics/rtab_c_ReportParameters.html
  • 7/25/2019 Data Rule Questions

    6/23

    &ogging categories

    &f you select certain reports$ such as 9og messages by data range$ you are asked whichlogging categories you want to include in the report. he following table includes a list of allavailable categories and their descriptions.

    'tpt options "or report reslts(o can speci"y additional otpt options "or report reslts that yo create in HT!&) #DF)

    te*t$ 79S$ and 789 format. he '6 and ;:89 formats do not have additional options.

    Creating reports "rom templates

    You can use a report template to create reports.

    #rere$isites! You must have read permission in the access control list for the template.

    o create a report from a template!

    ". Select a report template.

    %. Click New Report.

    (. &n the -ew 'eport pane$ specify parameters and other settings!

    o #ptional! Change the default name and description of the report. he name and

    description are inherited from the report template that the report is created from.o #ptional! Select a folder to save the report in. 4y default$ reports are saved in the root

    'eports folder.o Specify values for parameters or leave one or more parameter values unspecified$

    depending on whether you plan to run the report manually or by using a schedule.o Specify the output format of the report results that are created from the report.

    ;ifferent templates can offer different output formats.o Specify the time period after which each report result e*pires$ or specify that report

    results do not e*pire. 'eport results are deleted at midnight on the e*piration date.o Specify whether to overwrite each report result with the newest version$ or to save a

    specified number of report results for the report.

    +. Save the report.

    You can run the report to create report results.

    Templates) reports) and reslts

    You can use report templates to create reports. You can run reports to produce report results.

    #n the 'eporting tab$ the word reportrefers to the object that you run to produce report results. Areport resultis a file that is created in a specified$ viewable format$ such as :89$ 0;6$ or te*t format.

  • 7/25/2019 Data Rule Questions

    7/23

    when you run the report. he parameter values determine what information is searched forand returned when you run the report.

    Reports

    'eports are e*ecutable objects that are created from report templates. You run reports toproduce report results. You can schedule reports or run them manually.

  • 7/25/2019 Data Rule Questions

    8/23

    Users who have #wner permission for a template can add other members to the accesscontrol list.

  • 7/25/2019 Data Rule Questions

    9/23

    Delete

    ;elete the report result.

    he user who created the report result is the only member whohas administration permission by default. he user who createdthe report result also has read and delete permission for it. his

    user can add other members to the access control list and grantadministration and other permissions.

    8embers of the access control list for the report also appear onthe access control list for the report result. hese membersinherit the permissions that they had for the report$ e*cept thatadministration permission is not inherited and update and runpermissions do not apply to report results.

    6or e*ample$ a user$ group$ or role that has read$ run$ update and delete permission for areport inherits read and delete permissions for all of the report results

    &. 0ow to debugging the data rules and data rule sets when we get errors in our rule

    logic5. 0ow to import and export the data quality components like data rule definition

    ,rule set definition ,global variable, metrics

    Q61+T(78+ F-7) +0-)(+T0#1.What is system generated exception table?2.What is Benchmark in IA?3.What is the diference between ol!mn "roperty #al!es and ol!mn

    ompleteness?$.an we implement omplex %&' (!nctionality !sing IA )ata *!les?+.an IA *eports ,les can be modi,ed to create new one?-.Why we do not need to r!n ol!mn Analysis beore r!nning ross/)omain Analysis?

    Data profiling is essentially data mining, but for a different purpose. $ou mine data to

    understand, to gain better knowledge about the data. 2hile the more common use of datamining is for gaining the data insights for business purpose e.g. customer buying

    characteristics/, data profiling is for technical purpose. To be more precise, you do data

    profiling to gather and analy%e the technical metadata characteristics of the data.

    !nformation Analyzer, data profiling software from ("), helps you gain insight intosuch technical metadata characteristics as, for example, column data type and si%e

    length/.

    This article is based on a case where a database table grew its si%e unexpectedly and itsinitial disk space allocation got strained. 9ooking at the growth pattern of the table, such

    as the number of new records which was not that huge/, didn:t give us any clue about the

    cause of the problem. (n this article, D"oni Darmawi#artawill step through the

  • 7/25/2019 Data Rule Questions

    10/23

    (nformation #naly%er process, run one of its functions called !olumn #nalysis on a

    simple table a scaled;down version of the real table/, and show how the profiling output

    help solve the problem.

    (nformation #naly%er is a client;server software. # data profiling user metadata analyst/

    works on its 46( client, so to make it easier to show you how ( solve the problem (:ll usea lot of screenshots.

    7ur example data is an 7racle table that has two columns and three rows (n real life,they can typically be more than and then, its log;in window.

  • 7/25/2019 Data Rule Questions

    11/23

    2hen your log;in is successful, the console main windowwill show up.

    #ssuming the 7racle table that we:d like to profile is new> we must identify it to the

    #naly%er, which technically means importing its metadata.

    Make sure you have connected the Oracle database to theInformation

    Analyzer serverbefore you import the metadata of its tables.

    1xpand Metadata Managementfrom the %&M'drop;down menu.

  • 7/25/2019 Data Rule Questions

    12/23

    Then, click !mport Metadata.

    7ur example 7racle data table/ is in the !9-7'1- database hosted in DD7)=5/, soselect !9-7'1- and then click !dentify (e)t *evel.

  • 7/25/2019 Data Rule Questions

    13/23

    (t might take a while, particularly for a database that has many tables and many columns>

    so 3ust wait.

    7n the completion message screen, click &+to close the screen.

    #ll tables in !9-7'1- database will be identified listed/ including our example tablenamed +'#!1&. 2e:ll next identify the columns of our +'#!1& table> so select +'#!1&

    and then click !dentify (e)t *evel.

  • 7/25/2019 Data Rule Questions

    14/23

    The result shows that #naly%er has correctly identified the two columns of the table.

    8ow, import metadata of all columns of the table by selecting the table and then clicking

    !mport.

    !lick &+to continue.

  • 7/25/2019 Data Rule Questions

    15/23

    2ait for completion.

    !lick &+on the successful completion screen.

    2e:re now done with the metadata of the data> we:re now ready to start our profilingtask.

    (n (nformation #naly%er as in most other software of these days/ we group our profiling

    works into pro3ects. 0ere, ( 3ust use an existing pro3ect D?78(@T1+T/, so select 7pen

    'ro3ect from the drop;down arrow on the right of 87 '-7?1!T +191!T1D.

  • 7/25/2019 Data Rule Questions

    16/23

    $ou:ll be shown the list of existing pro3ects. +elect your pro3ect, and click &pen.

    7ur previous existing/ profiling works are shown.

  • 7/25/2019 Data Rule Questions

    17/23

    8ext, open click ro"ect ropertiesfrom the &V'RV!'-drop;down menu.

    4o to the Data +ources tab. 7ur +'#!1& table is not in the list yet, as we haven:tidentified it specifically in our pro3ect we did in the previous steps at the server;wide

    level/> so we need to add it into our pro3ect, click Add.

  • 7/25/2019 Data Rule Questions

    18/23

    1xpand the +'#!1& table to see its columns. +elect all of the columns as we want toprofile all of them, and then click &+.

    2hen completed, click $ave All, and then close the ro"ect ropertieswindow.

  • 7/25/2019 Data Rule Questions

    19/23

    8ow, we:re ready to profile our +'#!1& data, to analy%e its columns. 7n the maintoolbar select !nvestigate . Column analysis.

    +elect all columns of the +'#!1& table to analy%e, and click Run Column Analysis.

  • 7/25/2019 Data Rule Questions

    20/23

    !lick $ubmit.

    !heck status by clicking Details.

  • 7/25/2019 Data Rule Questions

    21/23

    2hen the 3ob status shows +chedule !omplete, click Closeto close the #ctivity +tatus

    3ob status/ window.

    !lose the Column Analysiswindow as well.

    To check result, click &pen Column Analysis.

  • 7/25/2019 Data Rule Questions

    22/23

    7ur profiling output shows the metadata characteristics of the two columns. 7ur focus is

    on their si%es> so if necessary scroll to the right to see the 9ength columns.

    The 9ength has three columns Defined, (nferred, and +elected. The Defined length of thefirst column (8T141-&/ is as defined in the metadata of the table we imported, which

    is AB. The (nferred length, which is A, is produced, by (nformation #naly%er, by

    computing statistically the data lengths of all rows, based on the actual data values of thecolumn> and then, it suggests +elected/ that A should be the length of this column.

    +imilarly, (nformation #naly%er did the (nferred and +uggested on the other column, the

    9#-41!0#-&.

  • 7/25/2019 Data Rule Questions

    23/23

    "ased on these output produced by (nformation #naly%er, we can decide how much we:d

    to reduce the length of the columns, which will certainly reduce the disk space needed for

    the data.

    $ummary

    6sing a data profiling tool, such as the (") (nformation #naly%er, we can analy%e and

    gain knowledge particularly large amount of data that otherwise would not be apparent.

    The (nformation #naly%er has much more functionalities> this article discussed only thebasics of one of them !olumn #nalysis/.

    Data rule

    This data ruledefinition can be written in any terms you want to use. For example, you

    can typeC#ge,C Cvoter@age,C or you can create a ///