canonical modeling: niem and beyond - datypic · 2019-11-12 · canonical modeling: niem and beyond...
TRANSCRIPT
summer school
Canonical Modeling: NIEM and Beyond19 September 2012
summer school
Canonical Modeling: NIEM and Beyond
Priscilla Walmsley
Datypic IncDatypic, Inc.
www.xmlsummerschool.comLicensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Contentssummer school
1. About canonical models2. NIEM as an example2. NIEM as an example3. Challenges4. Other scenarios and tools4. Other scenarios and tools
www.xmlsummerschool.com Slide 2Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer schoolsummer school
About Canonical Models
www.xmlsummerschool.comLicensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Definitionssummer school
Forrester Research:• A canonical information model is a model of the semantics and
structure of information that adheres to a set of rules agreed upon within a defined context for communicating among a set of applications or parties.
d ldigitalML:• A canonical model is an enterprise design pattern which provides a
common set of definitions and values for all data in motion. Canonical models are abstracted models not related to any applications. They tend to be based on simple but extensible XML Schema and provide a single view of core business entities.
k dWikipedia:• A Canonical Model is any model that is canonical in nature, i.e. a
model which is in the simplest form possible based on a standard,
www.xmlsummerschool.com Slide 4Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
common view within a given context.
summer school
Features of Canonical Modelssummer school
• Data definitions that are intended to be reused• Centralized source/repository for discoveryCentralized source/repository for discovery• Typically describing data in motion rather than
data at rest• Accessible to business analysts, not just techies• Ability to subset the model for specific contextsy p
• It is too big to be used as is to described an exchange.
• Ability to extend the model for specific contexts• It is not intended to cover every possible data
element
www.xmlsummerschool.com Slide 5Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
element.
summer school
Why Canonical Models?summer school
• Better data definitions• More organizedg• Better documented• More standardized names, structures
• Better communication between business and IT• Increased interoperability• Reduced development time
www.xmlsummerschool.com Slide 6Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Competing Goalssummer school
Universal Vocabulary- Reusable Components
Broad Applicability- Broad Applicability- Consistency Across Implementations
Strictly Specified Exchanges- Better Validation
Easier for Implementers- Easier for Implementers- Better Performance
www.xmlsummerschool.com Slide 7Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer schoolsummer school
NIEM as an Example
www.xmlsummerschool.comLicensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
What is NIEM?summer school
National Information Exchange Model (niem.gov)A U.S. National Standard that facilitates informationsharing:
Across organizational and jurisdictional boundariesAt all levels of governmentAt all levels of government
A Data Model providing:Agreed-upon terms, definitions, and formats for variousb i tbusiness conceptsAgreed-upon rules for how those concepts fit togetherIndependence from how information is stored inpindividual agency systems
A Structured Approach for:Development tools processes and methodologies
www.xmlsummerschool.com Slide 9Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
Development tools, processes, and methodologiesSource: NIEM Practical Implementer's Course Available under the Creative Commons License at http://www.niem.gov/training.php
summer school
NIEM at 50,000 Feetsummer school
NIEM CoreDomains
People
Person Organization
Places
LocationInfrastructureProtection
Items
Substance VehicleEventsImmigration
EquipmentActivity
Intelligence
InternationalTradeScreening Criminal
JusticeEmergency
Management
www.xmlsummerschool.com Slide 10Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
Source: NIEM Practical Implementer's Course Available under the Creative Commons License at http://www.niem.gov/training.php
summer school
The NIEM Modelsummer school
• Based on XML Schema (plus annotations) • NIEM has a "meta model" on top of XML SchemaNIEM has a meta model on top of XML Schema
• defines things like objects, properties, associations, roles• This meta model is used by both the NIEM model itself, y ,
and must be used by any extensions
www.xmlsummerschool.com Slide 11Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
NIEM Objects and Propertiessummer school
• The NIEM model consists most fundamentally of objects and properties• Example: "Person" is an object,
"PersonHairColorCode" is a property.Obj t t d i XML l l t• Objects are represented in XML as complex elements (elements with children)
• Properties are generally represented as children of• Properties are generally represented as children of the objects.
<nc:Person s:id="Per1"><nc:Person s:id= Per1 ><nc:PersonHairColorCode>PUR</nc:PersonHairColorCode>
</nc:Person>
www.xmlsummerschool.com Slide 12Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Object Inheritancesummer school
• NIEM objects can extend other objects
ComplexObjectType
j• The base object has the type
ComplexObjectType, from which ItemType
all other objects are (directly or indirectly) specialized
• XML Schema complex type
TangibleItemType
• XML Schema complex type extension is used to represent this
ConveyanceType
VehicleType
www.xmlsummerschool.com Slide 13Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Substitutionssummer school
• NIEM properties that are semantically the same but have different physical representations can substitute for p y peach other
• XML Schema substitution groups are used to represent this
PersonCitizenship(abstract)
PersonCitizenshipText
P Citi hi ISO3166Al h 2C d
PersonCitizenshipFIPS10-4Code
www.xmlsummerschool.com Slide 14Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
PersonCitizenshipISO3166Alpha2Code
summer school
Associations and Referencessummer school
• Two objects can be related using an association element• An association contains references to the related objects,An association contains references to the related objects,
and possibly other information<nc:ResidenceAssociation>
<nc:AssociationBeginDate><nc:Date>2000-01-01</nc:Date>
</nc:AssociationBeginDate><nc:AssociationEndDate>
<nc:Date>2007-01-01</nc:Date></nc:AssociationEndDate><nc:PersonReference s:ref="Per1"/><nc:LocationReference s:ref="Loc1"/><nc:ResidenceDescriptionText>duplex</nc:ResidenceDescriptionText>
</nc:ResidenceAssociation>
www.xmlsummerschool.com Slide 15Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Rolessummer school
• Roles can be used to indicate the role an object plays in another type or in an exchangeyp g
• Avoids creating conflicting specializations of the same object• for example, having a VictimType and a WitnessType,
when a single person could play both roles
<nc:Person s:id="Per1"><nc:PersonName>....</nc:PersonName>
</nc:Person>
...elsewhere...<j:Witness>
<nc:RoleOfPersonReference s:ref="Per1"/><j:WitnessWillTestifyIndicator>true</j:WitnessWillTestifyIndicator>
www.xmlsummerschool.com Slide 16Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
j y /j y</j:Witness>
summer school
Augmentationssummer school
• Reusable bundles of properties in particular contexts• For example, the Justice domain has a set of person-p , p
related properties that it bundles together for exchanges to reuse
• An exchange might define its own augmentations<lexsdigest:Person>
<nc:PersonBirthDate>...</nc:PersonBirthDate><nc:PersonName>...</nc:PersonName><j:PersonAugmentation>
<j:PersonFBIIdentification>...</j:PersonFBIIdentification><j:PersonEarShape>...</j:PersonEarShape>
</j:PersonAugmentation><lexsdigest:PersonAugmentation>
<lexsdigest:PersonRegisterNumber>...</lexsdigest:PersonRegisterNumber></lexsdigest:PersonAugmentation>
www.xmlsummerschool.com Slide 17Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
</lexsdigest:Person>
summer school
Metadatasummer school
• Information about the data• Source, quality, language, reliability, etc.y g g y
• Can be shared by multiple objects
<nc:Metadata s:id "M1"><nc:Metadata s:id="M1"><nc:CommentText>Reported by suspect</nc:CommentText><nc:DistributionText>SBU</nc:DistributionText><nc:LastVerifiedDate>
<nc:Date>2004-01-01</nc:Date></nc:LastVerifiedDate>
</nc:Metadata>
...elsewhere...<nc:Person>
<nc:PersonBirthDate s:metadata="M1">...</nc:PersonBirthDate>
www.xmlsummerschool.com Slide 18Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
</nc:Person>
summer school
Strict Naming and Design Rules (NDR)summer school
• Naming and documentation rules (ISO 11179)• Specialized XML Schema annotationsp
• Target type of references, metadata, augmentations• Schema design pattern
• Garden of Eden (global element declarations, named types)
• Namespace strategy• Disallowed features of XML Schema• https://www.niem.gov/documentsdb/Documents/Tech
nical/NIEM-NDR-1-3.pdf
www.xmlsummerschool.com Slide 19Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Typical NIEM ExchangeDevelopment Process
summer school
1. Exchange Content Modelingg• Develop an exchange
model diagram, depicting objects, depicting objects, properties, associations, etc.
• Use UML spreadsheets Use UML, spreadsheets, etc. -- NIEM does not care
• Helpful to think of things • Helpful to think of things in terms of the NIEM meta-model (roles, augmentations, etc.)
www.xmlsummerschool.com Slide 20Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
augmentations, etc.)
summer school
Typical NIEM ExchangeDevelopment Process
summer school
2. Mapping• Map the model to NIEM components to find the overlapMap the model to NIEM components to find the overlap
• typically done with a spreadsheet called a Component Mapping Template
www.xmlsummerschool.com Slide 21Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Typical NIEM ExchangeDevelopment Process
summer school
NIEM contains:3. Subsetting NIEM
NIEM contains:• ~6000 elements• very loose cardinalities
(everything is optional and ti )repeating)
• multiple representations of the same semantics Subsetting Process
The subset contains:• only elements and types
relevant to your exchanget i t di liti
Wantlist
Subset
• stricter cardinalities• more constrained types (e.g.
code lists)• a wantlist (manifest)
www.xmlsummerschool.com Slide 22Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
a wantlist (manifest)
summer school
Typical NIEM ExchangeDevelopment Process
summer school
Wantlist
4. Define an extension schema for anything
SubsetIEPD
Wantlistschema for anything that was missing from NIEM
5. Define an exchange schema for the root Extension Schema Exchange Schemasc e a o t e ootelement(s)
6 Assemble the
Extension Schema Exchange Schema (Root)
6. Assemble the schemas, along with other artifacts into
Other ArtifactsSamples, documentation, XSLTs, catalog, etc.
www.xmlsummerschool.com Slide 23Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
an IEPD
summer schoolsummer school
Challenges
www.xmlsummerschool.comLicensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Challenge #1: Looseness via Reuse summer school
• Reuse/extension can result in overly loose models
Organization
• Organization Name [1 *]• For example:
• Org Unit required for Law E f
• Organization Name [1..*]• Organization Tax ID [0..1]• Organization Unit [0..1]
Enforcement• Tax ID prohibited for
Gang
Gang
• Turf [0..1]Tattoo [0 1]
Law Enforcement
• Field Office [0..1] Gang• Solutions:
• Tattoo [0..1]• Hand Signal [0..1]
• Constraint schemas with XML Schema restrictions:• brittle• require significant refactoring
www.xmlsummerschool.com Slide 25Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
require significant refactoring
• Business rules (Schematron, new techniques)
summer school
Challenge #2: Interoperabilitysummer school
• Two NIEM subsets are not necessarily compatible• Different representations of the same semantics p
(code vs. text)• Different levels of semantic specificity• Different levels of structure• Entirely different properties chosen
Diff t t i ti / b tit ti• Different customizations/substitutions
Person Person
IEPD 1 Subset IEPD 2 Subset
Person
• Full Name• Hair Color Text• Identification
Person
• First Name• Last Name• Hair Color Code (FBI)D i e s License N mbe
www.xmlsummerschool.com Slide 26Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
Specific Attribute• Eye Color Text• IEPD 1-Specific Attribute
• Drivers License Number• Passport Number
summer school
LEXS: One Approach to the Interoperability Issue
summer school
• Digest is common area contains that is exactly th f
Message 1Digest
the same for every IEPD
• Extensions for
Person id="P1"NameDOB
Extensions for individual IEPDs are separated into a
l d ith f
Payload
Parolee ref="P1"Parole DatePa ole Stat spayload with references
back to the digest• Useful if interoperability
Parole Status Message 2Digest
Person id="P1"Useful if interoperability is important (e.g. federated search scenario)
NameDOB
Payload
www.xmlsummerschool.com Slide 27Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
scenario) Witness ref="P1"Will TestifyWitness Status
(More at http://lexsdev.org)
summer school
Challenge #3: Versioning summer school
• Reliance on a common model complicates versioningg
• If the canonical model changes, it can have a ripple effect on the exchanges that use it
• Approaches to ease versioning:• Defining a clear versioning policy
• Differentiation between minor and major releases based on backward compatibility
• Deprecation policy
• Repository that tracks versions and can show diffs• No forced upgrades for exchanges that don't need it
www.xmlsummerschool.com Slide 28Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Challenge #4: Governance and Harmonization
summer school
• Who governs the model?• NIEM "domain" concept allows governance to be p g
somewhat decentralized• IEPDs are even more decentralized
• Harmonizing the model within and across domains is a process that:
T k l t f ti• Takes a lot of time• Has to balance competing interests• Requires volunteers or generous sponsors • Requires volunteers or generous sponsors
www.xmlsummerschool.com Slide 29Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Harmonization Tools (e.g. OpenII)summer school
www.xmlsummerschool.com Slide 30Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Challenge #5: XSD as Modeling Language
summer school
• XSD is not an ideal modeling language• Some positives:p
• easily parseable for use in tools• directly tied to XML structure: no need to keep in sync
some feat es of XSD do help o nde stand semantics• some features of XSD do help you understand semantics• Type extensions• Substitutions
• NIEM-specific annotations also help where XSD is limited
• Some negatives• Cannot express all data constraintsCannot express all data constraints• Perception that is only for XML (JSON, RDF)
• Alternatives...UML? Spreadsheets?
www.xmlsummerschool.com Slide 31Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Is UML the Right Representation?summer school
www.xmlsummerschool.com Slide 32Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Challenge #6: Level of Effortsummer school
• Myth: "NIEM (because of its tools, documentation, etc.) should make it easy to implement XML.") y p• Reality: Creating a one-off exchange is easier.
Sharing is hard.• Solutions:
• Continuously improving toolsSimplified presentation of the model• Simplified presentation of the model
• Straightforward UML to XSD mapping/conversion
• Better documentation of best practicesp
www.xmlsummerschool.com Slide 33Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Despite the Challenges, Strong Advantages...
summer school
• Shared semantics• Despite the challenges, it really does helpp g , y p
• Forces some rigor in modeling• Tools• Community
• NIEM gets different groups of people in a room g g p p ptogether and gets them talking about information sharing
www.xmlsummerschool.com Slide 34Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer schoolsummer school
Other Scenarios and Tools
www.xmlsummerschool.comLicensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
Other Canonical Model Scenariossummer school
• Other industry standards that offer different subsets/"views", e.g.g• FpML
• Four different views (subsets) are automatically generated from a master schema setfrom a master schema set
• Extension is possible via substitution groups or type substitution
• Intra-enterprise canonical models• Typically large organizations implementing SOA
banking insurance• banking, insurance
• Like an "Enterprise Data Model" but:• for data in motion rather than data at rest
www.xmlsummerschool.com Slide 36Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
• directly used rather than being for documentation purposes
summer school
Other Toolssummer school
• Home-grown tools written in-house• CAMCAM• Commercial tools
• igniteXMLg• Altova Schema Agent• Progress DataXtend
www.xmlsummerschool.com Slide 37Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
CAMsummer school
• CAM = Content Assembly Mechanism• OASIS WG/Standard/
• https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=cam
• CAM Toolkit• CAM Toolkit• Open source reference implementation
• http://sourceforge.net/apps/mediawiki/camprocessor
• Design time and runtime components• Define an exchange from a set of source schemas
via "templates"via "templates"
www.xmlsummerschool.com Slide 38Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
CAM Main Viewsummer school
www.xmlsummerschool.com Slide 39Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
summer school
igniteXML Approachsummer school
Organizers (Model Management)
Physical Models
5.5
Logical Models
Metadata OverlayData Model Vocabulary
Consumers
Data Model VocabularyTaxonomy e.g. UDEFStructured e.g.
RDF/OWL
Weblogic
CanonicalModel
Business AnalystWeblogicWebsphereJBoss
Business Analyst
Integration Developer
www.xmlsummerschool.com Slide 40Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
Consumer ServerIntegration Developer(More at
http://ignitexml.com)
summer schoolsummer school
Questions? Comments?
Thank you for your attention.
Priscilla WalmsleyPriscilla Walmsleyhttp://[email protected]
www.xmlsummerschool.com Slide 41Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License