Credit: IBM
Using policies and rules
Before you start
Learn how to apply pre-built policy and rule content for InfoSphere
Information Governance Catalog (IGC) to get an information governance
initiative under way.
With the substantial growth in data volume, velocity, and variety comes a
corresponding need to govern and manage the risk, quality, and cost of
that data and provide higher confidence for its use. This is the domain of
information governance, a focal area for InfoSphere Information Server.
This is a broad topic, and further details on information governance
practices and solutions can be found in the Resources.
IGC provides a meaningful directory of governed information. It supports
this through a metadata repository that can include governed business
vocabulary (a business glossary); semantic policies and governance rules;
stewardship assignment for information domains; a catalog of information
assets; relationships and linkage across the vocabulary and assets; and a
range of tools to understand those relationships, including impact
analysis, business and data lineage, and queries and reports.
By taking advantage of these capabilities, organizations can:
- Enable information governance
- Semantic policies and rules
enable precise communication of governance
requirements. - Common language streamlines information
development for business requirements. - Stewardship at
all levels of the information supply chain. - End-to-end
data lineage and impact analysis.
- Semantic policies and rules
- Support accountability and responsibility
- Assign stewards as
single point of contact. - Link between business
metadata and technical metadata to ensure
compliance.
- Assign stewards as
- Improve information accessibility
- Administrators can tailor
the tool to the needs of business users. - Access
enterprise information you need when you need it. - Use
and reuse information assets based on a common semantic
hub.
- Administrators can tailor
- Enable collaboration
- Capture and share annotations between
team members. - Greater understanding of the context of
information. - More prevalent use and reuse of trusted
information.
- Capture and share annotations between
One challenge organizations find with new information governance
initiatives is establishing a foundation and structure to get under way,
particularly with sufficient components to understand how the various
parts of the catalog fit together and can be utilized to support their
governance initiatives.
Objectives
In this tutorial, you learn how to install the pre-built package of IGC
terms, policies, and governance rules to jump-start your information
governance efforts. We will show what the available content includes, how
to install the content for immediate use, and how to delve deeper into
usage of the content.
Overall the goal of this package is to provide insight on how to leverage
business information in an information governance context, specifically
to:
- Provide base content to facilitate getting started with:
- Working models with relevant assets.
- Working
models that span components of information
governance. - A framework for constructing and expanding
information governance. - Examples for educational
purposes among members of your teams.
- Recommend approaches for policy and rule entry and creation.
- Use the policy tree and referenced rules, including browse and
search. - Incorporate naming standards plus required or desired attributes.
Note that the content package is not intended as a full end-to-end solution
and does not reflect all possible requirements for information
governance.
Prerequisites
This tutorial is written for users of InfoSphere Information Server V11.3
who are learning or are familiar with InfoSphere Information Governance
Catalog and its use.
System requirements
To utilize the pre-built content, you need an InfoSphere Information Server
V11.3 platform with Information Governance Catalog installed. The
following patch release for IGC available on IBM Fix Central should also
be installed before importing the pre-built content:
is113_IGC_ru5_server_client_multi.
Overview
IGC is an interactive browser-based tool that enables users to create,
manage, and share an enterprise vocabulary and classification system along
with a framework for understanding and managing information governance
policies and requirements and data stewards, a repository of metadata
assets (such as the database tables that hold key business data), and a
query capability to report on relationships within the catalog.
A business glossary is designed to help users understand business language
and the business meaning of information assets like databases, jobs,
database tables and columns, and business intelligence reports. In
addition to categories and terms, the catalog also contains information
about other assets such as database tables, jobs, and reports in the
metadata repository.
A catalog of information governance policies and rules provides an
interactive environment to communicate precise intent for how information
must be managed throughout its life cycle. Such policies may represent
government regulations, corporate standards, or line-of-business processes
at a broad level. The governance rules provide detail to describe the
specific requirements, the terms or assets that are governed, and the
assets that implement the requirements.
Underlying the information governance capabilities are metadata assets.
These include the noted glossary terms, governance policies and rules, the
representation of databases with their tables and columns, logical and
physical data models, applications, business intelligence reports, data
integration (ETL) jobs, and many other assets. Impact analysis and
end-to-end data lineage across data sources and assets allows users of the
catalog to understand the relationships between the business language and
the technical implementations, a critical part of the information
governance story.
The material in this package includes:
- A sample glossary to facilitate understanding information governance
concepts. - An information governance policy structure with associated policies
and governance rules. - A set of metadata queries to review contents and their relationships,
and allow you to start defining approaches for information governance
policy development and administration.
The glossary provided for the IGC base content package is derived from
larger industry-specific models available from IBM, but is focused on a
limited subset for the specific subject areas of person, location, and
customer information. Subsequent versions and releases of this content may
expand on these dimensions.
This tutorial describes the steps to import the information governance
content, and potential usage of the content detailed further in the
included PDF document (see Downloadable
resources).
IGC package
content
The IGC Base Content package consists of a compressed (.zip) file named
IGC_OOTB_v1.zip containing a series of associated assets, including:
- POLICIES — A policy framework for organizing information
governance policies plus example governance policies and rules across
three focus domains: Master Data Management subjects, data privacy,
and information quality. - SUBJECT CONTENT — A set of categories and terms that include
person-, location-, and customer-related terms; and information
governance-related terms. - RELATIONSHIPS — Relationships that span the above artifacts,
including policies to rules and rules to terms and assets. - QUERIES — A set of queries that allow you to see some of the
relationships and connections between the artifacts.
The contents are stored as XML files, which are imported into the IGC
through the Administration tab of the UI by someone with
IGC administrator privileges. The files:
- IGC-governance-base-xml-export-terms-2014-09-23.xml
- IGC-governance-base-xml-export-rulesassets-2014-09-23.xml
Importing IGC
content
IGC content is imported with its import function. To import the business
glossary content, you must have IGC administrator privileges. For more
examples of how to import content into the catalog, see Importing and exporting glossary content of the catalog.
The subsequent import steps assume that you download, extract, and save the
content XML files to wherever the IGC browser can access them.
To perform the import of the IGC terms content:
- Open the IGC and select the Administrator tab.
- Select Tools > Import.
- Choose XML as the type of file to import and click
Next, as shown below. - Choose the Merge option (if other glossary content
exists, it is recommended you choose the Ignore
option to avoid overwriting someone else’s work) and click
Next. - Browse to the directory location of the
IGC-governance-base-xml-export-terms-2014-09-23.xml file and click
Import. - Review the import summary as shown in Figure 2. There should be 37
categories and 195 terms. - Click Close.
Figure 1. XML file selection
Figure 2. Summary of terms import
To perform the import of the IGC policy and rules content:
- Repeat the process for the
IGC-governance-base-xml-export-rulesassets-2014-09-23.xml file and
Click Import. - Review the import summary as shown below. There should be 72 policies,
110 governance rules, and updates to 62 terms. - Click Close.
Figure 3. Summary of policies import
Review the imported IGC
content
After import, you can browse the glossary and review the sample content.
Review the categories and terms
From the Catalog tab, select the Glossary
tab, then choose Browse Category Hierarchy. Depending on
your environment, your glossary may contain other content, but you should
find two categories: one called Business Information and one called
Information Governance.
- Business Information — This category centers on terms for
person, location, and general transactions for a person acting as a
customer, and provides example insight into the relationships
available for terms within the glossary. - Information Governance — This category focuses on categories
and terms relevant for information governance and provides insight
into useful information governance concepts particularly the
classification of information important in discerning key governance
focal areas.
The Business Information category, for instance, contains example content
for Calendar, Customer, Location, Organization, Payment Card, Person, and
Transaction.
Figure 4. Business information category
Expand the view for a category such as location information, then select
one of the subcategories such as physical address. This will highlight the
category overview as well as list the associated terms. You can then
select one of the associated terms such as street address and review the
description and general information provided as shown below.
Figure 5. Business information term
The business information terms provide examples of the types of
relationships available in the glossary. Relationships expand on the
understanding of a given term such as whether it has or encompasses other
terms; is a specific type of term; or simply falls within a common
category of terms. For instance, the term Street Address
illustrates a number of these relationships:
- Street Address is a term in the category Physical Address, which in
turn is part of the category Location Information. Categories are a
natural organization of related terms. - Street Address is governed by two governance rules (e.g., address must
be validated and verified against a postal reference source). It is a
bi-directional relationship, so if it is set in one location (whether
in the term or the governance rule), it is visible in both. - Street Address is a type of Address. This is the converse of the
has types relationship and is set bi-directionally. The
term Address has two types in this content set: Street Address and Box
Address. Address provides the broad term, but Street Address and Box
Address provide more specific terminology for these mutually
distinctive terms. - Street Address also uses the has a relationship.
Has a describes components included in a larger
term. In this instance, Street Address has a City, State, Postal Code,
and Country (as well as several other components). The converse of
this relationship is the is of relationship.
You can continue to review other terms. Initially, the terms are not
linked to any associated assets, but as such content becomes available in
the metadata repository, it is possible to connect or assign the terms to
assets to get a broader understanding of what data is associated with key
business concepts.
Review the governance policies and
rules
From the Catalog tab, select the Glossary
tab, then choose Browse Policy Hierarchy. Depending on
your environment, your glossary may contain other content, but you should
find five high-level policies, as shown below.
Figure 6. Governance policy hierarchy
These top-level policies in the policy tree align with the main groups of
policies outlined for an information governance program:
- Information governance approaches
- Standards adopted by the
information governance program to increase consistency, reduce
discrepancies, and remove unnecessary processing. These could
include, for example, the practices and processes used to
manage the Information Governance Catalog or to monitor
assignment of data stewards to terms or assets.
- Standards adopted by the
- Information governance delegations
- The set of core policies
delegated to another governance domain (for example, Audit or
Risk Management governance domains). For example, the
validation of fraud reporting might be considered part of
information governance, but in your organization may be part
of the fraud and risk management department, so these policies
are considered delegated to that area.
- The set of core policies
- Information governance domain policies
- The set of core
policies that cover the basic information domains of the
business (for example, Customer, Employee, Product). These
information governance policies may fall primarily within a
specific line of business, but because they span multiple
points in the business, they are considered core information
domains that must be included in the information governance
program.
- The set of core
- Information governance obligations
- The set of core policies
delegated to the Information Governance organization from
other governance focus areas and domains, including:- Corporate requirements — Obligations
defined between one or more groups within an
organization (for example, sales and IT or security
enforcement over data stores). - Government
regulations — Mandated laws and requirements
for an organization from a national entity or its
departments and agencies. - Industry standards
— Formalized standards, often from a standards
body, that provide best practices, but not mandated,
guidance for a specific topic. - Service-level
agreements — Obligations to meet a specified
service level (for example, delivery of data by 10
p.m.). - Third-party contracts —
Contracted obligations between an organization and
other third parties.
- Corporate requirements — Obligations
- The set of core policies
- Information governance principles
- Principles define the
high-level goals and approaches of the information governance
program. Such principles are the overarching goals and
directives of your information governance efforts and should
be understood by everyone in your organization. Some of these
principles may incorporate specific policies and rules, others
will not.
- Principles define the
You can review the policy tree and policies provided in more detail. Use
the governance policy to summarize a specific organizational obligation,
whether external or internal, with its objective. A policy should include
a short identifiable name, a short description of its intent, a long
description and a data steward before publication, and a custom attribute
such as the included “Link to more information” for URL links to the
actual policy for reference (many policies are simply too long to include
all detail in the catalog). You may find it useful to add a label to
associate related or transient links (for example, project or issuing
agency).
Many of the policies will contain one or more governance rules. These
governance rules may be declarations of how a policy’s goals are to be
achieved or discrete specifications of how some data will be processed,
evaluated, monitored, or remediated to comply with a policy’s goals.
Governance rules provide linkage between the policies and associated terms
and data assets. The governance rules include two relationships to support
this linkage: Governs and Implemented by. The former relationship
describes those terms and assets that fall under, or are governed by, the
rule. The latter relationship describes assets used to actually implement
the governance rule (as the governance rule is descriptive in nature, it
cannot by itself be used to process, validate, monitor, or otherwise
affect data). Use the governance rules to delineate specific requirements
of the policy rather than putting those details in the policy. Generally,
you should avoid embedding rules or requirements at the policy level as
these cannot be linked to other catalog assets. You can usually recognize
such rules by the use of action-oriented verbs: must be masked, must be
validated, must be monitored, etc.
From the Catalog tab, select the Glossary
tab, then choose Browse Policies. Scroll down (or to the
next page) until you find the Know Your Customer (KYC)
policy, and click the policy name to open it.
Figure 7. Information Governance policy —
Know Your Customer
In this policy example, you can see a number of features of the policy and
review associated governance rules:
- Know Your Customer is a policy specific to the domain of customer. The
parent policy describes where it exists in the policy hierarchy (it
can only have one parent, so you do have to determine the most logical
location to place it). - The policy includes a name, and short, and long descriptions. There is
a link to an external reference, in this instance a Wikipedia
reference. A link could be made to an accessible site internal to your
organization instead. - The policy references 25 specific governance rules. These are the
details or requirements of the policy. For instance, the first rule
listed is that address must be validated and verified against a postal
reference source. If you click on this governance rule, you will find
specifics of that rule such as its name and description. It may also
contain references to implementations and terms that it governs (for
example, the term Street Address that you reviewed in regards to term
content).
You can continue to review other governance policies and rules and their
relationships to get a broader understanding of how the components of the
IGC connect with key business concepts. This set of information governance
content, both terminology and policies, is a foundation allowing you to
begin focusing on key roles, processes, or information areas while
continuing to expand understanding of information governance through your
organization.
Refer to the PDF included in the downloads for further discussion on the
creation, development, and usage of the IGC content package.
Import the IGC
queries
IGC provides the ability to query or report on all the content and
relationships in its repository, including policies, governance rules,
terms, and assets. These queries are powerful tools to help with policy
administration, implementation, monitoring, and enforcement.
The subsequent import steps assume that you download, extract, and save the
query content file IGC-governance-base-GovQueries-2014-09-24.wbq to
wherever the IGC browser can access it. To import the catalog queries
content, you must have Information Governance Catalog Glossary
Administrator or Information Asset Administrator privileges.
To perform the import of the IGC query content:
- Open the IGC and select the Catalog tab, then the
Queries tab. - Click Import.
- Browse to the directory location of the
IGC-governance-base-GovQueries-2014-09-24.wbq and click
Import. - Review the imported query list as shown below. There should be at
least 10 queries present, although there may be others depending on
your environment.
Figure 8. Imported queries
The queries provide a means to search and present information pertinent to
your information governance initiative. The functionality can provide
details as simple as terms within categories (the Glossary Categories and
Terms query) or more complex output with specific filters, such as finding
policies do not yet have associated rules (the Governing Policies without
associated Rules query). The queries can become an active part of how you
implement your information governance program and your tools to monitor
the environment.
Conclusion
In this tutorial, you have learned how to import and review content for the
InfoSphere Information Governance Catalog that can help you jump-start an
information governance initiative. You can now apply this knowledge to
develop and use relevant governance terms, policies, and rules based on
your needs. For additional usage of the IGC content, please review the
document IBMInfoSphereInformationServer_IGC_OOTB_Usage_v1.pdf included
(see Downloadable resources).
Downloadable resources
Related topics
Credit: IBM