Wednesday, January 27, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Technology Companies

Analyzing the contextual content of hierarchical data by using IBM Watson Explorer

January 11, 2019
in Technology Companies
IBM RFE Community, Release 23
590
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

You might also like

The Eclipse Foundation’s move to Europe will bring open standards and open source together – IBM Developer

Kafka Monthly Digest – December 2020 and 2020 in review – IBM Developer

Learn new skills while working on volunteer projects – IBM Developer


Dimensional models are conventionally based on parameters such as data reality,
grain, dimensions, and facts or measurements. But there are business scenarios where
valuable unstructured information is stored around the grains or at different levels
of hierarchy within the data model.

Using the linguistic algorithms of natural language processing (NLP), IBM Watson
Explorer can understand all of the words in a question, not just the keywords. In fact, it can analyze grammar, context, jargon, idioms, and even mode-of-thought. It can
then draw statistical inferences and make recommendations or provide advice within a
given confidence level.

This tutorial describes how to analyze a hierarchical data structure by using IBM Watson
Explorer within the context of content analytics while retaining the hierarchical
structure or grains of the underlying data model. This strategy gives you
the flexibility and capability of analyzing aspects of unstructured content and
structured dimensions together. I demonstrate step-by-step how to
overcome the issue of multiplicity while performing content analysis with
IBM Watson Explorer and IBM Watson Content Analytics Miner.

In preparing this tutorial, I used the following software components:

  • IBM Watson Explorer Content Analytics Version 11.0.2
  • IBM Content Analytics Studio Version 11
  • Eclipse Java EE IDE for Web Developers
  • IBM dashDB
  • JDBC Driver for DB2
  • Windows Server 2012 R2

Problem description

I have constructed this tutorial using a simple example of an incident management
system. Although this is a very simple example, the concepts it illustrates can be
easily extended to more complex use case scenarios, from either incident management
or from other systems such as quality management, case management, workflow
management, or content management.

Figure 1 through Figure 3 show the entity relation diagram, the fact table, and the
hierarchical representation of the example. Consider the data structure used for
incident reporting shown in Figure 1.

Figure 1. Entity relation diagram: Incident management
system

entity relation diagram

entity relation diagram

The system illustrated here is an example of an Operational Source System. (For more
information about Operational Source Systems, see The Data Warehouse
Toolkit
.) This system consists of a
transaction header row that is associated with multiple transaction lines. With
header/line schemas (also known as parent-child schemas), all of the header-level
dimension foreign keys and “degenerate dimensions” should be included on the line-level fact table
with differing granularity at each level. For this tutorial,
I used INCIDENT_CODE, INCIDENT_LINE_ITEM, INVESTIGATION_CODE, RESOLUTION_CODE, and
CLOSURE_CODE to create the fact table in Figure 2, which was
generated from the data model shown in Figure 1. I added a few rows of example
data to the fact table to build and demonstrate the solution.

Figure 2. Fact table structure & example
data

fact table structure

fact table structure

By virtue of the denormalized structure of the derived fact table, it is obvious that
only
the leaf level is unique.

As shown in Figure 3, a symmetrical hierarchical data structure with 5 levels and
3 children at each level produces 81 rows for the root level. Repetition of
each unstructured/structured content of field will increase from 3 to 81 from level
1 to 5. (Note: Figure 3 represents a fragment of the hierarchical structure
representation. For more details, see the sample data file in GitHub.)

Figure 3. Hierarchical view of example data

Hierarchical view of example data

Hierarchical view of example data

Everything that is crawled by IBM Watson Explorer is considered a document or a
single entity (that is, each row of the database table is considered a separate
document). That document is the only grain available within it. Unfortunately, these
multiple occurrences of unstructured content, when analyzed using IBM Watson,
produce misleading statistics. For example, a given text in the INCIDENT_DESCRIPTION
field in level 2 as shown in Figure 1 through Figure 3 is counted nine times
(multiples of the child level) for one INVESTIGATION_CODE field.

This tutorial gives you a strategy to overcome this problem of
multiplicity. I demonstrate how to create a custom database crawler
plug-in to crawl the derived fact table in a specific way so that the dimensions of
all the levels of hierarchy are contained, but multiplicity is eliminated.

What you’ll need to build your
application

The build your own application, you’ll need the following skills and tools:

  • Working knowledge of IBM Watson Explorer (Advanced Edition)
  • Working knowledge of Content Analytics Studio and an understanding of Apache UMIA
    content analytics concepts
  • IBM Watson Explorer Content Analytics V.11.0.2
  • IBM Watson Explorer Content Analytics Studio V.11
  • IBM Cloud Account (optional)
  • IBM dashDB (or any standard DBMS)
  • Understanding of DBMS concepts
  • Java development skills

Set up the content analytics collection

To create the content analytics collection:

  1. Note the following settings and options for use by the collection.
    Option name Option value
    General options
    Collection name ATutorialCollection
    Collection type Content analytics collection
    Solution package Do not apply a solution package
    Document cache Enable the document cache
    Thumbnail generation (required for showing thumbnail images in
    the search results)
    Do not enable thumbnail generation
    Advanced options
    Description Collection for demonstrating the concept of Contextual Content
    Analytics of Hierarchical Data – 02/04/2018
    Collection security Do not enable security for the collection
    Overlay index Do not enable overlay index
    N-gram segmentation Do not enable N-gram segmentation

    Note: For more details, see “Administering Watson Explorer Content Analytics” in the IBM
    Knowledge Center.

  2. Create the fact table.

    A sample fact table, shown in Figure 2, is created in
    the IBM dashDB, which contains the sample data. Figure 4 shows the sample
    data in the dashDB console. Detailed steps for creating the table are out of
    the scope of this tutorial. However, you can use any standard relational database.

    Figure 4. Sample data in dashDB

    sample data in dashDB

    sample data in dashDB
  3. Create the JDBC crawler.

    Follow the standard process for creating a JDBC
    crawler, as described under “Crawler administration” in the IBM Knowledge Center.

    The following table lists the JDBC connection details for reference.

    Parameter Value
    JDBC driver name com.ibm.db2.jcc.DB2Driver
    JDBC driver class path F:DevWTutorialjars
    Database URL jdbc:db2://awh-yp-small02.services.dal.bluemix.net:50000/BLUDB

    The JDBC driver used to connect dashDB is the same as DB2 and can be
    downloaded from the IBM
    Support
    site.

  4. Populate the index field and facet mapping.

    While creating the crawler, the
    index field mapping and the facet tree are populated according to the
    following table. Index field names are important, as these fields will be
    used within the custom crawler plug-in for the custom crawling process of
    de-duplication.

    Field Name Return-able Faceted search Free text search In summary Fielded search Parametric search Analyzable Unique identifier Facet mapping Group
    ID Yes
    INCIDENT_CODE Yes Yes Yes Yes Yes No No Incident Code Incident Header
    INCIDENT_TYPE_CODE Yes Yes Yes Yes Yes No No Incident Type Code Incident Header
    REQUESTED_BY Yes Yes Yes Yes Yes No No Raised By Incident Header
    AFFECTED_ITEM Yes Yes Yes Yes Yes No No Affected Product Incident Header
    HELP_DESK_ID Yes Yes Yes Yes Yes No No Agent Code Incident Header
    INCIDENT_LINE_ITEM Yes Yes Yes Yes Yes No No Incident Line Item Incident Details
    ASSIGNED_DEPT Yes Yes Yes Yes Yes No No Assigned to Dept. Incident Details
    INCIDENT_PART_NO Yes Yes Yes Yes Yes No No Affected Parts(s) Incident Details
    INCIDENT_DESCRIPTION Yes No Yes Yes No No Yes
    INVESTIGATION_CODE Yes Yes Yes Yes Yes No No Investigation Code Investigation Details
    DIAGNOSIS_CODE Yes Yes Yes Yes Yes No No Diagnosis Code Investigation Details
    DEFECTIVE_PARTS Yes Yes Yes Yes Yes No No Defective Part(s) Investigation Details
    DIAGNOSIS_DESCRIPTION Yes No Yes Yes No No Yes
    RESOLUTION_CODE Yes Yes Yes Yes Yes No No Resolution Code Resolution Details
    R_DEPT_CODE Yes Yes Yes Yes Yes No No Resolution Dept. Code Resolution Details
    R_WORK_CENTER Yes Yes Yes Yes Yes No No Resolution Work Center Resolution Details
    RESOLUTION_DESC Yes No Yes Yes No No Yes
    CLOSURE_CODE Yes Yes Yes Yes Yes No No Closure Code Final Disposition Logs
    CLOSING_COMMENT Yes No Yes Yes No No Yes

    Figure 5 shows the facet tree hierarchy that is displayed in IBM Watson
    Explorer Content Analytics Miner. After the facet tree is created, the table
    is crawled, initially with the above configuration.

    Figure 5. Facet tree structure

    Facet tree structure

    Facet tree structure
  5. Create a defect dictionary and deploy it in the content analytics server.

    The
    following table lists the keywords of the defect dictionary along with their
    parts of speech and inflections.

    Defects
    (Keyword)
    Part of speech Inflection(s)
    damage Noun damages
    damage Verb damaged damages damaging
    foul Verb fouled fouling fouls
    foul Noun fouls
    crack Noun cracks
    crack Verb cracked cracking cracks
    clip Verb clipped clipping clips
    clip Noun clips
    break Noun breaks
    break Verb breaking breaks broke broken
    rip Verb ripped ripping rips
    rip Noun rips
    abrasion Noun abrasions
    mark Noun marks
    mark Verb marked marking marks
    seal Verb sealed sealing seals
    seal Noun seals
    scrap Noun scraps
    scrap Verb scrapped scrapping scraps

    The defect dictionary is generated using the keywords and is used to analyze
    the instructed text content in the fact table. For a more detailed
    understanding, see the sample data file in GitHub.

    Figure 6 shows how the dictionary is organized within the IBM Watson Content
    Analytics Studio. For more information, see “Content Analytics Studio for advance text analytics” in the IBM
    Knowledge Center.

    Figure 6. Defect dictionary in Content
    Analytics Studio

    Defect dictionary

    Defect dictionary

    After the defect dictionary is created, the configured UIMA pipeline is exported to the Content Analytics Server.

    (Detailed steps for creating the dictionary and exporting the UIMA pipeline
    are beyond the scope of this tutorial. This section is for reference only to
    maintain the continuity of the tutorial.)

Generate initial output without the custom
crawler plug-in

  1. Get the initial statistics in IBM Watson Explorer Content Analytics Miner.

    After the collection has been crawled, indexed, and made ready for
    analysis, use the IBM Watson Explorer Content Analytics to verify the
    initial statistics. Note that in Figure 7, 57 occurrences of
    “fouling” and 50 occurrences of “scrap”
    keywords are found from the table, followed by other keywords.

    Figure 7. Initial statistics in IBM Watson
    Explorer Content Analytics Miner

    Initial statistics in Watson Explorer

    Initial statistics in Watson Explorer
  2. Now query the documents (a single row of the table) in the IBM Watson Explorer
    Content Miner by selecting the keyword fouling in the following query:
    [Query Text : (*:*) AND (keyword::/"Explore"/"Defect
    Type(s)"/"fouling")]

    If you look at the sample fact table, you’ll see that the
    INCIDENT_DESCRIPTION field contains the same value repeated 20 times as it
    is joined with the INVESTIGATION_DETAILS table. This is misleading and is an
    example of the multiplicity problem that is inherent in the nature of
    relational joins. In the next section , I show you how to create a custom plug-in to overcome this issue.

    INCIDENT_HDR INCIDENT_DETAILS
    ID INCIDENT_CODE INCIDENT_LINE_ITEM INCIDENT_DESCRIPTION
    1 D1 D1.1 PART-01 is fouling and cracked or clipped
    2 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    3 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    4 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    5 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    6 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    7 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    8 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    9 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    10 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    11 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    12 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    13 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    14 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    15 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    16 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    17 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    18 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    19 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    20 D1 D1.1 PART-01 is
    fouling and cracked or clipped
    21 D1 D1.2 PART-02 is broken or damaged

Create a custom Java crawler plug-in

This section describes the custom crawler plug-in and using the conceptual view
of IBM Watson Explorer Content Analytics Miner to obtain statistically correct
insights.

Crawler plug-ins for IBM Watson Explorer Content Analytics are divided into two
categories based on data sources:

Type A data sources Type B data sources
Content Integrator
  • Agent for Windows file systems crawler
Content Manager
DB2
JDBC database
Notes
Quickr for Domino SharePoint crawler
Seed list
File system

In this procedure, I use a JDBC database, which is a Type A data
source. I can create a Java class to programmatically update the value of
security tokens, metadata, and the document content of Type A data sources.

The following code is for the Java plug-in. It is followed by a description of the
deployment process and an analysis of the code’s logic. PluginLogger logs
the issues related to the crawler.

Custom Java crawler
plug-in

         package com.ibm.es.crawler.dbplugins; 
 import java.util.List; 
 import java.util.Map; 
 import java.util.stream.Collectors; 
 import com.ibm.es.crawler.plugin.AbstractCrawlerPlugin; 
 import com.ibm.es.crawler.plugin.CrawledData; 
 import com.ibm.es.crawler.plugin.CrawlerPluginException; 
 import com.ibm.es.crawler.plugin.FieldMetadata; 
 import com.ibm.es.crawler.plugin.logging.PluginLogger; 
 public class TutorialDBCrawlerPlugin extends AbstractCrawlerPlugin { 
    /** Logger */ 
    private static final PluginLogger logger; 
    static { 
    PluginLogger.init(PluginLogger.LOGTYPE_OSS,PluginLogger.LOGLEVEL_INFO); 
    logger = PluginLogger.getInstance(); 
    } 
    /** End Logger **/ 
 	private static final String INCIDENT_DESCRIPTION = "INCIDENT_DESCRIPTION"; 
 	private static final String DIAGNOSIS_DESCRIPTION = "DIAGNOSIS_DESCRIPTION"; 
 	private static final String RESOLUTION_DESC = "RESOLUTION_DESC"; 
 	private static final String INVESTIGATION_CODE = "INVESTIGATION_CODE"; 
 	private static final String RESOLUTION_CODE = "RESOLUTION_CODE"; 
 	private static final String CLOSURE_CODE = "CLOSURE_CODE"; 
    public TutorialDBCrawlerPlugin() { 
       super(); 
    } 
    public void init() throws CrawlerPluginException { 
    } 
    public boolean isMetadataUsed() { 
       return true; 
    } 
    public void term() throws CrawlerPluginException { 
       return; 
    } 
    public CrawledData updateDocument(CrawledData crawledData)  
          throws CrawlerPluginException { 
 	// print metadata 
 	   List<FieldMetadata> metadataList = crawledData.getMetadataList(); 
 	   if (!(metadataList == null)) { 
 		   /*for (int i = 0; i < metadataList.size(); i++) {			    
 			   //logger.warn(metadataList.get(i).getFieldName() +  
 			//		   ": " + metadataList.get(i).getValue()); 
 		   }*/ 
 	   } else 
 		   logger.error("metadatalist is empty"); 
 	   crawledData.setMetadataList(setIndexContext(metadata List)); 
       return crawledData; 
    } 
    public boolean isContentUsed() { 
       return true; >
    } 
    public  List<FieldMetadata> setIndexContext(List<FieldMetadata> metaList){ 
 	   Map<String, String> OneRow = metaList.stream().collect( 
                Collectors.toMap(FieldMetadata::getFieldName,  
             		   FieldMetadata::getValue));	    
 	   String closureCode =  
 			   getLeafValue(OneRow.get(CLOSURE_CODE)); 
 	   String resolutionCode =  
 			   getLeafValue(OneRow.get(RESOLUTION_CODE)); 
 	   String investigationCode =  
 			   getLeafValue(OneRow.get(INVESTIGATION_CODE)); 
 	   logger.warn("Modified Map Details :"+ OneRow.toString()); 
 	   List<FieldMetadata> returnMetaList = metaList;  
 	   for (int i = 0; i < returnMetaList.size(); i++) { 
 		   if (returnMetaList.get(i).getFieldName(). 
 				   contentEquals(INCIDENT_DESCRIPTION)) { 
 			   if(closureCode.equals("1") &&  
 					   resolutionCode.equals("1") &&  
 					   investigationCode.equals("1")){ 
 			   } 
 			   else 
 				   returnMetaList.get(i).setValue(""); 
 		   } 
 		   if (returnMetaList.get(i).getFieldName(). 
 				   contentEquals(DIAGNOSIS_DESCRIPTION)) { 
 			   if(closureCode.equals("1") &&  
 					   resolutionCode.equals("1")){ 
 			   } 
 			   else 
 				   returnMetaList.get(i).setValue("");    
 		   } 
 		   if (returnMetaList.get(i).getFieldName(). 
 				   contentEquals(RESOLUTION_DESC)) { 
 			   if(closureCode.equals("1")){ 
 			   } 
 			   else 
 				   returnMetaList.get(i).setValue("");   
 		   }		   
 	   }    
 	   return returnMetaList; 
    } 
    public String getLeafValue(String keyField){ 
    	String hierarchy[] =  keyField.split("\."); 
 	return hierarchy[hierarchy.length-1]; 
    } 
 }

Custom crawler plug-in methods
description

To create a Java class for use as a crawler plug-in with content-related functions
for Type A data sources, extend com.ibm.es.crawler.plugin.AbstractCrawlerPlugin and
implement the following methods:

  • init()
  • isMetadataUsed()
  • isContentUsed()
  • activate()
  • deactivate()
  • term()
  • updateDocument()

The AbstractCrawlerPlugin class is an abstract class. The
init, activate, deactivate, and
term methods are implemented to do nothing. The
isMetadataUsed method and isContentUsed method are
implemented to return false by default. The updateDocument method is an
abstract method, so you must implement it.

For name resolution, use the ES_INSTALL_ROOT/lib/dscrawler.jar file.

When the crawler session starts, the plug-in process is forked. An
AbstractCrawlerPlugin object is instantiated with the default
constructor and the init, isMetadataUsed, and
isContentUsed methods are called one time. During the crawler
session, the activate method is called when the crawler starts its
crawling and the deactivate method is called when the crawler finishes
its crawling. When the crawler session ends, the term method is called
and the object is destroyed. If the crawler scheduler is enabled, the
activate method is called when the crawling is scheduled to start
and the deactivate method is called when the crawling is scheduled to
end. Because a single crawler session runs continuously when the crawler scheduler
is enabled, the term method is not called to destroy the object.

Custom crawler plug-in compile and
deployment

  1. Compile the implemented code and make a JAR file for it. Add the
    ES_INSTALL_ROOT/lib/dscrawler.jar file to the class path when you compile.
  2. In the administration console, follow these steps:
    1. Edit the appropriate collection.
    2. Select the Crawl page and edit the crawler properties for the crawler
      that will use the custom Java class.
    3. Specify the following items:
      • The fully qualified class name of the implemented Java class:
        for example,
        com.ibm.es.crawler.dbplugins.TutorialDBCrawlerPlugin.
      • Include the name of the JAR file in your path declaration: for
        example,
        F:DevWTutorialPluginsDBCrawlerPluginTutorialDBCrawlerPlugin.jar.
    4. On the Crawl page, click Monitor. Then click
      Stop and Start to restart the
      session for the crawler that you edited.
    5. Click Details and start a full crawl.

Other than the methods extended from the AbstractCrawlerPlugin abstract
class, two key methods have been implemented: getLeafValue() and
setIndexContext(). These methods eliminate the duplicates from the
collection index.

  • The getLeafValue () method is called from
    setIndexContext with a key field value and returns the value of
    the leaf assuming “.” is the separator. For example, if D1.1.1.1.1
    is passed as an argument, 1 is returned.
  • The setIndexContext() method is called from the updateDocument()
    method with an argument of a list of FieldMetadata
    (com.ibm.es.crawler.plugin.FieldMetadata). This method works on the
    FieldMetadata list based on the given logic and returns the
    de-duplicated FieldMetadata, which is eventually set into the
    crawled data and used by the crawler process.

After the code is ready, it can be deployed as explained in the procedure of this
section (substeps 1 through 3). For more information, see “Creating a crawler plug-in for Type A data sources” in the IBM Knowledge
Center.

Contextual views

A unique feature of IBM Watson Explorer is the contextual view. You can explore the data’s contents in a specific context by grouping analyzable
fields. For example, if the data is configured to analyze “Incident Description and
Resolution Description” grouped together or with individual contextual views, users
can limit the results to the parts of documents that match those contexts.

Contextual view creation

  1. You create a contextual view in the administrative console by first selecting
    Configure parse and index setting.

    Figure 8. Configure parse and index setting,
    Contextual view option

    Configure parse and index setting

    Configure parse and index setting
  2. Select Contextual views and then create the contextual views for
    textual content based on the business requirements. The contextual views created
    for this example are listed in the following table.

    Context fields Contextual view name
    INCIDENT_DESCRIPTION Incident Details
    DIAGNOSIS_DESCRIPTION Investigation
    INCIDENT_DESCRIPTION & DIAGNOSIS_DESCRIPTION Incident & Investigation
    RESOLUTION_DESC Resolution
    CLOSING_COMMENT Final Disposition
    RESOLUTION_DESC & CLOSING_COMMENT Resolution & Final Disposition

    Figure 9 shows the list of contextual views created for this tutorial as
    given in the previous table.

    Figure 9. Contextual view details

    Contextual view details

    Contextual view details

Now business users are enabled to derive insights based on the context of the
hierarchical data and the facet hierarchy, which is described in the following
section.

Contextual analysis using IBM Watson Explorer Content Analytics
Miner

The following outputs in IBM Watson Explorer Content Miner are indicative samples of
occurrences of defect types based on the context selected by the user. They show the
exact frequency of the occurrences.

Incident Details

Figure 10 shows the IBM Content Analytics Miner output when the Incident Details
contextual view is selected. The figure shows two occurrences of the keyword
fouling from the unstructured field INCIDENT_DESCRIPTION and so on
with the other keywords. The output can be also viewed for other configured
contextual views.

Figure 10. Incident details contextual view in IBM
Content Analytics Miner

Incident details

Incident details

Resolution & Final Disposition

In Figure 11, you see the fouling keyword in 24 fields, which include
the fields RESOLUTION_DESC and CLOSING_COMMENT. As the CLOSING_COMMENT is the leaf
level in the hierarchy, it is unique and the occurrences are high.

Figure 11. Resolution & Final Disposition
contextual view in IBM Content Analytics Miner

Final disposition

Final disposition

Use case analysis

Now let’s consider a use case scenario where business users want to analyze the
occurrence of a keywords and want to find the relationship with the facet derived
from the dimensions at different levels of hierarchy.

Assume that you see that the fouling keyword has unusual
occurrences compared with other defect types in the context of the Resolution &
Final Disposition contextual view. You want to investigate the issue in more
detail and to analyze the relationship of the “reworking” and “fouling” components
referred to in those incidents. Here are the steps you would follow.

  1. Add the fouling keyword to the query, which refines the analysis to
    24 results out of 150 within the context of Resolution & Final
    Disposition”

    ((*:*) AND (keyword::/"Explore"/"Defect Type(s)"/"fouling")) IN resolution_final_disposition
  2. Out of all the incidents, incidents D2 and D1 have the most occurrences of the
    fouling keyword in Resolution and Closing Comment. The user
    adds these two incident codes to the query, which narrows the results further to
    16.

    Figure 12. Frequency of incident based on the criteria

    Frequency of Incident based on the criteria

    Frequency of Incident based on the criteria
    (((*:*) AND (keyword::/"Explore"/"Defect Type(s)"/"fouling")) IN resolution_final_disposition) AND (keyword::/"Facet Hierarchy"/"Incident Header"/"Incident Code"/"D2" OR keyword::/"Facet Hierarchy"/"Incident Header"/"Incident Code"/"D1")
  3. Now a free form of text “rework and use” is added to the query, which further
    refines the scope and retrieves five specific records. (Instead of free form
    text, it would be possible to use the phrase annotation to retrieve the
    different writing patterns: for example, “rework and reinspect”, “rwrk &
    use.”, and “rework & re-use”.) Here is the final query
    syntax:

    ((("rework and use") AND (keyword::/"Explore"/"Defect Type(s)"/"fouling")) IN resolution_final_disposition) AND (keyword::/"Facet Hierarchy"/"Incident Header"/"Incident Code"/"D2" OR keyword::/"Facet Hierarchy"/"Incident Header"/"Incident Code"/"D1")

Now you can find the trend choosing different facets from the facet tree. Figure
13 shows the trend charts in IBM Watson Explorer Content Analytics Miner’s dashboard
for Affected Product vs. Affected Part(s) based on the text analytics query derived
above.

Figure 13. Final output in IBM Watson Content
Analytics Miner dashboard

You can see that out of six different parts affected in incidents D1 and D2, part-04,
part-13, and part-01 are affected in this use case scenario.

Conclusion

This tutorial has described a strategy for analyzing hierarchical data structure in
IBM Watson Explorer within the context of content analytics while retaining the
hierarchical structure or grains of the dimensional model. The dataset used in this
tutorial is representative of real-world scenarios, but it is not very large or
complex. In an actual business scenario, this kind of analysis would give you
the combined power of both content analytics and dimensional analysis, and
would provide you with machine-assisted decision making, automation, and
business optimization.

Such scenarios are common in the following domains:

  • Incident management
  • Case management
  • Content management
  • Quality management system
  • Enterprise information management

The concepts described here can be implemented for analyzing data inconsistency in
a source system, automated data profiling, big data analytics, data governance, and
data ontology creation.

Using these concepts, you can overcome the problem of data duplicity and analyze the
unstructured text contents at each hierarchical level (separately or together). You
can thereby derive insights that can be made accessible to business users.


Downloadable resources

Credit: Source link

Previous Post

Neiman Marcus agrees to $1.5 million data breach settlement

Next Post

Microsoft Enhances Azure For Running Container, IoT And Machine Learning Workloads

Related Posts

Istio 1.8 focuses on usability and upgrades – IBM Developer
Technology Companies

The Eclipse Foundation’s move to Europe will bring open standards and open source together – IBM Developer

January 15, 2021
Six courses to build your technology skills in 2021 – IBM Developer
Technology Companies

Kafka Monthly Digest – December 2020 and 2020 in review – IBM Developer

January 14, 2021
Six courses to build your technology skills in 2021 – IBM Developer
Technology Companies

Learn new skills while working on volunteer projects – IBM Developer

January 8, 2021
Six courses to build your technology skills in 2021 – IBM Developer
Technology Companies

IBM Advance Toolchain for Linux on Power 13.0-3 released!

January 6, 2021
Six courses to build your technology skills in 2021 – IBM Developer
Technology Companies

Following the data science methodology – IBM Developer

January 6, 2021
Next Post
Microsoft Enhances Azure For Running Container, IoT And Machine Learning Workloads

Microsoft Enhances Azure For Running Container, IoT And Machine Learning Workloads

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Predictive policing is just racist 21st century cyberphrenology
Internet Security

Predictive policing is just racist 21st century cyberphrenology

January 27, 2021
Apple Warns of 3 iOS Zero-Day Security Vulnerabilities Exploited in the Wild
Internet Privacy

Apple Warns of 3 iOS Zero-Day Security Vulnerabilities Exploited in the Wild

January 27, 2021
Airport Runway Foreign Object Debris (FOD) Detection System to bolster with Advancement in Sensor Technology!
Data Science

Airport Runway Foreign Object Debris (FOD) Detection System to bolster with Advancement in Sensor Technology!

January 27, 2021
New machine learning tool predicts schizophrenia
Machine Learning

New machine learning tool predicts schizophrenia

January 27, 2021
10-years-old Sudo bug lets Linux users gain root-level access
Internet Security

10-years-old Sudo bug lets Linux users gain root-level access

January 27, 2021
In the Wake of the SolarWinds Hack, Here’s How Businesses Should Respond
Internet Privacy

In the Wake of the SolarWinds Hack, Here’s How Businesses Should Respond

January 27, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Predictive policing is just racist 21st century cyberphrenology January 27, 2021
  • Apple Warns of 3 iOS Zero-Day Security Vulnerabilities Exploited in the Wild January 27, 2021
  • Airport Runway Foreign Object Debris (FOD) Detection System to bolster with Advancement in Sensor Technology! January 27, 2021
  • New machine learning tool predicts schizophrenia January 27, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates