Intelligence Data Quality Management

Preconfigured templates, Auto Assignment of Class & Characteristics, ISO 8000 and UNSPSC compliant

About Data Quality Management

Data Quality Management is aimed to automate the process of standardization, cleansing & management of unstructured/free text data by utilizing ASA (Auto Structured Algorithms) built on PiLog’s taxonomy and the catalog repositories of master data records.

Data Quality Management Capabilities includes but not limited to

  • Analyze the source data content for completeness, consistency, redundancy, standardization, richness, etc
  • Auto Assignment of Class & Characteristics from the PiLog's Taxonomy to each record
  • Extract the characteristic values & UOM's from the source descriptions for each record
  • Extract reference data from the source descriptions such as Part#/Model#/Drawing#/Mnfr/Vendor etc for each record
  • Bulk review of materials (QC Tools & DQ Assessment)
  • Auto mapping of source data with PiLog repositories & other reliable sources
  • Assign the data sets to relevant user groups based on various criteria
  • Capture additional information (or) Validate processed/structured data
  • Provision to collect the field data & update (Physical Walk down)
  • Auto-generation of Short & PO text based on User configured
  • Identification of redundant records
  • Export the data to be migrated to a Target system(s)
  • Integrate in real-time with other systems
  • Data Quality assessment & progress reports

Spares Data Acquisition

Criteria MM VM CM SM PM HR
Count 2,182 Classes Completed - - - 2K+ Classes Completed -
Validation ERP Fields validation General, Taxation, Banking Details Validation General, Taxation, Banking Details Validation ERP Fields validation ERP Fields are validated Address Fields Validation
Verification ERP Fields verification General, Taxation, Banking Details Verification General, Taxation, Banking Details Verification ERP Fields verification ERP Fields verification Address Fields Verification
S.No Scope of Work [PiLog: Customer = 30:70]
SOW Description PiLog Customer
1 Data Extraction from SAP Systems (Master Data, Base Data, Configuration Data, KDS etc), Templates, Values & Business validations C & I R & A
2 Analyzing the client's taxonomy with PiLog’s Global templates R & A C & I
3 Analyze the material master records and determine the Key data structure (KDS) fields in accordance with SAP, other 3rd party systems R & A C & I
4 Configuring the Algorithms specific to customer data, preprocessing the data and uploading in the Management tool in the same format as provided model template R & A C & I
5 Running data analysis algorithms and providing analysis details on source descriptions and on SAP fields (KDS) R & A C & I
6 Initiating cleansing process by grouping the records based on commodity, category and UNSPSC R & A C & I
S.No Scope of Work [PiLog: Customer = 30:70]
SOW Description PiLog Customer
7 Allocating Material records to cataloguers either group wise or category wise(Done by Manager) C & I R & A
8 Cataloguers to review allocated groups in Data Automation Process and correct / validate / update records R & A C & I
9 Process records through Automation to get the records segregated into various categories such as A, B, C & D R & A C & I
10 Process the Records with Reference details to ping against the Pilog repository (PPR) and fetch the characteristics & additional details based to Part/Model number R & A C & I
11 Review the records segregated into the categories mentioned in Point:9 above and assess the data quality (correctness, completeness, relevance, mandatory requirements etc), transfer the data to next steps R & A C & I
13 Run QC tools to establish consistency, standardization across the batch or set of records R & A C & I
S.No Scope of Work [PiLog: Customer = 30:70]
SOW Description PiLog Customer
13 Consolidate approved records and move them to next step or assign for enrichment in case of insufficient/inadequate information C & I R & A
14 Run the duplicate check, validate the results once all the records are approved & finalized per commodity or plant or batch R & A C & I
15 Merge the confirmed duplicates or common items, consolidate the records within ERP system with appropriate cross-references R & A C & I
16 Upload the data into Governance solution or ERP system with updated associated data R & A C & I
17 Prepare close-out reports, extract all the data & archive R & A C & I
  • PiLog objective is to have a global repository so that no item requires manual cataloguing
  • PiLog has developed PPR, A repository/ central location in which data is stored and managed. It has embedded industry-standard content and ISO compliant processes (Data exchange compliant to ISO 29002)
  • PiLog has spent over twenty years in researching, developing, and refining the PPR (PiLog Preferred Records)
  • For those wanting to start with structured descriptions now without waiting for historical data clean up, we offer the Structured Text Generator to build class-driven descriptions that improve search capability and eliminate free-text spending.
  • PPR is not only limited to materials but also has service master domain details i.e., 0.5 million readily structured service master records are available
  • PPR has 100% accuracy in the descriptions.
  • Re-usability of the golden records as they are independent of "language", "region", "Industry Sector".

Data Quality Standards

The International Organization for Standardization (ISO) approved a set of standards for data quality as it relates to the exchange of master data between organizations and systems. These are primarily defined in the ISO 8000-110, -120, -130, -140 and the ISO 22745-10, -30 and -40 standards. Although these standards were originally inspired by the business of replacement parts cataloguing, the standards potentially have a much broader application. The ISO 8000 standards are high level requirements that do not prescribe any specific syntax or semantics. On the other hand, the ISO 22745 standards are for a specific implementation of the ISO 8000 standards in extensible markup language (XML) and are aimed primarily at parts cataloguing and industrial suppliers

PiLog Data Harmonization processes & methodologies complies to ISO 8000 & ISO 22745 standards

PiLog utilizes the PiLog Preferred Ontology (PPO) when structuring and cleansing Material, Asset/Equipment & Services Master records ensuring data deliverables comply with the ISO 8000 methodology, processes & standards for Syntax, Semantics, Accuracy, Provenance and Completeness


PiLog throughout the 25 years of experience in master data solutions across different industries have developed PiLog Preferred Ontology (PPO) which is a Technical Dictionary that complies with the ISO 8000 standard. The PPO is a best-defined industry specific dictionary covering all industry verticals such as Petrochemical, Iron & Steel, Oil & Gas, Cement, Transport, Utilities, Retail etc.

PiLog's Taxonomy consists of pre-defined templates. Each template consists of a list of classes (object-qualifier or noun-modifier combination) with a set of predefined characteristics (properties/attributes) per class. PiLog will make the PPO (class/characteristics/abbreviations) available for general reference via the Data Harmonization Solution (DHS) and Master Data Ontology Manager (MDOM) tools.

PiLog creates client preferred ontology (CPO) by copying general templates common to most companies/industries, as well as known, expected templates for the specific client. The Client team will confirm the CPO templates by approving:

  • The class and characteristics combination for a particular class, including linkage to respective United Nations Standard Products and Services Code version 21.
  • The characteristics listed on the template as complete, and properly identified as mandatory and optional.
  • The proposed abbreviations.
  • Ordering of properties for description generation.

PiLog reserves the right to make these changes in the dictionary:

  • Change existing classes and or characteristics of the CPO where necessary.
  • Register new classes and or characteristics in the OTD, and add them to the CPO.
  • If changes are made to the CPO dictionary, PiLog will only update the changes/additions in the MDPM software tool; no additional approval is required from Client to incorporate the changes, as PiLog manages the CPO dictionary according to industry standards.
  • The CPO dictionary is the intellectual property of PiLog. In no way may it be edited, copied, compared, mapped, transmitted, imported/exported into other software/systems, or printed/published without prior written permission of PiLog. CPO includes concepts, classes, terms, definitions, languages, abbreviations, data requirements, equivalences, images, data types, translations, and any data structures or relationships of the content stored within the CPO.

Data Cleansing

The PiLog Master Data Project Management is used for cleansing and structuring of a material master is a highly specialized field, requiring the use of international standards, such as eOTD, USC, EAN, ISO 8000 etc.

Effective cleansing and structuring of a material master consistent and correct application of these standards in large volumes of data requires specialized processes, methodologies and software tools.

The material master forms the basis for a myriad of business objectives. PiLog understands the complex task of translating selected business objectives into master data requirements and subsequently designing a project that is focused on delivering optimal results in a cost effective way.

For a large number of line items effective cleaning of the material master does require the cleaning and standardization of the manufacturers and/or suppliers. It therefore follows that a vendor/supplier cleanup and standardization is a logical consequence in the process

PiLog has its own specialized data refinery, PiLog Data. PiLog has developed superior technology and methodologies that are aimed at delivering the best possible quality, consistently and cost effectively.

In answering the market need for the cataloguing of services in a consistent and repetitive manner, PiLog developed the world's first internationally proven standard for services cataloguing, the USC. Although this has now been accepted to be part of the eOTD, the specific methodologies required to implement it successfully remains with PiLog.

The material master, as well as other master data tables, requires standardized base tables for, amongst others; unit of measure, unit of purchase, material types and material groups. This is also a specialty of PiLog.

Data Cataloging

Data cataloging is classified below methods:

Basic Structuring (Reference data extraction & Allocation of a PPO-OTD class)-

  • Reference values, such as manufacturer name and part number; drawing number, supplier references, and any other reference values of the item, are identified and captured. These are used to describe the item during purchasing.
  • Each item is assigned to a PPO class. This allows all items to be grouped, as defined by the dictionary, and to have corresponding templates assigned to items.

Advanced Structuring (Value extraction)

  • Allocation of PPO - OTD properties (template), Attribute/Property values, UOM's extraction & population, Cleaning & structuring of free text, if any


  • Data enrichment is performed with the help of genuine reference data that is extracted from the source data by means of external research to generate additional data from technical sources, such as manufacturer's catalogs (PDF library) or the Internet (manufacturer's website, catalog cuts, etc.), as well as cross-verification and cross-validation of extracted data that is captured in Advanced Structuring. Further cleansing occurs through validation, cross-reference, and harmonization.

Cataloging Rules

PiLog follows three cataloging rules as follows:

  • Do not remove or delete any data provided by the client unless the data is duplicated.Duplicated in this context refers to the scenario where a word, concept, value, attribute, etc. is duplicated within a single description or text provided for an item.
  • Records are never deleted by PiLog, but will be flagged as potential duplicates. It is the client's responsibility to verify and confirm whether items flagged as potential duplicates are indeed duplicates before removing them from the item master set.
  • Do not add extra values to client data unless researched from a source with integrity and authority. If PiLog adds values to a client's master data item, PiLog provides the source and authenticity for the added data.
  • If descriptions are incomplete, incorrect, or contain conflicting information, query the client before assigning class or values. PiLog does not assign a class if the source description or information provided by the client is unclear. PiLog seeks additional information or a decision from the client; record(s) with pending queries are kept on hold until the query is resolved.
  • Electronic Data Verification (EDV) is the process whereby the source data received from the client is processed into the cataloguing system via the EOTD dictionary, where the correct item name and approved Attribute template is
  • linked, and the data for the material item is populated into the template. Descriptions are then generated according to certain rules. There are different Levels of cataloguing.

Data Quality Levels with samples:


  • Smart Consolidation

  • RPA Robotic Process Automation - BOTs

  • Data Profiling and Analysis

  • iSPIR