Back to ‘ABCD’ of Data: Master, Golden, Reference and Metadata
Data management is straightforward once you have the clarity.
In this blog, we will discuss about the classification of data and describe the various categories of data (reporting, transactional, master, golden, reference, metadata, unstructured and big data).
Categories of Data
This is a classification of data commonly known in data management field.
In the past we have seen the data in the form of a stack, a pyramid, or even a diamond as represented here; regardless of the shape, the list of items remains the same. Now, let’s have a look at these various data categories.
Various Data Categories
Transactional Data
Transactional data is related to business events. It is the largest volume of data managed in the enterprise. Examples of business events can be:
- selling products to customers,
- buying products from suppliers,
- shipping items to customer sites,
- hiring employees, managing their vacations, or changing their positions during promotion.
Every day we manage transactional data and is handled in operational applications of CRM, ERP, SCM, HR etc.
Master Data
Master Data is the information that supports the transactions and have the customers, products, parts, employees, materials, suppliers, sites, etc involved in the transactions.
Master Data is usually built and used in the normal course of operations by existing business processes. But these operational business processes are built for an “application-specific” use case of this master data and therefore it fails to deliver high-quality standards and common governance which can be used in the overall enterprise requirement across applications.
Reference Data
Reference data is data that’s used for categorizing master data or relating to information outside of business, like customer segments, business processes, countries, and zip codes. Reference Data is a non-volatile and slow-moving subset of master data.
Some of the reference data can be universal and/or standardized (example Countries – ISO 3166-1). Other reference data may be “agreed on” within the enterprise or within a given business domain.
Many a times Reference Data considered as a subset of master data and thus data category is referred to as Master Reference Data.
Reporting Data
Reporting data is data organized for the purpose of reporting and business intelligence. Data for operational reporting as well as data for reporting is created from transactional data, master data, and master reference data.
Metadata
Metadata is data that describes other data; it is the underlying definition or description of data, like descriptions in databases, configuration files, and log files. Examples of metadata include the properties of a media file or software applications, documents, spreadsheets, and web pages: its size, type, resolution, author, and create date. Master data, reference data, and log data all have related metadata.
Big Data
Big data has many different definitions and is a bit of a buzzword. What makes data ‘big’ is 3Vs: volume, variety, and/or velocity; and it requires machine learning and AI to derive insights. Big data cannot be effectively maintained with traditional technology, and it is the combination of the previous four types of data: log data, transactional data, reference data, and master data.
Unstructured Data
Unstructured Data is data that does not have a predefined structure. This type of data refers mainly to text data. This includes data from social media posts, emails, white papers, or help chats that is difficult to categorize and often it ends up as part of Big Data.
Golden Data
Golden data is cleansed, de-duplicated, consolidated, validated version of the original master data. Some people call it the “Single Version of The Truth” or “360° View” of products, employees, sites. Key points of Golden data:
- All data are relevant,
- Only valid information - No incorrect addresses or bouncing emails,
- No duplicates.
Golden data is extensively used by applications like BI, operational, and others.
Confusion between Master Data and Reference Data: You are not Alone
In this blog, we will discuss about major misconception we build while dealing with data and the confusion we usually build between Master Data and Reference Data. Most will tell you that reference data is a subset of master data, and it is, sort of. But...
Confusion between Master Data and Reference Data: You are not Alone
In this blog, we will discuss about major misconception we build while dealing with data and the confusion we usually build between Master Data and Reference Data. Most will tell you that reference data is a subset of master data, and it is, sort of. But...
Confusion between Master Data and Reference Data: You are not Alone
In this blog, we will discuss about major misconception we build while dealing with data and the confusion we usually build between Master Data and Reference Data. Most will tell you that reference data is a subset of master data, and it is, sort of. But...
Why Every Retailer needs a PIM Strategy?
When companies are running an omnichannel business, they need to create a cohesive business that combines offline and online channels into a unified brand identity. And to manage the data, responsibilities, and multiple channels, you need a PIM strategy.
Why Every Retailer needs a PIM Strategy?
When companies are running an omnichannel business, they need to create a cohesive business that combines offline and online channels into a unified brand identity. And to manage the data, responsibilities, and multiple channels, you need a PIM strategy.
Why Every Retailer needs a PIM Strategy?
When companies are running an omnichannel business, they need to create a cohesive business that combines offline and online channels into a unified brand identity. And to manage the data, responsibilities, and multiple channels, you need a PIM strategy.