Back to ‘ABCD’ of Data: Master, Golden, Reference and Metadata

Back to ‘ABCD’ of Data: Master, Golden, Reference and Metadata

Data management is straightforward once you have the clarity.

In this blog, we will discuss about the classification of data and describe the various categories of data (reporting, transactional, master, golden, reference, metadata, unstructured and big data).

Categories of Data
Categories of Data

Categories of Data

Categories of Data in Data Management Field

This is a classification of data commonly known in data management field.

In the past we have seen the data in the form of a stack, a pyramid, or even a diamond as represented here; regardless of the shape, the list of items remains the same. Now, let’s have a look at these various data categories.

Categories of Data
Various Data Categories
Various Data Categories

Various Data Categories

Transactional Data

Transactional data is related to business events. It is the largest volume of data managed in the enterprise. Examples of business events can be:

  • selling products to customers,
  • buying products from suppliers,
  • shipping items to customer sites,
  • hiring employees, managing their vacations, or changing their positions during promotion.

Every day we manage transactional data and is handled in operational applications of CRM, ERP, SCM, HR etc.

Master Data

Master Data is the information that supports the transactions and have the customers, products, parts, employees, materials, suppliers, sites, etc involved in the transactions.  

Master Data is usually built and used in the normal course of operations by existing business processes. But these operational business processes are built for an “application-specific” use case of this master data and therefore it fails to deliver high-quality standards and common governance which can be used in the overall enterprise requirement across applications.

Reference Data

Reference data is data that’s used for categorizing master data or relating to information outside of business, like customer segments, business processes, countries, and zip codes. Reference Data is a non-volatile and slow-moving subset of master data.

Some of the reference data can be universal and/or standardized (example Countries – ISO 3166-1). Other reference data may be “agreed on” within the enterprise or within a given business domain.

Many a times Reference Data considered as a subset of master data and thus data category is referred to as Master Reference Data.

Reporting Data

Reporting data is data organized for the purpose of reporting and business intelligence. Data for operational reporting as well as data for reporting is created from transactional data, master data, and master reference data.

Metadata

Metadata is data that describes other data; it is the underlying definition or description of data, like descriptions in databases, configuration files, and log files. Examples of metadata include the properties of a media file or software applications, documents, spreadsheets, and web pages: its size, type, resolution, author, and create date. Master data, reference data, and log data all have related metadata.

Big Data

Big data has many different definitions and is a bit of a buzzword. What makes data ‘big’ is 3Vs: volume, variety, and/or velocity; and it requires machine learning and AI to derive insights. Big data cannot be effectively maintained with traditional technology, and it is the combination of the previous four types of data: log data, transactional data, reference data, and master data.

Unstructured Data

Unstructured Data is data that does not have a predefined structure. This type of data refers mainly to text data. This includes data from social media posts, emails, white papers, or help chats that is difficult to categorize and often it ends up as part of Big Data.

Golden Data

Golden data is cleansed, de-duplicated, consolidated, validated version of the original master data. Some people call it the “Single Version of The Truth” or “360° View” of products, employees, sites. Key points of Golden data:

  • All data are relevant,
  • Only valid information - No incorrect addresses or bouncing emails,
  • No duplicates.

Golden data is extensively used by applications like BI, operational, and others.

Various Data Categories