Posted by Karthikeyan Sankaran
August 23rd, 2011

From my experience as a BI practitioner, one observation I have is that analytical systems are often built without a clear conceptual framework around the informational capability enabled by such systems. In my humble view, an orientation towards ‘thinking by business datasets’ and an artifact like the Business Dataset Bus Matrix (BDBM), illustrated in this blog will help in connecting the dots between business data and the knowledge gleaned out of such data. The focus, as always, is on helping organizations make better business decisions.

What is a Business Dataset?
Business Dataset is a self-contained collection that contains data for a particular business process or for an entity. Examples are – Point of Sales Data, Purchase orders, Customers, Products, Chart of Accounts etc. As you can see from the examples above, business datasets can be transaction oriented (business process) or master data (entities). The business dataset is processed from raw transactional data (present in tables of a relational data store) and can also include external data, syndicated data etc.

Why is a Business Dataset important from an analytical standpoint?
A collection of business datasets defines the analytical DNA of an organization. Each dataset represents a particular process or an entity and is typically pre-processed from raw transactional data. The datasets are combined in logically relevant scenarios and each scenario provides insights for a particular aspect of business. Quite obviously, the processing & governance of these datasets is very important and BI systems should be architected taking this into consideration. In my experience, the number of datasets in an organization can range anywhere between 25 (small firms) to 80 (large firms).

It is important to realize that Business datasets are not intended to replace traditional data warehouses or data marts. They are infact derived from the data warehouses or any other data repositories with the sole purpose of being more amenable for analytics.

How do we organize the Business Datasets in an organization?
I propose the creation of an artifact called the ‘Business Dataset Bus Matrix’ (BDBM), similar to the Dimensional Bus Matrix. BDBM maps the Functional Area to the required Informational capability which is then mapped to specific business questions. Each row of BDBM contains one business question, while the master and transactional datasets are shown in the columns. A tick-mark is placed at the intersection of rows (business question) and columns (business datasets) wherever appropriate, i.e. a tick-mark signifies that a particular dataset is required to enable the informational capability.

Thanks for reading. Please do let me know your thoughts.

Comments (0)