Archive for April, 2017

Identifying Parties (“Who”)

April 26, 2017

This is the first of a series about how to identify entities in data sources that can be readily classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Events, Locations and Motivators. I will start with Parties.

The Parties BOC (Business Object Category) identifies Who produces or consumes objects and concepts.  Examples of commonly used data element[i] and data element collection[ii] names you may encounter in diverse types of data stores, that can be categorized under the Parties BOC, include but are not limited to those in the table below.[iii] They contain names commonly used to identify entities (e.g. tables in an RDBMS and documents in a DDBMS) and attributes (e.g. columns in tables and fields in documents). The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

Telling providers from consumers is important to recognize because it will tell you in which direction the product flows and in which direction the payment flows.  Knowing this provides the basis for profitability and resource use efficiencies. Producing parties include such names as Producer, Provider, Seller, Supplier, Broker, Vendor for example. Examples of consuming party names include Consumer, Payer, Receiver, Buyer, Client, Customer.  A Parties BOC name can also identify third parties such as Obligee, Specialist and Agency.  Parties can be animate (Citizen, Patient) or inanimate (Bureau, Department). They can be collective (Organization) or singular (Person). These are some of the names you should look out for in analyzing source data systems. They can be names of data element collections (e.g. tables) and/or names of data elements (e.g. columns). Generally, they hold data about the object types within the Parties BOC which can then be used in answering the Who interrogative component of any query.  Parties often come in pairs, the parties that produce or provide, and the parties that consume or pay. Thus, it is important that when you discover one or more Parties BOC entity in your source data that you ask the question, “Is this a producer or a consumer”?  Of course, a single entity can be both a producer and a consumer depending primarily on the type of thing being measured and the type of activity that causes it to need to be measured. If a “first-party” (i.e. the party from whose perspective the measurements are taken) pays money in the transaction it will be the “consumer” in the scope of the transaction being analyzed. Otherwise the party is the “producer” in some manner and its performance will be measured by some assessment of relative value between product outflow and money inflow. Quite often both providers and consumers will be assessed in some manner, usually by an Assessor, as a result of the measurements.

With the advent of the Internet of Things (IoT) the Role that a device plays in producing and consuming data now means that the Parties BOC includes non-traditionally human parties as well. As a matter of fact entities that previously were only considered as part of the Things BOC can now be parties. The concept of a party has gone beyond the traditional definitions of people and organizations but it still, at the end of the day, remains the producer and consumer of data.

[i] Wikipedia defines any unit of data defined for processing as a data element. Since 6BI deals with the meaning of data elements it is generally non-productive to consider any unit smaller than an attribute or field.

[ii] A data element collection is a set of related attributes or fields. Depending on the degree of normalization they are usually co-located in a container called a table or a document.

[iii] I would like to thank Barry Williams and his excellent Database Answers website for providing many of the table name examples.

The 6BI Analytic Schema

April 10, 2017

6BI (Six Basic Interrogatives) was originally an adaptation of the Zachman Framework[i] to the design of data warehouse data stores.  The idea was based on the assumption that a large number of the business questions that always seem to need answering, regardless of industry, are based on Zachman’s primitive interrogatives. One combination or another of these interrogatives: Who, What, Where, When, How and Why always seems to be needed in order to get the answers we are after in business intelligence.  These combinations consist of at least three of the interrogatives, two of which are always Who and What.  At least one, or as often as not, all four of the other interrogatives are needed to complete what we call the Analytic Schema.  An Analytic Schema is not deigned to answer specific queries but instead is structured to represent all the basic aspects of an enterprise so that queries do not need to be known in any great detail in advance.

Each interrogative is associated with an aspect of the enterprise.  In 6BI, aspects are called Business Object Categories (BOC) and they include Parties, Things, Locations, Events, Activities and Motivators.  Many books and articles have been written explaining and expanding the fundamental concept of enterprise aspects, including those by David Hay[ii], Dan Tasker[iii], as well as Zachman himself and others.  Also, see earlier posts and pages on this blog[iv].

The most effective, and recognizable, form of Analytic Schema is a star schema. Each dimension is classified by one and only one BOC, in other words the aspects determine the type of each dimension.  This classification of dimensions into BOCs already begins to tell us the role each aspect will play in queries built from the schema.  Each BOC is represented by at least one dimension.  The number of actual dimensions classified under each BOC does not matter but it is important to know type of each dimension.  All the dimensions of all types link to the same set of facts.  Figure 1 shows what a 6BI Analytic Schema looks like.

Figure 1. The 6BI Analytic Schema.

The Analytic Schema is a conceptual schema and real world star schemas quite often do not explicitly represent the conceptual nature of 6BI but are physical schemas designed to support a set of queries addressing specific business requirements.  However, it is not generally too difficult to parse and sort the attributes of star schema dimensions into the six categories needed for 6BI.  Difficulties occur with dimension attributes that tend to defy categorization.  For example, is a certain attribute a component of the When or the How interrogative?  This is especially true when we consider the real world information trade-offs that are often needed between events and activities for effective decision making.  Does the event drive the activity or does the activity drive the event?  Something more just correlation is often needed.  Also what about the motivators?

In 6BI Parties and Things have no direct dependency on each other and have no non-intermediated association with each other.  From a data modeling perspective, this means that a Party does not reference (i.e. has no foreign key to) a Thing, and a Thing does not reference (i.e. has no foreign key to) a Party.  This is very important, mainly because of the implicit role of the Who interrogative.  It is “the Who”[v] of the Analytic Schema for which answers need to be found.  Parties are responsible for the results using Things, but those results only have value if placed in a framework of Locations, Events, Activities and Motivators.  In the semantics of queries built on the Analytic Schema the subject is most often one or more Parties and the object (either direct or indirect) is always one or more Things.  Predicates are composed of the other BOCs and the Facts.  6BI Analytic Schemas can be a framework for creating  queries that are a type of logical proposition called a Quadruple because they are always composed of four parts:  subject,  direct object,  indirect object and predicate.

6BI was originally used to do “logical reverse engineering” of existing operational OLTP databases to sort tables into a set of buckets with each bucket representing one of the BOCs.  The thinking was that underneath all the differences that make each organization and its data unique there is a common logical structure upon which an analyst can start the analysis of the problem space.  This can be a great benefit if one needs to “hit the ground running” so to speak.  My experience (25 years and counting) has shown this to be the case.  This was useful when my goal was to identify candidate tables in data sources to frame out the Analytic Schema.  Knowing which tables belonged to which BOC was always helpful because it allowed me to see what role the records in this type of table might play in query development.  For example, I would know that tables classified as BOC Location tables were the most likely to have information about where customers and products were located, and where transactions occurred.  It also identifies the “where” dimension of the facts to use for doing such things as calculating the relationship between location and time in, for example, analyzing the optimum combination of store location and store hours.

This “logical reverse engineering” can not only be applied to tables in a relational database (RDBMS), but to documents in a document database (DDBMS), key-value pairs in a key-value store, and two-level maps in a column-family database as well.

Next, we will look at how to align random data structures (tables and documents) with the BOCs of the 6BI Analytic Schema.





[v]Not the British rock band.