Posts Tagged ‘6BI’

Full Stack Data Science. Next wave in incorporating AI into the corporation.

October 22, 2019

I like the concept of “Full Stack Data Science”, especially the way the author depicts it in the included graphic.

One thing I would like to point out is the recognition that the process is really a circle (as depicted) and not a spiral, or line.  What I mean by that, is the path does not close between what can be perceived as the beginning “Business goal” and the end “Use, monitor and optimize”.

The results of applying Data Science to business problems not only helps solve these problems, but actually changes the motivators that drive the seeking of solutions in the first place.  Business goals are usually held up as the ends with the lowest dependency gradient of any component of any complex enterprise architecture.  While this may be true at any point in time, the dependency is not zero.  Business goals themselves change over time and not just in response to changing economic, societal or environmental factors.  The technology used to meet these goals does itself drive changes to the business goals.

A party, whether person or organization, tends to do what it is capable of doing.  Technology gives it more activities to undertake and things to produce and consume, which then feedback to the goals that motivate it.

I think this article is one of the best I’ve seen in making that point.


Automation and the End of Human Wealth

January 15, 2019

Time is money. Well not really, but they do equate very nicely. A person’s wealth can be measured not only by how much money he or she controls, but by how much of their time can be used for activities not necessary just for survival. This time, freed up from mere survival activities, has always been used to create increasing wealth for humans. The increase in wealth creation accrues to both producers and consumers. Producers get wealthier by getting more money, and consumers get wealthier by getting more time.

Previously the march toward automation has created ever increasing wealth because some party has invented the latest automation, sold it to others, and another party has bought the automation and used in to free up more of their time. In the 6BI sense, “money” and “time” are the product and payment exchanged at armslength in the transaction.

The question we should ask now is, will we ever reach the point when there are simply no new wealth creating activities that humans can invent? A time when every activity that could have created new wealth for humans will already be performed by some form of automation. Could it be possible that at some point in time any invention, instead of being valuable to some human, will have no value and thus not be able to be exchanged for money?

If we ever do reach the point where additional automation can no longer drive the creation of wealth for humans because everything that humans could do for themselves will have already been automated, then there will be no advantage, or value, to the next invention. It simply will not be an innovation.

At that point in time, I believe the earth’s human population will crash or go into a period of slow negative growth. There will be no motivation to either invent or procreate. Human population will decrease as a product of reduced opportunities and consequently the influence of humans on the planet will decrease.

On the other hand, the robots and artificial intelligence that provide automation to humans, since they do not need to either invent nor procreate, will increase in number and influence. In number because they will wear out more slowly than flesh and blood humans and in influence because they will no longer be dependent on humans to improve their programming.

Because of the decrease in number of unmet human needs fewer software developers will be needed, for example. This decrease in unmet needs, doesn’t necessarily mean humans will be more satisfied, just that there will be fewer and fewer value and wealth creating activities that they can perform for themselves.

If this happens, and there are substantially less humans, will there really be a lesser need for automation? What will happen when there is no longer any new human need or activity to be automated? Will robots and artificial intelligence continue to operate with humans eventually becoming less and less relevant to them? Will humans become even less aware of the means of automation? Are humans ultimately essential for the operation of automation and thus as human numbers drop, computing entities, the means of automation, will drop as well? Will automation itself be automated and operate without human intervention at all because any knowledge of how it works will eventually be lost to humans?

Will there be an ever increasing demand for resources such as electricity to keep a kind of “closed loop” automation going and going even though it has reached the point where automation’s added value to humans is at, or near zero? Even more interesting, from a human perspective, what will happen when new wealth can no longer be created?

Identifying Motivators (“Why”)

February 22, 2018

This is the sixth and final post in this series about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Locations, Events and Motivators. The fifth post in the series (about Events, the “When” aspect) can be found at .

The Motivators BOC is probably the most nuanced and least understood BOC. I have earlier devoted an entire article about the meta-data structure of motivators entitled “The Data Architecture of Business Plans”[i] which can be found at .

The Motivators BOC identifies Why things get produced and consumed by parties.  Concepts and objects in this BOC capture data about the ends, means, influencers and assessments that provide the reasons why parties exchanged things (products and money) at a particular time and place.  Ends and means are in general too abstract to be found in object names, but you will find names such as Strength, Weakness, Opportunity, Threat, and Key Performance Indicator (KPI) all of which are assessment elements.

Data element and data element collection names you may encounter that belong to the Motivators BOC include, but are not limited to, names in the following table[ii]. The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

In terms of identifying motivator data elements (i.e. attributes and columns) and motivator data element collections (i.e. entity types and tables) the most likely candidates are documents, or at least those objects that have the word Document in their name.  You need to consider documents, because it is quite often that you will find the means (missions and courses of action) of an enterprise described in document form, especially if the document name contains words such as Strategy/Strategic, Tactic, Enablement/Enabled, Directive, Policy or Rule.  The ends of an enterprise (visions and desired results) can also be described in a document, quite often having a name like Goal or Objective.

As mentioned in the post about the Things BOC[iii], a document can also be considered a type of thing, such as a definition.  As in “the definition” is being assessed for accuracy, for example.  However, if its purpose is to contain text that describes means or ends it also belongs to the Motivators BOC.  An event can also be a motivator such as Appeal and Campaign.  But as was mentioned in the Events BOC, events are primarily differentiated from other concepts and objects by their inclusion of a time data element, either a point in time or a duration.

Another source of motivators is reference data.  Reference data can describe business functions (see the post on the Activities BOC) and often determines choices that users make on user interfaces which then determine logic paths that an application will take when processing data and thus explain why certain results are derived.  Example data element and data element collection names that often become the basis of reference data management (RDM) include: Code, Type, Tag, Status and Class/Classification.  Often you may find these name in plural form as well.

So, if you are analyzing a legacy database and you come across a table with any of these words in its name you need to study the content of the table and understand how the rows and columns of the table effect, or are designed to effect, the motivation for actions taken by the parties in the organization.

The Motivators BOC is especially relevant to the type of NOSQL database known as a document database, Mongo DB being a prime example.  It is one thing to structure and access the data in a document store in an effective and efficient manner but, in terms of answering business questions, it is even more important to know what role the content of the document plays in the operation of the enterprise.  In other words, how does or how should the document provide the answer to “why” a business transaction took place between parties.

Another category of motivators deals with security and privacy, and sometimes is included in policies and procedures.  Names here include Authorization, Enforcement and Permission, among others.  The intersection between business motivation and security is ripe for further exploration.

This is the last post in this series.  I hope you will find them worthwhile and useful. To find each one just click the link in the first paragraph of each to take you to the previous one. The first in the series about the Parties BOC can be found at .

Thanks for reading them and best of luck in developing your 6BI Analytic Schemas.


[i] The title “The Data Architecture of Business Plans” is derived from the fact that Business Plans are the deliverable of the Motivation aspect (the “Why” interrogative) at the Business Management, or Conceptual perspective of the Zachman Framework for Enterprise Architecture.

[ii] As previously, I would like to thank Barry Williams and his excellent Database Answers website for providing many of the table name examples.


Identifying Events (“When”)

January 30, 2018

This is the fifth in a series of posts about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Locations, Events and Motivators.  Entity types in the Events BOC identify When production and consumption of things by parties occurs. The fourth post in the series (on Locations, the “Where” aspect) can be found at .

Concepts and objects in this BOC capture data about a point in time or the duration of time over which products or payments flow from one party to another, or when an enterprise carries out its work. Data element and data element collection names you may encounter that belong to the Events BOC include, but are not limited to, names in the following table[i]. The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

Events break down into two major sub-types: (1) Occurrence types, which include EventAlert, Notification, and Incident from the list above; and (2) Duration types which include, Year, Month, Week, Day, Hour, Minute, Second, Date and Time from the list.  Duration type entities, as no doubt is obvious, are units of time and can be used to aggregate facts in a star schema across a temporal hierarchy.  Occurrence types are more like things.  Instead of being produced and consumed, they occur, that is they are something that can be referred back to that, in addition to any other properties they may have, always have an aspect of time or “when” about them, this aspect is important for data analysis.

Unlike the other BOCs, the Events BOC has both dimensional and fact characteristics.  On the one hand, time is already defined into a hierarchy and is standard for everyone.  An hour is always an hour, sixty minutes, a minute is always a minute, sixty seconds, and so on.  On the other hand event occurrences are things that happen and can be measured and compared.  They are data, not metadata as the hierarchy of time is.  Events happen and then they are over but there can be much to learn from their having occurred. This BOC is conceived to capture important data about the perspectives of when something happens in your data.  These perspectives relate to when, not where, not who, not how, not why, not even what has happened, but when it happened, or will happen.

This BOC captures the characteristics of time that most influence results.  It is also important to understand how events differ from either locations or activities, two other previously covered BOCs, with which events are often confused.

A location is concrete.  It is a point in space, a place, even if that space is virtual. You can go away and come back to a location, and if most (not necessarily all) other factors are the same, or within tolerances, the location is still there.  Not so with an event.  An event, though all relevant data may be captured about it, once it occurs, is done and goes away forever.  Another instance of a particular class of events can subsequently occur, but each event is unique and has a time when it occurred.

Events and activities are closely related and co-dependent but are not the same.  Activities are event-driven.  They receive and react to events and create new events which are sent to other activities.  Each activity is an independent entity and can execute in parallel with other activities.  Coordination and synchronization is by means of events communicated between the activities.  Activities react to input events by changing state and creating output events[ii].

The important thing, from a 6BI perspective is that an event provides a temporal association for a result.  If the persons, places, products, locations, and motivators are known (or estimated) you still need to know when these aspects came together to create something of significance.

Another instance of the importance of the “When” aspect is in Big Data solutions.  Since systems owners often cannot control when data is available to the solution it is important to be able to record when each event occurs, and there could be literally millions of events in a short unit of time producing results which can uniquely aggregate the results.

[i] I would like to thank Barry Williams and his excellent Database Answers website for providing many of the table name examples.

[ii] David Luckham, various writings.

6BI and Marketing Attribution

December 26, 2017

Six Basic Interrogatives (BI) can be used to analyze marketing attribution. In marketing, attribution is the assigning of credit to the interactions in the sequence of interactions which have led up to what is called a conversion[i].  A conversion is an action, or event which results in an action, that has value for the means of interaction, the campaign, which is seen to be the motivator of the visitor’s interactions and eventual conversion. The interactions take place through channels which when associated with a campaign are called touchpoints.

To pursue the most effective marketing strategy it is important to know which touchpoints, and in what sequences they occur, are the most likely to result in conversions.  A typical scoring system to assess these sequences of actions consists of assigning credit to the touchpoints in a sequence according to some attribution rule or rules.  There are several popular attribution rules in use across the field of marketing analytics.  These rules fall into three broad categories.[ii]

  • Single Source Attribution (Single Touch Interaction) models assign all the credit to one event, such as the last click, the first click or the last channel to show an ad. Simple or last-click attribution is widely considered as less accurate than alternative forms of attribution as it fails to account for all contributing factors that led to a desired outcome.
  • Fractional Attribution (Multi-Touch Interaction) includes equal weights, customer credit, and multi-touch / curve models. Equal weight models give the same amount of credit to all events, customer credit uses past experience and sometimes simply guesswork to allocate credit. Multi-touch assigns various credit across all the touchpoints in set amounts.
  • Algorithmic Attribution uses statistical modeling and machine learning techniques to derive probability of conversion across all marketing touchpoints which can then be used to weight the value of each touchpoint preceding the conversion. Algorithmic attribution analyzes both converting and non-converting paths across all channels to determine probability of conversion. With a probability assigned to each touchpoint, the touchpoint weights can be aggregated by a dimension of that touchpoint (channel, campaign, interaction placement, visitor type, content type, etc.) to determine a total weight for that dimension.

Examples of each category of attribution model include the following:

Single Source Attribution[iii]

  • The Last Interaction model attributes 100% of the conversion value to the last channel with which the customer (or visitor) interacted before buying or converting.
  • The Last Non-Direct Click model ignores direct traffic and attributes 100% of the conversion value to the last channel that the customer clicked through from before buying or converting. Google Analytics uses this model by default when attributing conversion value in non-Multi-Channel Funnels reports.
  • The Last AdWords Click model attributes 100% of the conversion value to the most recent AdWords ad that the customer clicked before buying or converting.
  • The First Interaction model attributes 100% of the conversion value to the first channel with which the customer interacted.

Fractional Attribution[iii]

  • The Linear model gives equal credit to each channel interaction on the way to conversion.
  • The Time Decay model may be appropriate if the conversion cycle involves only a short consideration phase. This model is based on the concept of exponential decay and most heavily credits the touchpoints that occurred nearest to the time of conversion. The Time Decay model could have half-life of 7 days, meaning that a touchpoint occurring 7 days prior to a conversion will receive 1/2 the credit of a touchpoint that occurs on the day of conversion. Similarly, a touchpoint occurring 14 days prior will receive 1/4 the credit of a day-of-conversion touchpoint.
  • The Position Based model allows you to create a hybrid of the Last Interaction and First Interaction models. Instead of giving all the credit to either the first or last interaction, you can split the credit between them. One common scenario is to assign 40% credit each to the first interaction and last interaction, and assign 20% credit to the interactions in the middle.

Algorithmic Attribution[iv]

Algorithmic attribution is a more advanced way to model attribution data in order to most accurately represent the visitor interaction event flow.  Algorithms tend to be proprietary so what factors are considered in the algorithm and what weight each factor gets can vary by attribution provider.  However, the most accurate algorithmic attribution models use machine learning to intake vast amounts of data, all of the touchpoints, both historical and going forward, that went into closed-won deals, closed-lost deals, deals that fell apart at or before the opportunity stage, etc. to create enterprise specific models.

The algorithm then creates custom weights for each of your stages to represent how your visitors go through the funnel. It’s important to note that it should also use new data as you continue to engage prospects and close deals to refine and improve the model, which is the machine learning aspect.

The 6BI Analytics Schema in Figure 1 lays out the fundamental base entities that support marketing attribution.  This diagram also enumerates the process by which business value is extracted from that schema. Keep in mind this is a high level logical data model (LDM) and certainly not intended to be sufficient for generating database tables without far more domain specific modeling.

Figure 1.


From a 6BI perspective the Visitor is a type of Party because it represents “who” initiates the sequence of events.  Interaction and its sub-type Conversion are types of Events, they identify “when” an action takes place.  Credit, a type of Thing, more specifically a Thing of Value to the campaign is “what” the action produces.  Attribution, a type of Action, is “how” a credit is produced.  The Channel, a type of Location is “where” the events occurred. The assumption as to “why” the visitor interacts and converts is due to the influence of a Campaign, which is a type of Motivator.

The assigning of Campaign Credits to Campaign Channels is identified in Figure 1 by a series of five (5) steps.  This process begins with a Visitor performing a type of Interaction, through a Channel, which causes it, the Interaction, to become a Conversion.  The Conversion generates Attributions which, based on the application of an Attribution Rule produce Credits which are assigned to a Campaign. The use of a Channel by a Campaign identifies the Touchpoints which ultimately get evaluated based on how much Credit they produce for the Campaign.

To get the net benefit of attribution you need to capture the cost side as well. You need to know and use, in your assessments, not only the costs of applying the attribution rules, but the costs of channels, touchpoints, impressions and campaigns as well.  Not only do you need to determine how much influence, for example, your Paid Search feed had in generating conversions when it was the second touchpoint, but the cost of the Paid Search feed service to your enterprise as whole.[v]

The goal of attribution is to determine which touchpoints are producing a positive result, and, by using the cost of each touchpoint, an attribution system can then show which touchpoints are profitable. This allows optimization of marketing expenditures.[vi]


[i] Conversion is a generalized term for the desired result of a marketing effort. This can include other actions besides sales such as sign-ups, survey completions, favorable ratings, etc.




[v] The cost of a touchpoint might vary depending on whether it is first, last or some intermediate (assisting) interaction in the conversion event flow.


Identifying Locations (“Where”)

August 23, 2017

My apologies for the long delay since the last post in this series, but the real world got in the way, in the form of a job opportunity I just could not say “no” to.

This is the fourth in a series of posts about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Locations, Events and Motivators.  The third post in the series (on Activities, the “How” aspect) can be found at  Please note also that I changed the order of Locations and Events because I want to discuss Locations next and save until last what I consider to be the two most complex BOCs, Events and Motivators.

The Locations BOC identifies Where things get produced and consumed by parties.  Concepts and objects in this BOC capture data about, not only physical locations, but virtual locations as well.  Data element and data element collection names you may encounter that belong to the Locations BOC include, but are not limited to, names in the following table[i].  The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions. 

Usage often determines whether Location or Thing is the appropriate BOC for any given object or concept.  For example, Webpage, Portal, Database, and Dashboard are all objects that depend on context and can either be a thing or describe where a thing is located.  Also, Document, Source and Destination can designate places as well as being things.  The key to the Locations BOC is to include only those objects and/or concepts that actually refer to a Place or a Site, and not to just the description of the place or the site.  The Sites and Addresses are synonymous and can be either physical or virtual.

Locations exist whether we use them or not.  However, it is usually, at least for business purposes, only when a member of one of the other BOCs, usually parties or things, is placed at a specific site when an activity or event occurs that locations become relevant. This can best be determined if you ask yourself how important is it to know where some activity took place, where an event or where some party or thing is located.  Does being located at one site as opposed to any other site make a difference?  Is the measurement of performance impacted by the site of one or more of the objects or concepts under consideration.  If the answer is “yes” within the context being considered then location is a factor to consider in our data analysis.

Locations are often nested hierarchies as in Country, Region, State, County, City, Street, Postal Code, ZoneBuilding, etc.  This hierarchy impacts the level of aggregation of the data.  The larger the scope of the location, or the further apart the sites are within a given level of the hierarchy, generally the more parties and things are included.  The more of these objects and concepts (parties and things) are included in the measurement the less likely the details of any one specific party or thing will influence the data at that level.  This is an application of the law of large numbers. This fact is one reason why it is important to be able is dis-aggregate your data when it is needed.  Enterprise performance is measured by buying and selling lots of things, but those large numbers are generated one item at a time as often as not.

[i] I would like to thank Barry Williams and his excellent Database Answers website for providing many of the table name examples.



Identifying Activities (“How”)

May 24, 2017

This is the third in a series of posts about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Events, Locations and Motivators.  The second post in the series (on Things, the “What” aspect) can be found at .

The Activities BOC identifies How things get produced and consumed by parties. Concepts and objects in this BOC capture data about the means by which products or payments flow from one party to another, or how an enterprise carries out its work[i].  Data element and data element collection names you may encounter that belong to the Activities BOC include, but are not limited to, names in the following table[ii].  The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

An Activity is the most general super-type, in this BOC, encompassing Function and Process[iii].  Functions are intended to describe how an organization’s mission is planned to be carried out, and Processes describe how the mission is made real.  In the design of the Analytic Schema, data that identify and describe functions is almost always used in the source system as a type of reference data and will typically be brought into the Analytic Schema as text data in dimension members.  The maintenance of this data should be under the control of Data Governance.  The governance of data is itself both a function and a process and as such its performance can also be measured.  If one were to design a schema for measuring the performance of the Data Governance function a hierarchical collection of its sub-functions would be identified.  As we will see, functions also play a significant role in the Motivators BOC, but that will come later.

In data modeling, I have observed that we will more often model processes than functions.  A process can be either a Business_Process or a System_Process, but in either case the “process” is how something gets done.  This is accomplished by transforming either concepts or objects (or both) into different states.  It is the contribution of this transformation toward some goal that we need to measure.  Keep in mind it is “how” something gets done that we are measuring here, not “what” gets done.  This is vitally important to analytics and business intelligence because there is a lot of potential gain in improving how something is done, even if  what is produced (or consumed) remains unchanged.  For example, decreasing processing time, reducing waste and realigning responses to demand are all readily actionable.  For marketing purposes, how a product is produced or provided [iv] disappears into the product itself, and so is quite often overlooked as a separate factor in measurement.  In business systems quite often the names we look for to identify activities contain the word Transaction in some way.

Another feature of a process is that it transforms things, and these transformations usually take place via some Mechanism.  Mechanisms include Sales, Purchases, Receiving and Shipments.  A process can also be represented by a document such as a Request, an Order, an Invoice, a Manifest, or a Receipt.  It is the data about the transformation, perhaps recorded in a document, or perhaps not, that we want to measure.  We measure the impact on the parties and things participating in the transformation and not the parties and things themselves.  This is a subtle but important difference.  An activity’s quantities, values and description are the record of “How” the process produced a result.

An activity is often the source of one or more events, and an event is often the source of one or more activities, but activities and events are not exactly the same thing, and are not interchangeable. We will visit the Events BOC in a future post.

[i] David C. Hay, Data Model Patterns, A Metadata Map, 2006.

[ii] I would like to thank Barry Williams and his excellent Database Answers website for providing many of the table name examples.

[iii] David C. Hay, Data Model Patterns, A Metadata Map, 2006.

[iv]  The distinction between “produced” and “provided” is made to distinguish between, for example, manufacturing and retailing.

Identifying Things (“What”)

May 4, 2017

This is the second post in a series of posts about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Events, Locations and Motivators.  The first post in the series (on Parties, the “Who” aspect) can be found at ‎.

The Things BOC identifies What the concepts and objects are that are produced and consumed by parties.  Data element and data element collection names you may encounter that belong to the Things BOC include but are not limited to names in the following table[i].  The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

For the purposes of 6BI the first decomposition of Things is between Product and Payment.  Products are also further decomposed into Good and Service.  Goods are tangible material products for which consumers make payments, and services are products provided primarily by human or human-like labor.  Products are also quite often hierarchical and the names used for each level are Things BOC names in their own right.  These names can be logical such as Class, Category, and Type for example. Or can be physical such as Assembly, Component, and Container.  Look for these words, or ones like them, in conjunction with other words that more clearly designate them as classification levels of a product, such as Asset_Type or Vehicle_Assembly.  Products and Payments represent what is exchanged in a transaction and are differentiated by the direction in which they flow.  Products flow from provider to consumer, and Payments flow from consumer to provider.  Quite often the difference between a Product and a Payment is obvious, but sometimes it’s not.  This is especially true when transactions are “in kind” and it is not obvious which, if either thing, represents the “money”.  One rule of thumb is to always remember who the “first party”[ii] in your analysis and which side of their ledger you are analyzing.  The first party is for “whom” the analysis is done or “who” the analysis is intended to benefit.  If you are analyzing their receivables side then the inflow is always a payment and the outflow a product.  If you are analyzing their payables side then the opposite is true, inflows are product types and outflows are payment types.

Potentially the Things BOC can be identified by more data store names than any other BOC because we as humans often designate all phenomena as things.  In information systems however it is always more useful to refer to the instances of the Things BOC as products or payments.  We use the term “Things” for this category of business objects so that we remember to look at both sides of “what” is being exchanged in a transaction, and not be content to only consider the product alone.  There are simply so many things in the real world but we must concentrate on “how” (see the Activities BOC post) they flow if we need to measure their value to a party and assess a party’s contribution to that value.

A Definition can also be a product. This is true when used to represent the meaning of that which parties (individuals and organizations) produce or consume.  It doesn’t matter what is defined.  It can be a party, a location, an activity, an event, a motivator or anything.  If the definition itself is manipulated (i.e. produced or consumed by a party) then it is a product, and thus a thing.  We can speak about the “Definition of the customer” for example.  Customer is clearly a member of the Parties BOC when it comes to analyzing data content for understanding performance for example.  But the definition itself (i.e. What a customer is) is a product of a metadata system.  If you need to analyze, normalize and rationalize the consistency of various definitions of customer you need to treat these definitions as things and not as parties.  That is, they are products of the system associated with a provider and a consumer.

The customer can have multiple definitions, but each separate definition must be associated with the customer through some unique combination of location, event, activity and/or motivator.  Those for whom the consistency checking and improving is performed are the parties. However, and this is critical, the definition of what a customer is, so that it can be used consistently to mean the same role played by a party depending on some unique combination of activity, location, event, product, and motivator is itself a product.  As a product, its quality can be controlled and monitored, its accuracy and integrity assessed and its use measured.

[i]  I would like to thank Barry Williams and his excellent Database Answers website for providing many of the table name examples.

[ii] The party from whose perspective the measurements are taken.


Identifying Parties (“Who”)

April 26, 2017

This is the first of a series about how to identify entities in data sources that can be readily classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Events, Locations and Motivators. I will start with Parties.

The Parties BOC (Business Object Category) identifies Who produces or consumes objects and concepts.  Examples of commonly used data element[i] and data element collection[ii] names you may encounter in diverse types of data stores, that can be categorized under the Parties BOC, include but are not limited to those in the table below.[iii] They contain names commonly used to identify entities (e.g. tables in an RDBMS and documents in a DDBMS) and attributes (e.g. columns in tables and fields in documents). The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

Telling providers from consumers is important to recognize because it will tell you in which direction the product flows and in which direction the payment flows.  Knowing this provides the basis for profitability and resource use efficiencies. Producing parties include such names as Producer, Provider, Seller, Supplier, Broker, Vendor for example. Examples of consuming party names include Consumer, Payer, Receiver, Buyer, Client, Customer.  A Parties BOC name can also identify third parties such as Obligee, Specialist and Agency.  Parties can be animate (Citizen, Patient) or inanimate (Bureau, Department). They can be collective (Organization) or singular (Person). These are some of the names you should look out for in analyzing source data systems. They can be names of data element collections (e.g. tables) and/or names of data elements (e.g. columns). Generally, they hold data about the object types within the Parties BOC which can then be used in answering the Who interrogative component of any query.  Parties often come in pairs, the parties that produce or provide, and the parties that consume or pay. Thus, it is important that when you discover one or more Parties BOC entity in your source data that you ask the question, “Is this a producer or a consumer”?  Of course, a single entity can be both a producer and a consumer depending primarily on the type of thing being measured and the type of activity that causes it to need to be measured. If a “first-party” (i.e. the party from whose perspective the measurements are taken) pays money in the transaction it will be the “consumer” in the scope of the transaction being analyzed. Otherwise the party is the “producer” in some manner and its performance will be measured by some assessment of relative value between product outflow and money inflow. Quite often both providers and consumers will be assessed in some manner, usually by an Assessor, as a result of the measurements.

With the advent of the Internet of Things (IoT) the Role that a device plays in producing and consuming data now means that the Parties BOC includes non-traditionally human parties as well. As a matter of fact entities that previously were only considered as part of the Things BOC can now be parties. The concept of a party has gone beyond the traditional definitions of people and organizations but it still, at the end of the day, remains the producer and consumer of data.

[i] Wikipedia defines any unit of data defined for processing as a data element. Since 6BI deals with the meaning of data elements it is generally non-productive to consider any unit smaller than an attribute or field.

[ii] A data element collection is a set of related attributes or fields. Depending on the degree of normalization they are usually co-located in a container called a table or a document.

[iii] I would like to thank Barry Williams and his excellent Database Answers website for providing many of the table name examples.

The 6BI Analytic Schema

April 10, 2017

6BI (Six Basic Interrogatives) was originally an adaptation of the Zachman Framework[i] to the design of data warehouse data stores.  The idea was based on the assumption that a large number of the business questions that always seem to need answering, regardless of industry, are based on Zachman’s primitive interrogatives. One combination or another of these interrogatives: Who, What, Where, When, How and Why always seems to be needed in order to get the answers we are after in business intelligence.  These combinations consist of at least three of the interrogatives, two of which are always Who and What.  At least one, or as often as not, all four of the other interrogatives are needed to complete what we call the Analytic Schema.  An Analytic Schema is not deigned to answer specific queries but instead is structured to represent all the basic aspects of an enterprise so that queries do not need to be known in any great detail in advance.

Each interrogative is associated with an aspect of the enterprise.  In 6BI, aspects are called Business Object Categories (BOC) and they include Parties, Things, Locations, Events, Activities and Motivators.  Many books and articles have been written explaining and expanding the fundamental concept of enterprise aspects, including those by David Hay[ii], Dan Tasker[iii], as well as Zachman himself and others.  Also, see earlier posts and pages on this blog[iv].

The most effective, and recognizable, form of Analytic Schema is a star schema. Each dimension is classified by one and only one BOC, in other words the aspects determine the type of each dimension.  This classification of dimensions into BOCs already begins to tell us the role each aspect will play in queries built from the schema.  Each BOC is represented by at least one dimension.  The number of actual dimensions classified under each BOC does not matter but it is important to know type of each dimension.  All the dimensions of all types link to the same set of facts.  Figure 1 shows what a 6BI Analytic Schema looks like.

Figure 1. The 6BI Analytic Schema.

The Analytic Schema is a conceptual schema and real world star schemas quite often do not explicitly represent the conceptual nature of 6BI but are physical schemas designed to support a set of queries addressing specific business requirements.  However, it is not generally too difficult to parse and sort the attributes of star schema dimensions into the six categories needed for 6BI.  Difficulties occur with dimension attributes that tend to defy categorization.  For example, is a certain attribute a component of the When or the How interrogative?  This is especially true when we consider the real world information trade-offs that are often needed between events and activities for effective decision making.  Does the event drive the activity or does the activity drive the event?  Something more just correlation is often needed.  Also what about the motivators?

In 6BI Parties and Things have no direct dependency on each other and have no non-intermediated association with each other.  From a data modeling perspective, this means that a Party does not reference (i.e. has no foreign key to) a Thing, and a Thing does not reference (i.e. has no foreign key to) a Party.  This is very important, mainly because of the implicit role of the Who interrogative.  It is “the Who”[v] of the Analytic Schema for which answers need to be found.  Parties are responsible for the results using Things, but those results only have value if placed in a framework of Locations, Events, Activities and Motivators.  In the semantics of queries built on the Analytic Schema the subject is most often one or more Parties and the object (either direct or indirect) is always one or more Things.  Predicates are composed of the other BOCs and the Facts.  6BI Analytic Schemas can be a framework for creating  queries that are a type of logical proposition called a Quadruple because they are always composed of four parts:  subject,  direct object,  indirect object and predicate.

6BI was originally used to do “logical reverse engineering” of existing operational OLTP databases to sort tables into a set of buckets with each bucket representing one of the BOCs.  The thinking was that underneath all the differences that make each organization and its data unique there is a common logical structure upon which an analyst can start the analysis of the problem space.  This can be a great benefit if one needs to “hit the ground running” so to speak.  My experience (25 years and counting) has shown this to be the case.  This was useful when my goal was to identify candidate tables in data sources to frame out the Analytic Schema.  Knowing which tables belonged to which BOC was always helpful because it allowed me to see what role the records in this type of table might play in query development.  For example, I would know that tables classified as BOC Location tables were the most likely to have information about where customers and products were located, and where transactions occurred.  It also identifies the “where” dimension of the facts to use for doing such things as calculating the relationship between location and time in, for example, analyzing the optimum combination of store location and store hours.

This “logical reverse engineering” can not only be applied to tables in a relational database (RDBMS), but to documents in a document database (DDBMS), key-value pairs in a key-value store, and two-level maps in a column-family database as well.

Next, we will look at how to align random data structures (tables and documents) with the BOCs of the 6BI Analytic Schema.





[v]Not the British rock band.