Cross-mapping Business Data Elements with Physical Data Structures

September 12, 2018

ross-mapping[i] business data elements (BDEs) from requirements or business glossaries all the way through to the physical data structures that store these data elements can be tricky.  This process is made easier if a logical data model (LDM) is implemented between the business side and the technology model upon which the physical implementation is based.

In this article I will show two styles of cross-mapping between the BDEs of the business glossary and the data storage structures defined in the physical data model (PDM).  The two styles are called “Column Style” mapping and “Cell Style” mapping.

Column Style can apply to the cross-mapping of BDEs with both base tables[ii] and reference tables.  Cell Style mapping takes Column Style a step further for reference data and maps the individual values of a reference BDE to individual cells, or row column coordinates, in the database.  If each reference value needs to be implemented in the technology separately a Cell Style mapping is needed.

Also some BDEs are atomic in that there is no inherent structure in the business data element.  I call these Simple BDEs.  Some BDEs have an inherent structure which depends on the scope of their business definition.  These structures are often in the form of a container and its contents.  I call these Compound BDEs.

Some examples of each style follow.  Figure 1 shows an example of a Column Style mapping of a Simple BDE[iii] to a single Column of a single base Table in the database.  Using the terms “Table” and “Column” gives the impression that these mapping styles apply only to relational databases.  This is not true.  The concepts here can apply to “NoSQL” databases as well, such as Key-Value Stores, Column Family Databases and Document Databases.  These types of databases have been referred to as Aggregate-Oriented Databases.[iv]

Figure 1.

Figure 2 shows a Column Style mapping of a Simple BDE to a single Column of a single reference Table in the database.  Note here that because it is a reference data mapping that the valid values of the reference BDE do not get specifically mapped to any structure or data in the database.  In this case the valid reference values may be defined and maintained in the business glossary in an appropriate manner but there is no explicit mapping link between them and any thing in the data model or database.  Also note that I am not saying that the reference values are not in the database.  They almost certainly are, since the applications would presumably not function properly if they were not.  But the values themselves (“Unknown”, “IFRS”, US GAAP”, etc.) are not considered BDEs.

Figure 2.

Figure 3 is a Column Style mapping of a Compound BDE to multiple Columns of a base Table in the database.  Note here that the BDE maps to two Columns in the same Table.  It could, of course, map to only one Column, or to many more Columns in many more Tables.  The point of this example is not how may times the BDE maps to various Columns but that it is a Compound BDE.  “Current Assets.Cash”[v] really identifies two business concepts, “Current Assets” which is a section of a financial statement, and “Cash” which is the most liquid kind of asset.  In terms of a financial statement, cash is a type of current asset.  For whatever reason, probably because cash itself can occur in multiple contexts, the business found it necessary to identify the financial statement use of cash in its own BDE.  Cash could also, for example, be a type of transaction, which would be another BDE.

Figure 3.

In Figure 4 we have an example of Cell Style mapping.  Here, the same Compound BDE that was used in Figure 3, “Current Cash.Assets” is broken apart into its two separate parts, and each is mapped with a different Column in a different reference Table in the database.  Both of these tables are reference Tables.  One contains the name of the types of financial statement templates used by the system (financial_statement_templates) to which “Current Assets” is mapped.  The other contains the names of the types of lines on those financial statements (financial_statement_line_templates) to which “Cash” is mapped.  It is unlikely that “Current Assets” would be mapped to multiple database Columns, but because of the multiple meanings of “Cash” there could be multiple mappings between it and multiple database Columns.

Figure 4.

Note that Cell Style mapping requires mapping to the data content of the database and not just its structure, or metadata.  This needs to be taken on with great caution and requires a consensus for how the words (e.g. “Cash”) used in the glossary are conveyed to and used in the Transaction Activity Data[vi] of the database to determine the actions of any application accessing the data.  Cash is simple, but any time you rely on reading a string to determine a code path you create an external dependency in the code.  However, if the business changes the name of a Cell Style mapped BDE there should be a documented path to where and how that value is used by any application that accesses it in the database.  This fact alone makes any system more maintainable and sustainable.

 

 

[i] The word “cross-mapping” is used to imply that the mapping goes both ways, from business to technology, and from technology to business.  This allows the documentation of the system to more accurately describe the entire enterprise architecture by creating valid and traceable links back and forth between the business architecture and the technical architecture.  By creating an interface between the two aspects of the enterprise architecture, business people can more confidently make decisions based on the data they see, knowing with confidence it is reliably stored and processed, and technical people can more confidently know the data they collect, manipulate and provide does in fact represent what business side intends for it to be.

[ii] I use the term “base table” here to refer to the realization of the bottom four of the six layers of data described by Malcolm Chisholm in his article “What is Master Data” originally published February 6, 2008.  http://www.b-eye-network.com/view/6758 .  These include Enterprise Structure Data, Transaction Structure Data, Transaction Activity Data and Transaction Audit Data.

[iii] The numbers (128, 600 and 1) that follow the name of each BDE in the examples are purely arbitrary and are there only to indicate that BDE’s should be numbered in the catalog for indexing purposes.

[iv] See Pramod J. Saladage and Martin Fowler, NOSQL Distilled, A Brief Guide to the Emerging World of Polyglot Persistence, Addison-Wesley, 2013.

[v] Note the “.” Between Current Assets and Cash.  This is just to make it easier the tell that the BDE is compound.  Don’t count on always having a separator between the parts of a Compound BDE.  The glossary tool may simply not support it or the in-depth analysis to identify the two parts may not have been made.  As a data modeler you may have to do this analysis yourself.

[vi] Please refer to note ii above.

 

The Verge: Google may add Windows 10 dual-boot option to Chromebooks

August 31, 2018

The Verge: Google may add Windows 10 dual-boot option to Chromebooks.
https://www.theverge.com/2018/8/13/17682902/google-windows-10-dual-boot-chromebooks-support-campfire

 

Android Police: Google posts new Duplex demo to show how Assistant will identify itself on the phone

June 28, 2018

Android Police: Google posts new Duplex demo to show how Assistant will identify itself on the phone.
https://www.androidpolice.com/2018/06/27/google-posts-new-duplex-demoq-show-assistant-will-identify-phone/

Boing Boing: Garbage In, Garbage Out: machine learning has not repealed the iron law of computer science

May 30, 2018

Boing Boing: Garbage In, Garbage Out: machine learning has not repealed the iron law of computer science.
https://boingboing.net/2018/05/29/gigo-gigo-gigo.html

Shared via Google News

Everything you knew about Chromebooks is wrong | Computerworld

May 29, 2018

https://www.computerworld.com/article/3276329/chrome-os/everything-you-knew-about-chromebooks-is-wrong.html

Google’s Chrome OS gets new app muscle with built-in Linux

May 9, 2018

https://www.cnet.com/news/googles-chrome-os-and-chromebooks-get-new-app-muscle-with-built-in-linux/

 

‘Atlas’ 4K Chromebook may be one of the first Chrome OS ‘detachables’

April 7, 2018

‘Atlas’ 4K Chromebook may be one of the first Chrome OS ‘detachables’ http://google.com/newsstand/s/CBIwjuqUjzg

Google Chromebooks fight malware, get security experts’ approval – CNET

March 29, 2018

https://www.cnet.com/news/how-google-chromebooks-became-the-go-to-laptop-for-security-experts/

Identifying Motivators (“Why”)

February 22, 2018

This is the sixth and final post in this series about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Locations, Events and Motivators. The fifth post in the series (about Events, the “When” aspect) can be found at https://birkdalecomputing.com/2018/01/30/identifying-events-when/ .

The Motivators BOC is probably the most nuanced and least understood BOC. I have earlier devoted an entire article about the meta-data structure of motivators entitled “The Data Architecture of Business Plans”[i] which can be found at https://birkdalecomputing.com/6bi-home/the-data-architecture-of-business-plans/ .

The Motivators BOC identifies Why things get produced and consumed by parties.  Concepts and objects in this BOC capture data about the ends, means, influencers and assessments that provide the reasons why parties exchanged things (products and money) at a particular time and place.  Ends and means are in general too abstract to be found in object names, but you will find names such as Strength, Weakness, Opportunity, Threat, and Key Performance Indicator (KPI) all of which are assessment elements.

Data element and data element collection names you may encounter that belong to the Motivators BOC include, but are not limited to, names in the following table[ii]. The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

In terms of identifying motivator data elements (i.e. attributes and columns) and motivator data element collections (i.e. entity types and tables) the most likely candidates are documents, or at least those objects that have the word Document in their name.  You need to consider documents, because it is quite often that you will find the means (missions and courses of action) of an enterprise described in document form, especially if the document name contains words such as Strategy/Strategic, Tactic, Enablement/Enabled, Directive, Policy or Rule.  The ends of an enterprise (visions and desired results) can also be described in a document, quite often having a name like Goal or Objective.

As mentioned in the post about the Things BOC[iii], a document can also be considered a type of thing, such as a definition.  As in “the definition” is being assessed for accuracy, for example.  However, if its purpose is to contain text that describes means or ends it also belongs to the Motivators BOC.  An event can also be a motivator such as Appeal and Campaign.  But as was mentioned in the Events BOC, events are primarily differentiated from other concepts and objects by their inclusion of a time data element, either a point in time or a duration.

Another source of motivators is reference data.  Reference data can describe business functions (see the post on the Activities BOC) and often determines choices that users make on user interfaces which then determine logic paths that an application will take when processing data and thus explain why certain results are derived.  Example data element and data element collection names that often become the basis of reference data management (RDM) include: Code, Type, Tag, Status and Class/Classification.  Often you may find these name in plural form as well.

So, if you are analyzing a legacy database and you come across a table with any of these words in its name you need to study the content of the table and understand how the rows and columns of the table effect, or are designed to effect, the motivation for actions taken by the parties in the organization.

The Motivators BOC is especially relevant to the type of NOSQL database known as a document database, Mongo DB being a prime example.  It is one thing to structure and access the data in a document store in an effective and efficient manner but, in terms of answering business questions, it is even more important to know what role the content of the document plays in the operation of the enterprise.  In other words, how does or how should the document provide the answer to “why” a business transaction took place between parties.

Another category of motivators deals with security and privacy, and sometimes is included in policies and procedures.  Names here include Authorization, Enforcement and Permission, among others.  The intersection between business motivation and security is ripe for further exploration.

This is the last post in this series.  I hope you will find them worthwhile and useful. To find each one just click the link in the first paragraph of each to take you to the previous one. The first in the series about the Parties BOC can be found at https://birkdalecomputing.com/2017/04/26/identifying-parties/ .

Thanks for reading them and best of luck in developing your 6BI Analytic Schemas.

 

[i] The title “The Data Architecture of Business Plans” is derived from the fact that Business Plans are the deliverable of the Motivation aspect (the “Why” interrogative) at the Business Management, or Conceptual perspective of the Zachman Framework for Enterprise Architecture.

[ii] As previously, I would like to thank Barry Williams and his excellent Database Answers website http://www.databaseanswers.org/data_models/ for providing many of the table name examples.

[iii] https://birkdalecomputing.com/2017/05/04/identifying-things/

Identifying Events (“When”)

January 30, 2018

This is the fifth in a series of posts about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Locations, Events and Motivators.  Entity types in the Events BOC identify When production and consumption of things by parties occurs. The fourth post in the series (on Locations, the “Where” aspect) can be found at https://birkdalecomputing.com/2017/08/23/identifying-locations/ .

Concepts and objects in this BOC capture data about a point in time or the duration of time over which products or payments flow from one party to another, or when an enterprise carries out its work. Data element and data element collection names you may encounter that belong to the Events BOC include, but are not limited to, names in the following table[i]. The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

Events break down into two major sub-types: (1) Occurrence types, which include EventAlert, Notification, and Incident from the list above; and (2) Duration types which include, Year, Month, Week, Day, Hour, Minute, Second, Date and Time from the list.  Duration type entities, as no doubt is obvious, are units of time and can be used to aggregate facts in a star schema across a temporal hierarchy.  Occurrence types are more like things.  Instead of being produced and consumed, they occur, that is they are something that can be referred back to that, in addition to any other properties they may have, always have an aspect of time or “when” about them, this aspect is important for data analysis.

Unlike the other BOCs, the Events BOC has both dimensional and fact characteristics.  On the one hand, time is already defined into a hierarchy and is standard for everyone.  An hour is always an hour, sixty minutes, a minute is always a minute, sixty seconds, and so on.  On the other hand event occurrences are things that happen and can be measured and compared.  They are data, not metadata as the hierarchy of time is.  Events happen and then they are over but there can be much to learn from their having occurred. This BOC is conceived to capture important data about the perspectives of when something happens in your data.  These perspectives relate to when, not where, not who, not how, not why, not even what has happened, but when it happened, or will happen.

This BOC captures the characteristics of time that most influence results.  It is also important to understand how events differ from either locations or activities, two other previously covered BOCs, with which events are often confused.

A location is concrete.  It is a point in space, a place, even if that space is virtual. You can go away and come back to a location, and if most (not necessarily all) other factors are the same, or within tolerances, the location is still there.  Not so with an event.  An event, though all relevant data may be captured about it, once it occurs, is done and goes away forever.  Another instance of a particular class of events can subsequently occur, but each event is unique and has a time when it occurred.

Events and activities are closely related and co-dependent but are not the same.  Activities are event-driven.  They receive and react to events and create new events which are sent to other activities.  Each activity is an independent entity and can execute in parallel with other activities.  Coordination and synchronization is by means of events communicated between the activities.  Activities react to input events by changing state and creating output events[ii].

The important thing, from a 6BI perspective is that an event provides a temporal association for a result.  If the persons, places, products, locations, and motivators are known (or estimated) you still need to know when these aspects came together to create something of significance.

Another instance of the importance of the “When” aspect is in Big Data solutions.  Since systems owners often cannot control when data is available to the solution it is important to be able to record when each event occurs, and there could be literally millions of events in a short unit of time producing results which can uniquely aggregate the results.

[i] I would like to thank Barry Williams and his excellent Database Answers website http://www.databaseanswers.org/data_models/ for providing many of the table name examples.

[ii] David Luckham, various writings.