Business Intelligence | Birkdale Computing

Archive for the ‘Business Intelligence’ Category

Deleting Old Online Accounts

May 15, 2026

People are often reminded to “Delete old accounts”. I’ll assume that means old accounts at online services that you no longer use. This applies if you either access the service through a browser or through the service’s app. This is good advice, but not sufficient to ensure your association with the service is permanently severed. I would like to describe a more thorough process to create a permanent separation.

If you have an “old” account at some online service that you no longer want or need, it is not sufficient to just delete the service’s link (URL) and your credentials (username and password) on your client platform (either a browser or an app). You need to make sure the service has forgotten your account. By “forgotten”, I mean the online service has either deleted your account, or the service itself no longer exists.

The first thing you should do is to attempt to login to the service using known links, usernames and passwords. At this point you will either succeed in logging in, or you will fail. In this situation you are hoping to fail. If you receive any response to indicate that the service is no longer online, that is a good thing. If the service is not available at the URL you have, most likely the service itself has been deleted. If you get to the service but your credentials don’t work, most likely the operator of the service has already deleted your account.

The real challenge comes when you can still log in to your account on the service. The challenge is finding an option to delete your account somewhere on the user-interface of the service. Not all services have a delete or remove option. From my experience I’d say only about 10% of online services do. You can also search for a customer service telephone number for the service… good luck with that! The one thing an online service doesn’t want to do is lose you as an account holder.

Remember the thing you want deleted is not your record of the account’s link, username and password, but the actual account itself on the service’s website. If the service, where you have the account, should be compromised and its user credential database stolen, the “bad guy” has your credentials. The worst thing is that, if you have only deleted your credentials from your own records, you might not even remember you had an account, much less do anything like logging in to check if your service activity data is still valid.

One last point of clarification: User credential data identifies: “Who” can use the service. User activity data identifies: “Who” you are, “What” you have done, and often “How” and “When” you’ve done it. Their difference is meaningful and important.

Tags:security
Posted in Business Intelligence | Leave a Comment »

Great Minds Do Not Think Alike… Even When They’re Artificial

August 12, 2023

The “Gettysburg Address” is a concise and powerful speech given by U.S. President Abraham Lincoln There have been a lot of articles written in the past six months or so about generative AI (GAI)[i], Large Language Models, and just about anything concerning Artificial Intelligence. So, I thought I’d compare what are probably the two sites that get the most attention. Many people are running experiments to see if they can figure out how these generative AI systems generate text under various conditions and assumptions. There is now a growing new skillset called “Prompt Engineering” which is being defined and refined that is making its way into the popular and professional imagination. Many people are already specializing in it to make a living in this post-GAI world.

I took two topics near and dear to my heart, Abraham Lincoln’s Gettysburg Address, and The Six Basic Interrogatives Framework for Data Analysis, aka 6BI. I then asked both OpenAI’s ChatGPT and Google’s Bard the same question. The prompt I used was very simple but had one big assumption built into it, specifically knowing what the abbreviation “6BI” stood for, without being prompted with a definition of it in the prompt box[ii]. This type of prompting is often referred to as Zero Shot prompting because it gives the model zero “shots” of relevant real time data to augment its training data, thus forcing it to rely solely on its current training to produce a response.

My prompt on August 21, 2023:

Apply the principles of 6BI to the Gettysburg Address.[iii]

ChatGPT’s response:

on November 19, 1863, shortly after the Battle of Gettysburg in the American Civil War. To apply the principles of 6BI (Six Basic Interrogatives) to the Gettysburg Address, we need to investigate it from the perspective of “who”, “what”, “when”, “where”, “why’, and “how’.

Who: The “Gettysburg Address” was delivered by Abraham Lincoln, the 16th President of the United States, in front of an audience composed of citizens and soldiers.
What: The “Gettysburg Address” is a brief, eloquent speech that helped redefine the purpose and meaning of the Civil War and the American union. The speech contains just 272 words, yet its message of equality, freedom, and the significance of a government “of the people, by the people, for the people” has endured for centuries.
When: The “Gettysburg Address” was delivered on November 19, 1863, amidst the ongoing American Civil War.
Where: The speech was given at the dedication of the Soldiers’ National Cemetery in Gettysburg, Pennsylvania, where one of the bloodiest and most significant battles of the Civil War had taken place just a few months prior.
Why: Lincoln’s aim was to commemorate the Union soldiers who died at the Battle of Gettysburg and to motivate the living to continue the struggle for the preservation of the Union and the principles it was founded upon. He wanted to redefine the Civil War as not just a battle for the preservation of the Union, but also for the principles of human equality, and a “new birth of freedom.”
How: Lincoln structured his speech in a way that was deeply rooted in the history and values of the United States. He began by evoking the founding principles of the United States (“Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.”), moved on to the present struggle (“Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.”), and concluded with a vision for the future (“…that this nation, under God, shall have a new birth of freedom—and that government of the people, by the people, for the people, shall not perish from the earth.”). The speech was brief, poignant, and delivered with a gravity befitting the occasion.

In sum, the 6BI approach offers a comprehensive way to dissect and analyze the Gettysburg Address, shedding light on its significance from various perspectives.

Bard’s response:

Sure, here are some ways to apply the principles of 6BI to the Gettysburg Address:

Brevity: The Gettysburg Address is only 272 words long, making it one of the shortest speeches ever delivered by a US president. This brevity helps to keep the speech focused and memorable.
Impact: The Gettysburg Address has had a profound impact on American history and culture. It is considered one of the most important speeches ever given, and it is often cited as an example of great oratory.
Clarity: The Gettysburg Address is clear and easy to understand. Lincoln uses simple language and straightforward sentences, which makes his message accessible to everyone.
Logic: The Gettysburg Address is logically structured. Lincoln begins by stating the purpose of his speech, then he provides evidence to support his claim, and finally he concludes with a call to action.
Immediacy: The Gettysburg Address is written in the present tense, which gives it a sense of immediacy. Lincoln speaks as if the events of the Civil War are happening right now, which makes his speech more powerful and urgent.
Universality: The Gettysburg Address is universal in its appeal. Lincoln’s message of equality and freedom is relevant to people all over the world.

These are just a few ways to apply the principles of 6BI to the Gettysburg Address. The speech is a masterpiece of oratory, and it is a testament to Lincoln’s skill as a communicator.

Here are some additional thoughts on how the principles of 6BI apply to the Gettysburg Address:

Brevity: The Gettysburg Address is brief, but it is not short on substance. Lincoln packs a lot of meaning into his 272 words, and he does so in a way that is both clear and concise.
Impact: The Gettysburg Address has had a profound impact on American history and culture. It is considered one of the most important speeches ever given, and it is often cited as an example of great oratory. Lincoln’s words helped to inspire the nation during the Civil War, and they continue to inspire people today.
Clarity: The Gettysburg Address is clear and easy to understand. Lincoln uses simple language and straightforward sentences, which makes his message accessible to everyone. He also avoids jargon and technical terms, which helps to ensure that his speech is understood by a wide audience.
Logic: The Gettysburg Address is logically structured. Lincoln begins by stating the purpose of his speech, then he provides evidence to support his claim, and finally he concludes with a call to action. This logical structure makes the speech easy to follow and understand.
Immediacy: The Gettysburg Address is written in the present tense, which gives it a sense of immediacy. Lincoln speaks as if the events of the Civil War are happening right now, which makes his speech more powerful and urgent. He also uses the word “here” to refer to the battlefield at Gettysburg, which further creates a sense of immediacy.
Universality: The Gettysburg Address is universal in its appeal. Lincoln’s message of equality and freedom is relevant to people all over the world. He speaks of the “unfinished work” of the Civil War, but he also speaks of the “great task remaining before us” as a nation. This suggests that the principles of equality and freedom are still relevant today, and that they are still worth fighting for.

The Gettysburg Address is a masterpiece of oratory, and it is a testament to Lincoln’s skill as a communicator. The principles of 6BI can help us to understand why[iv] the speech is so effective, and they can also help us to improve our own communication skills.

Brief Analysis of the Responses.

This article provides a comparison between the responses of ChatGPT and Bard when the same prompt is input to them. Specifically, to apply the principles of 6BI to the Gettysburg Address. It highlights the differences in their approaches brought about by both the data in their training and the algorithms processing it[v]. It is clear they were trained differently. It also shows the distinct approaches taken by each model.

The ChatGPT model emphasizes the six interrogatives, referring to them by their names, but not grouping them into a superset of aspects or dimensions as John Zachman[vi] originally did. The order of the interrogatives is changed from Zachman’s original order which, spread across two articles, was “What”, “How”, “Where”, “Who”, “When” and “Why”. The original articles were focused exclusively on business.

On the other hand, Bard focuses exclusively on Lincoln’s address itself, ignoring the six basic interrogatives. Bard sites another set of aspects which are stated in terms of assertions and not interrogatives. These are: “Brevity”, “Impact”, “Clarity”, “Logic”, “Immediacy” and “Universality”. These aspects don’t seem to be attributed to a single, original analysis of Lincoln’s Gettysburg Address[vii].

These aspects, insightful as they are in their own right, do not correctly answer the prompt. I will assume for the purposes of this article that the 6BI aspects were not included in the data on which the Bard model was trained.

Though both John Zachman’s publications, my publications, and publications by several other writers on the same topic have been on the internet for more than 10 years, Bard either was not trained on that data, or its activation algorithms did not make the connection. This observation[viii], of course, is only as of the date of the prompt (see above).

The most pronounced feature of Bard’s response, in my opinion, is unlike Lincoln’s address itself, its lack of brevity.

[i] Not to be confused with AGI (Artificial General Intelligence).

[ii] It is also assumed, of course, that the LLMs have been pre-trained on the Gettysburg Address.

[iii] Please note the grammatically correct English sentence. I think it is important to prompt these models in language that is as well written (or spoken) as possible for reasons that I will discuss later.

[iv] One of the six basic interrogatives.

[v] This processing is basically, but not exclusively, Machine Learning (ML) and the data comes primarily from scraping the world wide web.

[vi] John Zachman originally published his two articles about the six aspects of enterprise architecture in 1987 and 1989.

[vii] Yes, I did ask both AI’s if they could site from whom, or where the assertive aspects came. Neither know.

[viii] Of course, I would not have been able to make that assertion had I not observed the data, or at least, that portion of it captured by the two subject models.

Tags:6BI, Observations of Life
Posted in Artificial Intelligence, Business Intelligence | Leave a Comment »

Crypto is reshaping the world economy

August 14, 2021

Crypto is reshaping the world economy, 50 years after Nixon ended the dollar’s peg to gold. Here’s how some are playing it https://a.msn.com/r/2/AANjEqk?m=en-us&a=0

Posted in Business Intelligence | Leave a Comment »

The Data Processing Equation

April 29, 2021

The equation P(D) = R means the Processing of Data produces Results. Where P is Processing, D is the Data, and R is the Results. Processing is a function that acts upon the Data, producing the Results. This can be expressed as P of D yields R.

Algebraically we can solve for either of the variables (P, D or R). We can solve for any one of the variables (designating it as X) which then becomes the dependent variable, as long as we know the value of the other two. The other two are the independent variables. One independent variable is the experimental variable and the other is the control variable or constant.

Solving for the Results (R) we have equation 1: P(D) = X. This means that if we know the rules and procedures of the Processing (P) and we have the Data (D) we can calculate the Results (R). This is the classic Business Intelligence (BI) paradigm. In a classic star schema think of the fact and dimension tables as containing the Data and the various analyses and reporting as the Processing which produce the Results which are then used as a predictive model going forward. This can be called a “Results Driven Predictive Model” (RDPM) because the predictive power of the model is derived from the Results, the R factor of our equation. You use the Results (which you do not know ahead of time) which are derived from the interaction of Data and Processing, to inform your predictions.

Solving for the Processing (P) we have equation 2: X(D) = R. This means that if we have the Results (R) and have the Data (D) we can discover the rules and procedures of the Processing (P) that was applied to the Data (D) to produce those Results. This is the classic Machine Learning paradigm. Here through progressively measuring how close each iteration of processing allows you to get to the Results (which you already know), given the Data, you can produce a predictive model going forward. This is called a “Processing Driven Predictive Model” (PDPM). You use the rules and procedures of processing (which you do not know ahead of time) that produced the Results given the Data, to inform your predictions.

Solving for the Data (D), which is far less common than the previous two solutions, we have equation 3: P(X) = R. This means that if we have the Results and know the rules and procedures of the Processing we can deduce the Data (D) that had to be used. This equation has no classic application to what is typically thought of as business as far as I know. But has application to scientific and historical endeavors. It can be called the Historical paradigm. In other words, what Data had to be processed according to the rules and procedures of the Processing to yield the observed Results. This is called a “Data Driven Predictive Model” (DDPM). You use the Data (which you do not know ahead of time) upon which the Processing was used to produce the Results, to inform your predictions.

We manipulate the experimental independent variable while holding the control independent variable constant. This is done in order to observe and measure how changes in the experimental variable (the one being changed) effects the dependent variable. For example, in equation 1 we can change the Processing (P) while leaving the Data constant and observe how the dependent Results change. This of course is very common. A constant set of data will almost always produce different results if processed according to a different set of rules and procedures.

We can also change the Data (the D factor) in equation 1 to observe how that changes the Results while the Processing stays constant. This opens up many predictive possibilities like comparing the different Results when different Data sets are processed the same way by constant Processing.

The same experimental design structure can be applied to equations 2 and 3 as well. This becomes interesting when the Results are held constant. That is, we know what we want to see in the Results. The Data may be out of our control (that is, it may be supplied by others) and we want to know how we can Process that Data to give us the Results we want. This scenario is, in fact, the basis of fraud,

This examination, of course, is an oversimplification but I believe it captures to essential interdependency between Processing, Data and Results. This interdependency follows the classic experimental model where we have two independent variables (one experimental and one control) and one dependent variable which is subject to the manipulation of either of the other two.

Posted in Business Intelligence | Leave a Comment »

What Is a Data Model?

August 10, 2020

A data model is a great way to und myerstand the structure of a system. It requires the acceptance of the concept of an “entity”. A data model depicts the relationships between the entities that make up a “real world” system. A data model differs from a mathematical model in that it neither requires mathematical provability nor must it be expressed in numbers and symwbols as a mathematical model does. A data model also differs from a process model in that it does not represent the dynamic changes in a system over time, it depicts the structure of a system, the way the parts of a system fit together.

A data model can be either a representation of the physical reality of a system or a non-physical representation. The latter is usually called either a “conceptual model” or a “logical model”. Though both of these phrases mean something non-physical they are not completely interchangeable with one another. A conceptual model is a model of ideas, while a logical model is a model of the semantic relationships between entities and requires a shared and agreed upon vocabulary for it to be useful.

Posted in Business Intelligence, Data Modeling | Leave a Comment »

Full Stack Data Science. Next wave in incorporating AI into the corporation.

October 22, 2019

https://www.forbes.com/sites/cognitiveworld/2019/09/11/the-full-stack-data-scientist-myth-unicorn-or-new-normal/#1eb0d4f32c60

I like the concept of “Full Stack Data Science”, especially the way the author depicts it in the included graphic.

One thing I would like to point out is the recognition that the process is really a circle (as depicted) and not a spiral, or line. What I mean by that, is the path does not close between what can be perceived as the beginning “Business goal” and the end “Use, monitor and optimize”.

The results of applying Data Science to business problems not only helps solve these problems, but actually changes the motivators that drive the seeking of solutions in the first place. Business goals are usually held up as the ends with the lowest dependency gradient of any component of any complex enterprise architecture. While this may be true at any point in time, the dependency is not zero. Business goals themselves change over time and not just in response to changing economic, societal or environmental factors. The technology used to meet these goals does itself drive changes to the business goals.

A party, whether person or organization, tends to do what it is capable of doing. Technology gives it more activities to undertake and things to produce and consume, which then feedback to the goals that motivate it.

I think this article is one of the best I’ve seen in making that point.

Tags:6BI, Business Motivation, Data Science, Enterprise Architecture, written by Others
Posted in Artificial Intelligence, Business Intelligence, Data Science | Leave a Comment »

Machine Learning and Database Reverse Engineering

October 13, 2019

Artificial intelligence (AI) is based on the assumption that programming a computer using a feedback loop can improve the accuracy of its results. Changing the values of the variables, called “parameters”, used in the execution of the code, in the right way, can influence future executions of the code. These future executions are then expected to produce results that are closer to a desired result than previous executions. If this happens the AI is said to have “learned”.

Machine learning (ML) is a subset of AI. An ML execution is called an “activation”. Activations are what “train” the code to get more accurate. An ML activation is distinctly a two-step process. In the first step, input data is conceptualized into what are called “features”. These features are labeled and assigned weights based on assumptions about their relative influence on the output. The data is then processed by selected algorithms to produce the output. The output of this first step is then compared to an expected output and a difference is calculated. This closes out the first step which is often called “forward propagation”.

The second step, called “back propagation” takes the differences between the output of the first step, called “y_hat” and the expected output, called “y” and, using a different but related set of algorithms, determines how the weights of the features should be modified to reduce the difference between y and y_hat. The activations are repeated until either the user is satisfied with the output, or changing the weights makes no more difference. The trained and tested model can then be used to do predictions on similar data sets, and hopefully create value for the owning party (either person or organization).

In a sense, ML is a bit like database reverse engineering (DRE). In DRE we have the data, which is the result of some set of processing rules, which we don’t know[i], that have been applied to that data. We also have our assumptions of what we think a data model would have to look like to produce such data, and what it would need to look like to increase the value of the data. We iteratively apply various techniques to try to decipher the data modeling rules, mostly based on data profiling. With each iteration we try to get closer to what we believe the original data model looked like. As with ML activation we eventually stop, either because we are satisfied or because of resource limitations.

At that point we accept that we have produced a “good enough model” of the existing data. We then move on to what we are going to do with the data, feeling confident that we have an adequate abstraction of the data model as it exists, how it was arrived at, and what we need to do to improve it. This is true even if there was never any “formal” modeling process originally.

Let’s look at third normal form (3NF) as an example of a possible rule that might have been applied to the data. 3NF is a rule that all columns of a table must be dependent on the key, or identifier of the table, and nothing else. If the data shows patterns of single key dependencies we can assume that 3NF was applied in its construction. The application of the 3NF rule will create certain dependencies between the metadata and the data that represent business rules.

These dependencies are critical to what we need to do to change the data model to more closely fit, and thus be more valuable for, changing organizational expectations. It is also these dependencies that are discovered through both ML and DRE that enable, respectively, both artificial intelligence and business intelligence (BI).

It has been observed that the difference between AI and BI is that in BI we have the data and the rules, and we try to find the answers. In AI we have the data and the answers, and we try to find the rules. Whether results derived from either technology are answers to questions, or rules governing patterns, both AI and BI are tools for increasing the value of data.

These are important goals because attaining them, or at least approaching them, will allow a more efficient use of valuable resources, which in turn will allow a system to be more sustainable, support more consumers of those resources, and produce more value for the owners of the resources.

[i] If we knew what the original data model looked like we would have no need for reverse engineering.

Tags:Artificial Intelligence, Business Intelligence, Machine Learning, Neural Networks, Parties
Posted in Artificial Intelligence, Business Intelligence, Data Modeling | Leave a Comment »

Identifying Motivators (“Why”)

February 22, 2018

This is the sixth and final post in this series about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Locations, Events and Motivators. The fifth post in the series (about Events, the “When” aspect) can be found at https://birkdalecomputing.com/2018/01/30/identifying-events-when/ .

The Motivators BOC is probably the most nuanced and least understood BOC. I have earlier devoted an entire article about the meta-data structure of motivators entitled “The Data Architecture of Business Plans”[i] which can be found at https://birkdalecomputing.com/6bi-home/the-data-architecture-of-business-plans/ .

The Motivators BOC identifies Why things get produced and consumed by parties. Concepts and objects in this BOC capture data about the ends, means, influencers and assessments that provide the reasons why parties exchanged things (products and money) at a particular time and place. Ends and means are in general too abstract to be found in object names, but you will find names such as Strength, Weakness, Opportunity, Threat, and Key Performance Indicator (KPI) all of which are assessment elements.

Data element and data element collection names you may encounter that belong to the Motivators BOC include, but are not limited to, names in the following table[ii]. The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

In terms of identifying motivator data elements (i.e. attributes and columns) and motivator data element collections (i.e. entity types and tables) the most likely candidates are documents, or at least those objects that have the word Document in their name. You need to consider documents, because it is quite often that you will find the means (missions and courses of action) of an enterprise described in document form, especially if the document name contains words such as Strategy/Strategic, Tactic, Enablement/Enabled, Directive, Policy or Rule. The ends of an enterprise (visions and desired results) can also be described in a document, quite often having a name like Goal or Objective.

As mentioned in the post about the Things BOC[iii], a document can also be considered a type of thing, such as a definition. As in “the definition” is being assessed for accuracy, for example. However, if its purpose is to contain text that describes means or ends it also belongs to the Motivators BOC. An event can also be a motivator such as Appeal and Campaign. But as was mentioned in the Events BOC, events are primarily differentiated from other concepts and objects by their inclusion of a time data element, either a point in time or a duration.

Another source of motivators is reference data. Reference data can describe business functions (see the post on the Activities BOC) and often determines choices that users make on user interfaces which then determine logic paths that an application will take when processing data and thus explain why certain results are derived. Example data element and data element collection names that often become the basis of reference data management (RDM) include: Code, Type, Tag, Status and Class/Classification. Often you may find these name in plural form as well.

So, if you are analyzing a legacy database and you come across a table with any of these words in its name you need to study the content of the table and understand how the rows and columns of the table effect, or are designed to effect, the motivation for actions taken by the parties in the organization.

The Motivators BOC is especially relevant to the type of NOSQL database known as a document database, Mongo DB being a prime example. It is one thing to structure and access the data in a document store in an effective and efficient manner but, in terms of answering business questions, it is even more important to know what role the content of the document plays in the operation of the enterprise. In other words, how does or how should the document provide the answer to “why” a business transaction took place between parties.

Another category of motivators deals with security and privacy, and sometimes is included in policies and procedures. Names here include Authorization, Enforcement and Permission, among others. The intersection between business motivation and security is ripe for further exploration.

This is the last post in this series. I hope you will find them worthwhile and useful. To find each one just click the link in the first paragraph of each to take you to the previous one. The first in the series about the Parties BOC can be found at https://birkdalecomputing.com/2017/04/26/identifying-parties/ .

Thanks for reading them and best of luck in developing your 6BI Analytic Schemas.

[i] The title “The Data Architecture of Business Plans” is derived from the fact that Business Plans are the deliverable of the Motivation aspect (the “Why” interrogative) at the Business Management, or Conceptual perspective of the Zachman Framework for Enterprise Architecture.

[ii] As previously, I would like to thank Barry Williams and his excellent Database Answers website http://www.databaseanswers.org/data_models/ for providing many of the table name examples.

[iii] https://birkdalecomputing.com/2017/05/04/identifying-things/

Tags:6BI, Analytic Schema, Business Intelligence, Business Object Categories, Data Modeling
Posted in Business Intelligence, Data Modeling | Leave a Comment »

Identifying Events (“When”)

January 30, 2018

This is the fifth in a series of posts about how to identify entities in data sources that can readily be classified as belonging to each of the 6BI Business Object Categories (BOCs): Parties, Things, Activities, Locations, Events and Motivators. Entity types in the Events BOC identify When production and consumption of things by parties occurs. The fourth post in the series (on Locations, the “Where” aspect) can be found at https://birkdalecomputing.com/2017/08/23/identifying-locations/ .

Concepts and objects in this BOC capture data about a point in time or the duration of time over which products or payments flow from one party to another, or when an enterprise carries out its work. Data element and data element collection names you may encounter that belong to the Events BOC include, but are not limited to, names in the following table[i]. The list gives you a hint of what kind of names to look for in putting together a 6BI Analytic Schema for enabling your data to answer business questions.

Events break down into two major sub-types: (1) Occurrence types, which include Event, Alert, Notification, and Incident from the list above; and (2) Duration types which include, Year, Month, Week, Day, Hour, Minute, Second, Date and Time from the list. Duration type entities, as no doubt is obvious, are units of time and can be used to aggregate facts in a star schema across a temporal hierarchy. Occurrence types are more like things. Instead of being produced and consumed, they occur, that is they are something that can be referred back to that, in addition to any other properties they may have, always have an aspect of time or “when” about them, this aspect is important for data analysis.

Unlike the other BOCs, the Events BOC has both dimensional and fact characteristics. On the one hand, time is already defined into a hierarchy and is standard for everyone. An hour is always an hour, sixty minutes, a minute is always a minute, sixty seconds, and so on. On the other hand event occurrences are things that happen and can be measured and compared. They are data, not metadata as the hierarchy of time is. Events happen and then they are over but there can be much to learn from their having occurred. This BOC is conceived to capture important data about the perspectives of when something happens in your data. These perspectives relate to when, not where, not who, not how, not why, not even what has happened, but when it happened, or will happen.

This BOC captures the characteristics of time that most influence results. It is also important to understand how events differ from either locations or activities, two other previously covered BOCs, with which events are often confused.

A location is concrete. It is a point in space, a place, even if that space is virtual. You can go away and come back to a location, and if most (not necessarily all) other factors are the same, or within tolerances, the location is still there. Not so with an event. An event, though all relevant data may be captured about it, once it occurs, is done and goes away forever. Another instance of a particular class of events can subsequently occur, but each event is unique and has a time when it occurred.

Events and activities are closely related and co-dependent but are not the same. Activities are event-driven. They receive and react to events and create new events which are sent to other activities. Each activity is an independent entity and can execute in parallel with other activities. Coordination and synchronization is by means of events communicated between the activities. Activities react to input events by changing state and creating output events[ii].

The important thing, from a 6BI perspective is that an event provides a temporal association for a result. If the persons, places, products, locations, and motivators are known (or estimated) you still need to know when these aspects came together to create something of significance.

Another instance of the importance of the “When” aspect is in Big Data solutions. Since systems owners often cannot control when data is available to the solution it is important to be able to record when each event occurs, and there could be literally millions of events in a short unit of time producing results which can uniquely aggregate the results.

[i] I would like to thank Barry Williams and his excellent Database Answers website http://www.databaseanswers.org/data_models/ for providing many of the table name examples.

[ii] David Luckham, various writings.

Tags:6BI, Analytic Schema, Business Intelligence, Business Object Categories, Data Modeling
Posted in Business Intelligence, Data Modeling | 1 Comment »

6BI and Marketing Attribution

December 26, 2017

Six Basic Interrogatives (BI) can be used to analyze marketing attribution. In marketing, attribution is the assigning of credit to the interactions in the sequence of interactions which have led up to what is called a conversion[i]. A conversion is an action, or event which results in an action, that has value for the means of interaction, the campaign, which is seen to be the motivator of the visitor’s interactions and eventual conversion. The interactions take place through channels which when associated with a campaign are called touchpoints.

To pursue the most effective marketing strategy it is important to know which touchpoints, and in what sequences they occur, are the most likely to result in conversions. A typical scoring system to assess these sequences of actions consists of assigning credit to the touchpoints in a sequence according to some attribution rule or rules. There are several popular attribution rules in use across the field of marketing analytics. These rules fall into three broad categories.[ii]

Single Source Attribution (Single Touch Interaction) models assign all the credit to one event, such as the last click, the first click or the last channel to show an ad. Simple or last-click attribution is widely considered as less accurate than alternative forms of attribution as it fails to account for all contributing factors that led to a desired outcome.
Fractional Attribution (Multi-Touch Interaction) includes equal weights, customer credit, and multi-touch / curve models. Equal weight models give the same amount of credit to all events, customer credit uses past experience and sometimes simply guesswork to allocate credit. Multi-touch assigns various credit across all the touchpoints in set amounts.
Algorithmic Attribution uses statistical modeling and machine learning techniques to derive probability of conversion across all marketing touchpoints which can then be used to weight the value of each touchpoint preceding the conversion. Algorithmic attribution analyzes both converting and non-converting paths across all channels to determine probability of conversion. With a probability assigned to each touchpoint, the touchpoint weights can be aggregated by a dimension of that touchpoint (channel, campaign, interaction placement, visitor type, content type, etc.) to determine a total weight for that dimension.

Examples of each category of attribution model include the following:

Single Source Attribution[iii]

The Last Interaction model attributes 100% of the conversion value to the last channel with which the customer (or visitor) interacted before buying or converting.
The Last Non-Direct Click model ignores direct traffic and attributes 100% of the conversion value to the last channel that the customer clicked through from before buying or converting. Google Analytics uses this model by default when attributing conversion value in non-Multi-Channel Funnels reports.
The Last AdWords Click model attributes 100% of the conversion value to the most recent AdWords ad that the customer clicked before buying or converting.
The First Interaction model attributes 100% of the conversion value to the first channel with which the customer interacted.

Fractional Attribution[iii]

The Linear model gives equal credit to each channel interaction on the way to conversion.
The Time Decay model may be appropriate if the conversion cycle involves only a short consideration phase. This model is based on the concept of exponential decay and most heavily credits the touchpoints that occurred nearest to the time of conversion. The Time Decay model could have half-life of 7 days, meaning that a touchpoint occurring 7 days prior to a conversion will receive 1/2 the credit of a touchpoint that occurs on the day of conversion. Similarly, a touchpoint occurring 14 days prior will receive 1/4 the credit of a day-of-conversion touchpoint.
The Position Based model allows you to create a hybrid of the Last Interaction and First Interaction models. Instead of giving all the credit to either the first or last interaction, you can split the credit between them. One common scenario is to assign 40% credit each to the first interaction and last interaction, and assign 20% credit to the interactions in the middle.

Algorithmic Attribution[iv]

Algorithmic attribution is a more advanced way to model attribution data in order to most accurately represent the visitor interaction event flow. Algorithms tend to be proprietary so what factors are considered in the algorithm and what weight each factor gets can vary by attribution provider. However, the most accurate algorithmic attribution models use machine learning to intake vast amounts of data, all of the touchpoints, both historical and going forward, that went into closed-won deals, closed-lost deals, deals that fell apart at or before the opportunity stage, etc. to create enterprise specific models.

The algorithm then creates custom weights for each of your stages to represent how your visitors go through the funnel. It’s important to note that it should also use new data as you continue to engage prospects and close deals to refine and improve the model, which is the machine learning aspect.

The 6BI Analytics Schema in Figure 1 lays out the fundamental base entities that support marketing attribution. This diagram also enumerates the process by which business value is extracted from that schema. Keep in mind this is a high level logical data model (LDM) and certainly not intended to be sufficient for generating database tables without far more domain specific modeling.

Figure 1.

From a 6BI perspective the Visitor is a type of Party because it represents “who” initiates the sequence of events. Interaction and its sub-type Conversion are types of Events, they identify “when” an action takes place. Credit, a type of Thing, more specifically a Thing of Value to the campaign is “what” the action produces. Attribution, a type of Action, is “how” a credit is produced. The Channel, a type of Location is “where” the events occurred. The assumption as to “why” the visitor interacts and converts is due to the influence of a Campaign, which is a type of Motivator.

The assigning of Campaign Credits to Campaign Channels is identified in Figure 1 by a series of five (5) steps. This process begins with a Visitor performing a type of Interaction, through a Channel, which causes it, the Interaction, to become a Conversion. The Conversion generates Attributions which, based on the application of an Attribution Rule produce Credits which are assigned to a Campaign. The use of a Channel by a Campaign identifies the Touchpoints which ultimately get evaluated based on how much Credit they produce for the Campaign.

To get the net benefit of attribution you need to capture the cost side as well. You need to know and use, in your assessments, not only the costs of applying the attribution rules, but the costs of channels, touchpoints, impressions and campaigns as well. Not only do you need to determine how much influence, for example, your Paid Search feed had in generating conversions when it was the second touchpoint, but the cost of the Paid Search feed service to your enterprise as whole.[v]

The goal of attribution is to determine which touchpoints are producing a positive result, and, by using the cost of each touchpoint, an attribution system can then show which touchpoints are profitable. This allows optimization of marketing expenditures.[vi]

[i] Conversion is a generalized term for the desired result of a marketing effort. This can include other actions besides sales such as sign-ups, survey completions, favorable ratings, etc.

[ii] https://en.wikipedia.org/wiki/Attribution_(marketing)

[iii] https://support.google.com/analytics/answer/

[iv] https://www.bizible.com/

[v] The cost of a touchpoint might vary depending on whether it is first, last or some intermediate (assisting) interaction in the conversion event flow.

[vi] https://www.convertro.com/

Tags:6BI, Data Modeling, Marketing Attribution
Posted in Business Intelligence, Data Modeling, Marketing Attribution | Leave a Comment »

Birkdale Computing