Posts Tagged ‘NOSQL’

6BI and Aggregate-Orientation, Part 2

September 15, 2016

This is the second part of a two-part post.  The first part can be found at https://waynekurtz.wordpress.com/2016/09/06/introduction-to-6bi-and-aggregate-orientation/.

To transform a 6BI (“Six Basic Interrogatives”) derived logical data model into an aggregate-oriented data model there are several steps that need to be taken.  We will use the logical business data model shown in Figure 3 as our example to show this transformation.  It’s very simplified but fairly typical of certain types for commercial enterprises.

ldm-clean-9-9-2016-2-36-43-pm

 

 

 

 

 

 

 

 

Figure 3. A Business Data Model Example.

Semantically we can say the following about our business data model.  A Customer can have many Contacts, and a Contact can have many Customers.  The Customer_Contact entity[i] resolves this by linking a Customer to its Contacts through the “has contact” relationship and a Contact to its Customers through the “has customer” relationship.   A Customer “places” Orders, “receives” Order_Invoices, and “makes” Order_Payments.  A Provider “receives” Orders.  An Order “contains” Order_Items and a Product “identifies” Order_Items.  An Order_Invoice “is paid for by” Order_Payments and a Payment “is applied to” an Order_Payment.

Unlike data modeling where the outcome, as mentioned in Part 1, is typically the creation of tables which separately store instances of entity types linked together by key-based references, the outcome of an aggregate-oriented data model is the creation of aggregates.  An aggregate is a complex data structure based on a “root entity”.  The root is determined by the business requirements and is identified as an entity type in an LDM.  This entity is often derived from a 6BI BOC (“Business Object Category”).  Aggregates are stored as a nested collection of entity instances whose relationship to the root is also determined by business requirements.  How deeply an entity type is nested within the aggregate structure is determined by where it is placed along a dependency gradient.

Dependency Gradients

A dependency gradient is a continuum of degrees of dependency where the amount of dependency flows from a higher concentration to a lower concentration.  It flows from entities that have a greater number of dependencies, both direct and indirect, to entities with a lesser number of dependencies.  An entity that consumes no other entity’s identifier (i.e. it has no foreign keys) is referred to as the “apex entity” of the continuum and has the least dependency.  An entity whose identifier is not consumed by any other entity along the continuum is referred to as the “base entity”.  The relationship between a base entity and an apex entity could be direct, however there are often many potential intermediate entities between the base and the apex entities.

Figure 4 shows two dependency gradients that flow from the same Base Entity Order_Payment (a type of Exchange).

  • Dependency Gradient-01 flows through the relationship “is applied to” directly to Apex Entity-01 Payment (a type of Thing).
  • Dependency Gradient-02 flows through the relationships “is paid for by” to Order_Invoice (a type of Exchange), “is billed thru” to Order (a type of Exchange), and finally “places” indirectly to Apex Entity-02 Customer (a type of Party).

dep-grad-9-13-2016-14-40-20-am

Figure 4. Dependency gradients flowing from Order_Payment.

Dependency gradients form dependency trees when the flow from base to apex takes more than one path.  The term gradient[ii] is used to indicate the relative degree and direction of dependency along a tree.  Dependency always flows from a higher concentration (more foreign keys) to a lower concentration.  Aggregates in an aggregate-oriented data model are based on dependency trees.

Steps for Transforming the LDM

The following ordered set of steps are required for transforming a LDM into an aggregate-oriented data model.

  • Select the root entity for each aggregate.
  • Identify relationships to be eliminated by “rolling down” parent entity attributes into the child entity.
  • Identify relationships to be collapsed by “rolling up” child entity attributes into the parent entity.
  • Eliminate unneeded apex entities by rolling down their attributes to an entity higher on the dependency gradient.
  • Roll up attributes from base entities in the direction of the root entity along selected dependency gradients.
  • Create the aggregate-oriented data model.

Select the Root Entity for each Aggregate

The root entity of an aggregate, identified by the ROOT operator, can occur anywhere along the dependency gradient and is independent of the relational association (i.e. relationship) between itself and any other entity along the gradient.  The other entities in a dependency tree, called aggregate participants, are embedded into the aggregate root entity instance.  The embedding of participant entity attributes into the aggregate root occurs regardless of whether the participant is in the direction of the apex or in the direction of the base relative to the root.  Identification of the root entity will influence which relationships get eliminated and to a greater extent determine which relationships get collapsed.

Identify Relationships to be Eliminated

Relationships to be eliminated are tagged with the REFI[iii] tag.  To identify each individual action triggered by the REFI tag the tags are numbered starting with REFI-01.  In eliminated relationships the parent entity is removed and all or some of its attributes are rolled down into the child entity.

Which attributes to roll down is determined by how normalized the entity is.  In cases where an entity has attributes that are not part of one of the dependency gradients being aggregated the non-dependent attributes would not be rolled down, however the relationship is still eliminated in transforming the logical data model into aggregate-orientated data model.

Figure 5 shows the relationship between Product and Order Item named “identifies” tagged as REFI-03.  In this case it was decided this relationship would be eliminated and the parent (Product) attributes were rolled down into the child (Order Item) entity eliminating the relationship and its parent entity.

refi03-9-10-2016-1-00-40-pm

Figure 5. A relationship tagged for elimination.

Identify Relationships to be Collapsed

This step requires modelers to take an entity, like Order in our example data model, which has other entities that refer to it and roll up the attributes of its selected child entities into the parent entity.  Order and its dependent entities, Order_Item through the “contains” relationship, and Order_Invoice through the “is billed thru”  relationship are shown in Figure 6.  Only the relationship with Order_Item is tagged as EMBED-02.  The relationship with Order_Invoice does not need to be tagged for this transformation because Order_Invoice has a separate relationship with Customer named “receives” and its attributes are rolled up along that dependency gradient (see Figure 7).

ldm-embed-tag-9-10-2016-1-38-40-pm

Figure 6. A relationship tagged for collapse.

Figure 7 shows the whole example model tagged in preparation for transforming it into an aggregate-oriented data model with Customer as its root entity.

ldm-tagged-9-10-2016-1-17-36-pm

Figure 7. A logical data model tagged for transformation.

Eliminate Unneeded Apex Entities

Figure 8 shows what the physical data model (PDM) looks like after certain relationships, and their parent entities, are eliminated.  In this model we have eliminated the relationship “has customer” rolling Contact into Customer_Contact (REFI-01), “receives” rolling Provider into Order (REFI-02), and “identifies” rolling Product into Order_Item (REFI-03)[iv].  Please note that in the roll downs we have included the parent primary keys (now as non-key attributes).  Strictly speaking we would not always need to do this.  If we were to implement this data model in a relational database management system (RDBMS) the entity Provider would become a table and its attributes columns.

Again this is actually more of a business decision than strictly a technical decision, though I suspect it is rarely considered so.  This is another reason why I say selection of a NOSQL database is at least as much of a business decision as a technical decision.

pdm-rel-9-9-2016-2-40-24-pm

Figure 8. Our example data model after eliminating unneeded apex entities.

Create the Aggregate-Oriented Data Model

Figure 9[v] shows an aggregate-oriented data model based on Customer as its root entity.  All the attributes of the entities that survive the roll down are rolled up along dependency gradients and collapsed into the Customer_Aggregate forming the nested collection of entity instances, which is called an aggregate.

We have the same issue here that we had in the previous section.  What do we do with the entity identifiers (i.e. primary keys)?  One identifier that we have to keep of course is the primary key of the root entity, Customer_Aggregate.customer_id.  It is the identifier of the aggregate.  The answer for an aggregate is not the same as the answer for a table.  In a RDBMS we are optimizing the design of tables, in a NOSQL database we do not have tables.  All the attributes of all entity types included in creation of the aggregate are considered attributes of the aggregate and no longer “stand alone” entities or entity instances.  The identifiers of all aggregate participants are removed from the model.

cust-agg-9-13-2016-3-23-47-pm

Figure 9. Aggregate-oriented data model.

Not having tables as your “modeling target artifacts” may be a difficult and even disturbing concept for data modelers experienced in identifying and normalizing entities.  For this reason an aggregate-oriented data model is a new type of data model.  It is not a logical model, but it is also not a physical model in the strict sense that it can be used to generate pre-defined database schema structures (e.g. tables).  It is used to relate the entity types and relationships designed in a logical data model, and the business requirements they represent, to the chosen type of NOSQL database (e.g. key-value, column family, document, and others) and the unique metadata structures each type of NOSQL database has.

A final transformation is required to create code for the construction of the appropriate aggregate-oriented NOSQL database structures.  Because we typically see NOSQL data structures expressed as a mixed meta-data and data structure such as a JSON string it is easy to forget that JSON still has a schema.  That schema is often maintained in the application code and not in the database engine.  The advantage of this is more flexibility in application development but the cost is often a lack of the uniform constraints needed to maintain the re-useable data structures required for non-deterministic query construction and compliance to data governance standards.

In summary the steps needed to transform a LDM into an aggregate-oriented data model are the following.  First, identify the root entity for the aggregate in the LDM, identify non-root parent entities to be eliminated by rolling down their attributes into their child entities, and finally identify child entities to be collapsed by rolling up their attributes into the root entity along dependency gradients that have the aggregate root as apex entity.

In a future paper I will describe how the concepts in this article can be applied to data targeted to be stored in popular NOSQL databases.

 

[i] It’s conventional for data modeling tools to call the rectangles in LDMs “entities”. They can also be called “types” or “entity types”. The physical form should be called an “entity instance”.  In this paper I will use “entity” and “entity type” interchangeably because most of the paper deals with logical data modeling and most of the tools used for logical data modeling call them tables.  Where I refer to physical modeling I use either “entity instance” of if entity is understood just “instance”.

[ii] In physics “gradient” is defined as an increase or decrease in the magnitude of a property, the change of which can usually be measured. The word seemed appropriate to name the concept of increasing or decreasing dependency.

[iii] The REFI and EMBED tags were identified by Jovanovic and Benson.  Their usage in this article is derived from the paper but may differ slightly from the original meaning, including the numbering of each instance of a tag type for easier identification of each individual roll-up and roll-down action.

[iv] It is important to understand that the three eliminated entities are not removed from the data model, neither logically or physically.  They are simply not part of the aggregate-oriented data model.

[v] Any reader familiar with the Erwin Data Modeler tool will recognize that I’ve used the symbol for a View to represent the aggregate. This is not to say that an aggregate is the same as a view.  Erwin, nor any other data modeling tool I am familiar with yet supports aggregate oriented modeling.  The COMN data modeling technique introduced by Theodore Hills is the closest I’ve seen to actually recognizing and describing a methodology for doing so.  Please see http://www.tewdur.com .

Introduction to 6BI and Aggregate-Orientation

September 6, 2016

I wrote this article after reading NoSQL Distilled, A Brief Guide to the Emerging World of Polyglot Persistence by Pramod J. Sadalage and Martin Fowler in the summer of 2016. I was incented to read the book after seeing Martin Fowler give a presentation on NoSQL at a 2013 GoTo Conference presentation, which can be viewed at https://www.youtube.com/watch?v=ASiU89Gl0F0.

Up to that point the whole concept of “NoSQL” was very fuzzy to me, but after experiencing his presentation and reading the book I realized that the “lynchpin” that held RDBMS and NoSQL together was a concept that many SQL database architects have been working with for years, the common de-normalized data structure. One problem (among several) with de-normalization has always been how do you systematically de-normalize data so that two de-normalized data-sets can be compared to each other in a consistent and repeatable manner to assure reliable analysis by different parties, over time and in different locations.  In other words how do we maintain the old “apples-to-apples” and “oranges-to-orange” paradigm.

————————————————————————————————————

Six Basic Interrogatives, or 6BI for short, provides an approach for organizing and understanding the context of data used to support business decision making.  I first wrote about 6BI back in the 2004-2005 time frame when I was working full time as a Business Intelligence Data Architect and needed a quick way to get started analyzing existing data in order to organize it to answer business questions.  Since that time I’ve had other opportunities to put 6BI to use in situations not normally identified as business intelligence, certainly not the traditional type which focuses on a central RDBMS based data warehouse.  I have found it holds up well, as I thought it would, in various scenarios and has versatility and applicability in just about all the computing situations that I have found myself in.

With the increased popularity of distributed computing, fueled mostly by the overwhelming up take of mobile devices both for business as well as personal use, new highly scalable technologies for persisting data have been developed.  These technologies tend to rely more on an expanding network of smaller computing units as they grow by scaling out to more units, as opposed to a small number of large and ever growing computing units which grow capacity by scaling up.  Scaling out is most often referred to as “horizontal scaling” while scaling up is referred to as “vertical scaling”.  There is no doubt that horizontal scaling is more popular today than vertical scaling.  Going hand-in-hand with the horizontal scaling trend has been the introduction of alternative data stores now collectively known as NOSQL databases.

Three types of NOSQL databases that are designed to store a rich structure of closely related data that is accessed as a unit were identified by (Sadalage and Fowler 2013)[i].  They call this type of database an aggregate-oriented database because its basic storage structure is conceptualized differently from that of a traditional relational database management system (RDBMS).  An RDBMS stores its data in tables, or more precisely in relations.  Relations allow data to be normalized into units (tables) which contain data that is dependent only upon the primary key of the entity.  From a logical perspective we can say that each set of attributes (i.e. a row in a table) describes one and only one instance of an entity which in turn is differentiated from all other instances of all other entities by its unique identifier. The columns of the table store the attribute values of the entity instance where its row intersects with its columns.

In practice this results quite often in data about customers stored in a Customer table, data about products in a Product table, data about Orders in an Order table, and so forth.  This is a very powerful design because it does not pre-suppose any fixed relationship between Customers, Products and Orders, but it let’s designers and developers link them together in any way that meets business requirements even when these requirements might not necessarily be known in advance.  This ad hoc linking is made possible by the SQL language[ii].

NOSQL databases, in order to provide more performance across distributed data stores, even when the nodes are distributed globally, were conceptualized and built with a different set of priorities.  Non-deterministic query structure was not a priority, but instead just capturing and persisting the data as quickly as possible was the top priority.  There was no need to build SQL language interpreters into the architecture of these “new age” data stores.  As a result NOSQL databases end up sacrificing some of the flexibility needed to support non-deterministic query construction in favor of higher performance access in a distributed landscape, or cluster.

Inevitably this trade-off has lead to the need for designers to know or predict, in advance, how the data will be accessed, which entities will need to be joined, and in what order to answer their business questions.  The loss of the luxury of not needing to know in advance exactly how your data will be accessed is the price that is paid for better performance.  Fair enough!  Nothing comes without its cost, especially in the world of computing. The important thing to remember is that there will always be a trade off[iii].

There are three aggregate-oriented NOSQL database categories that have emerged over the last several years:  Key-Value, Document, and Column Family.  Each is different but they share a conceptually similar data storage unit called an aggregate[iv].  A data modeler (or developer modeling an application’s data) using these new types of models needs to start thinking about how the data will be accessed earlier in the design and development process than ever before.  As (Sadalage and Fowler 2013) state, aggregate awareness needs to be made part of the design of the data model[v].  I can see how this could be the case when there is a lack of clarity between the role of database design and that of database development.  With the advent of aggregates and aggregate-orientation the job of a business intelligence data modeler has changed from being able to concentrate on business requirements to needing to be more aware of how the data will be accessed and stored.  This challenge is stated succinctly by (Jovanovic and Benson 2013) when they say NoSQL databases have opened a new non-trivial problem area for data modelers[vi].

If we do the aggregate data modeling at a level independent of the style of the data store (i.e. Key Value, Document or Column Family) however we can create implementation neutral and re-useable data models that reflect the requirements of the business as much as they reflect the requirements of the technology used to implement it.  Data organized with an aggregate-orientation needs to be modeled at a logical level just as much as data organized with a relational orientation ever did.  In both cases we need to start with the basic entities that describe business value for persisting and manipulating data.  The need for an organizing framework for data still exists.  We still need to know which real world entities (regardless of how they are instantiated in the data store) the business cares about and how these entities relate to each other.

Business Object Categories

6BI is built on a framework of six universal and re-usable Business Object Categories (BOCs).  Each BOC represents a non-overlapping category of business data.  Each category corresponds to one of the six basic interrogatives: Who, What, Where, When, How and Why.  Figure 1 shows the six basic interrogatives and the business object category which is derived from each including example sub-categories which will be used in subsequent business data model examples.

Basic Interrogative Business Object Category
Who produces the data used to measure performance? Parties[vii] (e.g. Customer and Provider)
What is being manipulated or exchanged to produce measurable performance? Things[viii] (e.g. Product and Payment)
How are the data values that measure performance produced? Activities (e.g. Exchange and Process)
When does the activity take place or for how long is the performance measured? Events (e.g. Point in Time and Period of Time)
Where does the activity used to measure performance take place? Locations (e.g. Address and Placement)
Why does the data actually measure performance? Motivators[ix] (e.g. End and Means

Figure 1. Six Basic Interrogatives and Business Object Categories.

In the 6BI Framework, Parties and Things have no direct association with each other.  It is only through the other four Business Object Categories (Activities, Events, Locations, and Motivators) that Parties and Things are associated.  The left side of Figure 2 shows the relationship of Parties to Activities, Events, Locations and Motivators.  The right side shows the relationship of Things to the same four BOCs.  This makes sense because only when parties engage in activities, exchanging and manipulating things between them, do events get generated at certain locations and reasons why become relevant.  Figure 2 illustrates the relationships between business object categories, both direct and indirect.

6bi-party-product-assocs-20160826

Figure 2. The indirect association of Parties to Things.

These business object categories provide the framework and starting point for classifying data that are need for non-deterministic query structure.  This design allows enough flexibility to support ad hoc reporting by focusing on the basic questions that all decision support data structures need to support, while not requiring a strictly relational storage structure underneath.

In the next part of this article we will show a method for transforming a 6BI derived logical data model into an aggregate-oriented data model.

 

[i] Pramod J. Saladage and Martin Fowler, NOSQL Distilled, A Brief Guide to the Emerging World of Polyglot Persistence, Addison-Wesley, 2013. Sadalage and Fowler identify four (4) types of NOSQL databases. The fourth being Graph databases which we will not be discussing in this article.

[ii] SQL is not the first relational database language.  In the 1980s each relational database product had its own language, each of which did pretty much the same thing.  SQL simply won the “database language war”.  It is often criticized for being, among other short comings, “less than perfect”. But like the human communication vehicles known as “natural languages” it too has evolved to fit its purpose and become widely accepted as a result.

[iii] This is the reason why the choice of using a NOSQL style database is, or should be, as much a business decision as a technical decision.  What needs to always be taken into consideration are the possible hidden costs of implementing a database solution that does not support SQL. Especially in light of the fact that SQL has become a virtual “lingua franca” of data analysis and business intelligence. These costs may be hidden from the group making the NOSQL database decision since it is quite likely they are not the  same group that uses SQL to answer business questions.

[iv] Sadalage and Fowler, page.13.

[v] Sadalage and Fowler, page.17.

[vi] Vladan Jovanovic and Steven Benson, Aggregate Data Modeling Style, Proceedings of the Southern Association for Information Systems Conference, Savannah, GA, USA March 8th–9th, 2013, page 75.

[vii] Parties are often first specialized into “Persons” and “Organizations” but whether a party is only one or a group is not relevant to our discussion here.  Instead I used a sub-categorization that is more reflective of the role that a party plays in an exchange.

[viii] In earlier articles on 6BI the term “Product” as the name of the 1st level BOC that corresponds with the “what” interrogative was used. Product however is not universal enough.  For lack of a better term I use “Thing” to include Money along with Goods and Services.

[ix] In earlier articles on 6BI the terms “Plans” and “Rules” were used as BOCs that correspond with the “why” interrogative. But these are only Means, not Ends, and “Why” needs to include both Ends and Means.

Attribute Enabling

August 29, 2010

This article describes a method for designing data which allows for the adding and removing of attributes from the instances of a single entity. At the physical database level this is equivalent to adding or removing columns from a database table depending on the rows of that table, a technical impossibility. One row of the same table might have fifteen (15) columns and another row might have twelve (12) or eighteen (18) columns. The idea revolves around the concept of a generic attribute pool from which each row in a table can “borrow” additional columns as needed. These “borrowable” columns are stored in the form of key-value pairs in their own table. Sometimes the value side of the pair can have a fixed value, for example: “5”, and sometimes the value side can be a variable with a value inserted programatically. The technique is presented as a way to give flexibility to the otherwise rigid nature of RDBMS design.

The article was first published in February 2001 on the DM Review website http://www.dmreview.com/article_sub.cfm?articleID=3067 under the title “Attribute Enabling, a Database Design Method”. As of this posting and for a few years now this site no longer exists. Since the publication of Attribute Enabling, similar design schemes have popped up in solutions such as the Trading Community Area (TCA) of the Oracle e-Business Suite Applications, and others.

Attribute Enabling

Have you ever been in a situation when the company you work for has changed its line of business? Perhaps the change is so great that an entire rewrite of a major business system is needed, or the purchase of an entirely new system is required to meet the changing business needs. If so, you are well aware of how much work this can be for an IT staff and how disruptive it can be for end users, analysts and managers, not to mention possibly customers and vendors. There are horror stories of business impacts so great that it has taken years, if ever, to recover.  Unfortunately, this same disruptive process is often experienced unnecessarily when less extensive changes are required. One of the reasons why this is so is because of the static and inflexible nature of the design of many application databases. For example, suppose your company has been in the business of producing small metal parts used in the fabrication of larger and more complex electronic equipment. Each of the items you produce can be described by a set of attributes (e.g., length, width, gauge, etc.). There could be dozens (or even hundreds) of attributes describing any one item. Each item type can be described using a different set of values for these attributes or, in some cases, a different set of attributes all together. For instance, part #101 could have a length of 1 inch, a width of .5 inches and a gauge of .03 of an inch. Part #102 would, undoubtedly, have different values for these and all the other attributes that your information system uses to store a description of an item.

Let’s say that up until this time color was never a factor in the production of an item. Either all products were the same color, or color simply did not matter. Now, because of increased competition or a decision to expand the business, items will be produced in several colors along with all the other traditional attributes. Perhaps some items will have color as an attribute and others will not. This last requirement (simplified as it is) typically poses a challenge to the database designers in an IT shop. You can either add color as a new attribute to the entity in your data model that represents an item or add a new entity to include color and all the other additional attributes to the data model to create a relationship back to the “base” item entity.  Either way you could be initiating far more change in your application system than at first it would appear because, as we all know, data changes have a way of propagating themselves throughout a system with a kind of rippling effect. The impact of this effect is often difficult and time-consuming to estimate and manage.

Suppose there was a way to enable a unique set of attributes to describe each specific item type. Only those attributes that apply to part #101 would be used to describe it, attributes that do not apply could be ignored saving both human and processing resources. If an attribute – for example, density – had no relevance to part #101, it would not be used to describe it. The attribute color, on the other hand, would be enabled and could have a value of “red” or “blue” or any color from the domain of allowable values for the color attribute. If, at some point, density becomes relevant to item #101, the density attribute could then be enabled for that item and a value assigned to it. If color becomes no longer relevant to item #101, the color attribute could be disabled for even more flexibility.

Attribute enabling allows this flexibility in describing products as well as customers, facilities and even specific orders by selectively “turning on” the set of attributes that describe each unique type of business object and not requiring the inclusion of “unused” attributes in the descriptions of business object types where these attributes do not apply.  Flexibility is accomplished through very fine grained entity attribution.  Figure 1 shows a simplified schema for attribute enabling a product line.  The generic attribute pool is represented by Attribute_Header and Attribute_Detail. The header describes characteristics that are common to all attributes such as name and effective date, and the detail describes the domain of values for each generic attribute. These allowable values will almost always take two forms:a range, used to describe contiguously valued attributes, or a list, used to describe noncontiguously valued attributes. A range is used to describe an attribute such as width, where any attribute instance assigned to any specific object instance can have any one value from a contiguous bounded range of values. This characteristic of a range lends itself to, among other things, run-time calculations based on enabled-attribute values, and only the upper and lower bounds of the range are stored. A list, on the other hand, is used to describe all nonnumeric attributes such as color as well as noncontiguous numeric attributes such as identification numbers which are not used in calculations. For a list of allowable values, each discrete value is stored separately. Thus, for a range attribute, we have a relationship of one attribute header to two details (i.e., 1:2) and, for a list attribute, we have a one to many relationship (i.e., 1:N).

Figure 1: Schema for Attribute Enabling

The Product_Attribute entity resolves the many-to-many relationship between the pool of generic attributes (Attribute_Header and Attribute_Detail) and the product hierarchy (Product_Line, Product_Category and Product_Item). Note on Product_Attribute there is a “discriminator” attribute (hierarchy_descriminator) that indicates which level of the product hierarchy a particular enabled attribute is associated with. This enables the second feature of attribute enabling, inheritance.

Inheritance is the ability to associate an enabled attribute with an element higher in the hierarchy and have that attribute inherited by all or some of the lower elements in the hierarchy. Using our previous example, suppose part #101 is an item in category #23, which is a category under product line #4. The color attribute can be enabled for product line #4 and inherited by to all of its categories (including category #23) and then to its items (including part #101). Inheritance can be complete as described when a particular enabled attribute is inherited by all the categories and items under a product line, or discretionary where only selected lower hierarchy elements inherit selectively enabled attributes to give more flexibility to the schema.

Even more flexibility can be achieved when values are taken into consideration. For example, let’s say the attribute “color” is completely inherited by all the categories and items of product line #4. An impact, usually a price impact, can be associated with each value of an enabled attribute. A customer ordering a red part #101 might be charged mor, because of the price impact of the value “red” than a customer ordering a blue part #101.

There are several advantages to attribute enabling. As business conditions change, it can be used to extend the life of legacy systems by broadening the attribution of essential business objects. Once an item or a customer has been attribute-enabled, the ability to describe that object using a flexible set of attributes can be realized. Attribute enabling can be used to bridge the gap between an older system’s view of business data and a newer system’s view. Perhaps the largest benefit to attribute enabling is that it enables businesses to customize their essential business objects (i.e., products, customers and business units) to meet the changing business requirements. Imagine the benefit of enabling a sales representative to customize a product interactively with the customer by drawing on a pool of attributes and their values to offer the customer uniquely what they want. If, as Alvin Toffler1 has said we are entering an era of “particle marketing,” then attribute enabling is a technology that can meet that challenge.

References:

1. Alvin Toffler, keynote speech, International Sybase user Group Meeting, 1994, San Jose, California.