Attribute Enabling

This article describes a method for designing data which allows for the adding and removing of attributes from the instances of a single entity. At the physical database level this is equivalent to adding or removing columns from a database table depending on the rows of that table, a technical impossibility. One row of the same table might have fifteen (15) columns and another row might have twelve (12) or eighteen (18) columns. The idea revolves around the concept of a generic attribute pool from which each row in a table can “borrow” additional columns as needed. These “borrowable” columns are stored in the form of key-value pairs in their own table. Sometimes the value side of the pair can have a fixed value, for example: “5”, and sometimes the value side can be a variable with a value inserted programatically. The technique is presented as a way to give flexibility to the otherwise rigid nature of RDBMS design.

The article was first published in February 2001 on the DM Review website http://www.dmreview.com/article_sub.cfm?articleID=3067 under the title “Attribute Enabling, a Database Design Method”. As of this posting and for a few years now this site no longer exists. Since the publication of Attribute Enabling, similar design schemes have popped up in solutions such as the Trading Community Area (TCA) of the Oracle e-Business Suite Applications, and others.

Attribute Enabling

Have you ever been in a situation when the company you work for has changed its line of business? Perhaps the change is so great that an entire rewrite of a major business system is needed, or the purchase of an entirely new system is required to meet the changing business needs. If so, you are well aware of how much work this can be for an IT staff and how disruptive it can be for end users, analysts and managers, not to mention possibly customers and vendors. There are horror stories of business impacts so great that it has taken years, if ever, to recover. Unfortunately, this same disruptive process is often experienced unnecessarily when less extensive changes are required. One of the reasons why this is so is because of the static and inflexible nature of the design of many application databases. For example, suppose your company has been in the business of producing small metal parts used in the fabrication of larger and more complex electronic equipment. Each of the items you produce can be described by a set of attributes (e.g., length, width, gauge, etc.). There could be dozens (or even hundreds) of attributes describing any one item. Each item type can be described using a different set of values for these attributes or, in some cases, a different set of attributes all together. For instance, part #101 could have a length of 1 inch, a width of .5 inches and a gauge of .03 of an inch. Part #102 would, undoubtedly, have different values for these and all the other attributes that your information system uses to store a description of an item.

Let’s say that up until this time color was never a factor in the production of an item. Either all products were the same color, or color simply did not matter. Now, because of increased competition or a decision to expand the business, items will be produced in several colors along with all the other traditional attributes. Perhaps some items will have color as an attribute and others will not. This last requirement (simplified as it is) typically poses a challenge to the database designers in an IT shop. You can either add color as a new attribute to the entity in your data model that represents an item or add a new entity to include color and all the other additional attributes to the data model to create a relationship back to the “base” item entity. Either way you could be initiating far more change in your application system than at first it would appear because, as we all know, data changes have a way of propagating themselves throughout a system with a kind of rippling effect. The impact of this effect is often difficult and time-consuming to estimate and manage.

Suppose there was a way to enable a unique set of attributes to describe each specific item type. Only those attributes that apply to part #101 would be used to describe it, attributes that do not apply could be ignored saving both human and processing resources. If an attribute – for example, density – had no relevance to part #101, it would not be used to describe it. The attribute color, on the other hand, would be enabled and could have a value of “red” or “blue” or any color from the domain of allowable values for the color attribute. If, at some point, density becomes relevant to item #101, the density attribute could then be enabled for that item and a value assigned to it. If color becomes no longer relevant to item #101, the color attribute could be disabled for even more flexibility.

Attribute enabling allows this flexibility in describing products as well as customers, facilities and even specific orders by selectively “turning on” the set of attributes that describe each unique type of business object and not requiring the inclusion of “unused” attributes in the descriptions of business object types where these attributes do not apply. Flexibility is accomplished through very fine grained entity attribution. Figure 1 shows a simplified schema for attribute enabling a product line. The generic attribute pool is represented by Attribute_Header and Attribute_Detail. The header describes characteristics that are common to all attributes such as name and effective date, and the detail describes the domain of values for each generic attribute. These allowable values will almost always take two forms:a range, used to describe contiguously valued attributes, or a list, used to describe noncontiguously valued attributes. A range is used to describe an attribute such as width, where any attribute instance assigned to any specific object instance can have any one value from a contiguous bounded range of values. This characteristic of a range lends itself to, among other things, run-time calculations based on enabled-attribute values, and only the upper and lower bounds of the range are stored. A list, on the other hand, is used to describe all nonnumeric attributes such as color as well as noncontiguous numeric attributes such as identification numbers which are not used in calculations. For a list of allowable values, each discrete value is stored separately. Thus, for a range attribute, we have a relationship of one attribute header to two details (i.e., 1:2) and, for a list attribute, we have a one to many relationship (i.e., 1:N).

Figure 1: Schema for Attribute Enabling

The Product_Attribute entity resolves the many-to-many relationship between the pool of generic attributes (Attribute_Header and Attribute_Detail) and the product hierarchy (Product_Line, Product_Category and Product_Item). Note on Product_Attribute there is a “discriminator” attribute (hierarchy_descriminator) that indicates which level of the product hierarchy a particular enabled attribute is associated with. This enables the second feature of attribute enabling, inheritance.

Inheritance is the ability to associate an enabled attribute with an element higher in the hierarchy and have that attribute inherited by all or some of the lower elements in the hierarchy. Using our previous example, suppose part #101 is an item in category #23, which is a category under product line #4. The color attribute can be enabled for product line #4 and inherited by to all of its categories (including category #23) and then to its items (including part #101). Inheritance can be complete as described when a particular enabled attribute is inherited by all the categories and items under a product line, or discretionary where only selected lower hierarchy elements inherit selectively enabled attributes to give more flexibility to the schema.

Even more flexibility can be achieved when values are taken into consideration. For example, let’s say the attribute “color” is completely inherited by all the categories and items of product line #4. An impact, usually a price impact, can be associated with each value of an enabled attribute. A customer ordering a red part #101 might be charged mor, because of the price impact of the value “red” than a customer ordering a blue part #101.

There are several advantages to attribute enabling. As business conditions change, it can be used to extend the life of legacy systems by broadening the attribution of essential business objects. Once an item or a customer has been attribute-enabled, the ability to describe that object using a flexible set of attributes can be realized. Attribute enabling can be used to bridge the gap between an older system’s view of business data and a newer system’s view. Perhaps the largest benefit to attribute enabling is that it enables businesses to customize their essential business objects (i.e., products, customers and business units) to meet the changing business requirements. Imagine the benefit of enabling a sales representative to customize a product interactively with the customer by drawing on a pool of attributes and their values to offer the customer uniquely what they want. If, as Alvin Toffler1 has said we are entering an era of “particle marketing,” then attribute enabling is a technology that can meet that challenge.

References:

1. Alvin Toffler, keynote speech, International Sybase user Group Meeting, 1994, San Jose, California.

Tags: Adaptive Data Design, BI, Business Intelligence, Data Architecture, NOSQL

This entry was posted on August 29, 2010 at 9:12 am and is filed under Business Intelligence, Data Modeling. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Birkdale Computing