Does Your Data Warehouse Need a Fix?

Data Warehouse Governance

The goal of a data warehouse is to integrate and conform the data from across the organization.

To conform means to bring into agreement, or to make similar. Conforming data entails having business units agree on terms that describe the important entities in the business. For example, a bank that originates and services mortgage loans, and then sells them on the secondary market will often have a discrete application for each function in the mortgage lending lifecycle.

Each application has its own set of codes and descriptions for the property and for the loan product. One application may call the property a town home, while another may call it a single-family attached dwelling. In order to evaluate or segment loans by type of property and product, a core set of descriptions must be agreed to by the people in origination, servicing and capital markets.

In turn, the codes of the respective operational systems must be mapped to this agreed-to set of descriptions when the data moves into the data warehouse.

In addition, the users of a data warehouse will often tell you that consistency means knowing exactly what they are getting from the data warehouse. If the users ask for revenue or cost totals, they want to know what the components of revenue and cost are, and they want those components to be the same for all queries.

The first thing your checkup must do is determine whether your data warehouse strategy is committed to the philosophy of making data consistent. Then you must examine how well your governance process for maintaining and growing the data warehouse functions in light of this commitment:

  • What is the method by which new attributes, new dimensions and new facts are released to the user community?
  • How do you determine the business impact and feasibility of getting new data into the data warehouse?
  • Do you have people from various business units dedicated to specific dimensions in your data warehouse, e.g., customers, products, locations, promotions?
  • What is the procedure for maintaining the consistency of existing dimensions as new products and codes are added in respective source applications?
  • Simply put, conforming dimensions means bringing consistency to the textual attributes of an organization. These are the attributes by which users slice and dice information — customer demographics, product categories, organizational hierarchies, time periods, and many more.

    Conforming facts means bringing consistency to the measures of an organization — the number of customers, total revenue, gross profit return on inventory, end-of-month balances, and a host of others.

    Examine your commitment to and process for conforming data. Unless data is conformed across the organization, the full value of the data warehouse will never be achieved.

    Data Warehouse Usage

    The real purpose for a data warehouse is to enable the business users to make decisions. To find out how well the data warehouse is assisting the decision making process, ask the users.

    Since the data warehouse has been functioning for some time, tackle the user interviews from three directions: the intended users, the neglected potential users, and the entrepreneurs.

    The data warehouse was built with a set of users in mind, and it was designed to answer their questions. Ask the intended users who are currently using the data warehouse:

  • What works well and what doesn’t?
  • What questions do you ask consistently and which decisions do they affect?
  • Do the answers provide adequate information?
  • Do you always understand the information you are getting from the data warehouse?
  • What are the most useful slice-and-dice capabilities?
  • What other slice-and-dice capacity would you add?
  • How do you rate the quality of the data in the data warehouse?
  • How would you improve the quality?
  • What questions would you like to ask?
  • How easy is it to formulate a query?
  • How quick is the response?
  • What forum would you like to see for adding new information?
  • Ask the intended users who are not using the data warehouse what prevents them from using it, and what would make them change their minds.

    Next, turn to acknowledged potential users of the data warehouse whose needs were not addressed in initial releases. These neglected information consumers will give you a better understanding of future wants: the key data elements to be migrated to the data warehouse, the data sources, the volume and frequency of migration, and the impact the information will have on making decisions.

    This will help shape the conforming committees and the governance process by which new information is released to current and new consumers.

    Finally, engage those select individuals who are using the data warehouse to do great things. Find out what hidden stories they unearth in the data. The entrepreneurs on the periphery are critical change agents, and they will help make your data warehouse a success.

    If there are no entrepreneurs to be found, consider whether everyone is asleep, or whether your data warehouse is too cumbersome to use, contains no useful information, or both.

    Data Model Composition

    Now we turn to the underpinnings of the data warehouse.

    The data model of a data warehouse should be the optimal model for delivering data to the end users. The model must be easy to understand and provide speedy query performance. Anything else is beside the point.

    Dimensional modeling is the best way to model decision support data, and a well-designed dimensional data model is the prerequisite to a high-performing data warehouse.

    The data model for your data warehouse should be checked by a data architect or data modeler versed in dimensional data modeling techniques.

    This assessment looks at: data mart alignment with business functions; fact table granularity; and dimensional structure.

    Data Mart Alignment with Business Functions: Each data mart or family of fact tables should focus on a business theme that’s important to the business, possibly tracking that theme through various phases of the business lifecycle.

    For example, managers and analysts who worry about quality in the order fulfillment process would be interested in knowing about:

  • The quality of goods received into inventory, inventory turns and replenishment times.
  • Machine down time, throughput, scrap, and breakage during the manufacturing activities.
  • Picking, packing, and shipping execution times.
  • Customer complaints and returns.
  • The data modeler examines each family of fact tables or individual facts to ascertain how well they align with the business requirements.

    Fact Table Granularity: The cause of much user heartburn and complaints regarding data warehouse queries can often be laid to the granularity of the individual fact table record.

    The design of a data mart begins with a declaration of the grain. The grain might be an end-of-day snapshot of individual account balances or every parking ticket issued by a parking garage.

    If the fact table contains every segment of a telephone call as it is routed through the system, then each call segment is the grain. Every question tied to a business theme depends on the grain of the fact table.

    Very often, not enough consideration has been given to visualizing and declaring the correct grain. The data modeler studies the impact the granularity of each fact table has on user queries. Changing the granularity can have a substantial impact on the choice of attributes in both dimension and fact tables.