Taming BIG Data: Taking Back Control, Part 3 – Retention

CIOs and others responsible for corporate technology initiatives are challenged to gain control of the ever expanding amount of data available today. The Taming Big Data series focuses on a solution that builds a sustainable model to keep up with such changes.

This solution is to formalize enterprise information management (EIM) programs, thus enabling a company to provide accurate consistent information to all of its resources (employees, computer databases, etc.), allowing them to perform their jobs more effectively.

A key objective of the EIM program is to transform a vast amount of information collected every day into a strategic advantage. To this end, CIOs seek a tactical solution where benefits can be realized early and work itself into the overall enterprise information strategy.

One approach in starting this journey is by looking at the information life cycle management (ILM). ILM is the process of managing specific data assets of an organization from creation to disposition. The five areas of ILM to be addressed are: data usage, creation, retention, availability and maintenance. This article will focus on the importance and best practices of the third phase, Retention.

At this point, a company using the approach clearly understands its information assets and has controlled the data received or created by the company (see Taming Big Data – part I: Usage, and part II: Creation).

Additionally, the effective use of resources and collaboration techniques should be maturing, allowing for the remaining phases to be even more successful. This third phase focuses on the retention of data. This will allow the company to better set and communicate expectations to improve the perceived quality of the information.

The retention phase defines the life span of the data by periodically reassessing the value over time and reprioritizing the value of the underlying data assets. Supporting policies and metrics are defined to better determine the value of information assets over time.

The direct cost of holding data is most relevant in this phase. It is easier to quantify items (e.g., disk space, full time equivalent person hours for backup and recovery, disaster recovery and business continuity planning), which are related to the retention of a company’s information assets. It is critical that the business, technology and EIM program management resources work together to implement a solution that supports the corporate goals. A common theme in all of these phases is that the collaboration is required to minimize risk. Clarifying the roles each play in the journey is warranted.

Roles and responsibilities

Business team members: The business workgroup members understand the company’s core competencies and can best assess the needs of the organization.

Technology team members: The technical workgroup members support the business by providing tools that can be used by an organization to effectively perform their jobs.

EIM program management: EIM program management structures the project, defines the goals and outcomes, facilitates the workgroup interactions and formally submits recommendations to the governance review board.

Collaboration is most effective when individuals are clear with their workgroup roles and the goals of the initiative. During this third phase, the company focuses on studying and maturing the business processes and technical maintenance procedures used to source important information assets.

Business group

The business workgroup members are responsible for:

  • Determining the legal data retention requirements;
  • Reviewing/Updating the functional requirements for data retention; and
  • Prioritizing the list of requirements.

Data retention discussion can often lead to disagreements across and within the functional and technical groups. Individuals typically have one of two perspectives: The first group believes that the company should capture whatever it can for as long as it can because it has some value. These individuals believe that the intrinsic value of the information may or may not fully be known and advances in technology may someday allow the organization to mine this data better.

The second perspective believes that too much information is not good since it detracts from standard metrics required for maintaining operational efficiency. Both schools of thought can make a case. During this phase, as the company focuses on the data retention strategy and requirements, it is most important to understand the minimum amount of data required for corporate decision making and the information value of all data assets above this minimum requirement.

By leveraging the prioritized information assets catalogue created during the first two phases, the team will be in a position to have a candid discussion about which data elements are must haves. They can work on the value proposition of the remaining information assets and implement as necessary based on budget and other corporate constraints.

Specifically, the business workgroup members can start by reviewing the legal documents and working with the legal department to determine any industry or company constraints (e.g., accounting information should be kept for seven years, companies with personal identifiable information disposal criteria, etc.).

Next, these members can assess the needs of the company and determine other self-imposed retention requirements. For example, a company may choose to retain twenty years’ worth of de-identified medical data if one of the primary objectives of the company is to analyze the historical trends of medical conditions.

A marketing company, on the other hand, that captures a lot of information on potential customers may only retain the details of good prospects identified by a formula developed over the history of the organization. The retention for these valued customers may be for 10 years whereas the retention of data for less valued customers may be removed after six months.

The role of the business workgroup members is to help the organization figure out what is the right mix for the company. Most importantly, they should have a clear understanding of the priorities so that as the integrated workgroup meets, they are in a better position to negotiate based on the best interest of the company.