The eDiscovery Implications of Structured Data

3. Validation and authentication

Once the required information has been collected, it still needs to be validated and authenticated. Validation is the ability to determine whether the query was accurate and complete. It must in fact return all “California employees” as they were defined, so the report is complete and “correct.”

Authentication, a legal construct, is more concerned with tracing the chain of custody from the physical copy of the information you are producing to opposing parties all the way back to the system of record to ensure it hasn’t been changed or tampered with along the way and is still accurate and complete.

Frequently, the data extracts are done by internal IT with the help of external resources. Expert witnesses may perform some filtering and analysis on this data, and outside council may do their own filtering and masking, redacting social security numbers, for example. When the reports are finally presented, it is important to fully document the chain of custody: Who pulled the data? Who else touched the data along the way? Exactly what was done to the data?

4. Custody and control

Believe it or not, e-discovery has some limits. One of these is the limitation that parties are obligated to preserve only information within their custody and control. Making the determination regarding custody and control has typically been relatively easy with unstructured data such as email and user files, although cloud computing is now complicating this process.

With structured databases, however, custody and control can be a much trickier issue to address. This is because not all data elements within large enterprise databases are owned or controlled by the same entity, and portions of data processing and aggregation may be outsourced to third parties. When building a compliance program, it is very important to be able determine exactly who owns and controls what at any given point in time and what portions of the information may therefore be subject to the preservation and production obligations of each entity.

This issue is further complicated by the use of cloud services. Here an organization may own and control the raw data in the system but have no control over the source code, architecture and algorithms, or certain aggregated or processed outputs of the database. For example, SalesForce.com considers its database structure to be a proprietary trade secret, and disclosing the tables, fields and relationships within the database during e-discovery would likely be a breach of its user license. The same holds true for many solutions offered by Google Apps.

Similarly, custody and control can be impacted by the corporate structure. For example, a parent or subsidiary company may own or control the database and database software and be leasing its use to child entities or other subsidiaries. In these cases, it may not be clear who controls what at any given time, creating a complex legal question to work through before responding to discovery from the system.

5. Privacy and access

Structured databases commonly store huge amounts of private and protected information, including employee records, customer records, financial information, health records, social security numbers, credit card numbers, IP addresses, geo-location data, and more. A robust e-discovery compliance program must therefore also confront state, national, and international information privacy regulations.

Privacy laws are typically agnostic as to where the data is physically located, focusing instead on domicile or residency of the data subjects This means that a company doing business with or employing California residents, for example, must comply with California’s data privacy regulations even if the data is stored in a different state.

Similarly, international companies must also ensure they are complying with relevant non-U.S. data privacy regulations which can be far more complex to decipher. For example, unlike most U.S. privacy laws, Article 4 of the EU Data Protection Act is not concerned with the domicile or residency of the data subjects. Instead it is focused on how data was collected and for what purpose.

If the business purpose relates in any way to established EU business activities, the data is considered subject to the act and the many subsequent and varied local implementations of the act. Even if the data legitimately resides in the U.S., companies must be aware of how the data is being used throughout the organization, and what other jurisdictions its regulation may be subject to.

Successfully managing these privacy issues clearly requires a multi-dimensional approach:

  • Data: What data is subject to protection?
  • People: To whom does the data relate, and which individuals and groups hold relevant privacy rights?
  • Systems: What systems are used to store, transport, access and manage the data?
  • Jurisdictions: What are all the jurisdictions that may come into play as data is collected, stored, and moved around the enterprise and beyond?

Keep in mind, that in the event of an e-discovery request that relates to data that is also subject to privacy regulations, you must be keenly aware of how you handle the extracted data, where it will be transferred to and how will it continue to be protected as the company’s liability remains even after transfer to third parties.