“Horizontal expansion loses the depth, though excessive depth that only provokes darkness is futile. Therefore a balance between depth and vastness is essential in learning” ― Privavrat Thareia
This year, consider spending more time protecting the vast bytes that make up your corporate data. As an IT or security professional, do you know how much data your organization has? Do you have the right controls and risk mitigations to protect your business data? One hundred, 70, or even 35 years ago our big worries about data would focus on accidental loss, such as Hemingway’s loss of a complete collection of manuscripts at the Gare du Nord in 1922, or the 1973 fire at the National Personnel Records Center that destroyed 16-18 million military records.
Today, it is estimated that about 205 billion emails are sent every day[1]. Email is likely important, but perhaps not a critical set of data for your organization. What is shared or stored in Box, Google®, SharePoint® or your CRM or customer databases? Is it critical data that the business depends on operationally? Is the data subject to compliance?
IT and security are often tasked with safeguarding all corporate data, from servers to cloud solutions. However, the staff does not always have the best information to ensure that the business’ data management services are appropriate for the type of data the business has.
To help IT and security gain perspective on data management, this blog discusses a streamlined process that teams without dedicated data management resources can use.
Data life cycle
We begin with the data life cycle, which describes the primary states that data exists in. This life cycle is commonly used in data privacy, compliance, and data management. The states are:
- Collect or derive: Direct collection, such as messages or files, or derived data from mobile or Web forms, data feeds, and 3rd-parties.
- Use/process: Actions taken with or about data.
- Disclose/transfer: Disclose to another entity, or transfer to another data processing system.
- Retain/store/archive/delete: What do you do with data after processing or transfer?
The data life cycle is helpful in all aspects of data management. It is also useful to share with your security team. Steps where data is moving usually warrant network access control, VPNs, and, if forms are used, application security. Steps, where data is being processed, can involve database security, API security, and authorization. At the end of the data life cycle, data encryption and access auditing become more important.
Throughout the data life cycle, log management is essential. Log management can prove compliance, help characterize accidental disclosures, and audit data handling throughout the life cycle.
Data Classification
Next, you need some level of classification, not labeling, marking, loss prevention solution, but a set of containers for data that establish the IT and security responsibilities for those containers. With these containers, you can apply actions, whether that means backups, integrity, verification capabilities, records management, or implementing compliance controls.
You also need to help the business determine which container their data belongs in. Asking a few guiding questions should help you begin to categorize corporate data and put in place good data management practices. Ask business owners the following questions:
Critical Data
If this data becomes unavailable for more than 24 hours, will the business be able to operate effectively?
If the business owners aren’t certain, initiate your own examination. Look for data that risks the daily operations of the organization. For example:
- Real-time financial systems.
- Inventory and order entry systems.
- Process controls.
- Customer accessed systems.
- Collaboration tools.
Sensitive Data
Is this data sensitive data? For example:
- Corporate strategy (M&A, go-to-market, product development).
- Intellectual property (trade secrets, patents, source code).
- Financial data.
- Prospective or current customer lists.
- Internal organizational changes.
- Cryptographic or API keys.
Regulated Data
Is this data governed by any compliance regulation or scheme? For example:
- HIPAA – Employee health information, or other patient data.
- SEC – Private financial data.
- PCI – Card holder data, such as credit cards or debit cards.
- COPPA – Data of children younger than13 years old.
- GLBA – Relating to consumer data held by financial institutions.
- FCRA – Relating to consumer credit information.
- FERPA – Relating to public school records of minor children.
- ITAR – U.S. government-controlled research data.
- NERC CIP – Power plant and transmission data.
There are other containers, but by focusing on these three, you prioritize and align your efforts with the most important data for the business. Once you know how to categorize your data (critical, sensitive, and/or regulated), you can establish the last step for each category (retain/store/archive/delete) in the data life cycle. You may already have this done for email, either through a corporate retention policy or through settings on email servers.
For each category of data, establish your commitments to the business. Keep in mind the fact that you may need to encourage the business to determine when data should be deleted. With the storage capacity growing exponentially, (IDC predicts global data will reach 40,000 exabytes by 2020), business leaders may want to horde their data.
Simplified Data Management
Here is a simplified table that you can use with your business owners to discuss their data management needs. This approach is designed to get you started in data management. It is not a replacement for a full data management program, as defined by DAMA International[2], which offers a complete program and certification.
There are other issues with data, including quality, longevity, and handling, but these are topics for another time. It’s easy to become overwhelmed if you think about your legacy data. SharePoint explosion and shadow IT file-sharing services have enabled corporate data to be any place, any time. You may not be able to affect change for the old data, but if you don’t begin the process, you’ll never establish good data practices.
As Einstein reminds us, “Insanity is doing the same thing over and over and expecting different results.”
[1] http://www.radicati.com/wp/wp-content/uploads/2015/02/Email-Statistics-Report-2015-2019-Executive-Summary.pdf
[2] http://www.dama.org/