Tuesday, March 13, 2012

Classification and Labeling of Data

In the early days, much of computer security research was aimed at developing computers that could be relied upon to enforce the DoD scheme for restricting access to data "classified" in the national security interest. Out of this research emerged the Bell-Lapadula model, the Trusted Computer System Evaluation Criteria (TCSEC), and rules-based access control. These assumed that data is "classified," that is, labeled as to its sensitivity. The research assumed that data is "born classified," without paying much attention to who says so or why.


Data is not really born classified. Someone has to decide. Classification is an economic decision. While some people think that one is simply making a statement about an inherent property of the data, what one is really doing is making a statement about how one believes the data should be protected and at what cost. The class maps the data to a set of methods and procedures to be used to protect it. We call this mapping "policy." Since these methods have a cost, by assigning a class or a label, one says that this is how much the enterprise or the authority is prepared to spend to protect this data.

The label can be thought of as a code that the author/classifier uses to communicate to the users of the data how he, the author/classifier, wants it to be protected. For example, when one labels an object "confidential," one communicates to all custodians and users of the data that it is only to be seen by those with "need to know." When one labels it "top secret," one asserts that, among other measures, the data should be locked up when not in use.

We choose our label or class based upon the "sensitivity," a term of art, of the data. Sensitivity is a function of context and association. A single bit of data is sensitive only if one knows what it signifies. A social security number, standing by it self, may not be sensitive but the bind of a social security number to a name starts to be sensitive. As one associates date and place of birth, the names of parents, address, credit scores, and credit card numbers sensitivity increases.


Sensitivity increases along an axis from raw data, to organized and analyzed data, to conclusions or intelligence derived, to plans of action based upon the intelligence. Thus for most business enterprises, competitive intelligence, product plans, and business plans tend to be very sensitive.


The sensitivity of data is a function of its timeliness. Reuters charges a premium for data that it plans to give away for free in fifteen minutes. At the other extreme, the identity of spies remains sensitive for the life of their grandchildren. The Secrets of Ultra remained sensitive until the inventions of modern cryptography made them obsolete; we kept them secret for another twenty years just to be on the safe side. While there are exceptions, the sensitivity of data tends to decrease with age. The sensitivity of Thomas Eagleton's mental health records was decaying along a predictable curve until he ran for the US Senate. It spiked again when he was chosen to run for Vice President of the United States.


Similarly, the sensitivity also tends to increase with quantity. One credit card number is sensitive but the sensitivity of a set of such numbers goes up with the number in the set.


So to recap, the sensitivity of data is a function of context or association, age, organization and analysis, and quantity.


Today we have default sets. In the private sector these include intellectual property (IP), personally identifiable information (PII), and payment card information (PCI) that must be protected from disclosure, and the books of account that must be protected from manipulation or contamination. In law enforcement we have investigations in progress and wants and warrants. In intelligence we must protect not only the conclusions and recommendations but also the sources and methods by which they were developed. Sources and methods are among the most sensitive data we have because compromise may cost lives.


Classification decisions must be made by those who know the most about the data. For business functional data, such as accounts receivable and the payroll, that it usually the manager of the function. By default, it is the person who creates the object. We often refer to this individual as the "owner," because the role usually includes the authority and discretion to say who, and in what circumstances, can see or modify the data.


Because the decision requires judgment and experience, it is normally reserved to executives, managers, and professionals. Our job as staff is to ensure that the process of creating an object includes the step of labeling it, then to note variances and ensure that they are corrected. However, it is often difficult to identify the responsible individual,


In most organizations, the authority to classify information includes the authority to re-classify. An exception is the US National Security system, where once classified, the data must go through a rigorous declassification process applied by specialists.


In business the label should include the identity of the classifying manager or authority and a date whereon the classification expires or must be extended. Enterprise policy may require the former and limit the latter.


The objective of the system should be to ensure that all data gets the protection that it warrants but that expensive measures are reserved for only the data that needs it. Under classification may result in leakage or contamination. Over classification is inefficient.


An issue is binding the classification label to the data object. The integrity of the system relies upon the label being tamper resistant, or at least tamper evident. In ink based systems the paper binds the data and the label to the same piece of paper. Because of the mutability of electronic data, the label is part of the meta-data and the bind may not be as reliable. Closed systems like the AS/400 or Lotus Notes help. Encryption can be used to bind the label to the object in such a way that tampering with it will be evident. The integrity of the bind is checked at open time and flagged if the content and the label do not agree.


An effective classification and labeling system has to be baked into the culture of the enterprise or organization. Policy, methods, procedures, All members of the organization have to expect labels and must know the measures for all classes that they are likely to see in their roles. Moreover, those who create data objects must know how to classify them. Both of these things require constant reinforcement and training. Supervisors, managers, and executives must note variances between the sensitivity of the object and the label assigned and take timely corrective action.


Fifty years of electronic computing has resulted in an explosion of data, if not intelligence. Efficient, even effective information assurance requires not only that we identify the sensitivity of our data, that is, classify it, but also then communicate that judgment to all users of the data, that is, label it. We have seen at RSA and Lockheed-Martin what comes from trying to protect all data the same.


Luckily, the computer has given us powerful tools, for example, uniform and consistent processes, user identification and authentication, rules-based access control, and encryption to protect the information. In some cases, simply applying the proper label may go a long way toward ensuring that it is protected appropriately.


Many organizations have nominal classification and labeling programs that are not effective. When a client tells me that they have a program, I make a quick test. I ask a couple of executives if they would notice a mis-classified document if it hit their desk, and, if so, what they would do about it. I get unsatisfactory answers more often than not.


Because such a program must be woven into the warp and woof of the fabric of the enterprise, creating a is difficult, time consuming, and high maintenance. Of course, that is why those of us who implement classification schemes are called professionals and are paid the big bucks.

No comments:

Post a Comment