Languages:
English flagItalian flagChinese (Simplified) flagPortuguese flagGerman flagFrench flagSpanish flagJapanese flag

An Update on Network Neutrality

May 12, 2010 by · Comments Off
Filed under: law, Online Privacy 

            Last fall the FCC issued a statement asserting that it would begin to enforce Net Neutrality regulation against broadband internet carriers (see article here).  However, in April a Federal Court of Appeals issued a decision which struck down an FCC decision that enforced Net Neutrality, and made many question the ability of the FCC to enforce Net Neutrality without a specific mandate from congress.  The FCC, however, still has a power play in reserve should they wish to reassert their dominance over broadband internet providers.

 The Court Decision – Comcast Corp. v. FCC

            In 2008 and earlier Comcast enacted a policy to slowdown the internet speeds for people who are they assumed were using peer-to-peer networks and were engaged in significant (presumably illegal) downloading.  Their argument was essentially that these users were resource hogs and were slowing other customers’ connections.  The problem was that Comcast targeted users accessing a particular service over the internet and slowed their protection, a direct violation of the concept of Net Neutrality, where a user will get equal access to different sites and internet services.  The worry behind this is that in the future your ISP may charge you extra to use services like VoIP, or even email.

            The FCC determined in August of 2008 that Comcast’s slowing of user’s connections in this manner was a violation of FCC policy, and ordered Comcast to cease this activity.  Comcast appealed this decision and it made it all the way to Federal Court of Appeals.  This court overturned the FCC decision and held that the FCC lacked the authority to enforce Net Neutrality rules.  Despite this ruling, the FCC plans to move ahead with creating formalized Net Neutrality rules.  It is unclear how the FCC will claim legal authority to create these rules in light of this decision.

Beefing up the FCC’s regulatory authority

            At any time congress could pass a law granting the FCC the authority to enforce network neutrality.  There has been mixed feelings towards network neutrality on Capitol Hill, and with everything else going on with Congress it is unlikely that Net Neutrality will make it to the top of the agenda any time soon.

            The FCC, however, has the ability to give itself more regulatory power.  In the late 1990’s the FCC mandated that companies who operate phone lines (referred to as telecommunication services) allow third parties to use their lines to provide internet access.  It essentially disconnected the company who owned the phone lines from the company who provided the internet access.  The FCC’s ability to enforce these mandates was limited to companies that provided a “telecommunications service.”  In 2002 the FCC issued a ruling stating that cable companies were not a “telecommunications service”, but rather an “information service”, and thus are not subject to the same level of heavy regulation that the phone companies are subjected to. 

            This classification of cable companies as “information service” providers was made internally by the FCC, and thus can be overturned on their own authority.  Reclassifying cable companies as “telecommunication service” providers will give the FCC an ample amount of power to enforce Network Neutrality rules on cable companies as well as much more extensive regulation, generally.  There is concern, however, that reclassifying cable companies will stifle investment in the industry.  This is a concern that may trump Net Neutrality for the FCC, as the US lags behind many advanced nations in terms of access to broadband.  If the FCC doesn’t reclassify cable companies it is unclear how they will be able to enforce Network Neutrality, or if the effort will be abandoned altogether until congress can grant express authority. 

            Unfortunately, just like last time I discussed Network Neutrality, there are still more questions than answers and this debate is far from finished.

What information is “personally identifiable”?

September 12, 2009 by · Comments Off
Filed under: Online Privacy 

A medical record includes data that Mr. X lives in ZIP code 02138 and was born July 31, 1945.  Sounds like Mr. X is pretty anonymous, right?

Not if you’re Latanya Sweeney, a Carnegie Mellon University computer science professor who showed in 1997 that this information was enough to pin down Mr. X’s more familiar identity — William Weld, the governor of Massachusetts throughout the 1990s.

Gender, ZIP code, and birth date feel anonymous, but such data is unique for about 87% of the U.S. population.   That is, if you live in the United States, there’s an 87% chance that you don’t share all three of these attributes with any other U.S. resident.  After narrowing the potential identify of the person, one can use additional data sources, such as voter registration records, property records, and other online sources, to “bootstrap” to then determine the person’s name and address.

Contemporary privacy rules and debates center on the notion of “personally identifiable information” (PII).  PII is information that identifies a particular person, typically by name and address, and such PII data is considered more sensitive than information that does not. For example,

  • Federal health privacy laws use “individually identifiable health information” about a patient as a basis for the category called Protected Health Information (PHI);
  • Federal telecommunications privacy regulations use “individually identifiable information” about a subscriber as a basis for the category called Customer Proprietary Network Information (CPNI); and
  • Federal financial privacy laws, the EU Data Protection Directive, and state privacy laws use similar concepts for categorizing PII data.

In each of the above categories, some data deemed “personally identifiable” or “individually identifiable” are to receive increased protections in order to protect the identity of an individual.

However, research by Prof. Sweeney and others demonstrates that surprisingly many facts, including those that seem quite innocuous, neutral, or “common”, actually may be used to identify the individual.  Privacy law is not keeping up with technical reality, and if your information is available online, you have likely been identified and profiled (what better way to market, then by knowing who a buyer might be).

So, what type of data is mined to identify and profile you?  Demographic data, your search terms; your purchase habits; your preferences or opinions about music, books, or movies; and the structure of your social networks (even when the identities of your friends and contacts are not shared).   As our society interacts and communicates over the world wide web, there are more and more sources that are being used to narrow down exactly who a particular record refers to.   And, accordingly, you should think about the privacy consequences of uploading personal data that might have long-term ramifications of your “hobby” for online publishing (e.g., blogging, tweeting, etc.) and how this data is subsequently analyzed and associated with records to identify you.

What information is “personally identifiable”?

Technical Analysis by Seth Schoen

Mr. X lives in ZIP code 02138 and was born July 31, 1945.

These facts about him were included in an anonymized medical record released to the public. Sounds like Mr. X is pretty anonymous, right?

Not if you’re Latanya Sweeney, a Carnegie Mellon University computer science professor who showed in 1997 that this information was enough to pin down Mr. X’s more familiar identity — William Weld, the governor of Massachusetts throughout the 1990s.

Gender, ZIP code, and birth date feel anonymous, but Prof. Sweeney was able to identify Governor Weld through them for two reasons. First, each of these facts about an individual (or other kinds of facts we might not usually think of as identifying) independently narrows down the population, so much so that the combination of (gender, ZIP code, birthdate) was unique for about 87% of the U.S. population. If you live in the United States, there’s an 87% chance that you don’t share all three of these attributes with any other U.S. resident. Second, there may be particular data sources available (Sweeney used a Massachusetts voter registration database) that let people do searches to bootstrap what they know about someone in order to learn more — including traditional identifiers like name and address. In a very concrete sense, “anonymized” or “merely demographic” information about people may be neither. (And a web site that asks “anonymous” users for seemingly trivial information about themselves may be able to use that information to make a unique profile for an individual, or even look up that individual in other databases.)

Many contemporary privacy rules and debates center on the notion of “personally identifiable information” (PII). The PII concept is used by several legal regimes and many organizations’ privacy policies; generally, information that identifies a particular person is considered much more sensitive than information that does not. For instance,

and, in each case, facts deemed “personally identifiable” or “individually identifiable” may receive dramatically higher protections under these laws and regulations.

But research by Prof. Sweeney and other experts has demonstrated that surprisingly many facts, including those that seem quite innocuous, neutral, or “common”, could potentially identify an individual. Privacy law, mainly clinging to a traditional intuitive notion of identifiability, has largely not kept up with the technical reality.

A recent paper by Paul Ohm, “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization“, provides a thorough introduction and a useful perspective on this issue. Prof. Ohm’s paper is important reading for anyone interested in personal privacy, because it shows how deanonymization results achieved by researchers like Latanya Sweeney and Arvind Narayanan seriously undermine traditional privacy assumptions. In particular, the binary distinction between “personally-identifiable information” and “non-personally-identifiable information” is increasingly difficult to sustain. Our intuition that certain information is “anonymous” is often wrong. Given the proper circumstances and insight, almost any kind of information might tend to identify an individual; information about people is more identifying than has been assumed, and in the long run the whole enterprise of classifying facts as “PII” or “not PII” is questionable.

Statistical inference and clever use of databases has resulted in impressive examples of deanonymization of supposedly anonymous data, the kinds of data that most organizations have not regarded as PII. Apart from combinations of demographic data, some of the sorts of things that may well uniquely identify you include your search terms; your purchase habits; your preferences or opinions about music, books, or movies; and even the structure of your social networks — in a purely abstract sense, even when shorn of the identities of your friends and contacts. Deanonymization is effective, and it’s dramatically easier than our intuitions suggest. Given the number of variables that potentially distinguish us, we are much more different from each other than we expect, and there are more sources of data than we realize that may be used to narrow down exactly who a particular record refers to.

Many of these papers were meant as proofs of concept: they show that people can potentially be re-identified by these kinds of data, not that everyone will be. Not everyone‘s medical records were as easy to put a name to as Governor Weld’s. And Narayanan and Shmatikov’s research definitively identified only two Netflix users from their movie ratings — not every user whose ratings were published by Netflix. Still, many of these research results deliberately do not use all the data available about individuals because their goal is to show the effectiveness of mathematical techniques, not to violate individuals’ privacy. Real-world attacks will use many more kinds of available information simultaneously to narrow in on people’s identities. As Bruce Schneier has observed, such attacks only get better over time; they never get worse.

Ohm argues that it’s more appropriate to think of identifiability as a continuum. The notion of “anonymized” or “sanitized” data is then problematic; researchers habitually share, or even publish, data sets which assign code numbers to individuals. There have already been conspicuous problems with this practice, like when AOL published “anonymized” search logs which turned out to identify some individuals from the content of their search terms alone.

We hope “Broken Promises of Privacy” encourages people who work with personal data to think more critically about their retention and sharing practices and the effectiveness of the anonymization or pseudonymization techniques they’re using. We also hope it finds a broad audience and helps start a wider discussion among researchers, technologists, and lawyers about what “privacy protection” should mean in the era of deanonymization.

Protect Online Personal Information

March 14, 2009 by · Comments Off
Filed under: Online Privacy, Uncategorized 

As we continue to conduct more business online, such as banking, shopping and other activities, our personal information (such as name, credit card account, address, etc) is increasingly utilized.  Personal information has become a frequent target for data thieves and the volume of breaches involving personal information continues to grow.  According to the Privacy Rights Clearinghouse, there have been more than 240 million records containing sensitive personal information involved in security breaches to-date nationally.

Read more