When security professionals think of “good” data and “bad” data, we think of things including accuracy, reliability, recency and applicability. But a more significant question is about data collection and use. All too often we focus our attention on data utility rather than on data ethics; on whether we can do something, not whether we should. Whenever we engage in a new enterprise that relies on data or new technologies that collect, store or analyze data, we should ask whether the process or technology can be misused and the social impact of the data use. Sometimes, the better part of discretion is valor.
In this vein, IBM’s CEO recently wrote an open letter to Congress indicating that:
IBM no longer offers general purpose IBM facial recognition or analysis software. IBM firmly opposes and will not condone uses of any technology, including facial recognition technology offered by other vendors, for mass surveillance, racial profiling, violations of basic human rights and freedoms, or any purpose which is not consistent with our values and Principles of Trust and Transparency. We believe now is the time to begin a national dialogue on whether and how facial recognition technology should be employed by domestic law enforcement agencies.
Sounds moral and ethical, no? But it’s not that easy. The problem is that things such as “mass surveillance” can be “mass surveillance” of peaceful protesters, or it could be “mass surveillance of peaceful protesters to search for known (right- or left-wing) agitators or agents provocateur, or it could be “mass surveillance” of people looking for those with outstanding arrest warrants for violent crime. The “mass surveillance” could include automated license plate readers searching for stolen vehicles, or speed or red-light cameras, traffic cameras or surveillance cameras in courthouses, police stations and your average 7-11. “Mass surveillance” may also include contact tracing, epidemiological modeling or other techniques helpful in public health. Focusing on the technology (facial recognition) or the description (mass surveillance) is where you begin—not where you end.
Privacy in Public Places
From a legal perspective, U.S. law has focused its privacy regime on the related questions of whether, in a particular circumstance, a person has a subjective expectation of privacy regarding where they are and what they are doing, and whether that subjective expectation of privacy is one that society, in general, is prepared to accept as “objectively reasonable.” While the U.S. Constitution does not use the term “privacy,” the framework of the Constitution—including the concepts of limited government, checks and balances and fundamental rights of persons (what the Declaration of Independence calls “inalienable rights”)—goes to the U.S. dual notion of “privacy” in general and data privacy more recently. Privacy includes the right “to be left alone,” a libertarian view which includes the right to be free from regulation, and the right “to be secure” in your person, places, houses and effects. Modern data privacy builds on these concepts and adds ideas such as fair data collection practices, openness and transparency, and legitimate purpose of both collection and use.
But it’s by no means perfect, or even mature. The same data collected by similar means about the same person for the same purposes may be entitled to privacy protection or not based on circumstances.
Location data is one such example. If you are driving around and the police spot you changing lanes without a turn signal, certainly their “viewing” of you in a public space is not an invasion of privacy; indeed, you likely have no or little expectation of privacy in the fact that you are in a particular place at a particular time outside. If you begin with the doctrine that people have no reasonable expectation of privacy in their location when they are outside then you enable a host of technologies—tracking devices, aircraft, drones, satellites, GPS transmitters, automated license plate recorders, cell site location data, IMSI catchers, cell phone tracking (apps and cookies) and yes, facial recognition. After all, these technologies simply reveal what the cops can reveal when you are outside: who and where you are. The Supreme Court has been of several minds about this, approving things such as the “open fields” doctrine allowing overhead surveillance of a person’s back yard (with a 40-foot fence) because the person had no expectation of privacy in what the police could see, and approving the tracking of an electronic beeper installed in a bottle of chemicals used as a precursor for manufacturing drugs. The installation of a GPS transmitter without a warrant was disapproved (because the installation invaded the property interest of the car owner) and the use of infrared tracking inside a home without a warrant was too much of a privacy invasion for the court. Similarly, the capture from the phone company of cell location data without a warrant was also considered to be too much of a privacy invasion. Other technologies including “pole cams” (a camera mounted on a pole adjacent to the place to be surveilled) have met with mixed results in the courts, although the Supreme Court has not yet weighed in on it.
And these are just technologies designed to track movements of people—mostly outside. So to even ask the basic question, “Do you have a reasonable expectation of privacy in your movements (location) in public?” is a fundamental question we have not answered—and we cannot answer because it assumes a binary result. Yes or no. But that’s not how privacy works. If we conclude that, well, the cops can pull me over for speeding so there’s no problem with them photographing me at a protest rally and tracking my movements and associations with facial recognition and then exposing my meeting and associations to the public without a warrant, then we have learned nothing about data privacy. The problem is not just the data (location) or the technology (GPS tracking, facial recognition) or the expectation or knowledge that the data is being collected and can be used—the problem is the purpose for which the privacy invasion is being conducted.
The same facial recognition technology that can be used to identify a terrorist entering the Super Bowl can be used, for example, to identify a law enforcement officer who brutally attacks a citizen or a cyclist who assaults girls putting up flyers in a suburban Washington, D.C., bike trail. In the latter case, the police admitted that they had used facial recognition to identify the suspect, but, after having put out incorrect information about the date and location of the assault, private citizens accessed an individual’s data through an app used to track bike rides and incorrectly identified him as the suspect, causing him to be threatened and doxed. The bike tracking technology is great to keep up with your workouts, but not so great for doxing and threatening. Same technology.
It’s not just about data collection. It’s also about data sharing, data aggregation and data analytics, each of which presents unique threats to privacy and personal integrity. The same software that can tell whether you are a Coke or Pepsi person (Pepsi slogan: “We’re out of Coke, is Pepsi OK?”) can be used to determine if you are Antifa, neo-Nazi or Al-Qaeda. Or a Democrat, Republican or Libertarian. Or Catholic, Unitarian or Buddhist. Indeed, the IBM open letter also observed:
Artificial Intelligence is a powerful tool that can help law enforcement keep citizens safe. But vendors and users of Al systems have a shared responsibility to ensure that Al is tested for bias, particularity when used in law enforcement, and that such bias testing is audited and reported.
It’s not just that AI can be biased, or that data on which AI relies can be incomplete or inaccurate. It’s that AI relies on data that often was not collected with the knowledge and intent that it be used in the way the AI program uses the data.
Another problem with the use of AI and personal data is that it can be used for nefarious purposes in other ways. For example, if I want a Democrat to win an election, AI and big data can help me draw up district lines to favor a candidate and decide how many polling stations to open (and close) and what kind of ID to require (or accept) to achieve my goal. Without the underlying data and the ability to crunch it, the manipulation (and the invisibility of that manipulation) is more difficult.
Ethics By Design
We speak of things such as “privacy by design,” but we rarely think of the morals and ethical implications of technology. A children’s toy that recognizes the child by face and personalizes the interaction might seem like a good idea and a fun toy but presents moral issues not only for its misuse but also for its use as intended. While Clearview AI can use “public” data to identify people, the question is not whether it can or whether it is legal, but is it moral and ethical? What impact will it have on society in general? What impact will it have on individuals? The debate about Facebook and other social media’s “censorship” (I do not think that word means what you think it means) practices is an example of morals and ethics in computing. Similarly, Google’s YouTube algorithm’s tendency to try to get people to stay engaged on the platform by suggesting coarser and more extreme (but engaging) videos, highlighted in The New York Times Daily podcast series “Rabbit Hole” has profound consequences on our ability to even have a dialogue about race, politics or anything else. “It’s just data,” or, “It’s just a platform,” is not a convincing answer. Maybe we need to have ethicists working for these companies. We certainly need to enhance not only legal but also moral and ethical teachings for computer professionals and for others. When it comes to data collection and use—or any technology—perhaps the answer is summarized by the apocryphal definition of a gentleman as a person who knows how to play the accordion but doesn’t.
Sometimes we can judge a society not by what it does, but by what it chooses not to do.