Stanford Gave Birth to Google. Now Its Computer Scientists Are Cleaning Up the Mess

Mar 13, 2014 at 3:29 PM ET

In 1995, two graduate students in Stanford’s computer-science program rented a friend’s garage and began working on their thesis project, a solution to the web’s then-biggest problem: There was lots of interesting stuff on the web, but finding it was insanely difficult.

Each day, people and businesses were uploading more and more information—from home telephone addresses to spinach dip recipes—but few search engines were able to capture the results effectively. The students, Larry Page and Sergey Brin, set about building the world’s most powerful web “crawler,” which combed vast expanses of the Internet and returned results almost instantly to the searcher. It worked fantastically well—people were able to retrieve and share information faster than ever. As more people used the search engine, the young scientists began collecting vast troves of data on their customers, a sort of ancillary benefit of being in the business of information. They called it Google, of course.

Google wasn’t alone—other Stanford graduates created their own data-hungry Internet powerhouses: Several other search and data-driven startups launched in this era from Stanford as well. David Filo and Jerry Yang, graduate students at Stanford in the early 1990s, started Yahoo in 1994. Paul Flaherty, the co-founder of Alta Vista, another early Internet search engine, graduated from Stanford with a PhD.

Twenty years later, the computer scientists of Stanford are trying to devise solutions to a new form of chaos—one that’s been unleashed by Google and its ilk. Many of today’s problems—the concerns over privacy, fears over government surveillance—were created (or at least enabled) by the companies founded by fellow alumni: computer scientists who graduated in the 1990s.

Check out this report released yesterday. It’s a study by two current Stanford computer-science graduate students, Jonathan Mayer and Patrick Mutchler. Their investigation looks at the type of data captured by NSA surveillance. The conclusion, which really shouldn’t shock anyone whose been following the Edward Snowden drama, is that the content of metadata is actually a lot more sensitive than the NSA makes it out to be. Metadata includes everything from where, geographically, the calls were made, to the recipients of the calls, to how long each of the conversations lasted.

By studying Android metadata on people’s phones—the operating system built and owned by Google—the researchers were able to decipher patterns in phone usage that were “highly indicative of sensitive activities or traits.” For instance, with one user, the researchers were able to detect an early morning call with her sister. “Two days later, she placed a series of calls to the local Planned Parenthood location.” Even though the content of the call is not necessarily recorded, a researcher could potentially patch together this information to create a compelling narrative about a person’s activity.

The researchers made it clear that the NSA metadata collection program, which accesses technology built by some of the world’s foremost tech companies, is plenty capable of spying, even if they say they’re not.

“Reasonable minds can disagree about the policy and legal constraints that should be imposed on those databases,” they conclude. “The science, however, is clear: Phone metadata is highly sensitive.”

The research is interesting in its own right, but it speaks to a larger shift within the computer science community, especially at Stanford. Twenty years ago, the emphasis at Stanford (and other schools, too) was on building the speed and efficiency of the networks, not necessarily safety or security. The products they created, especially at Google (from Gmail to Maps) were so good, and so easy to use, that most people overlooked the obvious: It was incredibly easy for someone to track their online behavior.

A corollary might be the automotive industry. In the early 20th century, engineers were obsessed with building faster cars. It wasn’t until the latter part of the century that engineers began focusing on safety. Hell, safety belts in cars weren’t even standardized until 1958.

The same thing is happening to the web industry, and it’s most obvious at Stanford.

One recent PhD dissertation from 2012, titled “Protecting Privacy when Mining and Sharing User Data,” sought to understand how companies like Google protects its users. Privacy seems well on its way to becoming a lucrative business model, too. Snapchat, the ephemeral messaging app that doesn’t leave a digital trace (well, claims not to) has surged in popularity, and now has about 30 million users. The founders, Evan Spiegel and Bobby Murphy, are both graduates of Stanford.

And campus speeches, like “How Can We Protect Government Surveillance” and “Privacy-Preserving Distributed Stream Monitoring,” not to mention dozens of new classes focused on privacy, are becoming increasingly common on campus.

The backdrop for this is people freely sharing huge amounts of digital information every day. “The degree of sensitivity among contacts took us aback,” noted the researchers in yesterday’s report. “Participants had calls with Alcoholics Anonymous, gun stores, NARAL Pro-Choice, labor unions, divorce lawyers, sexually transmitted disease clinics, a Canadian import pharmacy, strip clubs and much more. This was not a hypothetical parade of horribles. These were simple inferences, about real phone users, that could trivially be made on a large scale.”