The origins of the term Big Data is found in Big Data, reports Steve Lohr in The New York Times (2/4/13). It used to be that tracing the origins of a term was a matter of finding its earliest mention in “news or journal” archives. The term “software,” for instance, first cropped up in a 1958 article by John Tukey, a mathematician. But today any such search must also include “ditigal artifacts now posted on technical websites” — or, in other words, Big Data. This is certainly fitting, as the “unruly digital data of the web is a big ingredient of what is now being called Big Data.”
The evidence points to John R. Mashey, the “chief scientist at Silicon Graphics in the 1990s,” as the term’s originator, although not because he wrote an academic paper about it: “Instead, he gave hundreds of talks to small groups in the middle and late 1990s to explain the concept and pitch Silicon Graphics products.” At the time, Silicon Graphics “dealt with new kinds of data, and lots of it.” John, himself, thinks it’s such a simple term that it’s not much to brag about.
“I was using one label for a range of issues, and I wanted the simplest, shortest phrase to convey that the boundaries of computing keep advancing,” he says. Indeed they do: “Just last month, for example, the Library of Congress said its archive of public Twitter messages had reached 170 billion posts and was rising by about 500 million messages a day.” Fred R. Shapiro, editor of The Yale Book of Quotations, says that, among other things, Big Data opens up “new linguistic terrain.” He observes: “What you’re seeing is a marriage of structured databases and novel, less, structured materials … It can be a powerful tool to see much more.”