Google’s Ngram attracts a crowd

Take 15 million books and 4 billion words, use a simple interface to search for a few words or phrases, and presto – you have Google’s Ngram Viewer.

Ngram creates charts that show how often words or phrases have occurred in books since 1800. It can be used to trace to rise and decline of certain words, giving clues to researchers. Or it can be used for simple fun – (Red Sox, Yankees).

Jon Orwant, leader of the Google’s Digital Humanities effort and one of three co-creators of the Ngram, spoke to about 75 people at a meeting of the Boston group of Hacks/Hackers at Google in Cambridge on Feb. 9.

Orwant, a former publisher, described the Google Books project and how Ngram can be use. The Book project now has scanned in about 15 million books, or more than 10 percent of the estimated 129 million books “printed since Gutenberg,” he said.

(To try out: http://ngrams.googlelabs.com/)

Users can see how often words or phrases (up to 5 words) have appeared in print since 1800.

While it can be a lot of fun, it’s also a scholarly tool. But Orwant, who grew up in Fitchburg where his father was a reporter for the local paper, warned that he sees it more as a tool for helping discover which questions need to be asked, and “not an oracle.”