Google

Rabu, 07 Mei 2008

Word 'Bursts' May Reveal Online Trends

18:24 18 February 2003
NewScientist.com news service
Will Knight


Searching for sudden "bursts" in the usage of particular words could be used to rapidly identify new trends and sort information more efficiently, says a US computer scientist.

Jon Kleinberg, at Cornell University in New York, has developed computer algorithms that identify bursts of word use in documents.

While other popular search techniques simply count the number of words or phrases in documents, Kleinberg's approach also takes into account the rate at which the word usage increases.

Kleinberg suggests that the method could be applied to weblogs to track new social trends. For example, identifying word bursts in the hundreds of thousands of personal diaries now on the web could help advertisers quickly spot an emerging craze.
Hot or not

The algorithms used to identify these sudden bursts are relatively simple, but very powerful, says Christos Papadimitriou, at the University of California at Berkeley.

"The key is to find unexpected changes in the frequency of the appearance of words," he told New Scientist. Papadimitriou agrees the method could prove valuable when searching for new trends in weblogs.

The approach could also be applied to sifting through other types of information. Identifying word bursts within email messages sent to a company's customer support address might help maintenance staff spot a major new problem.

Researchers at Google, the world's most widely used internet search engine, have already shown that identifying spikes in search terms can be used to track the spread of news and rumours around the world. The algorithms that run Google's automated news aggregation service remain secret, but it is not difficult to imagine that word bursts could, or do, have a useful role.
British savages

In a simple historical test of the technique, Kleinberg analysed all the annual State of the Union addresses given by US Presidents since 1790. He found that particular word "bursts" could indeed be linked to important events at the time the speeches were delivered.

In the years that immediately followed the American Revolution, for example, sudden bursts in the use of words such as "militia", "British" and "savages" are found.

From 1930 to 1937 a spike in the use of the word "depression" is seen. And from 1949 to 1959 "atomic" is the word with the greatest "burstiness". Later in the 20th century, words such as "Vietnam", "Soviet", "communist" and "Afghanistan" increase sharply in usage.

Kleinberg presents his findings on Tuesday at the American Association for the Advancement of Science's annual meeting in Denver, Colorado.