Feb. 05 2009

Understanding the number of adjectives being used by a community, or associated with a topic, for example, is fundamental to understanding how opinion is expressed. Adjectives are an important sub-area that any opinion mining technology needs to master (along with all the other forms of opinion expression).

Understanding the growth of nouns in a community, and the appearance of new ones, is an important signal when tracking conversations and social trends at all levels.

A simple experiment to explore this space is to scan a collection of documents and graph the appearance of hitherto unseen terms. The graph below shows this for a sample of blog data. The x-axis shows the number of documents inspected, the y-axis shows the number of types of a certain part of speech (NN = nouns, JJ = adjectives, VB = verbs, RB = adverbs).
(via Data Mining: Text Mining, Visualization and Social Media: Lexical Growth in the Blogosphere)