Sentences that endure in the public mind are evolutionary success stories, comparing “the fitness of language and the fitness of organisms.” On April 30, 2012 Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, and Lillian Lee of the Department of Computer Science at Cornell University published "You had me at hello: How phrasing affects memorability," arXiv: 1203.6360v2 [cs.CL] 30 Apr 2012). Using the "memorable quotes" selected from the Internet Movie Database or IMDb, and the number of times that a particular movie line appeared on the Internet, they compared the memorable lines to the complete scripts of the movies in which they appeared—about 1,000 movies
"To train their statistical algorithms on common sentence structure, word order and most widely used words, they fed their computers a huge archive of articles from news wires. The memorable lines consisted of surprising words embedded in sentences of ordinary structure. 'We can think of memorable quotes as consisting of unusual word choices built on a scaffolding of common part-of-speech patterns,' their study said.
Consider the line 'You had me at hello,' from the movie 'Jerry McGuire.' It is, Mr. Kleinberg notes, basically the same sequence of parts of speech as the quotidian 'I met him in Boston.' Or consider this line from 'Apocalypse Now': 'I love the smell of napalm in the morning.'Only one word separates that utterance from this: 'I love the smell of coffee in the morning.'
"This kind of analysis can be used for all kinds of communications, including advertising. Indeed, Mr. Kleinberg’s group also looked at ad slogans. Statistically, the ones most similar to memorable movie quotes included 'Quality never goes out of style,' for Levi’s jeans, and 'Come to Marlboro Country,' for Marlboro cigarettes.
"But the algorithmic methods aren’t a foolproof guide to real-world success. One ad slogan that didn’t fit well within the statistical parameters for memorable lines was the Energizer batteries catchphrase, 'It keeps going and going and going.'
"Quantitative tools in the humanities and the social sciences, as in other fields, are most powerful when they are controlled by an intelligent human. Experts with deep knowledge of a subject are needed to ask the right questions and to recognize the shortcomings of statistical models.
“ 'You’ll always need both,' says Mr. [Matthew] Jockers, the literary quant. 'But we’re at a moment now when there is much greater acceptance of these methods than in the past. There will come a time when this kind of analysis is just part of the tool kit in the humanities, as in every other discipline' " (http://www.nytimes.com/2013/01/27/technology/literary-history-seen-through-big-datas-lens.html?pagewanted=2&_r=0&nl=todaysheadlines&emc=edit_th_20130127, accessed 01-27-2013).