Why What You Say Reveals More Than You Think

From Thomas Jefferson’s “All men are created equal” to John F. Kennedy’s “Ask not what your country can do for you — ask what you can do for your country,” simple words strung together in distinctive ways have the power to move people. But imagine if Jefferson instead said, “Each person is not worse than the next,” or Kennedy rephrased to “Don’t just take for yourself but give to your country” — would these quotes have become just as famous?

Know more about Russia than your friends:

Get our free ebook on how the Soviet Union became Putin's Russia.

Academics tackled these and other linguistic analysis questions at the recent interdisciplinary “Behavioral Insights from Text Conference,” organized by Wharton’s Risk Management and Decision Processes Center. Beyond a theoretical exercise, the multi-university research presented at the conference showed how word choice can have deep implications for people, business and society by examining the subtle psychology behind them.

People’s word choices can reveal such things as their mental health, ability to persuade or even if they’ll default on a loan. A company’s choice of pronouns can affect a customer’s experience and whether it will lead to a purchase. Words used by the media influence how the public thinks about social issues like casino gambling. And the placement of gender — men and women vs. women and men — affect whom the reader believes is on top.

“The whole world of text analysis is so exciting,” said James Pennebaker, psychology professor at the University of Texas at Austin who co-developed the Linguistic Inquiry and Word Count (LIWC) system, which is widely used for text analysis. He said word analysis is more reliable than asking people to document what they are thinking or feeling because these “self-reports are poorly related to real world behaviors.”

“Self-reports are self-theories. They are theories about who we think we are,” continued Pennebaker, a keynote speaker at the conference. It’s tough to change the narrative people believe about themselves. “To change a self-theory is really hard. That is why language [analysis] is interesting.” Language betrays what the speaker or writer is truly feeling, even on a subconscious level, much like a Freudian slip betrays one’s real thoughts.

Today, as social media, mobile apps and web technologies fuel an explosion of virtual conversations, text analysis is having a field day. “This is the beauty of Big Data,” Pennebaker said. “It’s allowing us to see things we haven’t seen before.”

How Public Opinion Changes

The importance of word choices, as well as how words are framed, is exemplified by their ability to influence public debates, with widespread implications for society. Lillian Lee, a professor at Cornell University interested in natural language processing and social interaction, cited as an example the words used in the debate over genetically modified foods.

Supporters call it a “green revolution,” connoting sustainability, which is a good thing. But detractors call it “Frankenfood,” framing it in terms of an out-of-control monster created by science. “There are people putting a lot of thought into trying to use phrasing to get the public to think about issues a certain way,” Lee said. “Public opinion matters.”

Ashlee Humphreys, a journalism professor at Northwestern University, took three decades’ worth of archived articles from the nation’s largest newspapers to understand why public outrage over casino gambling has changed over time.

When the idea first arose of cities building casinos as a revenue generator, people were concerned that it would lead to mass gambling addictions and increased crime in their neighborhoods. To gauge sentiment, she analyzed categories of words used in the stories at the time. Words such as ethical, bible and law signified ‘purity’; guilty, illegal, arrested and sin, denoted ‘filth’; junket, limo and yacht signaled ‘wealth’; welfare, slum and ghetto spoke of ‘poverty.’

“To change a self-theory is really hard. That is why language [analysis] is interesting.” –James Pennebaker

Over time, the use of ‘filth’ words decreased. “People were no longer talking about casino gambling in terms of good and evil,” Humphreys said. The news story became more about local governments raising tax revenue from casinos. “A more ‘rational’ discourse took hold as the ‘purity’ and ‘filthy’ discourse waned and was about on par with ‘wealth’ and ‘poverty,’” she said. “As this happened, casinos became more legitimate.”

Humphreys also looked at the declining public outrage around oil spills. She looked at the massive 2010 BP oil spill in the Gulf of Mexico (which spawned a movie called Deepwater Horizon). Right after the spill, BP’s stock dropped by 40%, public support for offshore drilling fell, and consumer confidence also dipped.

Two years later, people forgot about the spill and sentiment recovered. Overall oil production also exceeded pre-accident levels. BP’s stock price rebounded to 80% of its pre-spill value, and public support for off-shore drilling went back to near pre-accident levels. Consumer trust in the energy industry came back as well.

“The question is, why did this happen? Why didn’t people stay upset?” Humphreys asked. “What media narratives are used to explain and contain these fears?” To find the answer, she compared news coverage of the BP oil spill to the Exxon Valdez oil spill in 1989 in Alaska. With Exxon, the “cultural narratives and public templates had to be worked out — how do you deal with such a disaster? This kept Exxon in the news much longer.”

With BP, the media brought up the consequences of past oil spills — such as lawsuits and government fines for Exxon — and then closed the issue. “A year out, nobody was talking about oil anymore. That’s not where the discourse moved,” Humphreys said. News coverage shifted to containing the spill, investigating the causes of the accident and focusing on the folks responsible.

Something similar is happening to public opinion about the legalization of marijuana. Pot is currently legally cleared for medical use in 30 states and recreational use in eight states, Humphreys said. The image of marijuana users is slowly changing, from lazy stoners to health buffs as the plant is increasingly being incorporated into legitimate products like chocolate bars and body butters. “We see growing emergence, acceptance,” she said.

Linguistic ordering of genders can also affect how the public views who is more powerful. According to research by Selin Kesebir, professor of organisational behaviour at the London Business School, if a man is mentioned before a woman, he is seen to be in a more dominant or central position — and vice versa.

In an experiment, Kesebir showed two versions of a news article about townspeople protesting a power plant proposal. In one version, the story said “some of the town’s men and women are out on the streets.” The other version reversed the genders, “some of the town’s women and men are out on the streets.”

When readers were asked which gender played a more central role in the protest, 66% of those who read “men and women” chose the men. Among those who read the version with “women and men,” 71% said women played a more central role. These results have implications for how public perceptions can be influenced based on the placement of language.

How to be Persuasive

Word choice also has a profound impact on one’s ability to persuade, according to Cornell’s Lee. Being persuasive is a handy skill whether it is to prevail in business meetings or getting your kids to go to bed on time. Lee analyzed the posts and threads in an online debating forum called ChangeMyView on Reddit. Users would post opinions and explain why they hold these beliefs. Other people would post counter-arguments to try to change their minds. The successful post would be flagged with a delta symbol.

“Subtle variation in lyrical topics produces a relatively big incremental in commercial success.” –Grant Packard

Lee discovered that successful counter-arguments are ones that provide new information, but were communicated in a style similar to the original writer of the post. “They told me something I didn’t know before,” she said. However, it does pay to know when to stop — people who kept arguing didn’t change minds. “Too much back and forth equals lost cause,” Lee said. “If you go on that long, stop talking. The kind of people who keep going that long aren’t necessarily the kind of people who are persuasive.”

Paul DiMaggio, sociology professor at New York University, also looked at persuasiveness but in a corporate setting. He analyzed the discussions at IBM’s 2004 World Jam, a 54-hour online conversation to brainstorm solutions to global challenges faced by the company. There was no anonymity — everyone had to register. Moderators did not remove or edit posts. Out of more than 31,000 comments only 282 were selected for further development. Why were they singled out?

By applying text analysis to the chosen comments, DiMaggio discovered that successful posts were of higher quality (longer, more thoughtful, generated more discussion, the writer took time to respond) or they were focused on core topics important to IBM. There also was one unexpected finding: Successful posts tended to be ones that were different in style from executives. So mimicking the way the top brass talked didn’t work. He also found out that men were not favored, nor were executives.

How about the discarded comments? DiMaggio discovered that hasty responses typically were not chosen. Posts that had a high level of excitement also didn’t get an edge. Displaying pride at being an IBM employee had no bearing on being chosen. Responses from the U.S. also generally were not favored. Moreover, nobody liked to talk about IBM products.

What Drives Popularity?

Another insight about words is that they can predict popular success. That’s a finding by Grant Packard, marketing professor at Wilfrid Laurier University in Canada, and Wharton marketing professor Jonah Berger. Packard presented results from their paper “Are Atypical Songs More Popular?” at the conference.

Their research used text analysis and natural language processing methods to determine why some songs become more popular than others. They pulled the lyrics of the top 50 Billboard songs for every three months spanning three years for each of the seven major genres (Christian, country, dance, pop, rap, rock and R&B). Their data set also included the artist, promotional activity and support, as well as radio airplay.

What they found was that songs that shot up the charts were more unique than other songs in the same genre. And it doesn’t take much: A 16% differentiation is enough to make a song move one notch up the charts. “Subtle variation in lyrical topics produces a relatively big incremental in commercial success,” Packard said. These results hold true even if the songs varied by artist, promotional activity and other factors.

However, songs cannot be too different or else they turn off the listener. “We look for novelty and experience,” Packard said. “We want things that are known to us but novel to make us engage further. It needs to fit with our experience but push us slightly away from it.… Novelty has to be distinguished by the bounds of our own experience.” For example, a blue rubber duck will attract people because it’s not the typical yellow — as long as it retains the shape and texture of the original.

Lee came to a similar conclusion with another experiment she ran involving movie quotes. She tried to discover why certain movie quotes go viral while others are forgettable. Lee found out that “on average, memorable quotes significantly contain more surprising combination of words. … When things are unusual, people remember them.” However, the sentences tend to be simpler in structure. For instance, she said, “you’re gonna need a bigger boat” is more memorable than “you’re going to need a boat that is bigger.”

Emotional volatility also predicts how movies will fare, according to other research by Berger. He studied movie scenes and plotted their emotional trajectory using text analysis and natural language processing. He discovered that movies that are more emotionally volatile — they have higher peaks and lower lows — overall get higher ratings. Berger said that a 10% increase in emotional volatility translates to a 1% increase in ratings. However, he cautioned that if a movie whipsaws audiences too often with highs and lows, it backfires. Viewers get exhausted.

People with high stress talk online about pain, anxiety, being tired, hurting, depression and headaches. Low-stress people convey enthusiasm about today, vacations, breakfast and being “pumped.”

Language as a Health Predictor

There’s also research showing that social media chatter can help predict a person’s mental and physical health. Lyle Ungar, professor of computer and information science at the University of Pennsylvania, parsed through troves of Facebook data to measure psychological traits. “How does language inform what we can learn about people?” he asked.

Words people use can predict their gender 92% of the time, Ungar said. For example, women tend to use the following words on Facebook more often: shopping, excited, “love you,” yay and birthday. Men tend to use more profanity as well as the words Xbox, girlfriend, war, YouTube and PS3. Words can also help pinpoint whether someone is extraverted: They use words or phrases like “can’t wait,” chillin, party, weekend, girls. Introverts favor anime, internet, manga, computer, sigh, Pokemon and others.

Word choice can also pinpoint mental health, Ungar discovered. More neurotic people tend to post online that they are “sick of” or hate something. Other words they use more often are kill, dead, bloody, alone, bored and stupid. Less neurotic people talk about religion and sports, use phrases such as “life is good” and “beautiful day,” and use words like beach, success, workout, soccer, church and blessed.

People with high stress talk online about pain, anxiety, being tired, hurting, depression and headaches. Low-stress people convey enthusiasm about today, vacations, breakfast and being “pumped.” “Why is this useful? We can estimate people’s personality and how this personality correlates with behavior, such as showing up in the hospital” if they’re sick and being willing to take care of themselves, Ungar said.

Deadbeats and Fitting In

When it comes to giving out loans, a person’s financial information and credit score are what lenders rely on to make sure the money borrowed will be paid back. But if lenders also analyzed what borrowers write when applying for loans, the prediction of default or repayment will be even more accurate, according to Oded Netzer, a Columbia Business School professor.

Netzer looked at data from a peer-to-peer lender in which borrowers didn’t need to put down any collateral and everything was executed online. Borrowers provided their personal financial information, debt-to-income ratio and credit scores, among others. They also had the option to explain in their own words why they wanted the loan. There were 140,000 loan requests, and Netzer focused on the 18,000 loans granted, of which a third defaulted. Using several machine learning methods, he discovered that “text significantly helped predict [default] than just financials alone.”

What words were used more often by loan defaulters? They would talk about external things such as God, relatives, their mother. They also were more likely to explain why they need the money (child support) and why their credit score wasn’t better. They used future tense more often as well as shorter time spans (weeks not months). They mentioned how hardworking they were and preferred “extreme” words like “perfect, all, final, great, everything.” They were polite: “Hello,” “Thank you,” “God bless you.” They also tended to include a desperate plea: “I need help.”

What do good borrowers sound like? Netzer said they tended to use more complex language. They talked about the longer term (months and years) more than the short term. They understood debt and finances and mentioned upcoming events signaling a brighter financial future, such as graduations and weddings.

Another business application of text analysis is predicting cultural fit. Amir Goldberg, professor of organizational behavior at Stanford University, used text analysis to examine what makes an employee fit in better with an organization. Specifically, he tested to see which trait was better for the employee: perceptual accuracy — the ability to accurately read the corporate culture — or value congruence, the worker’s personality being already similar to the company’s culture. (For instance, a Type A personality would fit in with a hard-charging, driven company.)

For his research, Goldberg looked through seven years of emails sent and received by more than 1,200 employees at a midsized U.S. tech firm. He used linguistic models to measure behavioral fit. Goldberg found out that the ability to accurately read the corporate culture and adjust one’s linguistics accordingly makes an employee a better fit. “Perceptual accuracy is more consequential for the ability to read the cultural code and behave compliantly than value congruence,” he said. “Peers matter. Culture is learned from those with whom one interacts.”

Good borrowers … tended to use more complex language. They talked about the longer term (months and years) more than the short term.

Reaching Customers

Kartik Hosanagar, Wharton professor of operations, information and decisions, looked at another business use case for text analysis: How companies can elicit greater engagements from customers on social media, such as likes, comments and clicks. He noted that social networks reach over 3 billion people and account for a quarter of people’s time spent online. Still, only 1% of Facebook fans engage with brands and only about 0.2% of posts actually reach the audience for whom they were intended.

That means companies have to do a better job to get their followers to interact with them. To find out what content was effective, Hosanagar looked at 160,000 unique messages posted by 782 firms and recorded followers’ responses. He tested two types of content: informative (product facts, brands, prices) and brand personality (humor, remarkable facts and others).

The result? “Emotional and philanthropic messages … drive higher levels of likes and comments but product and price information elicit lower levels of likes and comments,” Hosanagar said. That means “informative content, when used, should be combined with persuasive content.” The results hold for click-throughs, with brand personalities doing better than just informative posts. One exception: promotional deals. People tend to click on them.

Sarah Moore, a professor at the University of Alberta in Canada, used text analysis to figure out how companies can deliver better customer service. In particular, she looked at the use of pronouns “I,” “we,” and “you” in a company’s communications with customers. “We look at how personal pronouns signal to customers things about the firm,” she said. Customers hear these pronouns all the time: “Your call is important,” “Your patience is greatly appreciated,” “Thank you for your question.”

In her experiment, she sent emails to a random sample of the top 100 online retailers. Half inquired about international shipping and the other half complained about the website. Moore discovered that 40% of customer service agents did not use “I” in their emails when responding to her inquiry or complaint. They would say things like, “We’re happy to help you.”

Moore said her research showed that the customer would be happier if the agent used “I” because it suggests that the agent feels for the customer and acts on her behalf. It signals that the agent is attempting to understand the problem, empathizes with the customer, takes responsibility for what happened and has some level of autonomy to act. “’I’ pronouns increase satisfaction with the agent and increase purchase intentions with the firm.”

However, agents should be careful about using “you” when talking to the customer. “’You’ is bad when it’s used not as the object but the subject — ‘you do this’ or ‘you do that,’” Moore said. So saying something like “If you have your username, you can look into the account” would not be accepted well, she said. “It shifts responsibility to the customer, and they don’t like that.”

Article by Knowledge@Wharton