What Do Your Social Media Posts Reveal About Your Health? by Knowledge@Wharton
What have you shared on social media today? Did you comment on last night’s election results; mention that you’re going to the gym later; sympathize with a friend who’s been in the hospital; describe your meal at a favorite burger joint, or display pictures of your daughter’s jazz dance recital?
And what do those post reveal about your health and your risk for serious medical conditions?
That last question may seem odd, but not to the researchers at the Penn Social Media & Health Innovation Lab at the University of Pennsylvania. Director Raina Merchant and her team are investigating how people’s social media language on sites such as Facebook, Twitter and Yelp can be used to assess and their health and predict diseases. The conditions they are looking at are some of the main culprits for premature death and disability (not to mention skyrocketing health care costs) in America, including heart disease, diabetes, hypertension, obesity, chronic lung problems, depression and drug abuse.
Part of the larger Penn Medicine Center for Health Care Innovation, the lab also has a partnership with the Leonard Davis Institute of Health Economics (LDI), which studies ways to improve America’s health care system. Merchant is a senior fellow at LDI as well as an assistant professor of emergency medicine at Penn.
Merchant explains that there are differences in people’s language structure, or the kinds of words they use, that might indicate a disorder or cognitive decline. “Someone might post directly about having a condition, or some [conditions] may be more revealed when people talk about it,” says Merchant. “If someone has a lot of posts that may suggest that they’re depressed, they may not be as overt as ‘feeling sad,’ or ‘blue,’ or ‘unhappy,’ but there may be other words … that suggest depression, that aren’t as obvious.”
While much of the lab’s research is at a relatively early stage, there have been some intriguing initial findings. The team published a study involving Facebook in October 2015 in BMJ (formerly British Medical Journal) in which more than 1,000 patients in the University of Pennsylvania Health System agreed to have their social media data compared with their electronic health record.
One finding: Individuals who were clinically obese according to their medical records were significantly more likely to use words related to being stationary: “sitting, being still, planted, at rest; these sorts of things,” says Merchant. The results were not what the team had predicted; they had thought this group might make frequent references to food or exercise.
About 71% … consented to share their social media activity and have it compared with their electronic medical records.
David Asch, who directs Penn’s Center for Health Care Innovation, mentions an even more unexpected association that was revealed by another of the team’s ongoing studies: Patients with high blood pressure post more frequently about their children than do people without the condition.
“Dealing with your kids doesn’t cause high blood pressure, although people think it does, colloquially,” noted Asch, who is also a professor of health care management and of operations, information and decisions at Wharton. “We find associations that are on the surface hard to explain, [and which] we wouldn’t have thought of in advance.”
The Privacy Question
Would most Americans agree to this type of surveillance, if they were told it was for the purpose of improving their health? Data mining is not new, of course — marketers have been using it for years to stealthily capture our online behavior and tempt us with ads. Some of this research may even call to mind the 2014 controversy involving Facebook and “emotional contagion.” The company reportedly manipulated nearly 700,000 of people’s news feeds without their knowledge, to test if it could influence whether individuals posted more positive or negative content. (Facebook asserted that consent was given via its stated Data Use Policy.)
By contrast, in Merchant’s research the idea is to obtain explicit consent and to funnel “actionable” data to patients. “Our hope is, can we collect this information and give it back to patients so that they could really learn from these assumptions we’re making? And how do we also make this available for health care providers, if patients wanted to share with them?”
In the lab’s Facebook study, a large percentage of individuals were in fact willing to participate. The study showed that of 1,432 patients in the University of Pennsylvania Health System who were Facebook and Twitter users and expressed interest in the study, the majority — about 71% — consented to share their social media activity and have it compared with their electronic medical records.
“That was a big finding,” says Merchant. “We don’t know of anyone really having done that before — being able to demonstrate [that people would give consent] and to engage in a very transparent way for data collection.”
Asch says that in his experience with the lab’s experiments so far, people seem to feel comforted by the idea that their health might be “watched over” by their local hospital or health system. “My intuition was that people would think of this as Big Brother,” he said, but he found that the opposite appears to be true. Plus, “a main finding is that although people do care about privacy, they also recognize the value of sharing, to themselves or to society.”
“Even something that is said in jest [on social media] may be more likely to be used by people with certain conditions than others.” –Raina Merchant
With 3,000 patients in the database currently, the team plans to collect data over the next decade and according to Merchant, “build this map, this database of digital footprints that people are sharing as information.”
Separating the Signal from the Noise
Is it really possible to get useful health data from social media posts? People say a lot of spur-of-the-moment things online. How does a computer program cope with human beings’ colloquial language, metaphors, sarcasm, and humor? What if the lab’s computer program interprets “BTW, I could have died!” as “I’m depressed and thinking about killing myself?”
“I think [those questions] get at the crux of this,” agrees Merchant. But even joking comments may be relevant. “Even something that is said in jest may be more likely to be used by people with certain conditions than others.”
The team’s task, she says, is to try to separate the signal from the noise. This effort is spearheaded by the lab’s computer scientists, including Lyle Ungar and Andy Schwartz. Ungar, whose expertise is also in biomolecular engineering and operations, runs the group that performs natural language processing: using computers to automatically “read” people’s social media. Schwartz is based at Stony Brook University and works remotely with the Penn Social Media and Health Innovation Lab.
“Social media is an unstructured data source. It doesn’t come with these variables that you can just cleanly plug into your statistical software,” Schwartz points out. “So you have to, first of all, run algorithms that turn the social media — these strings of characters — into some sort of meaningful piece of statistical information.” He also applies the latest machine learning techniques from computer and information science. But even so, the process is challenging.
Tracking Public Health
In addition to looking at