What Happens When An Algorithm Labels You As Mentally Ill?


Adam Hofmann is an internal medicine physician and founder of Medmo.ca and Preventia Clinics.

SAINT JEROME, Canada — While you might think that your mental health is known only to you, your health practitioner or those closest to you, you might be unwittingly revealing it to strangers online. A series of emojis, words, actions or even inactions can communicate how you feel at a given moment and when collected over time, comprise your “socionome” — a digital catalogue of your mental health that is similar to how your genome can provide a picture of your physical health.

Today, a number of efforts have been made to design algorithms to scan online behaviors for markers of mental illness. Crisis Text Line, a messaging service that connects users to crisis counselors, uses a chatbot to flag callers at risk of suicide and bump them to the front of the helpline. The services’ data scientists have found, for example, that when a caller texts the word “Advil” or “Ibuprofen” to Crisis Text Line, his or her risk of attempting suicide is up to 14 times higher than that of the average caller.

Social media platforms such as TwitterFacebook and Instagram have also implemented or been used to deploy algorithms attempting to identify or even prevent people at risk of suicide from self-harm by directing them to the appropriate health services. And while there are certainly potential positive applications of using technology to identify and address mental health, such as redirecting people to professional help, the potential negatives are significant and must be addressed.

To start, the tools themselves might be crude or rudimentary, designed by people untrained in psychiatry or psychology and thus, unmoored from their usefulness in a clinical setting for determining whether a patient is depressed or not. In 2013, Microsoft’s research subsidiary, Microsoft Research, reported that a depression screening tool used data gleaned from over 69,000 social media posts to predict depression with roughly 70 percent accuracy. To the non-physician, this number might seem impressive. But as most clinicians know, this level of accuracy is low. For statistical reasons, tests below the 85 percent threshold are rarely useful.

Microsoft Research’s technology was based on common clinical criteria established by the Center for Epidemiologic Studies Depression, which has a higher rate of screening for depression in patients at roughly 87 percent. Similarly, two scholars, a psychologist and computer scientist created a tool to determine whether Instagram users were clinically depressed based on pixel-by-pixel analysis of their posts. This tool had, at best, less than 70 percent accuracy as a screening test. Under certain test conditions, it performed only slightly better, and even sometimes worse, than a coin toss.

In more mature fields of medical research, such as pharmacology and therapeutics, physicians and other health professionals know that prior to unleashing any new drugs or devices to the public, whether curative or diagnostic, they must test them extensively to prove that they demonstrate significant benefits and minimize the likelihood of harm. The medical field is also subject to strict health and privacy laws for safeguarding patient data.

Fields involving artificial intelligence and algorithm-based diagnostic tools are not subject to this level of scrutiny and regulation, even when it affects a person’s well-being. There was no regulation, for example, in the creation of the Samaritans Radar app in 2014, which notified users of who among their Twitter followers was at risk of suicide based on phrases such as “help me” and “need someone to talk to.” It was only after significant public outcry about the risks of publicly flagging Twitter users’ emotional state, which could subject already vulnerable people to bullying, that the controversial mental health app was shuttered.

As a physician, I see the risks and consequences of algorithmic overreach in mental health falling into two camps: there are the risks of “false positives,” of labelling healthy people as being mentally ill, and the risks of “false negatives,” of missing cases of mental illness that actively require our attention.

Could false positives, for example, lead people who are not yet depressed to believe they are? One’s mental health is a complex interplay of genetic, physical and environmental factors. We know of the placebo and nocebo effects in medicine, when blind users of a sugar pill experience either the positive or negative effects of a medicine because they have either positive or negative expectations of it. Being told you are unwell might literally make it so.

Then imagine if those labelling tools, accurate or not, were used by third parties — insurers, employers or even the government. Would a person be ineligible for life insurance based on the color scheme or filters used in a series of Instagram photos? Could a government agency decide to rescind an individual’s right to bear arms based on his or her Twitter posts?

New technological tools for the screening and identification of mental illness could also lead to dangerous false negatives. Microsoft Research’s new algorithm, for example, raises the possibility, as the company notes in a paper, of creating “a social media depression index that may serve to characterize levels of depression in populations” — in other words, a heat map for depressed people. From the perspective of a public health official, this might be used to appropriately direct resources such as counsellors, crisis centers or doctors to an at-risk region or perhaps serve as an early-warning system for a substance abuse epidemic, such as with our current opioid crisis.

But false negatives may result from inadvertently excluding certain populations from these heat maps. For example, rural and poor communities, which already suffer from fewer mental health resources, are at risk of being left out because they are more likely to be technologically under-represented. Under the direction of a misguided algorithm, we may be distracted from helping the sick by overanalyzing the digitally-active healthy.

While the medical application of the socionome holds a great deal of potential, how data science influences medical diagnostics raises thorny issues. Doctors, after all, are not allowed to practice untested diagnostics on patients, never mind on large numbers of unsuspecting social media users, and then go on to share that data with others without their patients’ consent. If social media data is health data, and data scientists are using it to diagnose mental illness, then we must ask ourselves: shouldn’t we hold them to the same privacy and health standards as medical practitioners?

This was produced by The WorldPost, a partnership of the Berggruen Institute and The Washington Post.