I want to use this blog to look at how data and emerging technologies affect us – or more precisely YOU. As a tech ethics researcher, I’m perpetually reading articles and reports that detail the multitude of ways in which data can be used to anticipate bad societal outcomes: criminality, abuse, corruption, disease, mental health, etc etc. Some of these get oxygen, some of them don’t. Some of them have integrity, some don’t. Often these tests, analyses, and studies identify problems that gesture toward ethically “interesting” solutions.
Just today this article caught my attention. It details a Canadian study that tries to get to grips with an endemic problem: suicide in young people. Just north of the border, suicide causes no fewer than 24% of deaths amongst those aged between 15 and 24 (Canadian Mental Health Association). Clearly, this is not a trivial issue.
In response, a group of researchers have tried to determine the signs of self-harm and suicide by studying the social media posts of those in the most vulnerable age bracket. The team – from SAS Canada – have even speculated that, “these new sources could provide early indication of possible trends to guide more formal surveillance activities.” So, with the prospect of officialdom being dangled before us, it’s important to ask how this social media analysis works. In short, might any one of us land-up being surveilled as a suicide risk if we happen to make a trigger comment or two on Twitter?
Well the answer seems to be “possibly”. This work harvested 2.3 million tweets, of which 1.1 million were identified as “likely to have been authored by 13 to 17-year-olds in Canada”. This determination was made by a machine learning model that has been trained to predict age by relying on the way young people use language. So, if the algorithm thinks you tweet like a teenager, you’re potentially on the hook. From there, the team looked for where these tweets related to depression and suicide, and “picked some specific buzzwords and created topics around them, and our software mined those tweets to collect the people.”
Putting aside the undoubtedly harrowing idea of people collection, it’s important to highlight the usefulness of this survey. The data scientists involved insist that the data they’ve collected can help them narrow down the Canadian regions which have a problem (although one might contest that the suicide statistics themselves should reveal this), and/or identify a particular school or a time of year in which the tell-tale signs are more widespread or stronger. This in turn can help better target campaigns and resources, which – of course – is laudable, particularly if it is an improvement on existing suicide statistics. It only starts to get ethically icky once we consider what further steps might be taken.
The technicians on the project speculate as to how this data might be used in the future. Remember, we are not dealing with anonymized surveys here, but real teen voices “out in the wild”: “He (data expert Jos Polfliet) envisions the solution being used to find not only at-risk teens, but others too, like first responders and veterans who may be considering suicide.”
Eh? Find them? Does that mean it might be used to actually locate real people based on what they’ve tweeted on their personal time? As with many well-meaning data projects, everything suddenly begins to feel a little Minority Report at this point. Although this study is quite obviously well-intentioned, we are fooling ourselves if we don’t acknowledge the levels of imprecision we’re dealing with here.
Firstly, without revealing the actual identities of every account holder picked-out by the machine learning, we have no way of knowing the levels of accuracy these researchers have hit upon when it comes to monitoring 13-17 year-olds. Although the use of certain language and terminologies might be a good proxy for the age of the user, it certainly isn’t an infallible one in the wacky world of the internet.
Secondly, the same is true of suicide and depression-related buzzwords. Using a word or phrase typically associated with teen suicide is not a sufficient condition for a propensity towards suicide (indeed, it is unlikely to even be a necessary condition). As Seth Stephens-Davidowitz discussed in his new book Everybody Lies: Big Data, New Data, And What the Internet Can Tell Us About Who We Really Are, in 2014 research found that there were 6,000 Google searches for the exact “how to kill your girlfriend” and yet there were “only” 400 murders of girlfriends. In other words, not everyone who vents on the internet is in earnest, and many who are earnest in their intentions may not surface on the internet at all. So, in short, we don’t know exactly what we’ve got when we look at these tweets.
Lastly, without having read the full methodology, it appears that these suicide buzzwords were hand-picked by the team. In other words, they were selected by human beings, presumably based on what sorts of things they deemed suicidal teens might tweet. Fair enough, but not particularly scientific. In fact, this sort of process can be riddled with guesswork and human bias. How could you possibly know with any certainty, even if instructed by a physician or psychiatrist, exactly which kinds of words of phrases denote true intention and which denote teenage angst?
Hang on a second – you might protest – these buzzwords may have been chosen by a very clever, objective algorithm? Yet, even if a clever algorithm could somehow ascertain the difference between a “I hate my life” tweeted by a genuinely suicidal teen and a “I hate my life” tweeted by a tired and hormonal teenager (perhaps based on whatever language it was couched in), to make this call it would have to have been trained on data which used the tweets of teens who have either a) committed suicide or b) have been diagnosed/treated for depression. To harvest such tweets, the data would have to rely upon more than Twitter alone… all information would have to be cross-referenced with other databases (like medical records) in ways that would undoubtedly de-anonymize.
So, with no guarantees of accuracy, the prospect of physical intervention by social services or similar feels like a scary one – as is the idea of ending up on a watchlist because of a bad day at school. Particularly when we don’t know how this data would be propagated forward…
Critically, I am not trying to say that the project isn’t useful, and SAS Canada are forthcoming in their acknowledgment that ethical conversations that need to take place. Nevertheless, this feels like the usual ethical caveat which acts as a disclaimer on work that has already taken place and – one might reasonably assume – is already informing actions, policies, and future projects.
Some of the correlations this work has unveiled clearly have some value, for example, there is a 39% overlap between conversations about suicide and conversations about bullying. This is a broad trend and a helpful addition to an important narrative. Where it becomes unhelpful, however, is when it enables and/or legitimizes the external surveillance of all bullying-related conversations on social media and – to carry that thought forward – some kind of ominous, state sanctioned “follow-up” for selected individuals…