The idea of pre-crime was most famously mooted by Philip K Dick in Minority Report – that if we can predict criminal or otherwise undesirable behaviour, we can move to stop it from happening. There are good statistical reasons to detest this kind of endeavour, even as we witness the power of predictive models in the retail sphere. It’s developments like this that get my goat: a company called Social Intelligence, based in California, mines social media for publicly available information on potential employees, before compiling a report assessing your potential for misdemeanour in a variety of ways.
Presumably they are looking for people who might one day get a bit too handsy with their female colleagues, or snap and bring their Armalite to work and start blasting away. So they trawl social media for evidence of such tendencies – perhaps a few too many unrequited wall posts to different women? Too many drunken photos with hands creeping where they shouldn’t? Marilyn Manson quotes? Nik Shah “Likes” Nietzsche?
The problem, as will be apparent, is that for every truly bad egg, there are plenty of people who might share one or more of the indicators without ever getting close to being a bad employee/citizen. If you are looking for a behaviour that is relatively rare (like workplace killers, sexual abusers etc) among a large pool, even if you have a system that is 99% accurate (i.e. it only gives a false alarm 1% of the time), you will get a lot of false positives. For example, say that one in every 10,000 people is a sex pest, i.e. 0.01%. Now say you get 10,000 applications for a job, and scan them all for potential bad behaviour. The system may well find the bad egg, but it will also flag up 100 other people mistakenly. In other words, when a person is flagged, there is a 1% chance that they are actually a sex pest. Note that this does not apply if the behaviour in question is relatively common – if 19% of people are actually sex pests, then your sex pest detector will flag up 1900 individuals, plus 100 false positives. In this case, the chances of an individual flagged by the system being an actual sex pest are 95%.
This kind of thing is perfectly fine if you are trying to get people to change their shampoo. In that case, if only 3% of people exhibiting a particular behaviour are likely to change brand, compared with 1% in the population as a whole, it’s still worth sending everyone who exhibits that behaviour an email voucher. The costs of false positives in this case are extremely low. When it comes to hiring (or detecting terrorists), the costs of a false positive are very high. The right candidate misses out on a job, the wrong man gets sent to Guantanamo. Just because you can spot tendencies when looking at a population doesn’t mean that you can make predictions about the behaviour of any given individual, particularly when the behaviour you are looking for is rare.