- Posted by bsimms
- On March 28, 2017
There are plenty of people who fall victim to this every day, but I am talking about a more sinister version- mathematical models.
As the buzz continues for terms like “Big Data”, “Predictive Analytics” and “Machine Learning” its important to reflect on just how pervasive these things are. After half-finishing a book called Weapons of Math Destruction by Cathy O’Neil – I have a 7-month old at home and a library membership – I was once again struck by how easily our lives are controlled by the predictive algorithms of others.
Have you ever checked to see what’s “new” on Netflix and then moments later did the same on another account? The results are amazing. Almost certainly you’ll find a completely different selection of shows, tailored of course, to what Netflix believes each account likes. Admittedly this can be handy, but recognize it is also a limitation imposed on you by someone else.
In Weapons of Math Destruction Cathy describes many instances (the stories are the best part of her book) where mathematical models start out with the best of intentions, but end up ruining, or significantly damaging lives. For the record I don’t think the models used by Netflix are going to ruin anyone’s life! She posits that the most detrimental models scale easily, and are too often immune to re-testing (models need new data to learn), neither of which is good, but what really bothers me is how opaque the majority of models are. Which means, their parameters are unknown to majority of people who’s fates they shape.
With Netflix there are clues to how my “new” shows are selected. For instance, because I watched Braveheart, Netflix may offer me the chance to watch Gladiator, or I can infer that my previous viewership at least partially, determines “new” content. How on the other hand do online stores decide to offer me a product to buy after purchasing a seemingly unrelated item? “It’s proprietary” is often the answer given, and it is unlikely that anyone outside of a select few will ever know how that second item was chosen.
The examples I have given are fairly innocuous, but as Cathy discusses in her book, even the simplest of data points with the wrong bias attached can negatively impact lives. Postal codes have been particularly abused as they are often considered proxy’s for highly complex realities. It takes little imagination to extrapolate how innocent people can be disadvantaged when postal codes are placed in models used to govern police routes, or guide loan approvals. “Want to go to college and need a student loan? Well, it’s too bad you live in neighborhood X because our model has determined you are too great a risk to default”. Not exactly fair.
So, I encourage you to read her book if this sort of thing interests you. Cathy has a prolific past in data science and some intimate insights into how “Big Data” really works on wall street, or in large online vendors. At the very least I hope you realize that whenever something is automated for you, there is mathematical model making choices on your behalf.
Perhaps its best to ask yourself, “what am I not seeing/getting/hearing about”?