I was at re:publica the other day and someone confidentially asked me: what exactly is an algorithm. Why is Frau Merkel herself upset and concerned? So here’s the understandable explanation.
In the old days, decisions of all sorts were made wholly by humans. If you applied for a loan, for instance, the bank would look at your proposal, but also look at who you were. Had they heard of you or your family? What is your background? And so on.
But then something new happened.
Credit was given depending on a score. The bank began to look at objective things, like how old you were, how long you had been living at your current address and other things, not subjective things, like what sort of car you were going to buy or whether you were optimistic about the future. This seems obvious to it now, but it is actually a pretty new development in the history of lending money, less than six decades old, say the academics who research such things. So instead of the subjective view of your local assistant bank manager, there is a scoring system to figure out whether you can get your loan or not.
Now this was 90 percent a good thing. For starters, the bank was a lot more likely to get its money back and this is the benefit that banks and the literature understandably focus upon. The costs went down for the bank too, because the guy in the bank just had to be able to get you to fill forms and answer questions. The bank wasn’t depending on him being a person of great insight or a canny judge of character, so less skilled people could do credit assessment. There were also social benefits. You didn’t have to be ‘liked’ by the bank staff in order to get a loan. You didn’t have to have gone to particular schools or to know particular people in your town. What mattered was whether you met the criteria. That might mean that a family that was cash poor, but worked hard and with strong income as a result could get the money to buy a replacement fridge-freezer, or even a house. A good thing.
So far so good. But there are problems. Over-lending was one problem, for people who didn’t manage it well. Another problem was that previously homely institutions became quite formal and suddenly had a lot of strictures. Bank staff didn’t have discretion anymore to deal with special cases. There were also deeper problems. The credit scoring techniques worked well for most cases, but every so often cases come up that didn’t fit in the model. This could go both ways. You wouldn’t make loans that you should have, and every so often you’d make a loan that you really shouldn’t have.
So the models became more advanced. In the beginning it was a pretty simple score, but now there is a lot of complex analysis in these scoring systems. This presentation gives an outline of some of the techniques a young data scientist on the make might apply to improve an insitution’s credit scoring. These techniques can take all sorts of variables, however irrelevant-seeming into the credit equation. And the analysis seems to get better and better.
And the use of techniques like this (‘algorithms’, which is really just a word for a complicated-looking mathematical formula) isn’t confined to banks anymore. Banks got into it early because they had a lot of computers and a lot of data lying around, even 50 or 60 years ago. As other businesses have built up data of their own, they are getting in on the game. Now even the local grocery supermarket has scores and algorithms and all the rest of it.
So far so good. Obviously there are a lot of potential benefits here. Just as algorithms made credit more accessible, they could make healthier eating and regular excercise more accessible, now that so much data is available to gyms and food retailers. In practice it didn’t work out so socially focused. In practice they use this information to try to come up with offers and deals that suit your locality and even you personally. In itself this is not a bad thing at all.
But problems begin to arise. Where do the weightings come from anymore? If you want to weigh something like people’s hobbies into a credit score decision, for instance, you will probably find that the model will give good results. If somebody likes playing polo, or lives in a wealthy part of the city, there is a good chance they repay their loans. That does not mean it makes sense to give any weight to the fact that the person plays polo or lives in a wealthy part of the city when it comes to giving out a loan.
Equally, if people in an area seem to drink a lot of fizzy drinks, does this mean the store should widen the store space devoted to Coca Cola in order to cater to that demand? Maybe by cutting back the fruit-n-veg stall? These ideas are all the more appealing because they will certainly result in a short-term increase in trade and even customer satisfaction. But are they healthy for the community? And are they even commercially sustainable? The disastrous performance of Tesco over the last few years was certainly not prevented by its commitment to data analysis at the heart of its business. When I look at their cold, empty-looking, unfriendly shops now, I can’t help thinking they acted on the results of data analysis and algorithms, but that they didn’t really understand what they were doing or what the consumer wanted. There is a possibility that the reliance on data analytics actually hastened the decline (and maybe Tesco top-brass agreed, as they seemed anxious enough to sell their analytics arm, DunnHumby last year).
The problem is that when algorithms begin to be used this way, the whole thing descends into prejudice. Suddenly people can’t get credit, insurance or access to education because of what are basically subjective criteria, which seem to be relevant, but are not. It’s easy to look at all that data and think that by analysing it, you can find some ‘truth’. You certainly can, but you need judgement. And you almost certainly need a lot more data and a lot more research before you make decisions like that.
And algorithms can be self-perpetuating. Google and Facebook use algorithms to decide which posts and search results to put at the top of the page. So not only are they judging behaviour of you and your community to decide what to show you, they are slowly and surely influencing your behaviour by showing you particular things. This is what Frau Merkel is so upset about. She isn’t against algorithms – you couldn’t have a search engine that didn’t use some criterion to decide what to show first -. But she wants to see some transparency. She thinks we have a right to know why we see what we see on Facebook or Google, and why not something else?
So the arrival of the algorithm has brought many good things, but not only good things. Algorithms are not (yet) a substitute for thinking.