After a major feature release one of the engineers in the team rushed to tell me that the search conversion rate had increased by 300% according to the team’s data analysis. I challenged the number quite a few times but he and the team had triple-double checked the numbers and insisted that they are counting 3 times more clicks than before the feature release. I insisted back: if the conversion rate has increased by that much, where is all the additional revenue? 

Our main revenue source at the time was clicks to merchants, a 300% increase would result in a similar revenue increase but the daily revenue hadn’t changed at all. Well, it turned out that the data were flawed, the actual conversion rate increase was marginal.

Humans have to deal with this type of number validation in their daily and professional lives. Being able to perform a basic validation of any number is crucial not only for making quick decisions but for understanding the world as well. Take fake news for example, almost 99% of the time they will contain some numbers. Those numbers will be most often exaggerated in order to draw attention and make a story believable but that exaggeration is also the reason the stories can be easily taken down by a quick number validation.

Numbers are connected. Not only in the math realm but in the physical world as well. 

What if I told you that North American cicadas remain underground for 13 or 17 years before they emerge? The number is quite big to grasp, it is absurd to believe that a species would choose such a big life cycle and even more difficult to understand why not 9 or 21 but 13 or 17 years. You would have every reason to believe that I just made it up that number but before you go on to check Wikipedia let me assure you that it is a fact and those cicadas actually spend that much time underground. 

There are many explanations as to why, the most accepted one being that 13 and 17 are prime numbers and ensure that when the cicadas emerge their natural predators will not be that “hungry”. If a bird for example has a 3 year cycle it will take 51 years before the bird and the cicada life cycle coincide making sure that as many as possible cicadas will survive. 

Numbers are connected with little strings that you can push and pull anytime you need to validate a number. If a number is off there is a chance that you can’t validate it on a standalone basis but you can follow its dependencies and effects and validate those. 

As another example, what if I asserted that a factory produces X millions items of product Y daily? One quick validation is to look at the materials required for that product: are there enough in the world? Same goes for those conspiracy theory government spaceships that require amounts of energy the planet doesn’t have (not even the solar system in some cases).

I have many examples where I, or a team I work with, made decisions based on a number that was obviously wrong but based on real observations. With huge amounts of data and complex attribution schemes there is always the possibility that a critical number is calculated in a way that is disconnected from reality yet everyone takes it for granted. Circulating it so that other people will validate it independently is a sure way to minimize errors. Do you have any examples to share?

Leave a Reply