Sprockler in combination with statistics
Sprockler’s statistician René van der Heijden shares his vision on the combination of Sprockler and statistical research.
With Sprockler it is all about collecting stories. Stories are valued and classified by the respondents themselves and are subsequently presented with all these characteristics in our Visualizer. With the right presentation, patterns can become visible that can provide valuable insights. But all of this is done without statistical proof. Sometimes our users indicate that they have issues with this. Sometimes they themselves have problems with this, but more often they have difficulties convincing their clients that Sprockler is a suitable method regardless of this fact. Let me first state that all information in Sprockler is downloadable in a format that is suitable to use for statistical analysis in, for example, SPSS. I have done this myself occasionally to assist a client with calculating the significance of a series of associations. What has become clear visually is then further supported with ‘hard statistics’.
In this piece, I want to draw your attention to these ‘hard statistics’ and put it in perspective. As statistician, I will of course not claim that statistics are nonsense. There are many applications where statistics are extremely useful, but the importance of statistics is often exaggerated.
Embellished truth
My impression is that this is mainly due to a too dogmatic view. If a phenomenon passes the 95% test, it is experienced as ‘true’, and if not it is ‘not true’. If you take a minute to consider this, you understand that this leads to incorrect images.
Additionally, people often lie when responding to inquiries – or well ‘lie’ might be too strong – but people answer in line with how they would like to see themselves. This explains the high percentages of embellished answers, even if people are convinced that the inquiry is completely anonymous. This effect continues to exist if questions are formulated in a way that it makes it easy for them to tell the truth. This is common knowledge, but when interpreting results, this is often not taken into account. Numbers are presented as the best we have to build conclusions on. With the question “Do you always stick to the speed limit or do you sometimes drive faster than the allowed limit of 10 km/h on the parking lot?” I have seen that 40% of respondents indicate that they do not always stick to the speed limit, whereas speed measuring devices show us that 75% of people continuously drive too fast! What does it means, looking at it from this light, that when research is repeated a year later it shows that significantly fewer people break the speed limit?
The influence of researchers is also often grossly underestimated. People can, consciously or not, steer a sample, an inquiry, the order and formulation of questions, the moment and setting of recording, the way in which outliers are treated and countless other aspects. One aspect that one cannot influence is the group of people who refuse to answer the questions, which is rarely a random group and which can cause quite a bias. When processing data, one has to chose which links are reviewed, which target audiences are included, what the level of significance is, and, last but not least, which researchers interpret the report. It is not rare that that interpretation deviates from the factual statistic results. It seems that in advance researches have certain convictions, which they all too often see confirmed in the research. Dr Sanne Blauw (econometrist and journalist) says about this: “Somebody else, with the same research questions but with different beliefs or perspective, probably arrived at different results. Numbers are supposed to be objective, but I suddenly saw how much they are connected with the researcher.”
A convincing story
You can neatly calculate significance based on collected data, but that should not be convincing on its own. In the end, it is not about the statistics, but about the convincing story. Statistics can play a role in this, but on its own it is never sufficiently convincing. We also have to be convinced about the quality of the analysed data, the objectivity of the researchers, the methods used, the level in which alternative explanation have been researched. The question about what the significance exactly means must also be asked. Significance is not about ‘the truth’, it only says something about coincidence. Coincidence under the special conditions of the null hypothesis.
What do x-ray radiation, wireless telegraphy, the CCD image sensor, the existence of blood groups, electrolysis, penicillin and the MRI scanner have in common? They are all inventions that were awarded with the Nobel Prize! How many statistics do we need to see that penicillin works? That you can use an MRI scanner to make an image of the inside of a human? That blood types exist? Nobel Prize worthy inventions are so obvious that we do not need statistics. The same goes for other connections. With clear connections, we do not need statistics. With unclear connections, statistics can more or less demonstrate the relation, but the value of that connection is unclear
Data and stories
It is often difficult to make the right statistical calculations about the complex reality. The result of the wrong (simplified) calculations often seems convincing, but should of course not be. When talking about causality, it is even more complicated. Very often, causal connections are suggested when encountering clear associations. This is mainly the case when looking with tunnel vision for certain connections. The facts are important, but in the end it is about their meaning. This meaning does not just follow from statistics. Stories, however, do contain information about cause and effect. Stories are often very convincing, especially when they are not isolated. That is what Sprockler is about: collecting stories and showing that they are not one-off events.
At the same time, it is possible for everyone to download the data and, just like with other inquiry tool, conduct a statistical analysis with external software. That being said, with Sprockler, we do have the ambition to add statistics as a feature of our product on the long term.
Rene van der Heijden
Statistician