Big data: Software replaces statisticians in analysis work

The growing importance of big data in economy and research means that there is an increased demand for statisticians and mathematicians on the employment market. As early as July of last year, the British Royal Statistical Society warned about an imminent shortage of skilled workers who are able to recognise the economic benefit of large quantities of data. New research approaches, as described in a current technology article, therefore deal with the question whether or not it is possible to have computer programmes do the job of a statistician.

The goal is to develop a software that generates readable reports from raw data, i.e. a software that describes the trends concealed within the data in words and diagrams. Zoubin Ghahramani, Professor for Information Engineering at the University of Cambridge, recently introduced just such a system that is already delivering interesting results. For instance, it was able to distil an automated report from a century’s worth of air traffic data, which not only provides mathematical explanations for identified trends, but also enables forecasts to be made for the future.

Nevertheless, a human statistician will probably always be needed to carry out the final evaluation. Although the software recognised the fact that air traffic increased regularly during the summer months, it was unable to provide any real explanation for this (holiday travel during the holiday period). Yet there is no doubt that the computerised statistician is a great help in working with the data.

Prof. Ghahramani wants to further improve his system and is considering a commercial version. This means that his software would be in direct competition with the American start-up company Skytree who a few weeks ago introduced a product that the company claims is able to automatically select the model best capable of interpreting a collection of data. The American company Narrative Science is also active in this environment and is working on a product that translates numerical data into natural language.

Matomo