Abstention Rate and New Information Technologies in Spain. Part III

Analysis of the correlation between the Abstention Rate and the barrage of the New Information Technologies in Spain with Python and stating conclusions.

Francisco Herrera González
5 min readJun 25, 2021

That’s it! We arrived at the last part of this analysis. If you have reached this point, let me congratulate you!

In the last two articles we have analysed the evolution of voters turnout in Spain and the influence of the ICTs in the spanish households. Now we are going to find out if there is a relation between the two variables using a statistical model as correlation is.

First of all, let’s us recall the data frames we are going to work with:

So, can you notice if there is any problem with these data tables?

Effectively, we still have some formatting problems. As you may understand, since the elections is an every four-years event, we have less data entries for the abstention table (one by election year) than the ICTs table. Furthermore, as the ICTs are something relatively new, we start to have data entries from 1995 onwards, while the abstention set starts in 1979…

Those differences between tables leads us to a typical data analysis problem that we will repeatedly find over our career: Which is the better method for a good accuracy?

Indeed, you can choose among plenty of paths to resolve this kind of problem. For example, you can make a cluster for every 4 years in the ICTs table with the mean values for each 4 years, you can calculate manually and load it in a new frame, etc. If you are lucky enough to work in a team, this kind of decision may normally be taken among all the project’s crew.

In this case, as I want to make an easy example, I’ve chosen the simplest way, which is to work with the respective years in each table.

Let’s start!

Good!

Now we have a data set containing all the data ready to work with. We are going to analyse if there exists a correlation between each variable and the abstention rate:

What conclusions have we obtained after this individual correlation analysis?

First of all, the small amount of data that we have, made the analysis less accurate. Although we could have a little approximation, the low correlation values and the high P-values in each of the performed models indicates that the probability of our hypothesis (the voter turnout rises with the higher exposition to the ICTs) to success has a low reliability level.

In other words, according to our statistical results, it is more likely that the correlation between our variables is a chance occurrence instead of a possible relation.

That said, and emphasizing again the high P-values for each of the correlations, we can see, nevertheless, how there exists a slightly negative correlation between the TV consumption and the abstention rate, i.e, as one variable increases the other one has a little decrease. On the other hand, we can appreciate how there is a slight positive correlation between the abstention rate and the internet consumption (for the last day consumption as well as for the last month consumption), both variables had a gentle trend to increase together.

May this be the top of the iceberg that will show us that, effectively, the increase of the ICTs exposure could lead in a rise of political apathy?

It would be, of course, excessively presumptuous to try to set a succinct conclusion with the analysed data. We should imagine this survey as a big puzzle to solve from scratch; first we start by adding pieces in order to make the remnants of images, and by putting them together, little by little, we make the whole set. For this purpose, it is ideal to ask ourselves the questions that entails such pieces. For example: Does the people trust in the internet? Where do they read up on the news? Has the political propaganda increased in the net? Which causes there are behind the maximum and minimum values?

Throughout the analysis of the relation among the answers to those questions, we would maybe be able to start building a consolidated image of what is behind the voter turnout and the new ICTs effect and, in the future, have a clear answer to our question.

And now, which way to go forward? First of all, I would propose to analyse the rest of electoral processes in the country. We have chosen only the presidential election processes, with the consequence of reducing our effective sample size and, by extension, draining its accuracy and reliability. Is for this reason that it would be convenient to repeat the process of this article, but by analysing the evolution of ICTs consumption and voter turnout for each of the regional elections in every autonomous community, so that the possibility of a correlation between both variables can be a little more accurate. Would you like to try this process with the data of your autonomous community?

This is how far we have come with our analysis between the abstention rate and the ICTs household consumption in Spain. I would like to thank you so much if you have followed all the process! In order to make it more understandable, I invite you to visit my GitHub and download this Notebook and my other projects: https://github.com/FHERREGON.

I hope you’ll enjoy the next articles too!

See you soon!

--

--