ANALYSIS OF TWEETS OF SPANISH POLITICIANS AND STUDY THROUGH TEXT MINING TECHNIQUES
The objective is to extract information from the social network Twitter, specifically from four users who represent political figures in Spain, for further study applying different statistical and mining techniques. The users studied are: Pedro Sánchez, the current president of the government and of the Spanish Socialist Workers Party (PSOE), Pablo Iglesias, former vice president of the government and of United We Can, Inés Arrimadas, president of Ciudadanos and Pablo Casado, president of the Popular Party (PP). ). The methodology used consists of, starting from an average of 1,900 tweets per user and a total of more than 7,600 extracted tweets, applying hierarchical and non-hierarchical clustering techniques, association rules, classification of tweets based on their authorship using machines of support vector and, finally, an analysis of sentiments using a dictionary of words. For the classification, a mathematical model is trained and the politicians are divided into two groups for its application: on the one hand, Sánchez together with Iglesias, and on the other, Arrimadas with Casado, due to the, a priori, similarity between them. The results show the heterogeneity of the data collected, the classification of tweets based on authorship fails in 8.72% for the case of the president and former president of the government, and 10.44% of erroneous classifications are obtained for the other two politicians. Sentiment analysis shows how users' tweets stand out for having a high degree of “trust”. The results of the classification show how, despite having formed a government coalition in the past, Sánchez and Iglesias publish more different tweets than those published by Arrimadas and Casado, which did not have to show a more even content, but the model failed more times when classifying their tweets.
ANALYSIS OF TWEETS OF SPANISH POLITICIANS AND STUDY THROUGH TEXT MINING TECHNIQUES
-
DOI: 10.22533/at.ed.3172222203104
-
Palavras-chave: Text mining, discourse analysis, statistical techniques, politics.
-
Keywords: Text mining, discourse analysis, statistical techniques, politics.
-
Abstract:
The objective is to extract information from the social network Twitter, specifically from four users who represent political figures in Spain, for further study applying different statistical and mining techniques. The users studied are: Pedro Sánchez, the current president of the government and of the Spanish Socialist Workers Party (PSOE), Pablo Iglesias, former vice president of the government and of United We Can, Inés Arrimadas, president of Ciudadanos and Pablo Casado, president of the Popular Party (PP). ). The methodology used consists of, starting from an average of 1,900 tweets per user and a total of more than 7,600 extracted tweets, applying hierarchical and non-hierarchical clustering techniques, association rules, classification of tweets based on their authorship using machines of support vector and, finally, an analysis of sentiments using a dictionary of words. For the classification, a mathematical model is trained and the politicians are divided into two groups for its application: on the one hand, Sánchez together with Iglesias, and on the other, Arrimadas with Casado, due to the, a priori, similarity between them. The results show the heterogeneity of the data collected, the classification of tweets based on authorship fails in 8.72% for the case of the president and former president of the government, and 10.44% of erroneous classifications are obtained for the other two politicians. Sentiment analysis shows how users' tweets stand out for having a high degree of “trust”. The results of the classification show how, despite having formed a government coalition in the past, Sánchez and Iglesias publish more different tweets than those published by Arrimadas and Casado, which did not have to show a more even content, but the model failed more times when classifying their tweets.
- Úrsula Torres Parejo
- Raquel Enrique Guillén
- Dolores Ruiz Jiménez