14. Open code and data

This chapter is structured in 3 section: The first one, Code repositories, is a list of the existing code repositories that have been used in the current research. They contain all the code that has been developed to produce all the data visualizations. The second section, Databases, expands the information of the databases. The last section contains the list and transcripts of the interviews to newsrooms of Spanish newsmedia.

14.1 Code Repositories: data gathering, processing and visualizing

These are the code repositories that host all the scripts used to develop the data gathering processes, data analysis and data visualizations for the current research. I hope they are good documented enough so other researchers can reuse the code¹.

Newspaper front pages

PageOneX is a software to help analyze newspaper front pages and measure the surface area dedicated to pre-selected topics.
Code: https://github.com/montera34/pageonex/
Online instance: http://pageonex.com
Color Corrupción: how to download the monthly threads at pageonex.com (2009-2019) that measure corruption coverage about corruption in Spanish newspapers front pages.
Code: https://github.com/numeroteca/colorcorrupcion
R scripts to analyze Pageonex. data Code: https://code.montera34.com/numeroteca/pageonexR
Agenda y voto CIS study (Q-ÍNDICE, 2006) analysis: R scripts to read data from existing study and calculate percentage of coverage by party, year and framing. https://code.montera34.com/numeroteca/agendavotocis

News sites home pages
Homepagex is a software that analyzes news sites home pages. It has two parts: 1) downloads home pages with web scraping with python, and 2) parses with R the HTML code to track headlines. https://code.montera34.com/numeroteca/homepagex

TV newscasts

VerbaR is a software that analyzes TV newscasts subtitles from the Verba database (Fundación Ciudadana Civio, 2022). https://code.montera34.com/numeroteca/verbar
Online instance “When they speak about you in the Telediario” software. https://r.montera34.com/users/numeroteca/verbar/app

Twitter
Set of scripts written in R that help analyze tweets gathered with t-hoarder and twarc: https://code.montera34.com/numeroteca/tuits-analysis

Multiple news and social media channels
Compare news and social media channels for the Cifuentes’ Master scandal in-depth study https://code.montera34.com/numeroteca/ekosystemedia

Public opinion
This set of scripts processes and analyzes CIS barometer microdata, and yt is also used to analyze and compare public opinion evolution with news coverage data from Spanish Policy Agendas (Chaqués-Bonafont et al., 2014, 2015) and Color Corrupción (Rey-Mazón, 2022a) databases. It also processes front page news coverage data from the Spanish Policy Agendas project. It is developed with R: https://code.montera34.com/numeroteca/barometro_cis

14.2 Databases

This sections expands the information about the databases, about news media, Twitter and public opinion data, used in the present research and that have been already presented in chapter 7.

As a backup, and to simplify the access to all the data, all the datasets have been compiled in the this repository https://osf.io/gpm8x at the Open Science Framework (OSF), an online platform hosted by the Center for Open Science non-profit organization (Rey-Mazón, 2022c).

(…)

1Write to pablo@montera34.com if you have questions.

14.1 Code Repositories: data gathering, processing and visualizing

14.2 Databases

Search in numeroteca.org