PageOneX, ready steady go!

View this datavis full size at gigapan and the related post about May 2012 social mobilizations in Spain.

Today’s post is to present the tool we are building this summer: PageOneX. The idea behind is to make online and easier the coding process of front page newspapers. Make  this visualization process available for researchers, advocacy groups and anyone interested. I’ll will give some background about this process.

How things started
Approximately one year ago I started diving in the front page world. It was days after the occupations of squares in many cities from Spain, and I was living in Boston. I made a front page visualization to show what people was talking about: the blackout in the media about the indignados #15M movement. You can read more about Cthe story in the ivic Media blog. Since then I’ve been making more visualizations around front pages of paper newspapers, testing different methods and possible ways to use them. I’ve also made a tool, built in Processing, to scrap front pages from and build a .svg matrix.


PageOneX: first tool
Before starting with the new tool, I’d like to present the present one I’ve been using this past moths:  a semi-automated process with Processing and Inkscape. The code is available at

The process is segmented in two main actions: 1. Get the front pages and 2. Code articles.

  1. Get the front pages
    Open the pageonex.pde file and open it in Processing:

    1. Select newspapers and starting and end dates (be aware of adjusting the size of the newspaper)
    2. Run the program to scrape front pages from It automatically downloads them.
    3. It will construct the matrix of front pages in a .svg (Scalable Vector Graphics) file.
  2. Code articles
    1. Open the matrix.svg in Inkscape. It is a file with 3 layers:
      1. Highlights (multiply option to show transparency)
      2. Filter to make images look lighter.
      3. Images of front pages.
    2. Now highlight by hand the news we want by drawing rectangles (merging of rectangles for news with non-rectangular shapes) on the news.
    3. Export the file to a pretty .png file

That was the process until now. It works, but it is not easy for not tech savy people: you need to have Processing installed, change the parameters and be able to ‘play’ with Inkscape at an intermediate level. That is how we decided to make this tool online, to broaden its use. (now temporary redirected to this blog) is an online platform for analyzing and visualizing coverage of news in newspapers’ front pages. We’ll be coding this summer to have at least a part of the tool available by August-September. We’ll be having some beta tests, so if you are interested, just ask for it! The idea is to open a co-design process where future users take part in the design of the tool.

Why analyzing front pages? Some ideas behind the project:



It seems that analyzing front pages  is a good method, a shortcut to follow how news are being covered in the media. Front pages are a very special piece of the media ecology: newsrooms spend a lot of time deciding what goes in their A1, fighting which news have to be in their page one.


With this kind of visualizations we are able to show the data and the analysis at the same time. We can show in a bar chart quantitative data regarding the coverage, but also the data source it self: the front pages. We want to offer a visual and direct way to visualize all the coding. Check the The Global Media Monitoring Project Report methodology to see which other interesting  approaches to this coding process. We have tested with Gigapan the possibility of exploring this huge graphics, to be able to read the newspapers and also have a sense of the whole data visualization.



We’ll be using the percentage of every front page, instead of the actual size: it will allow us to compare newspapers with diferent sizes as we did with this example of US and Spanish newspapers covering the Japanese Tsunami in 2011.

Here there are slides about the project and basic draft for the User Interface.

Arab Spring, Spanish Revolution, and Occupy Movement: Mainstream Media vs Social Media coverage

The project is incorporating new collaborators: Ahmd Refat as coder, thanks to the Google Summer of Code programmSasha Costanza-Chock and Nathan Mathias from the Center for Civic Media at MIT Media Lab. Nathan is developing is also developing Mediameter with more people at the Center for Civic Media, we might be using their framework to build our tool and not start from scratch. Mediameter is used for crowd-source analysis of articles. We are also looking at Mapmill by Jeff Warren, built in Ruby on Rails, as it is a system to code  image as well.

Stay tuned!



Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>