Data byline: “How does the London Marathon compare to other races worldwide” – The Guardian

Screen Shot 2013-05-05 at 23.04.37

I wrote some of this and also put together the basis for the Tableau visualisation at the bottom.

What I found quite interesting was how the larger marathons seemed to spread by continent. They were being founded in the US sporadically for the first half of the twentieth century and then suddenly there was a rush in the late 70s and early 80s in Europe. A similar thing happened later on for Asia.

The sample size is not that big because obviously there are not that many marathons with over 10,000 people running but it’s still quite a nice thing to watch on the map. I could get a Tableau animation in worksheet view but not in dashboard view. Anybody else have this problem?

 

Advertisements

Using Many Eyes to create data visualisations

Many Eyes is one of the better free online tools for making quick data visualisations – here is our guide to putting one together.

The Interhacktives were introduced to it in our latest class with James Ball. It’s a pretty foolproof process so I thought I would take you through it:

Firstly, you need to look at your data and decide what can be visualised and why you are visualising it. Just like any good journalism you need to tell a story with your infographic – so do not just throw in any old chart to dress up an article.

The dataset I used was the ONS’s recently released statistics on the Live Births in England and Wales by Characteristics of Mother (2011). I decided to pluck out the age of parents giving birth during that year.

Open up Many Eyes – create a profile – and click “Create a visualisation”. You can use datasets already uploaded to the site or upload your own, which is what we are doing.

Make sure that your data is formatted cleanly as you are going to have to copy and paste it into the table into this rectangle. With this data in particular I combined the births within marriages and births outside of marriages and then copied and pasted onto a new sheet – you can get my cleaned data here.

I removed the total amounts (unnecessary for my visualisation) and made the row and column titles clearer – instead of having one row saying “Father” with the age brackets below, I put the word “Father aged” before each age category. I did the same for the age of the mother – the row and column headers will be used within the Many Eyes visualisation so you want to make sure they are as clear as possible.

Paste the data

Many Eyes will then show you how it has processed the data – make sure that what it views text and numbers correctly. Also make sure that your data is replicated exactly how it is on your original spreadsheet.

Check that we understood
You can then upload your data to the site. Just for a disclaimer: ANY DATA UPLOADED TO MANY EYES WILL BE PUBLICALLY AVAILABLE. If you do not want people to know what you are doing either give your data an obscure title or do not use the site.

Go to the next step and it will show you your fully formatted dataset. This is the fun part: click “Visualize”.

You will then be offered a number of visualisations from which to choose. Think about what you want your data to show and the best way in which to show it. James pointed out that when you have some large numbers and some small numbers bubble charts often work best for giving a true comparison – so that is the visualisation I went for.

You have the option to “Flip” if the visualisation is not showing what you want it to show.  But if you are using the same data as me it should look like this:

Age of new parents in England and Wales  in 2011 Many Eyes

What I learned: Although most people have children with fathers of the same age,  it shows that fathers are much more likely to be older than vice versa. This is something much more pronounced when you pluck out the live births to married parents, which I also created a graphic for. Also there are some curious outliers: 2 children were born with a mother under 20 and a father between 60 and 64.

It is a pretty simple process and we all got the hang of it relatively quickly so I would encourage you to make your own. Let us know if you do and please share them with us.

Take a look at the data here.

Originally posted on Interhacktives.com

Winter break – using data in the newsroom

Over the past few weeks I have had my first experience of using data journalism in a national newsroom during placements at the Independent and the Telegraph online.

Telegraph

At the Telegraph I was working on the Interactive team, who are commissioned regularly to make graphic-led news stories. What made this placement so interesting was the chance to work with the team of designers who were more than willing to tell me about what makes a good graphic.

It also gave me a chance to compile together a series of datasets to make stories. The story that I am proudest of was definitely this:


Interactive Graphic - telegraph

Using season ticket data for the past two seasons (compiled by the Telegraph last season and the BBC this season) we worked out how much each goal cost the fan with the cheapest season ticket last season.

Fans at Liverpool were paying in excess of £30 for each goal scored while fans of Manchester City could pay less than £5 for a goal. Clicking on each club will also tell you how much the season ticket has increased/decreased. As City’s cheapest ticket has not increased  by much at all they seem like a pretty sound investment for an exciting game.

The Independent

I was not working on the news desk at The Independent but sitting with the online team. I did have a play around with the census data in my own time when it came in and found which towns now had more part-time female workers than full-time female workers. Most of these places seemed to be holiday destinations (such as the Derbyshire Dales and bits of Cornwall/Devon).

However, the statistical change was so minimal from the last census that it did not seem significant enough to run a story on. This underlined the importance to me at heading towards data with a hypothesis and not just aimlessly fishing around – fun though it is.

The highlight of my placement was undoubtedly this:

Independent front page!

I am still buzzing about being on the front page of a national but am not going to go too into depth on what I did for this story. My only comments are that a decent knowledge of Excel is so useful when working with surveys and that good admin sometimes equals good journalism.

Data byline: “Interactive graphic: the real value of every Premier League club compared” – Telegraph online

Screen Shot 2013-05-06 at 12.19.38

 

Using Premier League clubs that were in the league over two seasons, we divided the cost of the cheapest season ticket by the number of goals. Manchester City came out at the best value by far (despite being winners of the league). It was actually relatively difficult to find this data as price surveys of the British game, such as the BBC’s Price of Football change their goalposts – or as I play the game, jumpers on the ground in the park – over a couple of years. They were not collecting data for the cheapest season ticket during the 2010/11 survey. The Telegraph had luckily got what we wanted themselves though so we worked with that.

This also brought home to me the abundance of great sport data there is – only scratching the surface with this.