Local news and data journalism

Screenshot of 'Mapping America' - A New York Times data journalism project

Mapping America – A New York Times data driven journalism project

Finding stories from datasets need not be something exclusive to large global news organisations populated with specialist journalists and developers devoted to producing data driven articles.

Regional and local news websites and blogs in the UK have a great deal of data available to them – with hopefully more on the way – and the potential to interrogate that data to produce interesting stories and visualisations is certainly there.

Doing data journalism on a local news website

Earlier this month, as web editor of the Hackney Post, a local news website run periodically by journalism students at City University as part of our production weeks, my strategy was to have one data driven story a day accompanied by interactives such as maps, graphs or charts wherever possible.

To plan for these stories, we looked back at what data had been recently released specific to Hackney or London boroughs. Possibly the best website for London-specific datasets is the London Datastore, built by the Greater London Authority (GLA) in a bid towards making more of the capital’s data freely available.

The GLA actually claims to be committed to “influencing and cajoling other public sector organisations into releasing their data” on the site, which is a very noble and praiseworthy, and there are some really interesting datasets to explore.

I found some recently published statistics on teenage pregnancies across London and some updated data on food hygiene, while the Bureau of Investigative Journalism had released the regional breakdown of the data behind their story about the prevalence of payday lender shops across the UK.

We looked at each dataset, trying to identify potential stories and we did not come out empty-handed. By analysing the food hygiene data for London, Joe Hall found that half of all Hackney’s food outlets were hit with hygiene warnings in 2013, a figure which constituted the highest rate in London. This is the story we published.

London boroughs food hygeine chart

How London boroughs rate in terms of food hygeine – chart by Joe Hall

When it came to teenage pregnancy across London over a 15-year period, Sophie Murray Morris found that while the rate of under-18 pregnancies has fallen by 38 per cent in the capital from 1998, the third biggest decrease of any London borough was in Hackney, with a decrease of almost 50%. This was the resulting story.

In both cases, the data was freely available in a clean, structured format and only needed some basic analysis in Excel to compare by region, providing some added context to the figures. The maps, created with Google Fusion tables, also include charts in the infowindow to show how each borough compares to the rest of London and England. (Here is a tutorial I wrote showing how to do that).

Hackney Post interactive map screenshot

Hackney Post data journalism interactive on teenage pregnancy rates in London

Proof lies in the analytics

Evidence of the value that data driven stories have in engaging readers lies in the analytics. In the month from February 24 to March 26, readers on average spent two minutes and thirty seconds on a page of The Hackney Post.

When it came to the stories involving interactive data content, all five stories published in the week I was editor were well above average in terms of this dwell time, with four of five posts enjoying almost double that at around 5 minutes.

These are by no means ground-breaking investigative data driven stories and could perhaps have been built up further with more analysis had we had more time to explore other potentially interesting patterns in the data.

However, it just shows that with some simple spreadsheet experience and basic data visualisation skills, regional news outlets can produce interesting stories going beyond citing just the headline figures for their area that generally lack context, as in this example, which is one of many similar local stories online.

The Trinity Mirror’s ‘data unit’ established last year, where data journalists supply their regional titles with engaging data driven stories on a daily basis, looks like a step in the right direction and hopefully an example of things to come.

Here’s an example of one of the most successful stories they’ve done, a schools database, as featured in the Birmingham Mail. The fact that they are also expanding the team shows that there is a real future in the work they do.

Hopefully, this is just the start when it comes to producing interesting data-driven content to engage regional audiences and not a one-off. As the ability to analyse and build stories using a large volume of data becomes an increasingly important skill for journalists to have, regional outlets need not get left behind.

How to build charts in your Google Fusion Tables info-window

Google Fusion Table chart

How a chart in a Google Fusion Table map looks

Interactive maps are often the go-to tool for data journalists wanting to provide readers with a visual aid to make sense of a story.

For reasons largely associated with the way it is collected and distributed, data is generally available by region – whether electoral ward, constituency, local authority, country – and the list goes on. This makes for visualising it using maps ideal, giving people a sense of how their own area is doing relative to other areas of their town, other cities, regions or countries.

Furthermore, with data published regularly in a consistent format, which tends to be the case in national government data releases such as that published by the UK’s Office for National Statistics, creating maps in a tool like Google Fusion Tables for example, is relatively straightforward once you get the hang of it.

But sometimes a simple choropleth map (a tutorial on how to do this here) on its own as a way of visualising data may lack analysis or context.

One way to introduce more insight into your interactive map is by adding charts in the info-window box, showing another layer of analysis.

During a recent data journalism class, our lecturer John Burn-Murdoch went through an example and gave us a few lines of the ‘code’ you need to put into the custom fusion table info-window as a starting point from a previous example.

This allowed us to adapt this to our own examples, which I used it to make the map to accompany this story on the levels of young female graduates in the EU.

Clicking on each country will bring up a column chart showing how the rate of female graduate compares to male ones.

google fusion table map with chart

How to make charts appear in the info-window

Essentially, this works by drawing data from your fusion table and the Google Charts API, displaying a bespoke chart for each region when you click on it. It all depends on the chart parameters you input and what data you ‘ask’ it to display.

To try it out, when you merge your KML fusion table with the table containing your data and make the necessary adjustments to how you want to visualise your map, then click on ‘Change info window’ in the left side of your screen in your configure map options.

configure Google Fusion table info-window

After that, you need to click custom in the box that pops up, which in turn allows you to erase the automated content of the info-window and paste in your own, or as in this case, an example from a previous map which you can then adapt to fit your own.

Below, I have pasted the ‘code’ from my map for you to modify to fit your map.

I will proceed to go through each of the elements.

Therefore, I am starting from the end to guide you through how to adapt it to your own map.

<div class=’googft-info-window’>

<style=”font-family: sans-serif”>EU Tertiary education attainment, 2012<br><font size=”+2″><font color=”#980043″>{Country}</font></font><br><p><br><img src=”http://chart.apis.google.com/chart?

chxt=x,y

&chxr=1,0,60

&chds=0,60

&chbh=a&chs=310×150

&cht=bvg&chco=980043

&chd=t:{% of women aged 30 to 34 with a degree},{% of men aged 30 to 34 with a degree}

&chtt={difference in rate between women and men}%+difference+between+women+and+men

&chxl=0:|Women|Men|

“></p>

Title and appearance:

<div class=’googft-info-window’>

<style=”font-family: sans-serif”>EU Tertiary education attainment, 2012<br><font size=”+2″><font color=”#980043″>{Country}</font></font><br><p><br><img src=”http://chart.apis.google.com/chart?

So, this part dictates the appearance of your chart, from the font to the title which is “EU Tertiary education attainment, 2012”, to the size of the title, colour of text and what is displayed at the top.

The thing to remember is that anything in curly brackets is a value from the fusion table and changes according to what country you click on. google fusion table customising info-window

So in this case, with {Country}, each time you click on a specific country, it will display the name of that country. The same applies every time the curly brackets are used and it is especially important further down when it is drawing on values in your columns.

A <br> tag indicates a linebreak, while <p> is a paragraph tag.

After that, the rest of the information will determine the type and features of your chart.

Chart Type

Cht for example denotes your chart type. What comes after cht= will determine what type of chart it will be. Bvg is a simple column chart, while lc for example will give you a line chart if that’s the best way of displaying your data.

Here is a list of all the  different chart types and their code.

When you change chart type, you may need to modify the other parameters you have to get it to display and a lot of trial and error is the best way of figuring this out.

Chart features – other parameters

&chco=980043 

Series colour

&chs=310×150 

Size

&chbh=

Spacing between bars

chxt=x,y 

Visible axes

&chxr=1,0,60

Axis ranges

&chds=0,60 

Scale

&chd=t {% of women aged 30 to 34 with a degree},{% of men aged 30 to 34 with a degree}

The data you want your chart to display. You can either use a curly bracket to bring in data from your table or input a value. After each data range use | to indicate a new value range in the next column/line etc.

&chtt={difference in rate between women and men}%+difference+between+women+and+men

Chart title. You can either use a column from your fusion table to change based on each value, or just text. As you can seem I have used both. Important to say here that you need to use a + sign whenever you want to have a space in between the text, as it will not recognise spaces.

&chxl=0:|Women|Men|

Axis labels

Here is the full list of chart parameters when making your chart and you will need to change these around based on your values and what type of chart you want to display, but having the code already is a good starting point.

Happy mapping!

Getting data from a PDF

Tabula tutorial

Tabula, one of the five tools I examine in my article

We live in a world where PDF is king. Perhaps we could even go as far as calling it the tyranny of the PDF.

Developed in the early 90s as a way to share documents among computers running incompatible software, the Portable Document Format (PDF) offers a consistent appearance on all devices, ensuring content control and making it difficult for others to copy the information contained within.

However, for a data journalist whose job depends on being able to extract bulk data for analysis and visualisation, PDFs as the filetype of choice does not tend to go down well.

In a field of journalism where the spreadsheet needs to rule the roost, I look at a few ways of turning data enclosed within PDFs to spreadsheets (excel xls or CSV), primed for data analysis.

What’s always important to remember in trying to get data out of PDF files is that there is no single catch-all way that works for every occasion, sometimes it’s just a matter of trying each one until you find the one that works.

For the rest of the article and the tutorials, published on the Interhacktives website, please click here. 

Data journalism tools – using Quartz’s chart builder

Quartz chart

Quartz chart

There appears to be a real appetite for new tools attempting to make life easier for data journalists.

Ranging from the advanced, like powerful software for cleaning messy data Open Refine to the simple interactive chart-maker Datawrapper, a real favourite among the data driven journalism community, data journalists have an arsenal of tools at their disposal.

I recently found out that in-house application that digitally native global economy news website Quartz use to make charts is open source and freely available for anyone to use, and it really is another tool worth adding to the list.

Quartz’s chartbuilder is very much like Datawrapper but without the added level of interactivity, which is actually neither always necessary nor possible, for example for wordpress.com accounts that do not allow embedding charts.

To use the chartbuilder, you simply paste the data into the tool and then pick between bar, column, line and scatter depending on how you want to visualise each data series.

Quartz chartbuilder back end

Quartz’s open source chartbuilder is as simple a data journalism tool as they come

There are fields in the workspace for tweaking things like graph axes and labels, while any change you make is visible as you work.

I recently used it for a data-analysis of an EU survey on data roaming and found it to be extremely straightforward, while the option to export the chart as a picture file works a charm.

It’s important also to consider Quartz’s focus on mobile optimised content and how a lot of more interactive content does not really work on mobile, whereas more ambitious tools may not work so well on a smaller screen.

Also, for those into their design, you can export the chart as an SVG, open and subsequently edit it in Illustrator.

Quartz initially created the chart builder in order to lower the barrier for what they described as “non-technical and less-designerly journalists” to create charts in their own newsroom.

According to the application’s creator, David Yanofsky, the chart-builder “has helped all of our reporters and editors become more responsible for their own content and less dependent on others with specialized graphics skills”.

And opening it up to the rest of the world will hopefully see journalists elsewhere become more responsible for the data and visual content to accompany their articles.

How to verify a photo – Google image search

When it comes to photos and the internet, the mantra is “if it looks too good to be true, then it probably isn’t true”.

Surprisingly, that’s something many people, including journalists, often ignore, and as a result they tend not to apply the care that online photo verification requires.

From disaster movie wallpapers masquerading as hurricane Sandy pictures to an image of a radioactively mutated 50 metre giant squid, internet photo hoaxes are regular occurrences, with journalists often joining gullible social media users by falling into the trap.

There are a number of ways to test whether a photo is genuine, and while some cleverly photoshopped images may escape scrutiny it is always worth checking up on an image you have doubts about.

This is where Google image search comes handy.

Google image search

Google image search

While it is not completely full-proof, running a picture through Google’s image archive should be the first stage of the process, and it will often allow you to dismiss a photo as being fake in a matter of seconds.

A Google image search can be done in one of two ways: either by dragging the photo to a tab or window in your browser open on the Google image search page, or by using a chrome extension.

Google image search – Method 1

Go to images.google.com/ and click the image button (PHOTO 1). Then simply drag the photo into the image search box, as in the video.

Google image search – Method 2

Install this chrome extension and right click on an image found online to run it through Google image search.

Google image search extension

Google image search extension

The results it gives will show the extensive history of the image, including when and where this image has appeared previously and what other, visually similar images are available. This can offer an indication of whether the image has been tampered with.

In our example, the photograph of the shark supposedly swimming in the streets of New Jersey can easily be identified as fake, since articles linking to the image were available a long time before Hurricane Sandy took place.

This may seem like an easy example to pick, and it probably is. However, a lot of the fake images that start trending at the time of, for example, an extreme weather event have appeared online in the past, either as an internet hoax or images of an actual event that may have taken place at a completely different place and time. This just shows how easily they can be debunked with a simple Google image search.

Here’s more on that from this Storyful blog post, taking you through how easy it was to debunk fake photographs that appeared during the lead up to Hurricane Sandy.

Other useful resources for photo verification:

Tin Eye reverse image search

http://imageedited.com/

Poynter – Three ways to spot if an image has been manipulated

This article was initially published on interhacktives.com here

The ultimate guide to liveblogging

Example of a liveblog

BBC liveblog of the Boston Marathon bombings

If there is one thing that covering breaking news online has brought the world, it’s the liveblog. Liveblogging has become the default format for engaging audiences in ongoing news stories, allowing websites to compete with rolling news TV coverage.

According to a  City Journalism School study conducted last year, liveblogs on the Guardian website were receiving 300 per cent more views and 233 per cent more visitors than conventional online news articles on the same subject. They also outperformed online picture galleries, getting 219 per cent more visitors.

Whether it’s an extreme weather event, a breaking financial story,  a US election or events like the Oscars and the Grammys, a liveblog is the go-to format to cover it.

I recently had the opportunity to be one of the official livebloggers at the digital journalism conference news:rewired, organised by journalism.co.uk.

Here are a few tips and lessons I learnt about liveblogging.

Don’t be fooled by the name – prepare prepare prepare

While the word liveblog may suggest that all the work takes place realtime, one of the most important things in a great liveblog is the effort you put in beforehand. Researching the story, finding the individuals involved and  keeping up to date with any latest developments from them, setting up your social media lists, preparing links to give context are all essential to a high quality liveblog. By making sure you actually do this before an event gets started or as soon as a story breaks will also set you up for a much easier time when things get going.

The day before news:rewired, I prepared notes on each talk that I would be liveblogging, including the twitter handles of those speaking, as well as some background into the organisations and their work, anticipating that some of the articles may come up during the discussion. Having the link ready and primed to post instead of having to look for it while the discussion was going on was a real help.

preparing for a liveblog

Preparation is key to a good liveblog

Some liveblog platforms even give you the option of creating a ‘raw’ feed where you can prepare posts and pull them into the live feed when you need to, making it even simpler when the time comes. ScribbleLive, the platform we used, offered this option and it made life so much easier.

Don’t try to do everything on your own – Use the crowd

using the crowd in a liveblog

Use the crowd to help you liveblog an event

You can’t do everything alone – and nor should you try to. Especially when reporting on a short length live event like a conference, or a football match, there is real value in gathering and presenting a variety of opinions. There are always other people there who can offer a different – and at times more specialist perspective – and really enhance your liveblog while ensuring you don’t miss anything. Remembering that it’s not necessarily a competition and linking to others will make your liveblog much more relevant. During the news:rewired conference, experienced former journalism.co.uk reporter and currently Wall Street Journal (WSJ) social media editor Sarah Marshall was covering the event through an open google doc. Linking to it regularly and checking anything I may have missed was a real asset.

Vary your style

It took me a couple of attempts to manage to get a good variety in the length of my posts – in the first liveblog I did I triedtoo hard to get everything on a particular issue in a single post before pushing it out, while as a reaction in the second liveblog I was perhaps posting too frequently, interrupting the flow and providing contextless posts.

The ideal is to post short, snappy, interesting facts as they come in, as with  breaking news when covering a news liveblog. Varying it with longer texts providing analysis or background whenever possible is vital to keep the audience engaged, while  links to additional information are also welcome. Liveblogs of cricket matches do this very well, providing over by over updates as well as a lot of analysis, background, audience engagement and plenty of randomness, which is always fun.

Make it visual

liveblog visual

Keeping a liveblog visual is key

Long and continuous blocks of text are hard to keep up with, while breaking up the text with headings, subheading, bullet points and lists – as well as adding images and videos –  is crucial in providing some colour and scannability to whatever you are covering. So there are a number of reasons to focus on keeping your liveblog visual. Whether it’s breaking news and giving a feeling of the situation on the ground or an awards ceremony and giving your audience an insight to what is happening there, keep the photos and videos coming. Keep tabs on instagram (using third-party services like Gramfeed and Statigram) hashtags, twitter search filtered just by pictures and video, vines and use your own phone to take a quick snap if that’s an option – anything to keep the content interesting and your audience engaged.

Think of the future generations

 Well maybe not quite future generations, but actually it’s worth realising that a liveblog can often provide a reference long after an event. Turning a liveblog into an ‘as it happened’ post allows people an in-depth catch up on a news story or an event. Taking this into account it’s important to add a summary, key points and links at the top after the event. Collecting links and useful bookmarks could be extremely helpful for those interested so that they can read up later on topics they are interested in when they have more time on their hands.

Here are my bookmarks from news:rewired, collected with pinboard and worth taking a look at if you’re interested in what was said during the conference.

Each talk also has it’s own tag, so for the BuzzFeed keynote speech on making shareable content, the tag is shareablecontent,  for the data journalism on a budget talk the tag is ddj, for short form video it is imaginatively tagged shortformvideo and for Instagram – wait for it – it’s instagram. 

What I learnt from my first hackathon

journalists and developers working together

The Interhacktives team busy working on their idea at Build the News

Photo credit: MattieTK/Flickr 

Well, officially it wasn’t a hackathon but, as a journalist and not a developer, Build the News – a two-day event for student journalists and web developers to team up and compete in the production of a digital journalism project – was as close to one as I’d got.

Known as events where computer programmers and others involved in software development collaborate to produce projects in a short and intense period of time, hackathons (or hackdays) where journalists and coders work in tandem offer great potential for the future of digital newsrooms.

Since September I’ve become a big fan big fan of Hacks/Hackers meet ups here in London, but there is a fundamental difference between listening to others speak about about something they’ve done or built (however helpful and eye-opening it can be) and actually attempting to build it yourself with them.

Build the News, which was organised by The Times digital development team this weekend, was a great experience, facilitating for some brilliant innovative ideas and projects, as well as plenty of fun.

I thought I’d write a few brief observations from the weekend while it was still fresh in my mind, so as a result I apologise beforehand in case this post turns out a little incoherent, it’s been a long week. These are things I’ve learnt from Build the News to apply for future journalism work and projects, not necessarily lessons for future hackathons.

1) Make friends with developers

By far the most important lesson. Talk to them, sit with them, listen to them talk about how they’ve done something – even though you may spend the next few hours trying to understand exactly what they said. In order to understand what is and is not possible to do online, this is step one. Learn about APIs and what they do, ask what  GitHub is and how it works (still trying to figure both those two questions out myself), try to find out as much about the different programming language and generally make a real effort to keep up instead of  switching off, as you may naturally be inclined to do. I am still struggling with this, but the only real option is to learn through interaction. This is certainly no easy process, but one true digital journalists cannot afford to ignore.

2) Open source your work as a journalist

Journalists are traditionally fiercely protective of the process behind their work and given the nature of the industry, often with good reason. The developer community generally takes a very different approach to this and is certainly one we could learn from. Publicly opening up the thinking behind your work, as well as the process involved in the developing phase of your idea, generally gives you a different perspective of what you are doing, as well as generating interest as well as potential for feedback. Halfway through the event our group, which consisted of few students from City University’s Interactive Journalism MA, belatedly set up a tumblr blog to detail our experience, thoughts, ideas, and overall progression of our project. This turned out to be extremely helpful to us in terms of understanding our own idea better by communicating it, as well as encouraging interaction. Obviously opening up your work process in journalism is not always applicable, but whenever possible the more you do it, the more you and others can potentially learn and improve.

screenshot from interhacktives build the news tumblr

3) Be nosy

People – and especially tech savvy journalists and programmers – all use technology in different ways. Staring at someone’s screen while they are working may seem bad etiquette, but it can offer an insight into a new world. Whether it’s a programme, an app, a browser extension or even something as minor as a keyboard shortcut, you can learn a new way of doing something online, improve how you use a particular programme or save time from the mundane tasks you run on your device. Look at how others work and don’t be afraid to ask what they are doing and how they did that when they talk about, or show you, their work. I feel the need to clarify – always within reason.

I’m sure there could be many more points I could make here, but I’ll stop here and hope to write an updated post once I’ve been to a few more similar events.

Can we work out whether the 50p tax rate raised money?

50p

Photo: Howard Stanbury/Flickr – Creative Commons license

The debate following the commitment made by Labour Party Shadow Chancellor Ed Balls of restoring the 50p tax rate for the top earners has been emblematic of how politics and electoral campaigning are played out in the media.

First a party leader, comes out with a policy announcement, backed up by figures and often a new study assessing its financial impact or implications.

Not long after, other party leaders, using a different set of figures or perhaps taking a very different interpretation of the available data, attempt to show why their counterpart is wrong, demonstrating that in fact it is their own plan has the taxpayers best interests at heart.

This will often continue for months on end, with the different sides trying to poke holes in the other’s stats, while both sets of figures highly unlikely to show the full picture, cherry picked to within an inch of their statistical lives.

When it comes to the 50p tax rate and its potential benefits or pitfalls, evidence used to back up each side of the argument have cited two rather different reports, which came out at different times.

Ed Balls/Labour:

“Latest figures from the HMRC show that people earning over £150,000 paid almost £10bn more in tax in the three years when the 50p top rate of tax was in place than was estimated at the time when the government did its assessment back in 2012.”

Chancellor George Osbourne:

“The direct cost (of reducing the top rate from 50p to 45p) is only 100 million pounds a year. HMRC calculate the loss of other tax revenues may cancel that out. It raises at most a fraction of what we were told and may raise nothing at all.”

So who should we believe?

The answer is… well, probably no one.

In customary fashion, Tim Harford, the Financial Times undercover economist and presenter of Radio 4’s “More or Less” programme that investigates the numbers in the news, said in the latest edition of the show:

“If only life were so simple…  and all the taxable income in the country was a delicious gigantic cake and all the Chancellor had to do was decide how big a slice to take. However, taxable income is a moveable cake, it’s a cake that shrinks from the taxman’s cake slice and grows again when the taxman is out of the room.”

Ed Balls cake

Can Ed Balls have the tax cake and eat it too?

Photo: Tattooed_Mummy/Flickr – Creative Commons license

Essentially the main point here is that taxable income changes in response to tax rates. This is especially true in the case of the 50p tax rate, as both its introduction in 2010 by Alistair Darling and George Osbourne’s decision to cut it to 45p were pre-announced.

This allowed for major behavioural responses as people adjusted how they paid their taxes, bonuses and collected dividends based on the rate at the time, either by paying them early in anticipation of the impending tax hike, or forestalling and waiting to collect it later when it was dropped.

It was in place for such a brief time that it tells us very little about “how much it might have raised in the long run when everything had settled down”, according to the analysis by More or Less.

Data journalism site Ampp3d have done a good job in explaining the debate here, raising the issue of other factors at play, not just the revenue raised by the 50p tax rate.

They also analyse how many people a reintroduction of the 50p tax rate will potentially affect, comparing it to other policies and of course answer the question on everyone’s lips, how its potential introduction will affect Wayne Rooney. This is a great way to provide context to the debate.

Given how tough it is to draw conclusions, especially considering the uncertainty surrounding income tax revenue as a whole, (according to More or Less studies by the Institute of Fiscal Studies show that the top rate of tax at which the Treasury would scoop up the absolute revenue could be as low as 30p to as high as 75p), it is unlikely that a definitive answer is likely to emerge anytime soon.

In the next year we will undoubtedly have these stats thrown at us time and time again, used to back up the respective arguments.

Therefore, it’s important to acknowledge that most probably these figures are being used to reinforce an ideological point of view ahead of the general election rather than the result of having studied the facts and adopted a position through a clear and thorough data analysis.

If you haven’t already, definitely worth listening to the full version of More or Less, available here

Interview with Sarah Hartley, Editor of Contributoria

screenshot of Contributoria

Contributoria: A new model for journalism?

What do Edward Snowden, Corn Exchanges, erotica for women and fish and chips all have in common?

They are just some of the topics that feature in the first issue of Contributoria, the new crowdfunded collaborative journalism platform launched last month.

Backed by the Guardian Media Group (GMG), Contributoria allows its community of journalists and interested readers to decide what articles they would like to see written and support each pitch accordingly – with the community involved in all stages of an article’s development.

The underlying aim is to enable the creation of transparent, high-quality collaborative journalism that might otherwise not have been produced.

I spoke to editor Sarah Hartley about how Contributoria is making collaborative, transparent journalism work.

You can read the interview and more about Contributoria here

Finding meaning in the metrics

In the week or so where the Internet was going through its ‘ best *insert literally any word here* of 2013’ phase, I jumped on the bandwagon and wrote a post about the best-received stories of the year gone by on the interhacktives website.

*Interhacktives is the website the students on the Interactive Journalism MA at City University London.

Essentially the post was borne out of a growing desperation to keep the site updated during a slow couple of weeks where work experience and the badly needed Christmas holidays put writing for interhacktives on the back burner for a while.

At the same, this quick analysis of the type of articles that do well for the website offered valuable insight in refining our content strategy and provide added focus for the year ahead.

After Adam Tinworth,  our lecturer for the Social Media and Community Engagement module, pointed to what we could learn from analysing this type of data, I thought it was worth delving into the analytics a little further to find some more meaning in the metrics.

The most-read post published on interhacktives – by some distance – was an article on the top ten tools for data journalism. Not only did it receive more pageviews, but equally as important was the fact that readers spent around two and a half times longer (7 mins and 25 seconds) on that post than is the average for all pages on the site (2 minutes and 56 seconds).

This is a great example of a type of article that can live through time and keep getting pageviews months after initial publication.

As data driven journalism becomes more popular in the industry and as upcoming journalists join media professionals in trying to stay up to date with the skillset needed to do some basic-level data analysis and visualisation, the article’s prominence is no surprise.

pageviews - trends over time

Google Analytics showing pageviews over time for the Top 10 Data Journalism Tools

A look at its popularity over the months shows spikes at different times, with new tweets about it from other sources coming after its initial publication also interesting.

screenshot of a tweet about the top ten journalism article

Top Ten Data Journalism Tools

Another popular story, the third most viewed story of 2013, was in fact a post written in April 2012 about making a website compliant with EU cookie law. Looking at the source of its traffic over the 13 month-period, more than two out of three views came from Google, as this was a topic that bloggers and others were presumably still searching for. It is in this context that this article’s enduring popularity makes sense.

traffic source data for interhacktives article

Traffic source data for the third most popular post of 2013 on interhacktives

Two more recent how-to guides to making a choropleth map and using Raw to make advanced data visualisations are regularly generating traffic over the last couple of months, often featuring on the trending content widget on the homepage.

Rank Article
1 Top Ten Tools for Data Journalism
2 Who did it best: Data coverage of the 2013 local elections
3 Your website, now illegal: How to comply with the EU cookie law
4 How to Make a choropleth map with google fusion
5 A beginners pre-guide to data journalism
6 Friday Interview: Anne Marie Tomchak, presenter of BBC Trending
7 How to make an alluvial diagram
8 Making data accessible: Interview with Nick Scott from import.io
9 Pivot Tables are your best friend
10 Interview with Andrew Hill of CartoDB

With most interviews for example, they may receive attention at the time, especially via social media, but are unlikely to keep generating traffic to your time in bulk.

Google analytics graph of an interview article on interhacktives

Trend over time graph for an interview article

In fact, analysis of the top 10 articles from January 1st 2013 to February 1st 2014 in terms of pageviews shows that the majority are predominantly timeless, durable pieces of content of use to readers beyond their publication date.  They are what you would describe as ‘stock content’.

Fellow coursemate Sophie Murray Morris offers an excellent analysis of stock and flow content, a concept originating from economics.

Here’s what she says:

“…stock content is durable. Examples of stock content include podcasts, videos, guides and research work.

Flow content is the stream of daily and sub-daily updates. For instance, news articles, surveys, live blogs and social media updates.

While flow content helps to keep newspapers or brands in the public eye, stock content drives steady and continuous traffic to websites over a long period of time. This is why it is really important not to remove good-quality archived content from a website. Good quality archived content can still drive views in if people are researching the topic, for instance.”

Articles explaining the difference between Sunni and Shia Muslims are a great example of stock content that will likely drive views long after the publication date.

screenshot of google search difference between sunni and shia

Difference between sunni and shia google search

As the question undoubtedly will crop up regularly across time, any explainers on the issue will regularly attract traffic. The first two search results are from the BBC from 2009 and 2011, while the Economist’s May 2013 guide comes in third. Both websites will certainly get hits on their site regularly from this one-off explainer based on people’s searches.

Perhaps in the media industry content such as explainers, how-to guides and reviews of apps and tools are often perceived to be of secondary importance, to accompany a major development or news piece. That maybe so, but they given the nature of the internet, they can live much longer online than the news article and are a core part of the journalistic task to inform the population.

At interhacktives have perhaps been guilty of not focusing enough on this and the potential the website offers to create long-lasting stock content based on the skills we are regularly taught and experiment with as part of the course.

Over the next few months, that is something we should perhaps turn our attention to a little more and leave a lasting legacy on the interhacktives website, hopefully ensuring traffic for the site many months after our involvement with it ends.