Local news and data journalism

Screenshot of 'Mapping America' - A New York Times data journalism project

Mapping America – A New York Times data driven journalism project

Finding stories from datasets need not be something exclusive to large global news organisations populated with specialist journalists and developers devoted to producing data driven articles.

Regional and local news websites and blogs in the UK have a great deal of data available to them – with hopefully more on the way – and the potential to interrogate that data to produce interesting stories and visualisations is certainly there.

Doing data journalism on a local news website

Earlier this month, as web editor of the Hackney Post, a local news website run periodically by journalism students at City University as part of our production weeks, my strategy was to have one data driven story a day accompanied by interactives such as maps, graphs or charts wherever possible.

To plan for these stories, we looked back at what data had been recently released specific to Hackney or London boroughs. Possibly the best website for London-specific datasets is the London Datastore, built by the Greater London Authority (GLA) in a bid towards making more of the capital’s data freely available.

The GLA actually claims to be committed to “influencing and cajoling other public sector organisations into releasing their data” on the site, which is a very noble and praiseworthy, and there are some really interesting datasets to explore.

I found some recently published statistics on teenage pregnancies across London and some updated data on food hygiene, while the Bureau of Investigative Journalism had released the regional breakdown of the data behind their story about the prevalence of payday lender shops across the UK.

We looked at each dataset, trying to identify potential stories and we did not come out empty-handed. By analysing the food hygiene data for London, Joe Hall found that half of all Hackney’s food outlets were hit with hygiene warnings in 2013, a figure which constituted the highest rate in London. This is the story we published.

London boroughs food hygeine chart

How London boroughs rate in terms of food hygeine – chart by Joe Hall

When it came to teenage pregnancy across London over a 15-year period, Sophie Murray Morris found that while the rate of under-18 pregnancies has fallen by 38 per cent in the capital from 1998, the third biggest decrease of any London borough was in Hackney, with a decrease of almost 50%. This was the resulting story.

In both cases, the data was freely available in a clean, structured format and only needed some basic analysis in Excel to compare by region, providing some added context to the figures. The maps, created with Google Fusion tables, also include charts in the infowindow to show how each borough compares to the rest of London and England. (Here is a tutorial I wrote showing how to do that).

Hackney Post interactive map screenshot

Hackney Post data journalism interactive on teenage pregnancy rates in London

Proof lies in the analytics

Evidence of the value that data driven stories have in engaging readers lies in the analytics. In the month from February 24 to March 26, readers on average spent two minutes and thirty seconds on a page of The Hackney Post.

When it came to the stories involving interactive data content, all five stories published in the week I was editor were well above average in terms of this dwell time, with four of five posts enjoying almost double that at around 5 minutes.

These are by no means ground-breaking investigative data driven stories and could perhaps have been built up further with more analysis had we had more time to explore other potentially interesting patterns in the data.

However, it just shows that with some simple spreadsheet experience and basic data visualisation skills, regional news outlets can produce interesting stories going beyond citing just the headline figures for their area that generally lack context, as in this example, which is one of many similar local stories online.

The Trinity Mirror’s ‘data unit’ established last year, where data journalists supply their regional titles with engaging data driven stories on a daily basis, looks like a step in the right direction and hopefully an example of things to come.

Here’s an example of one of the most successful stories they’ve done, a schools database, as featured in the Birmingham Mail. The fact that they are also expanding the team shows that there is a real future in the work they do.

Hopefully, this is just the start when it comes to producing interesting data-driven content to engage regional audiences and not a one-off. As the ability to analyse and build stories using a large volume of data becomes an increasingly important skill for journalists to have, regional outlets need not get left behind.

Advertisements

How to build charts in your Google Fusion Tables info-window

Google Fusion Table chart

How a chart in a Google Fusion Table map looks

Interactive maps are often the go-to tool for data journalists wanting to provide readers with a visual aid to make sense of a story.

For reasons largely associated with the way it is collected and distributed, data is generally available by region – whether electoral ward, constituency, local authority, country – and the list goes on. This makes for visualising it using maps ideal, giving people a sense of how their own area is doing relative to other areas of their town, other cities, regions or countries.

Furthermore, with data published regularly in a consistent format, which tends to be the case in national government data releases such as that published by the UK’s Office for National Statistics, creating maps in a tool like Google Fusion Tables for example, is relatively straightforward once you get the hang of it.

But sometimes a simple choropleth map (a tutorial on how to do this here) on its own as a way of visualising data may lack analysis or context.

One way to introduce more insight into your interactive map is by adding charts in the info-window box, showing another layer of analysis.

During a recent data journalism class, our lecturer John Burn-Murdoch went through an example and gave us a few lines of the ‘code’ you need to put into the custom fusion table info-window as a starting point from a previous example.

This allowed us to adapt this to our own examples, which I used it to make the map to accompany this story on the levels of young female graduates in the EU.

Clicking on each country will bring up a column chart showing how the rate of female graduate compares to male ones.

google fusion table map with chart

How to make charts appear in the info-window

Essentially, this works by drawing data from your fusion table and the Google Charts API, displaying a bespoke chart for each region when you click on it. It all depends on the chart parameters you input and what data you ‘ask’ it to display.

To try it out, when you merge your KML fusion table with the table containing your data and make the necessary adjustments to how you want to visualise your map, then click on ‘Change info window’ in the left side of your screen in your configure map options.

configure Google Fusion table info-window

After that, you need to click custom in the box that pops up, which in turn allows you to erase the automated content of the info-window and paste in your own, or as in this case, an example from a previous map which you can then adapt to fit your own.

Below, I have pasted the ‘code’ from my map for you to modify to fit your map.

I will proceed to go through each of the elements.

Therefore, I am starting from the end to guide you through how to adapt it to your own map.

<div class=’googft-info-window’>

<style=”font-family: sans-serif”>EU Tertiary education attainment, 2012<br><font size=”+2″><font color=”#980043″>{Country}</font></font><br><p><br><img src=”http://chart.apis.google.com/chart?

chxt=x,y

&chxr=1,0,60

&chds=0,60

&chbh=a&chs=310×150

&cht=bvg&chco=980043

&chd=t:{% of women aged 30 to 34 with a degree},{% of men aged 30 to 34 with a degree}

&chtt={difference in rate between women and men}%+difference+between+women+and+men

&chxl=0:|Women|Men|

“></p>

Title and appearance:

<div class=’googft-info-window’>

<style=”font-family: sans-serif”>EU Tertiary education attainment, 2012<br><font size=”+2″><font color=”#980043″>{Country}</font></font><br><p><br><img src=”http://chart.apis.google.com/chart?

So, this part dictates the appearance of your chart, from the font to the title which is “EU Tertiary education attainment, 2012”, to the size of the title, colour of text and what is displayed at the top.

The thing to remember is that anything in curly brackets is a value from the fusion table and changes according to what country you click on. google fusion table customising info-window

So in this case, with {Country}, each time you click on a specific country, it will display the name of that country. The same applies every time the curly brackets are used and it is especially important further down when it is drawing on values in your columns.

A <br> tag indicates a linebreak, while <p> is a paragraph tag.

After that, the rest of the information will determine the type and features of your chart.

Chart Type

Cht for example denotes your chart type. What comes after cht= will determine what type of chart it will be. Bvg is a simple column chart, while lc for example will give you a line chart if that’s the best way of displaying your data.

Here is a list of all the  different chart types and their code.

When you change chart type, you may need to modify the other parameters you have to get it to display and a lot of trial and error is the best way of figuring this out.

Chart features – other parameters

&chco=980043 

Series colour

&chs=310×150 

Size

&chbh=

Spacing between bars

chxt=x,y 

Visible axes

&chxr=1,0,60

Axis ranges

&chds=0,60 

Scale

&chd=t {% of women aged 30 to 34 with a degree},{% of men aged 30 to 34 with a degree}

The data you want your chart to display. You can either use a curly bracket to bring in data from your table or input a value. After each data range use | to indicate a new value range in the next column/line etc.

&chtt={difference in rate between women and men}%+difference+between+women+and+men

Chart title. You can either use a column from your fusion table to change based on each value, or just text. As you can seem I have used both. Important to say here that you need to use a + sign whenever you want to have a space in between the text, as it will not recognise spaces.

&chxl=0:|Women|Men|

Axis labels

Here is the full list of chart parameters when making your chart and you will need to change these around based on your values and what type of chart you want to display, but having the code already is a good starting point.

Happy mapping!

Getting data from a PDF

Tabula tutorial

Tabula, one of the five tools I examine in my article

We live in a world where PDF is king. Perhaps we could even go as far as calling it the tyranny of the PDF.

Developed in the early 90s as a way to share documents among computers running incompatible software, the Portable Document Format (PDF) offers a consistent appearance on all devices, ensuring content control and making it difficult for others to copy the information contained within.

However, for a data journalist whose job depends on being able to extract bulk data for analysis and visualisation, PDFs as the filetype of choice does not tend to go down well.

In a field of journalism where the spreadsheet needs to rule the roost, I look at a few ways of turning data enclosed within PDFs to spreadsheets (excel xls or CSV), primed for data analysis.

What’s always important to remember in trying to get data out of PDF files is that there is no single catch-all way that works for every occasion, sometimes it’s just a matter of trying each one until you find the one that works.

For the rest of the article and the tutorials, published on the Interhacktives website, please click here. 

Data journalism tools – using Quartz’s chart builder

Quartz chart

Quartz chart

There appears to be a real appetite for new tools attempting to make life easier for data journalists.

Ranging from the advanced, like powerful software for cleaning messy data Open Refine to the simple interactive chart-maker Datawrapper, a real favourite among the data driven journalism community, data journalists have an arsenal of tools at their disposal.

I recently found out that in-house application that digitally native global economy news website Quartz use to make charts is open source and freely available for anyone to use, and it really is another tool worth adding to the list.

Quartz’s chartbuilder is very much like Datawrapper but without the added level of interactivity, which is actually neither always necessary nor possible, for example for wordpress.com accounts that do not allow embedding charts.

To use the chartbuilder, you simply paste the data into the tool and then pick between bar, column, line and scatter depending on how you want to visualise each data series.

Quartz chartbuilder back end

Quartz’s open source chartbuilder is as simple a data journalism tool as they come

There are fields in the workspace for tweaking things like graph axes and labels, while any change you make is visible as you work.

I recently used it for a data-analysis of an EU survey on data roaming and found it to be extremely straightforward, while the option to export the chart as a picture file works a charm.

It’s important also to consider Quartz’s focus on mobile optimised content and how a lot of more interactive content does not really work on mobile, whereas more ambitious tools may not work so well on a smaller screen.

Also, for those into their design, you can export the chart as an SVG, open and subsequently edit it in Illustrator.

Quartz initially created the chart builder in order to lower the barrier for what they described as “non-technical and less-designerly journalists” to create charts in their own newsroom.

According to the application’s creator, David Yanofsky, the chart-builder “has helped all of our reporters and editors become more responsible for their own content and less dependent on others with specialized graphics skills”.

And opening it up to the rest of the world will hopefully see journalists elsewhere become more responsible for the data and visual content to accompany their articles.

How to verify a photo – Google image search

When it comes to photos and the internet, the mantra is “if it looks too good to be true, then it probably isn’t true”.

Surprisingly, that’s something many people, including journalists, often ignore, and as a result they tend not to apply the care that online photo verification requires.

From disaster movie wallpapers masquerading as hurricane Sandy pictures to an image of a radioactively mutated 50 metre giant squid, internet photo hoaxes are regular occurrences, with journalists often joining gullible social media users by falling into the trap.

There are a number of ways to test whether a photo is genuine, and while some cleverly photoshopped images may escape scrutiny it is always worth checking up on an image you have doubts about.

This is where Google image search comes handy.

Google image search

Google image search

While it is not completely full-proof, running a picture through Google’s image archive should be the first stage of the process, and it will often allow you to dismiss a photo as being fake in a matter of seconds.

A Google image search can be done in one of two ways: either by dragging the photo to a tab or window in your browser open on the Google image search page, or by using a chrome extension.

Google image search – Method 1

Go to images.google.com/ and click the image button (PHOTO 1). Then simply drag the photo into the image search box, as in the video.

Google image search – Method 2

Install this chrome extension and right click on an image found online to run it through Google image search.

Google image search extension

Google image search extension

The results it gives will show the extensive history of the image, including when and where this image has appeared previously and what other, visually similar images are available. This can offer an indication of whether the image has been tampered with.

In our example, the photograph of the shark supposedly swimming in the streets of New Jersey can easily be identified as fake, since articles linking to the image were available a long time before Hurricane Sandy took place.

This may seem like an easy example to pick, and it probably is. However, a lot of the fake images that start trending at the time of, for example, an extreme weather event have appeared online in the past, either as an internet hoax or images of an actual event that may have taken place at a completely different place and time. This just shows how easily they can be debunked with a simple Google image search.

Here’s more on that from this Storyful blog post, taking you through how easy it was to debunk fake photographs that appeared during the lead up to Hurricane Sandy.

Other useful resources for photo verification:

Tin Eye reverse image search

http://imageedited.com/

Poynter – Three ways to spot if an image has been manipulated

This article was initially published on interhacktives.com here

The ultimate guide to liveblogging

Example of a liveblog

BBC liveblog of the Boston Marathon bombings

If there is one thing that covering breaking news online has brought the world, it’s the liveblog. Liveblogging has become the default format for engaging audiences in ongoing news stories, allowing websites to compete with rolling news TV coverage.

According to a  City Journalism School study conducted last year, liveblogs on the Guardian website were receiving 300 per cent more views and 233 per cent more visitors than conventional online news articles on the same subject. They also outperformed online picture galleries, getting 219 per cent more visitors.

Whether it’s an extreme weather event, a breaking financial story,  a US election or events like the Oscars and the Grammys, a liveblog is the go-to format to cover it.

I recently had the opportunity to be one of the official livebloggers at the digital journalism conference news:rewired, organised by journalism.co.uk.

Here are a few tips and lessons I learnt about liveblogging.

Don’t be fooled by the name – prepare prepare prepare

While the word liveblog may suggest that all the work takes place realtime, one of the most important things in a great liveblog is the effort you put in beforehand. Researching the story, finding the individuals involved and  keeping up to date with any latest developments from them, setting up your social media lists, preparing links to give context are all essential to a high quality liveblog. By making sure you actually do this before an event gets started or as soon as a story breaks will also set you up for a much easier time when things get going.

The day before news:rewired, I prepared notes on each talk that I would be liveblogging, including the twitter handles of those speaking, as well as some background into the organisations and their work, anticipating that some of the articles may come up during the discussion. Having the link ready and primed to post instead of having to look for it while the discussion was going on was a real help.

preparing for a liveblog

Preparation is key to a good liveblog

Some liveblog platforms even give you the option of creating a ‘raw’ feed where you can prepare posts and pull them into the live feed when you need to, making it even simpler when the time comes. ScribbleLive, the platform we used, offered this option and it made life so much easier.

Don’t try to do everything on your own – Use the crowd

using the crowd in a liveblog

Use the crowd to help you liveblog an event

You can’t do everything alone – and nor should you try to. Especially when reporting on a short length live event like a conference, or a football match, there is real value in gathering and presenting a variety of opinions. There are always other people there who can offer a different – and at times more specialist perspective – and really enhance your liveblog while ensuring you don’t miss anything. Remembering that it’s not necessarily a competition and linking to others will make your liveblog much more relevant. During the news:rewired conference, experienced former journalism.co.uk reporter and currently Wall Street Journal (WSJ) social media editor Sarah Marshall was covering the event through an open google doc. Linking to it regularly and checking anything I may have missed was a real asset.

Vary your style

It took me a couple of attempts to manage to get a good variety in the length of my posts – in the first liveblog I did I triedtoo hard to get everything on a particular issue in a single post before pushing it out, while as a reaction in the second liveblog I was perhaps posting too frequently, interrupting the flow and providing contextless posts.

The ideal is to post short, snappy, interesting facts as they come in, as with  breaking news when covering a news liveblog. Varying it with longer texts providing analysis or background whenever possible is vital to keep the audience engaged, while  links to additional information are also welcome. Liveblogs of cricket matches do this very well, providing over by over updates as well as a lot of analysis, background, audience engagement and plenty of randomness, which is always fun.

Make it visual

liveblog visual

Keeping a liveblog visual is key

Long and continuous blocks of text are hard to keep up with, while breaking up the text with headings, subheading, bullet points and lists – as well as adding images and videos –  is crucial in providing some colour and scannability to whatever you are covering. So there are a number of reasons to focus on keeping your liveblog visual. Whether it’s breaking news and giving a feeling of the situation on the ground or an awards ceremony and giving your audience an insight to what is happening there, keep the photos and videos coming. Keep tabs on instagram (using third-party services like Gramfeed and Statigram) hashtags, twitter search filtered just by pictures and video, vines and use your own phone to take a quick snap if that’s an option – anything to keep the content interesting and your audience engaged.

Think of the future generations

 Well maybe not quite future generations, but actually it’s worth realising that a liveblog can often provide a reference long after an event. Turning a liveblog into an ‘as it happened’ post allows people an in-depth catch up on a news story or an event. Taking this into account it’s important to add a summary, key points and links at the top after the event. Collecting links and useful bookmarks could be extremely helpful for those interested so that they can read up later on topics they are interested in when they have more time on their hands.

Here are my bookmarks from news:rewired, collected with pinboard and worth taking a look at if you’re interested in what was said during the conference.

Each talk also has it’s own tag, so for the BuzzFeed keynote speech on making shareable content, the tag is shareablecontent,  for the data journalism on a budget talk the tag is ddj, for short form video it is imaginatively tagged shortformvideo and for Instagram – wait for it – it’s instagram. 

What I learnt from my first hackathon

journalists and developers working together

The Interhacktives team busy working on their idea at Build the News

Photo credit: MattieTK/Flickr 

Well, officially it wasn’t a hackathon but, as a journalist and not a developer, Build the News – a two-day event for student journalists and web developers to team up and compete in the production of a digital journalism project – was as close to one as I’d got.

Known as events where computer programmers and others involved in software development collaborate to produce projects in a short and intense period of time, hackathons (or hackdays) where journalists and coders work in tandem offer great potential for the future of digital newsrooms.

Since September I’ve become a big fan big fan of Hacks/Hackers meet ups here in London, but there is a fundamental difference between listening to others speak about about something they’ve done or built (however helpful and eye-opening it can be) and actually attempting to build it yourself with them.

Build the News, which was organised by The Times digital development team this weekend, was a great experience, facilitating for some brilliant innovative ideas and projects, as well as plenty of fun.

I thought I’d write a few brief observations from the weekend while it was still fresh in my mind, so as a result I apologise beforehand in case this post turns out a little incoherent, it’s been a long week. These are things I’ve learnt from Build the News to apply for future journalism work and projects, not necessarily lessons for future hackathons.

1) Make friends with developers

By far the most important lesson. Talk to them, sit with them, listen to them talk about how they’ve done something – even though you may spend the next few hours trying to understand exactly what they said. In order to understand what is and is not possible to do online, this is step one. Learn about APIs and what they do, ask what  GitHub is and how it works (still trying to figure both those two questions out myself), try to find out as much about the different programming language and generally make a real effort to keep up instead of  switching off, as you may naturally be inclined to do. I am still struggling with this, but the only real option is to learn through interaction. This is certainly no easy process, but one true digital journalists cannot afford to ignore.

2) Open source your work as a journalist

Journalists are traditionally fiercely protective of the process behind their work and given the nature of the industry, often with good reason. The developer community generally takes a very different approach to this and is certainly one we could learn from. Publicly opening up the thinking behind your work, as well as the process involved in the developing phase of your idea, generally gives you a different perspective of what you are doing, as well as generating interest as well as potential for feedback. Halfway through the event our group, which consisted of few students from City University’s Interactive Journalism MA, belatedly set up a tumblr blog to detail our experience, thoughts, ideas, and overall progression of our project. This turned out to be extremely helpful to us in terms of understanding our own idea better by communicating it, as well as encouraging interaction. Obviously opening up your work process in journalism is not always applicable, but whenever possible the more you do it, the more you and others can potentially learn and improve.

screenshot from interhacktives build the news tumblr

3) Be nosy

People – and especially tech savvy journalists and programmers – all use technology in different ways. Staring at someone’s screen while they are working may seem bad etiquette, but it can offer an insight into a new world. Whether it’s a programme, an app, a browser extension or even something as minor as a keyboard shortcut, you can learn a new way of doing something online, improve how you use a particular programme or save time from the mundane tasks you run on your device. Look at how others work and don’t be afraid to ask what they are doing and how they did that when they talk about, or show you, their work. I feel the need to clarify – always within reason.

I’m sure there could be many more points I could make here, but I’ll stop here and hope to write an updated post once I’ve been to a few more similar events.