Data is a journalist’s best friend

Nate Silver (Randy Steward/Creative Commons)

Nate Silver (Randy Steward/Creative Commons)

Data isn’t just for Nate Silver.

Many journalists are scrambling to learn data journalism skills, in schools, conferences and free online courses. Interactive graphics and maps seem to be the key to engaging an ever-more distracted public. However, before journalists can tell stories with data they need to understand what the numbers are saying.

The most critical tool for data journalism may not be D3 or Fusion Tables, but simple statistical analysis.

Journalists with strong statistical skills are the exception, not the norm. Few of the major journalism programs offer complete courses in statistics, leaving journalists to pick up the skills on the job. Big data is already here and will only become bigger in the coming years. Right now, many journalists simply don’t know how to use it.

“People have been doing some version of data journalism for much longer than the term has been around. It’s just the tools and techniques have changed, in addition to some of the ways that data comes to us,” said Chrys Wu, who takes issue with what she calls the current fascination with data journalism.

Wu is an organizer of the New York City Hacks/Hackers, a MeetUp that brings together technologists and journalists to collaborate on projects, and has worked with The New York Times, the Los Angeles Times and the Knight Foundation. “The current attention around data visualization is partly driven by the very human tendency to always pay attention to the new stuff,” she said.

Wu admits that the recent abundance of digitized data presents challenges for journalists. “Having a better foundation in statistical analysis techniques is going to be really important. If you’re working in news, you know that here’s a chunk of source material out there that’s really data you need to understand,” she said.

Data stars like Nate Silver have demonstrated the mass-market appeal of statistical forecasting. But the ability to analyze numbers is crucial for journalists who cover beats far from the realm of election forecasts, such as health care and education.

The amount of data available to journalists is growing at an explosive rate. A recent International Data Corp. study estimated that the digital information in the world is doubling every two years. Governments, private corporations and lobbying groups are all devoting their resources to capturing the electronic traces of daily life. But this abundance of data brings with it a dilemma for journalists – how can they sift through such unprecedented quantities of information?

The early attempts to address the overabundance of data led to the development of computer-assisted reporting, or CAR. Over time CAR evolved into “data journalism,” a term that encompasses both the back-end skills to sift through vast amounts of data and the visualization skills to translate data into graphics and maps. Some in the field argue that data journalism is just a new name for the in-depth reporting that good journalists have always done.

Lena Groeger, a reporter and designer at ProPublica, echoes the need to be able to sift through large quantities of data without getting lost. “It’s a matter of picking which sources you think are worth spending time on and trying to find the value there,” Groeger said. “Much of the data is just not that interesting.”

Statistical analysis enables journalists to spot the significant patterns and outliers in a sea of data. Formal journalism programs have begun to recognize the value of math. One example: Statistician Mark Hansen is now head of the Brown Institute for Media Innovation at the Columbia Graduate School of Journalism. Journalism programs at New York University and Northwestern’s Medill School of Journalism have hired faculty with advanced degrees in statistics. Investigative Reporters and Editors and the Society of Professional Journalists both have substantial guides to working with statistics and data

At times, it can be hard to hear the voices calling for statistical training above the roar of data visualization enthusiasts. Free statistics courses are available online – some of them even tailored to journalists. But none of these courses is as popular as the Knight Center’s Infographics and Data Visualization, which attracted more than 3,300 participants to the most recent class.

Paul Bradshaw, the founder of HelpMeInvestigate and head of Birmingham City University’s MA in Online Journalism, attributes the skew to math phobia in the journalism community. “Many people become journalists because they want to write,” said Bradshaw, adding that ignoring analytical skills could hinder these math-averse journalists.

Following the 2012 presidential election, Bradshaw criticized American journalists for failing to engage with polling data. “With campaigning camps communicating directly to the electorate, mere access to information has become ever more devalued. In this election it wasn’t that access which won out, but ultimately the analysis, understanding, and data literacy,” he wrote in a blog post. “The reporters who couldn’t handle this witchcraft were left with egg on their faces.”

As the amount of data available soars, statistical skills will only become more valuable. Before journalists can build infographics, they need to evaluate the data and question its integrity, as they would any other source. Before learning Python, they need to revisit regressions. Data journalism may be trendy right now, but a growing contingent of journalists is recognizing that statistics are here to stay.