Tag Archives: data visualization

How Hacks/Hackers Rosario Made Its Map of Intentional Homicides of Rosario City

The Rosario Intentional Homicides Map 2013 is the first digital data journalism platform developed especially for publication in Rosario, Argentina, media. The platform and the map are the result of several months of intense work, exchange of ideas, information and building mutual trust between a team of journalists, designers and programmers from Hacks/Hackers Rosario (HHROS) and members from La Capital newspaper.

As with all firsts, there is a story behind those who were involved and Hacks/Hackers Rosario wanted to share it with all the Hacks/Hackers community.

The Rosario group launched in April 2013. Currently, it has 133 active members and has already held ​​seven meetups. Community activities have ranged from discussions about what data journalism means; workshops with D3.js data visualization, digital security seminars, social events, crash courses and proposed project presentations for future hackathons.

The idea of ​​making a map of intentional homicides arose from the concerns of the HHROS co-organizers during one of the usual rounds of project presentation. The purpose of the whole enterprise was to create a platform that would help to demonstrate, through data visualization, the increase in the number of social violence across the city: 214 intentional homicides in 2013 (the city average is 21 murders per 100,000 inhabitants; the national average is 5.5 per 100,000).

The data for the platform came from the Ministry of Justice in Santa Fe, through the police beat reporters of the newspaper La Capital. This information was used to create the dataset of points and geolocation on the map corresponding to each of the casualties. The data related to context was obtained from reading all the daily chronicles published in 2013.

The team that developed the platform was just two unpaid people: a programmer and a journalist. Among both devised its design and objectives in the short and medium term.

Platform development and design adjustments were by Pablo Cuadrado; Ezequiel Clerici normalized the dataset and fact checked information shown on the map. The newspaper La Capital, represented by Hernán Lascano (police beat editor), brought the original database and oversaw contextualization of events.

Our Process

Because each report and location had to be manually confirmed, we decided to limit the mapping of intentional homicides to the city of Rosario. This meant places like Villa Gobernador Gálvez, Baigorria Pérez, Ibarlucea and Funes, which are part of the Rosario Department (greater metro area), were left out. Our decision reduced the number of homicides appearing on the map from 264 for the entire department to 214 just for the city.

The purpose of this decision was to reduce the amount of standardization work on the dataset provided by La Capital, since we had a small team of people and limited time available to work on the project.

With respect to the dataset the first thing we did was remove all homicides that did not correspond to the city of Rosario. Then we searched through La Capital’s digital archive of every chronicled homicide to obtain each article link to include on the map.

The links to the chronicles were essential information for georeferencing every fact in the map. The stories were also useful for corroborating ages, full names, police stations, courts and addresses.

To avoid headaches, it was necessary to obtain accurate directions. The problem was that a significant number were referring to the intersection of two streets (“Ezeiza and Filiberto”, “Rueda and Pascual Roses”, etc.), recreational spaces (“Pools of Saladillo”) or parks (“Independence Park”). This made precise location impossible, so in many cases it was necessary to go to the news reports and do a quick reading looking for further details. This craftsmanship allowed accurate geolocation in most cases, and approximations otherwise.

Despite these precautions, we had to add new columns of information to the original dataset to achieve each fact georeference smoothly. The original Excel file had 8 columns: number of homicides, date of death, the victim’s name, age, approximate address, type of weapon and motive, police station, and in some cases the competent court. To those columns were joined three new: exact address; city, state, country; and district. With this change, the problem of georeferencing was corrected and we were able to map the 214 events smoothly.

At first, we used the free version of CartoDB to georeference the killings. It allowed work on five tables and drew polygons quickly and easily. Then — due to free version’s monthly view limits, plus the response time between multiple filters &mdash we exported our platform with its contents (police stations, districts, timeline, milestones, etc.) to D3.js and left CartoDB for Open Street Map (Leaflet) and increase the speed of response demanded by users.


The publication of the map made ​​a big impact since it allowed to put into focus a range of issues that go beyond just the high number of homicides. And this, in part, was made possible by the perspective to analyze the facts that data visualization provides journalism.

The platform and the project were submitted to the Global Editors Network Data Journalism Awards 2014 and will soon by submitted to the Fundación Nuevo Periodismo Iberoamericano (FNPI) Gabriel Garcia Marquez journalism awards. Also, as a result of this work, the team formed around Hacks/Hackers Rosario was invited to speak to the Media Party 2014 (CABA, Argentina).

This work has given way to the creation of VisPress, a startup that seeks to focus on the development of platforms and tools that work starting from visualization and data analysis.

The aim of the VisPress founders is to provide data visualization services to​ local and international media, as well as to public and private sector companies that are handling large volumes of data and are interested in giving productive use and thereby improving their decision processes and achieving higher efficiency.

Data Journalism: A Showcase of Viz Projects in India

Building on the excitement around data journalism at our previous hackathon, Hacks/Hackers New Delhi recently paired up with the Hindustan Times — one of India’s leading English newspapers — to host a showcase of innovative data journalism work going on in the country.

The goal was to share best practices when it comes to using data to source, tell and visualize stories.
Avinash Celestine of The Economic Times
Avinash Celestine of the Economic Times started off by explaining how he’s using open data – particularly Indian government census data – to answer big questions about socioeconomic trends on his Datastories blog.  He stressed the importance of putting data in context, giving the example of how he recently tried to understand women’s declining participation in the labor force by contrasting it with data about increased studying and housework.
Cordelia Jenkins of Mint
Cordelia Jenkins of Mint newspaper explained Trading Up: Slum Economics, the data journalism project that recently won the GEN Editor’s Lab hackathon in New Delhi.  The team’s goal is to create an app that visualizes and displays detailed household and economic data for slums.  Users can compare different slums side-by-side, graph the parameters they choose, and ideally – for journalists – pick out story ideas.  The project is at the concept stage, and will take another three months to become operational.
Ravi Bajpai of Down to Earth
Ravi Bajpai of Down to Earth talked about how he produces data visualizations in a hurry – in a day or less – for the organization’s blog.  He parses survey results, pulls out relevant data, and creates interactive infographic-driven posts that draw a lot of users, get good traction on social media, and keep users on page for longer.

Neeta Verma, who works with the government’s National Informatics Centre, presented data.gov.in, a new open data site from the Indian government.  The site features several tools, including inbuilt visualizations that users can embed on outside blogs and pages, the opportunity to request and vote for more data sets, and a developers community.  The site is free, but has lagged in adoption as NIC works to get more data onsite.
Guneet Narula of Datameet
Guneet Narula, of the data science collective DataMeet, presented some of the work that their members have done.  Their projects include the Geohackers blog and the India Water Portal. Most of their members, he said, come from a coding and tech background, and would like to work more with journalists.

Between them, the presenters mentioned several freely available tools for quickly packaging/presenting data, including the D3 viz library, Tableau Public, Datawrapper, Leaflet, and MapBox.

Overall, an informative set of presentations that explored some of the creative data-driven work going on in India today. About 50 people came to Sunday’s event at the HT House, and several more followed along on Twitter. For more info about Hacks/Hackers New Delhi or to join, check out the meetup page.

Hacks/Hackers Austin: Tableau Public

On April 1, Hacks/Hackers ATX (in conjunction with ONA Austin) hosted Ben Jones and Jewell Loree of Tableau Public. The pair went through the comprehensive offering that Tableau provides for data visualization.

The meeting opened with a presentation from Harsh Patel of MakerSquare, a new organization providing Web development training in the Austin area.

Many thanks to Christian McDonald for arranging this event with refreshments sponsored by Tableau Public. And we are always grateful for the usage of space at The Austin American-Statesman.


Hacks meet Hackers in packed Ottawa pub

Hacks/Hackers Ottawa logoWalking and road biking to work is most popular in Nunavut. Canada’s federal Conservative Party raises more funds through personal donations than the rival Liberal Party does overall. And in Ottawa, you’re most likely to get a parking ticket on Lynda Lane, not far from the Ottawa Hospital.

Each of these tidbits, a story in their own right, and many more tales buried, sometimes deeply, in publicly available data were revealed the inaugural Hacks/Hackers Ottawa event on May 12.

In an overcrowded pub basement, the beer was pouring as freely as ideas about the future of storytelling in a data-driven world. The house draught list had two speakers: Glen McGregor of the Ottawa Citizen and Alice Funke of punditsguide.ca.

While he was humble about his own work in the field, Glen set the crowd — half hacks, half hackers — at level footing. He provided an introduction to tech-assisted journalism, explained how journalists shouldn’t depend on governments to provide important data, and spoke about how every column in a spreadsheet could lead to a story.

Alice, a hero on Parliament Hill in all political corners thanks to her electoral data crunching, happily managed to out-geek Glen. She showcased how she reverse-engineered inaccessible public elections data into gigabytes of relational databases. The hackers were wowed by her smug SQL, while visions of headlines danced in hacks’ heads. If every column is a story, her work could be a multi-volume epic.

Between presentations, one  of Ottawa’s leading hackers offered beer to any hack/hacker pair who came forward with one collaborative idea. That beer, perhaps unsurprisingly, was soon claimed. Journalists from across the country joined local developer groups, data visualizers, political parties, public servants — and one accordion guy — to launch Ottawa’s chapter.

The event couldn’t have happened without the support of the Ottawa Citizen, OpenFile Ottawa and Open Data Ottawa. The next meet-up will be sometime in July, and we hope to see you there. Join the Hacks/Hackers Ottawa meetup group to be notified.

Nick Taylor-Vaisey and Alex Lougheed are a Hack and Hacker, respectively, who helped get Hacks and Hackers Ottawa off the ground. They can be reached through the group’s meetup page at hackshackers.com/chapter/ottawa.

Hacks/Hackers NYC: Wikileaks – Data Science & Data Journalism

When WikiLeaks released the Afghanistan and Iraq war logs, news organizations and the public alike sprang into action to understand the documents.

The New York Times was instrumental in analyzing and reporting the story in articles, photographs, maps and graphic information.

Meanwhile, several local hackers worked on their own data visualizations and were featured soon after on Wired, NPR and the New York Times.

RSVP now to join Hacks/Hackers NYC on March 9 at New Work City to learn how the analyses were done, the importance of independent validation checks on data, and see further examples of their work.


  • Drew Conway, PhD student, Dept. of Politics – NYU
  • Mike Dewar, Post-doc, Applied Mathematics – Columbia University
  • John Myles White, PhD student, Dept. of Psychology – Princeton University 
  • Jacob Harris, senior software architect, The New York Times

Registration for the event is $10, payable in advance. If the cost of registration is beyond your budget, email nyc [at] hackshackers [dot] com to volunteer.

Doors open at 6:30 p.m.
Presentations begin at 7 p.m.

MIT project looking for WordPress users to beta test data visualization tools

An MIT research project is looking for beta testers for its  Knight News Challenge proposal for a WordPress data visualization plugin. Sign up on their blog.

As Professor David Karger writes, his team has created a WordPress plugin called Datapress that lets folks  WYSIWYG author interactive visualizations of any data without any programming.  Using the tool, users can drop maps, timelines, tables, charts, lists, thumbnail grids, and graphs into your article the same way images drop in an image.   You can include widgets that let your readers sort and filter the data by the criteria you specify.  The data you’re presenting can be in a file uploaded to your blog or can live in a Google spreadsheet or a wiki where that can be maintained over time—your article will automatically incorporate your changes.     All these pieces are incorporated in the standard WordPress blog-post editor.

Datapress uses the Exhibit framework, which has been used to create several hundred interesting data visualizations on the web, including some by the San Fransisco Chronicle, the Star Tribune, and the St. Petersburg Times.  But Datapress is intended to make it even easier to author these views and incorporate them in blogs.  A couple of brave bloggers at Factory Portland and Quantnet have already used it successfully for music and finance.

You can see examples on the demo site, watch a tutorial on the Datapress blog, or just download the plugin from the WordPress plugins site. And sign up.

NYC Data Visualization Extravaganza

The New York City chapter of Hacks / Hackers met on Nov. 9th for a jam packed information session on information and data visualization. The four presenters covered the gamut of information visualization, from online data-viz products, to just launched prototypes, to critical analyses of how graphics are being used in the media.

The four speakers for the evening included Marc Rueter from Tableau Software, Matthew Ericson from the New York Times, Alex Lundry from TargetPoint Consulting, and Santiago Ortiz from Bestiario.

Marc kicked off the presentations by talking about Tableau’s mission to democratize data publishing and visualization online using Tableau Public, their online tool. He demoed the tool showing how easily data could be imported and dragged and dropped to create different views (he chose a dataset on bird strikes to airplanes in honor of his trip to New York). A lot of journalists are turning to Tableau Public to support their data-based reporting and it’s easy to see why: the tool is slick, it’s point and click, and the results are embeddable and look great online. Here are some examples using it.

Matt Ericson, the Deputy Graphics Director at the New York Times, then took the stage to talk about how the NYT designed their election graphics for the just completed election season. There are a lot of challenges to doing real-time data driven graphics for election night, among them is thinking about the visual design for dynamic results and providing insight into undecided races as new information is released. Among the visuals that Matt talked about was a new tool they developed expressly for helping people compare results from past elections, which really gives a sense of how voting patterns have changed (see below).

All of the NYT election graphics are still online here. The platform is implemented with data from the AP, an in-house flash library for the mapping, Ruby on Rails to bake pages to .html (for faster loading) and everything is hosted on EC2 for mass scalability.

To contrast the objective and news-y graphics of the New York Times, Alex Lundry then took the floor to present his ideas on the inherent subjectivity and manipulability of information graphics. He showed some fantastic examples of data visualization being used as a tool of political persuasion (and yes, people with information graphic laden picket signs). Below is the masterpiece which sparked a partisan visualization volley Alex calls Chart Wars. Alex gave a powerful and dynamic talk, waking the crowd up to how shapes, color, and iconography can be used to bias graphics, and how point-of-view journalism is invading the ostensible objective realm of data visualization.

The presentations were closed out by Santiago Ortiz who demoed a practically brand-spanking-new visualization environment called impure. In contrast to the non-programmer approach to visualization taken by Tableau, Impure is oriented more toward the gear-heads. For anyone who’s done music programming with Max MSP, it’s that, but for data visualization. It has a powerful data flow model (which might be recognizable to someone who’s used Yahoo Pipes) which lets programmers connect data sources to filters, tables, and interactive visualizations.

Sign-up for the beta release at impure.com.

The program for the night was co-organized by Hacks / Hackers’ @hoenikker and the NYViz Groups’ @jpmarcum and was sponsored by Tableau Software and Dogpatch Labs. For those who missed it the video of the event can still be found online via Livestream.

Introducing Hacks/Hackers Los Angeles

We’re proud to announce the addition of a Los Angeles branch to the ever-growing Hacks/Hackers nationwide network.

Thursday, July 8, more than a dozen members of the Los Angeles journogeek scene joined up for the first introductory Hacks/Hackers LA meetup at Redwood Bar and Grill, conveniently located across the street from the LA Times.

Our first meetup was a mingling event for our community members to get to know each other and for us, the organizers, to get a feel for the kinds of ideas and interests floating around the Los Angeles community.

What we talked about

Most of the conversations throughout the night were in small, circulating circles. A few recurring topics:

Data visualization: What works, what’s possible, what has fallen flat in the past.

Multimedia strategies: One man-band strategy vs. the in-house multimedia inspection team.

Journalism education: Yes, it’s a topic that has been discussed over and over again in journogeek circles, but for a good reason: we all agree that it still needs work. A few memorable notes from discussions about journalism education:

  • Many students still care too much about the “grade” rather than the value of the experience they gain. This isn’t the fault of the students, but of the general mindset and educational structure upheld by traditional institutions
  • How valuable are entrepreneurial journalism classes/programs? Can you truly teach entrepreneurialism in a classroom setting?

And, just for fun: Hidden food gems in the Southland: Korean clambakes and SF/SD-worthy burritos

What the Twitterverse said

Now that we’ve all had the chance to meet and talk, future meetups will be more structured and thematic with speakers, panels, presentations — you name it. We’re open to ideas from everyone.

Keep an eye on this blog for and our freshly-created Facebook page for updates on the place and theme of the next meetup.

Michelle Minkoff, Eric Zassenhaus and Lauren Rabaino are co-organizers of Hacks/Hackers LA.

The art of data visualization: Stamen Design event wrapup

Eric Rodenbeck, Stamen Design

The art of making sense of data — and it is truly an art — is a key element in building the future of journalism. Interactive presentations created from data can be personalized by the reader, giving a more engaging news experience. Data-based applications can also lead to new business models, through paid or subscription-based applications that give extra value to readers by providing a new dimension on news coverage.

One of the leaders in data visualization is Stamen Design, which has worked with news organizations and museums alike to help make sense of the world through its unique views of data.

Speaking to a Hacks/Hackers event this week at the Gray Area Foundation For The Arts, Stamen founder Eric Rodenbeck discussed some of his firm’s work and philosiphy.

Some highlights from the presentation:

Create maps as tools for exploration

Stamen doesn’t have any preconceptions for what they want their visualizations to show. They aim to create interfaces that allow users to come to their own conclusions about what they see. Part of this is insuring that the data they use is as complete and accurate as possible. They also don’t try to clean up outliers in their data that might appear to be unexpected noise cluttering up a visualization.

Issues of data access

When Stamen was about to release their visualization of crime data in Oakland, the city shut off access to the data pipe. Access to open data is obviously essential for these applications. This is one area where journalists and developers can work together. With their experience finding sources and doing Freedom of Information Act (FOIA) requests, reporters can help obtain the right information in the right format so that designers and developers can build a complete visualization. That takes an understanding on both sides of what’s available and what format is required.

The best data come from human actions

Data that come from actual human activity is best starting point for creating visualizations. This means information based on how people behave in the real world, not doing something like filling out surveys.

Data that come from actual human activity is best starting point for creating visualizations

Current tools are complicated and expensive

There are some easy-to-use tools for doing this work, such as Google Maps. But when you want to go beyond just sticking red pins on a map, it can get complicated very quickly. Stamen’s projects require complex and expensive tools that aren’t easily usable by non-techies. Perhaps this will change and there are some people working in this space, such as IBM’s Many Eyes project.

Print-on-demand as a way to bridge the digital divide

Print media through small on-demand runs could be a way to bridge the digital divide and bring some of the information gleaned from data analysis to a wider audience. There have recently been some efforts to use downtime on big presses for short runs.

Want to hear more? The video of the event is embedded below, and here are some photos.