9 things I had to think about on my latest data visualisation project

Usually people are excited about the possibility of working on a project with millions of rows of data. "It's a challenge!" they say. However, some "normal data" projects can still allow you to showcase the best of your skills as a designer and visualisation engineer.

This was the case on the latest project I took part with the awesome guys at Red Badger (who happen to be looking for talented devs to join their London office)

We were faced with the interesting prospect of updating the product website of an insurance company which produces risk ratings for all the countries in the world, on several different areas (i.e. risk of a civil war starting, risk of strikes and political unrest, etc.). Our task was to produce a mixture of a CMS (to update news, articles etc) and dashboards that would display data at different geographical levels.

This post collects some ideas from the development work, and might prove useful to you when working on your own project.

That map was going to be a dynamically generated SVG with pins flying around and news popping up from each country. We decided to tone it down to just a nice optimised PNG for several resolutions.

That map was going to be a dynamically generated SVG with pins flying around and news popping up from each country. We decided to tone it down to just a nice optimised PNG for several resolutions.

1. Static sites are fast - generators, not so much.

From the onset we decided to make the site static. This made sense to us because the site wasn't changing often enough to warrant serving our users a freshly generated copy of the site every time. How about generating the site when it actually changes, and then just serving speedy HTML files that can be cached by your CDN?

We had a look at several platforms, like Jekyll, but in the end we decided to go with Docpad for consistency with the rest of our platform, which was going to be built in Node.js. Furthermore, Red Badger's website also runs on docpad, so people around us had had experience with it and we could consult them for doubts.

Docpad worked brilliantly for the first few weeks of the project, when we had a small number of documents to generate each time. Docpad has very useful features for development, such as watching files and reloading automatically, which I tend to lean on a lot when working: Make a little change, watch the results, change it a bit, see how it looks, etc. Docpad was doing great, rebuilding the whole site in a few seconds.

However, once we added all the countries and news content to the site, the building time began to skyrocket. At one point generating the entire site on my laptop was taking around 100 seconds, which was definetely a problem. Working became pretty difficult, and it was clear we needed to find an alternative.

2. Livescript to the rescue.

The very talented Stuart and Viktor had stumbled upon Livescript, and enticed by their elegant functional programming style, with clever list comprehensions and compact syntax, claimed they would be able to entirely replace docpad's generation step with our own version, and that it would only take them a day.

It only took them a day. And the build was back down to a few seconds. We teamed up our new generator (which we have dubbed generator) with grunt to run tasks and handle file changes, live reload, etc. Working became a breeze again.

Red Badger are thinking about open-sourcing this generator. Its aim is to be highly focused, providing just enough features for compiling your templates into html. It provides helper functions that run at generation time to help generate the content for the templates - and little else. 

3. Componentify everything.

One of the best choices we made for project was that working with component, a client-side package management system that helps you modularise your code. You load very small, very focused components that get compiled into a single file, in the same guise as require.js and other AMD modules, but with the advantage of using the nicer Commonjs paradigm.

For every small section of our site, for every visualisation, we built a component. Inside of it, we put all the templates, javascript files and structural styling we needed for that part of the site. Tests, specs and other content can all be part of the component.

You can go as far as to have a repo for each component, and then simply list them on your component.json to have the newest version freshly picked for you.

In this spirit, I've been looking at web components, which you should read more about here. They are a set of standards being developed by the W3C which allow you to create custom HTML elements, bundled with your own styles and markup. They're currently best supported by Chrome canary, which has been bringing some impressive features (device emulation being one of our favourites).

The markup and styling you write inside your components becomes part of the shadow dom (wicked name) and the browser ensures it's always rendered properly, regardless of what other code and markup might be around it.

4. D3 is still great but tough for UX

This was the first time I had ever worked with a proper UX team behind me, and it was a great experience to have great people working with you. Working on d3 towards already designed visualisations means you have a focused goal in mind, and makes the entire process much more enjoyable.

Achieving the interaction patterns they had designed unearthed some of the hardest challenges of this project, and probably some the hardest in my experience with d3 so far. Things like zooming, panning and handling external events from other parts of the dom.

With d3 it is possible to achieve successful user interaction, although to a high degree of difficulty. Take a look at this map:

The global heat map exposed many areas of interaction with users: A timeline slider, a zoom slider, zoom buttons and a mini toolbar to download the data and copy the visualisation to a custom report.

The global heat map exposed many areas of interaction with users: A timeline slider, a zoom slider, zoom buttons and a mini toolbar to download the data and copy the visualisation to a custom report.

There were several things the users needed to be able to do with this map:

  • Zoom and pan with usual gestures (mouse, touch, pinch).
  • Zoom and pan with the slider and buttons.
  • Change the colors of the choropleth map with the slider, updating the title.
  • Change the category of the data displayed.
  • Expose its current data view so that it can be downloaded as a CSV.
  • Expose its current state, including current level of zoom and pan, selected series, time etc. so that it could be loaded onto a custom report

D3 offers custom behaviours, such as zoom and drag. These create listeners for handling gestures on an element, but usually with the intent of modifying the element itself, i.e. by applying a transformation on it.

However, D3 is less sophisticated for creating custom UI elements, such as sliders or buttons, except for the brush (not so wicked name). The brush allows you to create and modify a range selection with gesture input from the mouse or touch. It has an extent, and can take x and y scales to translate those inputs into altering the extent of your brush.

5. D3 Brushes

Brushes are very powerful, but a bit hard to understand. A brush can let you select a time period on a time axis, or act as a slider. You get brushstart, brush and brushend events when the user interacts with the element you attach it to. The events fire as the user interacts with the element attached to the brush, containing the changes to the extent. You can use these to update the position of the ui elements and the ranges of the scales of our visualisation, and redrawing with new values

With that, building a slider is simple: make a long SVG rectangle and attach a brush to it. This will act as the slider itself. On top of it, you can put a small element, such as a circle, to act as the slider handle. When the user drags over the slider rectangle, you change the position of the slider handle, by taking the minimum value of the extent. Have a look at this example for a better idea.

This means that the users aren't interacting directly with the handle, but with the slider's extent. There is a feeling of lack of immediacy that you get from this that I never quite managed to solve. I think this comes from the fact that d3's event dispatch allows only one listener per event.

One listener per event means everything that needs to happen on brush has to happen inside one event handler. It would make more sense to me to be able to attach several listeners to one event, so that the handle and the visualisation could update on separate functions. At the moment, my listener functions look pretty messy, with many calls to update all of the things that need updated.

6. Cross-Browser means IE, use aight

Lately, Cross Browser usually means 'also works in IE'. For this, I've used aight. It helps bring IE8 and IE9 to come to a bare minimum of support of d3. It also provides reliable IE detection, so you can hide some parts away from IE8's crashy engine. Don't try to use it for detecting other browsers, as their user agent strings mislead it.

You still don't get SVG, but you can still use divs to build a decent treemap:

A treemap visualisation of trend data results from elasticsearch tag search counts

A treemap visualisation of trend data results from elasticsearch tag search counts

All you need is absolute/fixed positioned divs, and a way to calculate their sizes. You can do that with d3.layout.treemap. 

7. Give them tables.

One important thing to take into account when trying to do data visualisation for many different browsers is that sometimes a map or bar chart will not work properly in an old browser. Also some people get a fuzzy feeling when seeing tabulated numbers, perhaps it reminds them of their beloved excel

A table can be easily enhanced to display differences or change very visually by mapping a value to a color scale.

A table can be easily enhanced to display differences or change very visually by mapping a value to a color scale.

Tables can be visually enhanced to show differences and change, and colored to become heat-maps. These will help readers find patterns in your data and also draw their attention to important aspects of your data.You can use d3 to build these, or just plain old javascript. it's fun, you should try making a nice, responsive table.

8. Use gifs to communicate UI changes to your team

When you add a pull request with a bunch of UI changes, the best way to communicate it is by capturing a little GIF of the new behaviour, so testers and others know what to expect and what you achieved:

A gif is worth a thousand github issues.

A gif is worth a thousand github issues.

Capture your videos with quicktime (press ctrl + cmd + n to make a new screen recording, then use gifrocket to convert them to a gif. You can then attach them to your github issues after uploading them somewhere public like your dropbox:

![image](url-to-image.gif)

9. Do it all again

Every time you take on a project, it is a learning experience. I've learnt a lot this time around, especially around d3's event handling. Next time around I know i'll be way more prepared to learn whatever needs to be done to improve my work a little bit more. I think web components will finally help making data visualisation easier on the browser. Line chart elements, bar chart elements that can be reused on any of your apps, by just importing one component.  I can't wait!