We filtered articles from the GDELT dataset to display only the ones containing references to cyber crime. Then, the data science team clustered them to distill topical themes, in order to identify individual stories, made up of several related articles.
For every day, the guys computed positive and negative sentiment averages, and I was tasked to plot them. I drew big inspiration from Moritz Stefaner's Emoto project - where the overall sentiment of the internet was analysed during the 2012 London Olympics.
The piece uses Poisson-disc distribution to progressively fill each sentiment arc with evenly distributed points, and then applies a Delaunay triangulation to obtain an organic looking tessellation of the space that would highlight the idea of sentiments being organic and changing.
The visualisation is the main focus of the piece - which consists on a dashboard allowing users to choose dates, and two information areas, the above visualisation and a list of stories, akin to tweets.
The stories are thus presented on a continuous scrollable list, on the right hand side of the dashboard, with a title chosen from the titles of the articles that belong to that story cluster.
The stories invite the user to explore more about the topic, and include a list of people, themes and organisations that are involved, drawn from the articles that the cluster algorithm joined together. The user can click these stories and is taken to a story focus page, where she can study the story in detail:
The story page displays the people mentioned in the articles, a tree-map of the themes from the stories, a map with the locations where the stories have taken place, and a list of articles for the user to go off and read.
GDELT does not provide images for the people that are mentioned on the themes, so i wrote a node.js express app that adds routes to allow to add <img> tags with an src that directly points to a bing search. The images are cached (obviously) but if a new one is found, an API query is sent to bing images with a new one. The search is tailored so that portrait images are always returned. The result is 90% good.
The picture collage has the effect of instantly inviting the user to connect the people in the pictures to a theme, and users instantly relate to this. The only downside is that sometimes the GDELT data set identifies places as people, which in turn get translated to strange portraits, so a certain degree of editorial eye was required.
One of the requirements of the dashboard was to display a map of the locations where the articles had taken place. Instead of displaying a flat square projection, I decided to used an orthographic projection to provide a 3D effect with an orbiting globe. The result is surprisingly performant on desktops, although it would need an alternative in phones (i.e. maybe render it on canvas), as the orbiting animation runs very slow on my Nexus 5:
Tools used (in no particular order):
- Gulp for tasks (first time using it over grunt, I like it, but the watch task needs to stop dying every time i forget to add a comma to my JS!)
- Browserify for dependencies in the browser. NPM >> Bower for sure, although i had a few headaches getting some things to work (nudge nudge masonry)
- React - it's just so quick to sketch things, evolve designs, etc.
- D3 - well yeah.
Thanks for reading, you can have a look at it here