I wrote a tutorial for d3-annotations. Hope it’s helpful.
Colours is a big part of what makes up graphics and making colours accessible is important.
We’re used to thinking of colours in terms of red, green and blue. Add these colours together and you get white. We can describe colours using RGB, breaking up each colour channel into 256 parts. Increase the red channel and you get red. Add the blue channel and you get purple. Add the green channel and you’ll get .
This works quite well for computers, just add three numbers together to make a colour. The problem comes when we apply human physiology to the issue. Inside the eye, there are two type of cells that sense light: rods and cones. Rods mostly deal with low light and cones are used to tell colour in well-lit conditions.
The three rods are sensitive to different wavelengths of light, with one roughly at blue, yellow and red. But these don’t respond linearly (i.e. a light twice as bright doesn’t send a signal twice as strong). That graph has been normalised so every peak is at 1. It doesn’t show that our eyes are most sensitive to green light than red, and more sensitive to red than blue. What this means is that we should consider what colours our eyes are drawn to, what colours make up the data vis and what parts of the visual you’d like to draw attention to. Perhaps you need to use a colour to highlight a specific aspect.
An alternative way of thinking about colour instead of RGB is Hue, Chroma, and Lightness (HCL). The advantages of using HCL is that it takes into account the way the human eye perceives colour.
Hue is the shade (red,green,blue), Chroma is the richness of colour (it’s a bit like saturation but takes into account the colour of other white objects). Lightness is the perceived brightness of that colour.
Whereas RGB could be imagined as a cubic colour space with each dimension going from 0 to 255. HCL works in a cylindrical colour space. Hue ranges from 0-360°. Chroma start at 0 but the maximum can vary with hue and lightness. Lightness is from 0 to 100. Lightness is also dependent on hue and chroma.
The most relevant part of web guidelines regarding colour relate to text. They say for WCAG AA compliance text and images of text should have a minimum contrast ratio of 4.5:1 for normal text and 3:1 for large text. Contrast ratio is calculated by comparing the relative luminance of the lighter colour divided by the relative luminance of the darker colour. What’s important to note is that it doesn’t depend on hue as people’s colour vision are different and it’s the contrast in lightness.
The first thing to note is that this related to text and having enough colour difference to determine letterforms. Charts and interactives can contain many things other than text such as bars, lines, squares, circles and other shapes. All of these shapes can be big or small or a mix. Smaller objects would need a higher contrast ratio whereas a high contrast colour for large blocks would be too strong.
Also with most interactives and especially maps, you have colours next to each other rather than on a background. So there needs to be some consideration of the difference between colours and that you have enough that they are distinguishable.
Be mindful that when using colour to represents your data that it shows the relationships in your data. e.g. if your data is different categories, your colours should be as distinct as possible. If your data is sequential or represents a range, colour should change in a sensible way.
Colour also has semantic meaning. We’re tired of seeing blue for males and pink for females for any dataset, but it’s hard to break away from the associated meaning of those colours. Datawrapper did a recent review of what colours people are using to represent gender.
Be careful to check what those colours could mean for people. Meanings also vary culturally and with language (e.g see this wikipedia article on blue-green) so may mean different things outside what you’re used to.
I am colourblind myself (slightly red/green) which is useful when it comes to calling out bad colours on charts. Approximately 8% of men are colourblind and 0.5% of women. There are two main types, difficulty seeing red/green and difficulty seeing blue/yellow.
The best write up I’ve seen about testing colours in charts for colourblindness comes from Gregor Aisch of datawrapper. He applies simulated colour blind vision to a set of colour and then looked at difference between colours. Where the differences are not great enough a warning is given.
You can check your colour palette in datawrapper and check if they give any warnings.
So now we’ve learnt about colour and we’re aware of all the considerations we have to take we can start choosing colours.
Let’s start with the easy one first.
To make a good sequential colour scale, you need to vary chroma and lightness of the colours through the scale.
Let’s start with a colour that low in chroma and high in lightness. This is going to be a pale blue.
We want another colour that’s the opposite so high chroma and low in lightness.
And let’s make a scale that add three steps in between.
Analysing the colours we can see chroma increases and lightness decreases. If you think this palette looks familiar you’d be right. It’s the blue palette from colorbrewer.
This is a single hue palette as although the hue varies, it’s pretty constant.
Although this colour scale is good, there are benefits from using multi-hue sequences. From Gregor Aisch’s article on colour
Hue variation provides a better color contrast and thus makes the colors easier to differentiate.
But as the creators of colorbrewer say, they are tricky to create because
(The reason why multi-hue palettes are better are explained in more depth in this article. )
Luckily for us Gregor Aisch has created a tool to help create smooth palettes by interpolating between colours in three dimensional colour space with bezier curves. I highly recommend reading his article Mastering Multi-hued Color Scales with Chroma.js to understand more. He also includes a neat trick to make sure lightness increases linearly.
Now we know how to make sequential palettes, we can make divergent palettes by sticking two sequential ones back to back. You may need to put in a neutral shade in the middle. Gregor has even made a tool for that.
As advised by graphiq, choose colours that make sense. This generally means faint colours for low numbers and stronger colours for high numbers, although this might depend on your data.
You are going to need to think through the starting points for your colours. The more colours you have in your scale, the more you’ll need to move your start and end further away from each other, to ensure your colours have enough distance between them. For example see how Colorbrewer does it (from this paper).
Choosing distinct colours is hard. We know that variation in chroma and lightness make colours easier to distinguish.
For categorical colours, the difficulty comes when we need to keep chroma and lightness similar so colours don’t seem stronger than each other. But if we want the colours to work in greyscale you need variation in lightness. Getting some difference in lightness also helps viewers with colour so they aren’t relying on hue alone.
I Want Hue is a tool to help you choose to “generate and refine palettes of optimally distinct colors.” You set the possible colour space in HCL and it uses some maths to pick colours as far away from each other in colour space.
If we set the chroma range to 50-55 and lightness to 65-70 and ask it to generate 4 distinct colours we get.
On the surface, these look quite different. I Want Hue looks at the difference between colours and gives them a grading on how well they do. 5 out of the 6 of the colour pairs have smiling faces for colour distance so it’s easy to tell these colours apart. This drops to 1/6 if we consider colour blindness. But if we desaturate these colours we find these almost all the same.
So we need to introduce a bigger range of chroma and lightness. Taking inspiration from colorbrewer, their 5 colour qualitative palette has a chroma range from 21-50 and lightness from 45-98. Using these setting we get an example palette of
These work quite well with 6/10 smiley faces for normal vision and 2 of the colour blind modes.
With I Want Hue, you can set colours and lock them so if you need to use a certain colour that is possible too, for example if you had to include one brand colour and find 4 other colours there were equally distinct.
Now you’ve got your palette(s), why don’t you test it out in Susie Lu’s palette tool. So we didn’t quite come up with accessible colours but hopefully I’ve shown you what to think about to make the best colour palette possible to make it as accessible as possible.
When I moved jobs to work at the Office for National Statistics, I moved from London to Hampshire. And I had suddenly had more places to visit. Some of these places were National Trust, some were English heritage. I wanted to know would I visit enough places to merit buying membership.
I wanted to build a calculator where you put in your group size (adult, pair of adults, family) and whether you wanted to pay gift-aid or not. You could then see a map with National Trust attractions and you could click to add the one you might visit and if would tell you if that total amount went over the threshold of membership and it was better value to get membership.
This ended up being quite complicated to I decided to make a MVP (minimum viable product) of a map with all national trust attractions with prices for adults with gift-aid.
First had to scrape the National Trust website. This was my first go at learning how to scrape with python and it was quite easy. Thanks to Jure for teaching me. The data was scraped on 23 July 2018 so I expect prices may have been altered since.
Second, I found out each National Trust place comes with the titles for it’s tickets. For example there was
1 adult and up to 3 children
1 adult & up to 3 children
1 adult 3 children
So I used OpenRefine to cluster these into one entry. Here’s the data that’s been cleaned and clustered.
I realised I had scraped my data in a funny way so used the reshape packed in R to fiddle about with it. The plan to make a calculator was out the window so I just filtered the data for an adult paying gift aid. Now my data was ready to map.
I wanted a simple map to just show places and allow people to click through and see what was around their area and prices. I settled on using flourish and was impressed with the configurability of their maps.
Lesson for future me is to get into the mindset of not letting projects drag on and getting overwhelmed by their scope and just make something.
It’s been just over one year since I joined the Office for National Statistics to work in the data visualisation team. As with most jobs it took a while to find my feet, but I no longer stumble when I explain my job. I help communicate the data the ONS collects about the UK economy and society in a way that’s more understandable. Moving away from long reports and excel sheets, our multidisclipinary teams uses plain English and visuals to improve communication. By improving how we talk about stats, we hope to raise the level of debate in society about important issues.
When I took the job, it felt like jumping fields. But there is a lot of overlap with my previous roles in user research, parliament communications, public engagement and science communication. You need to know your audience, work to get yourself where they are, and give them a meaningful engagement. Although I’m doing more programming and more maths now which suits me.
I started doing interactive data visualisations as a bit of a hobby. I set myself weekend projects putting data into existing visualisations as a way to learn to code with d3.js. I read other people’s blog posts about issues in data vis for example, using colour wisely or the best way to represent data truthfully.
But now I’m a practitioner I’ve learnt so much more. I’ve improved my coding and can create things from scratch rather than just repurpose examples. I understand more web technologies. I am more familiar with principles of data vis so I can talk about why things should be a certain way. I’ve done more writing and have more experience of integrating storytelling into projects. I’ve managed more projects simultaneously than ever before. I’m a manager for the first time which is a big learning experience but it’s rewarding to see someone flourish and grow.
Being in a team of data vis specialists means we talk a lot about data vis. We also get approached by colleagues in the ONS about the best way to represent a particular story in a dataset. Critically talking about data vis and learning to articulate what makes a good data viz only really happens when you have other knowledgeable people around. Having talked to other government analysts, our team is in a unique position with so many data vis specialists in one team.
On a personal level, there are several areas I want to develop. I want to improve my data wrangling, my story generation and project management. I want to use some of my user research skills to feed into the evidence behind what we do. There is a the team challenge of integrating the learnings from our visual.ons prototype website into the way the organisation works and perhaps even wider than the ONS.
When sitting down to write this blog post, I realised that although I learnt how to code through my visualisation experiments. But I only learnt how to talk about data vis with other people around me and I wonder if other analysts aren’t able to have these conversations because they don’t have data viz people around them. And if we created a friendly space for these conversations to happen, would this help graphical literacy. I feel there’s an appetite as we have often have people on our data vis courses from other public bodies.
So if you’re interested in starting something let’s talk.
This is a repost from the ONS’s digital blog.
Conferences are a great way to learn and view the wider field of your profession. Often in data visualisation, you feel you’re working in an area so specialised that no one else does anything similar to you. Imagine my surprise when 350 other data visualisation experts and practitioners turned up in Paris for the OpenVis 2018 conference.
To quote Lynn Cherry, OpenVisConf programme co-chair, “OpenVis is a top-tier conference about ‘open source data visualization’ tools and techniques (“openvis”)”.
The conference was inspiring, full of high-quality talks from people leading the field and from a mixture of academics, journalists and industry types. I also got to ride up and down the Seine in a party boat in the Parisian drizzle.
My takeaways can be grouped into the following four areas.
There was so much inspiration. There were technical showcases of web technology to visualise dinosaurs in 3D or how to handle drawing a billion stars. There were explanations of the analytical side of things using machine learning to train neural nets or classify drawings. And also breakdown of design processes.
One idea that I thought could be applied to visualising ONS data is t-SNE clustering. t-SNE is a machine learning algorithm for visualising data with lots of dimensions. Ian Johnson showed what this technique could do on the quick draw dataset (a dataset of people drawing objects). Previous attempts at characterising this dataset focused on the average (How long does it take to draw a cat?) but there is an argument that it’s more interesting to show the distribution rather than focus on summary statistics.
The t-SNE algorithm visualises groups that are similar but doesn’t specify what attribute it is matching them on. It could be any feature (eyes, ears, shapes, strokes) or a combination. This method could be applied to a number of our statistics where we create groupings, census being the most obvious one but also well-being, households, earnings and other surveys.
There were talks that forced me to consider how we do things. Can we bring aspects of gaming into data visualisation? How can we learn not to fall for fallacies, and how does the brain process information?
Steven Franconeri looked at the how the brain can process visual information either quickly or slowly. The quick part works for shape recognition or feature distribution (mean, outliers, trends or clusters), but works slowly for comparing properties of objects.
Try spotting the odd ones out in this pictures.
Source: Steve Franconeri on Twitter
We can apply these insights to make our visualisations more understandable.
Sometimes it’s good to know, the best people out there are also doing what you do. We have put user-centered design at the heart of what we do, as do many others.
One talk was about disagreements from two of the top data vis editors at the New York Times, Amanda Cox and Kevin Quealy. It was great to see the honest conversations that go on behind making visualisations. Disagreements are part of the process as there will be design choices to be made and these are subjective but even the best disagree.
There was lots to take-away from the diverse presentations and range of other attendees. From talking to people from design agencies, big tech companies to freelancers, we had common challenges and we would discuss how we overcome them.
I’ll be sharing more of what I’ve learnt with the data visualisation team and from there into our work in the future.