Thursday February 22, 2007
Information Design for the Web
By Colin Lieberman
The art and science of preparing data for human consumption is Information Design. Every train schedule, product data sheet, and customer survey we encounter was designed, either expertly or crudely, to organize information to facilitate interaction.
Good information design solves problems and communicates effectively. Bad design muddles facts and misses causality. When we’re lucky, bad design results merely in confusing or misleading information displays; at its worst, ineffective information design can cost lives.
Not Hyperbole: The Challenger Disaster
Edward Tufte, one the most widely published voices in the field, in his book Visual Explanations: Images and Quantaties, Evidence and Narrative, argues a strong case that the cause of the Challenger explosion was chiefly a failure of information presentation.
Excellent summaries of his argument are available online. In short: when trying to make a case that cooler temperatures were related to O-ring failures, engineers chose a visual representation of the data that convoluted their point.
Had they represented the data in a manner that directed the viewer to the causal relationship being explained, the decision to launch in subfreezing temperatures might not have been made.
The Flip Side: Solving Real Problems
On a more positive note, strong information design can solve very real problems. Epidemiologist John Snow discovered the source of London’s 1854 cholera epidemic through a brilliant organization of data.
By grouping cholera deaths geographically and by quantity on a map that featured public water pumps (suspected sources of contamination), he was able to correctly identify the offending pump as the one most closely situated to the most deaths.
Life and Death on the Web?
The above example are extreme, but the principles of information design at work in those events are no different than the ones at work helping people catch the right train.
Much of what’s written on information design focusses on print design, but the principles are equally applicable online. We have the added benefit online of user interaction — our information designs can live and breath, and meet user needs in a way that print data cannot. But again, the principles are the same.
Principles of Information Design
I’ve tried to distill a lot of reading into just three topics (all of which are Tufte’s), which though fuzzy and slightly overlapping, may help give you specific ideas for your own work. They are:
- Data Richness
- The Smallest Effective Difference
Data richness is the principle of incorporating as much information as possible into a single graphic or chart. By including as much information as possible, we invite the viewer to make comparisons and give her the opportunity to see the forest as well as the trees, coming away with a fuller understanding of the information available.
The Onion’s weekly Statshot feature does a superb job of poking fun at the bad, information-lean designs that are the hallmark of television news graphics.
This graphic (taken from theonion.com) is perfectly terrible:
- The boxes are all the same size: the viewer can not take in the big picture of the data at a glance.
- The heavy weight of the numbers, compared to the much lighter weight of the descriptions, makes it difficult for the eye to rest on the text: it does not invite an explanation of the numbers.
- There is zero visual interaction among the table cells: each data point seems to exist on its own island, not inviting comparison to other data points. This effect is heightened by the rule lines, which are of stronger weight than the text explaining the data.
- The strong design elements (it’s meant to look like a touch screen voting machine) outweigh the data themselves, hiding the paucity of information behind a tarted-up graphical veneer.
Information is presented much more effectively by Google Analytics website traffic software:
Taken from jeffooi.com, this pie chart showing web browser usage among visitors to a website clearly presents several layers of information, and does it very well:
- The percentage representation of each browser is clearly identified (although duplicating the percentage scores in the legend would be a bonus).
- A pie chart is a useful representation for this data, as these charts naturally invite comparisons among relative data points.
- The colors chosen are distinct enough to fit 10 browser versions and an ‘other’ category into the graphic.
- The colors are well muted — heavily saturated colors vibrate against one another, which in such large numbers would have made the graphic unappealing to the eye. The use of pastel colors invites the eye to linger, allowing the viewer to make inferences and comparisons more easily.
- Unsurprisingly for Google, this graphic clearly knows its medium: the graphic invites viewer interaction through buttons for showing or hiding additional information, as well as an option to transform the pie chart into a bar chart (for those who prefer).
- The three-diminsionality of the presentation provides a graphical polish that does not distract from the data.
When presenting information graphically, including sufficient information, and organizing it to provide both the big picture as well as individual data points, allows viewers to fully understand the subject being presented. Information-scarce design, and design that stresses graphical contrivances over content, tends to miss the forest for the trees (or vice versa), and does not invite comparison or understanding.
The Smallest Effective Difference
The smallest effective difference is the principle of weighting individual design elements only as much as is needed.
The principle comes into play most frequently with charts and other tabular information. Heavy black rules between rows and columns can distract from the data they contain, and row or column shading, which is meant to aid the eye in perceiving data, can become a distraction when overdone.
Here’s a table showing percentages of children with disabilities in various U.S. metro areas from the Annie E. Casey Foundation’s Kids Count census data site:
The table needs work in a few areas:
- The horizontal and vertical rules are very dark, the same or nearly the same weight as the text. The rules leap out at the eye, making a strong grid pattern. By reducing the rules to a lighter gray, the data would stand out more.
- Most of the columns are too wide. Unless the rankings will get up in the 10 ~ 15 digit range, these columns could be a fraction of their current width, and the same applies to the Geographic Area column. With the data spread unnecessarily far apart, it’s hard for the eye to take in everything there is to know about an area in one movement.
- There is no hilighting of rows or columns (except the one manually selected). Highlighting alternating rows would help the eye parse the table.
The manual selection highlighting is a strong feature, though. The target audience for this table is children, and nothing interests the young more than themselves. A feature that helps kids find their own region in a large table makes the data more interesting to this audience.
This next table, from Geoscience Australia, gets a little more right, but still could use some work.
The pale gray highlight is an excellent use of the principle of smallest effective difference. It’s just dark enough to allow the eye to accurately follow each row, but not so dark that it stands out as a design element. It serves its purpose without distracting.
However, the cell borders could use some fixing up. The white space between cells is completely unnecessary, and is only made worse by the black borders on the tops and left edges of the cells. The strong black-on-white contrast of these borders makes reading difficult — the contrast is stronger than the contrast of the text itself (except for the bold table names). This high contrast distraction makes the table appear to vibrate on the screen, and pulls the eye away from the data.
Replace both the heavy borders and the white space between cells with a narrow, pale gray line, and this table would be much more readable.
This last table, of Indian census data, despite the color choices, has a lot going for it:
- The muted row colors guide the eye without providing much distraction (though the green could be toned down).
- The light warm colors for data contrast nicely with the cool, dark colors for the column headings (even if the colors themselves could use work).
- The white vertical and horizontal rules are almost perfect. The white works well with the light colored rows: it provides separation without distraction. However, it doesn’t work as well with the darker colored headings, where the contrast with the background is as strong as the contrast of the text - an unnecessary visual distraction.
The blue-on-teal of the table title is awful, but easily fixed, and the numbered column headings serve no discernible purpose, and could be safely removed. The table also suffers from excessive color, and columns could be more narrow.
Those issues aside however, this is generally a very strong table: the data is clearly presented, and treated as the most important element of the presentation.
The principle of smallest effective difference dictates that by paying attention to line weight, contrast, and color, and we can aid the viewer’s eye in easily perceiving data, without causing unnecessary distraction.
The simplest, clearest way to invite the view to make comparisons among data is to show similar information close together. This is parallelism.
An example of this with which we are all familiar is a typical search engine results page.
These pages, when done well, are masterpieces of information design:
- The numerous results are a clear example of strong parallelism. The different results to be compared constitute the bulk of the page content.
- The large blue text of the links, coupled with the white space above, creates a strong contrast, clearly delineating one result blurb from the next.
- The linked page title, as well as the summary text bellow, provide information about the content of the linked page.
- The green URL provides information about the source, and perhaps credibility, of the content of each result.
Good search engine results pages like Google’s allow the viewer to compare a wealth of parallel information.
Toyota.com’s Model Selector tool is another example of strong parallelism:
The tool shows 21 models of cars, trucks, vans, and SUVs, clearly depicting the body shape of each, in more than a half-dozen available colors. The vehicles are arranged on a well-conceived grid by vehicle type and price (note the strong use of the principle of smallest effective difference).
At a glance, the viewer can compare these 21 vehicles by appearance and cost, and consider color options.
The viewer may then mouse-over any vehicle for additional information, including MSRP, fuel-efficiency, and passenger capacity, as well as for a slightly larger graphic. If these other options are more import to the viewer, she can use the menu on the left to request a new chart with other criteria for axes. This display is highly effective.
The examples of good parallelism we’ve looked at so far compared information in the present. Naturally, we also want to compare changes in data over time and geography.
An excellent map of hummingbird migration at hummingbirds.net clearly shows the range of the journey across the eastern United States, with individual data points color-coded by month, and with individual legends by sighting date.
Note: due to its very large size I’ve shrunk the image. See the original for the full picture.
Parallels among the colored swaths, as well as among individual sightings, provide a broad and deep picture of the birds’ migration. The political boundaries between U.S. states are clear enough to provide context, but not so sharply delineated that they distract from the point of the map (a very strong example of the principle of smallest effective difference).
Effective parallelism provides the viewer with information needed to fully take in the point the data is trying to convey. The viewer is not left wondering “but what about this?” or “I wonder how that fits into this.”
Case Study: bart.gov
The subway in the San Francisco area, the Bay Area Rapid Transit (BART) system, like all major metropolitan transport, has a website with a great deal of information available. A quick look at the information design of bart.gov may help clarify any of these principles I’ve failed to explain clearly.
The home page of bart.gov lacks parallelism. The Quick Planner looks useful, but if I don’t know where in the system stations are located, it’s not terribly helpful. If the information about the sports sponsorship were replaced with a system map, I could more effectively compare my options on the left with geographical information.
So, I’ll click the “System Map” link…
The System Map
The BART system map, an icon of the Bay Area.
This map gets a lot right:
- The simple background draws attention to train lines and stations.
- The saturated colors for the train lines contrast well with each other and the background.
- There is good (at least better than other systems) geographical data to provide context.
While it would be nice to have some other information, such as major freeways, or popular tourist spots, this map does provide a much greater degree of geographical context than metro maps in London or Tokyo (for example).
The page itself has a major problem though: where’s the trip planner we had on the home page? I’ve got the map now, but I can’t look up a schedule for my trip. These features should be on the same page.
Buried under the “Stations and Schedules” menu is a Line Schedules link, and voilà:
A schedule lookup form with the map adjacent.
The radio button options for the desired train line could stand some improvement. A complete lack of table rules makes it difficult to correlate the round radio option with the vertical line color selection, and the legends are far away.
It’s not impossible to use, but it could be better. Here’s my suggestion:
The timetable is also in need of improvement:
The station names written vertically across the top are difficult to read, especially once you scroll down towards later trains.
There are a lot of repeated numbers taking up unnecessary space; removing the unnecessary hours would trim hundreds of characters from this display.
There is no shading to guide the eye.
Here’s my improvement:
|5 am||6 am||7 am|
|El Cerito del Norte||16||31||46||6:01||16||31||46||7:01||16||31|
|El Cerito Plaza||19||34||49||04||19||34||49||04||19||34|
This still doesn’t solve the problem of using only color to indicate when bicycles are not permitted, but overall, it’s a little easier to read.
We’ve dealt entirely with visual representations of information. When designing for the web, it is important to remember that our work is accessible to users who can not perceive information visually (such as the googlebot and those with vision disabilities). Many excellent references already exist on how to make graphical data accessible. Proper markup of tables (lacking in my own examples here) is important. Don’t forget to do it.
Good information design makes data clear, and its interpretation readily apparent to viewers. Poor information design results in confusion, lack of usability, or worse. Good information design accessibly marked up and with text descriptions makes clear data widely consumable — which is a goal that when achieved drives understanding as well as traffic.
As responsible web professionals, our mission is to keep traffic and conversions up, while simultaneously making the web a better place. Strong, accessible information design achieves these goals.