Wednesday, October 26, 2011

Hexbins!


The Hexbin method as applied to the locations of all Walmarts within the United States.  By Zachary Forest Johnson.  from: http://indiemaps.com/blog/2011/10/hexbins/
What are hexbins, and why would one use them in cartography, data visualization, and spatial analysis?  Hexbinning is an innovative way of representing a large number of geocoded locations (points) on a map.  It is at once a great way to give a clear overview of the data, while also allowing the user or map reader to drill down to specifics about the data. 

“Binning can be good for both the users and the creators/developers of static or interactive thematic maps or other visualizations.  For the user, showing every single point can lead to cognitive overload, and may even be inaccurate, as overlapping points lead to a misreading of density.  A binned representation may reveal patterns not readily seen in the raw point representation of the data….The idea of hexagonal binning is to break a two-dimensional plane into different bins. First, the bins make interlocking hexagons. It is possible to use squares (or interlocking triangles or another shape), but hexagons look “rounder” than squares.
            Hexbinning consists of 1) laying a hexagonal grid or lattice atop a 2-dimensional field of data and 2) determining data point counts for each hexagon.  This says nothing of the symbolization or representation method that can then be employed to communicate these counts to the graphic’s reader…These [hexbin] plots are notable for allowing the user to simultaneously view generalities and retrieve specifics.” From http://indiemaps.com/blog/2011/10/hexbins/

“Hexagon binning is a form of bivariate histogram useful for visualizing the structure in datasets with large n.  The underlying concept of hexagon binning is extremely simple;
1. the xy plane over the set (range [x], range [y]) is tessellated by a regular grid of hexagons.
2. the number of points falling in each hexagon are counted and stored in a data structure.
3. the hexagons with count > 0 are plotted using a color ramp or varying the radius of the hexagon in proportion to the counts.
The underlying algorithm is extremely fast and effective for displaying the structure of datasets with n >/= 106.  If the size of the grid and the cuts in the color ramp are chosen in a clever fashion, then the structure inherent in the data should emerge in the binned plots.  The same caveats apply to hexagon binning as apply to histograms and care should be exercised in choosing the binning parameters.
Why hexagons?  There are many reasons for using hexagons, at least over squares. Hexagons have symmetry of nearest neighbors which is lacking in square bins.  Hexagons are the maximum number of sides a polygon can have for a regular tessellation of the plane, so in terms of packing a hexagon is 13% more efficient for covering the plane than squares.  This property translates into better sampling efficiency at least for elliptical shapes.  Lastly hexagons are visually less biased for displaying densities than other regular tessellations.  For instance with squares our eyes are drawn to the horizontal and vertical lines of the grid.  
When the data are plotted as squares centered on a regular lattice our eye is drawn to the regular lines which are parallel to the underlying grid.  Hexagons tend to break up the lines.”
From: Hexagon Binning: an Overview, by Nicholas Lewin-Koh

For more information see the following:
The Hex Bin method

Walmart locations all hexed up!


No comments:

Post a Comment