Step 1: A Clear GoalEvery single successful data visualization begins with a question, a curiosity, followed by dedicated research. Data sets can be found in a large variety of places. Census tracts are useful and free resources that can benefit almost any topographic endeavor, especially those focused on discovering correlations between demographic, behavioral, and geographic data to learn more about the people which the data represents. There are scores of resources that maintain lists of usable information, some which are part of the open data movement, others that are only accessible for a fee. Another option for finding information, sometimes overlooked, is asking questions on forums, in blog comments, etc. The analyst should always use caution before beginning a project to ensure a reliable dataset.
Step 2: CleaningThe next step, especially when constructing something to publish, is to clean the data. Capitalization should be consistent, spelling errors and extra spaces eradicated, and formatting developed with purpose. You may have 100,000 rows of data or more, so no time to waste! We call this technique “data wrangling,” and there are many useful tools to help in this sort of effort. One such tool is an open source application called OpenRefine. OpenRefine allows the user to group the data using pieces of information, such as a string (a sequence of characters), a numeric value, or any other similarity that the data share. This is helpful for discovering typos, capitalization patterns, or other differences within a cell of data. This application can transform the data into other formats; combine it with other data sets, and more. OpenRefine fits most of CATMEDIA’s data wrangling needs. There are tools available for a fee that can help with other specific needs.
Step 3: DiscoveryThe next step in this method is to import the cleaned data into an applicable geographical data visualization software, and begin analysis. Today, there are many innovative mapping software projects and presentation tools to choose from, making it difficult to recommend just one. I’ll let you in a little secret. The most important part of creating a successful data visualization is uncovering a story, trend, or insight from the data using curiosity, ingenuity, and creativity. This part is fun and exciting, full of adventure, and unexpected surprises. Thoughtful experimentation will help guide this process and reveal insight from what was once a series of numbers. There are tried and true methods for representing geographical data that were developed and implemented long before digital interactive mapping was a glimmer in a cartographer’s eyes. For the purposes of this blog, I have laid out a brief description of a few types of thematic mapping. Thematic maps are usually used as tools to present statistical data connected with geographical locations, as opposed to general maps, like atlases. Here are some of the most widely used types of thematic maps.
- Choropleth maps: use color to represent rates, quantities, values, etc. over a geographic range. Here, the linguistic stems are “choro,” meaning area, and “plethos,” meaning multitude.
- Dot maps: use markings of the same shape, size, etc. to represent a singular unit of data. Trends can be identified by determining patterns, or lack thereof, among the symbols.
- Graduated and proportional symbols: use markings of different sizes to represent different values. For example, a larger value might utilize a larger symbol.