Why Checkpoints? Between the launch of the competition and selection of the finalists, there are two checkpoints. These are designed to keep you on track, and, particularly if you are a beginner, to give you a little bit of guidance on the typical process of a data analysis. They are for you to use as you like, or not use at all.
The checkpoints provide opportunities to reflect on how your project is coming along as you move through the stages of creating your final product. Having to summarize all the details in which you were immersed for someone else to consume is a valuable tool and it can help you deliver your best.
One week before each checkpoint, you will receive a reminder email that a checkpoint is coming up.
The Idea Checkpoint By this point you should have a clear idea of what your project will be about, and of whether you will design a data-driven visual essay/story/infographic or an exploratory data visualization.
You don’t yet need to have a detailed plan of what your final product will look like, but you do need to know what information you will use to make it.
The purpose behind this checkpoint is to ensure that the dataset does in fact contain the information that you need in order to answer your question of interest. A common pitfall in data analysis is thinking that you are asking one question, when the information available to you lends itself to answering something (slightly or more than slightly) different.
Are you able to refine your question after having taken a closer look at the data and the metadata? What are some possible sources of bias and is there any way that you can avoid them without spending time and resources collecting more data?
Deliverable: a paragraph describing the question that drives your visualization, emailed to firstname.lastname@example.org. You may describe the issue that you care about, why it is important to you to raise awareness about this issue, some limitations of the data to answer your question and what you plan to do about these limitations (if anything can be done).
The Analysis Checkpoint By this point you should have done any pre-processing, cleaning of the dataset that you are interested in, as well as have a fairly good idea of basic features and relationships in the data.
Pre-processing refers to manipulations you make to your variables to get them ready to use. For example, some of the datasets have location encoded as address (latitude, longitude) and you know that you need to separate out latitude and longitude into different columns. This manipulation will turn a string of characters into numeric values, changing the datatype of the variable. Latitude and longitude need to be represented numerically in order to map the data point.
The process is iterative: you are not expected to know that you need numeric values before you try to use the mapping functions and find out that the input that they take has to be numeric. You may have tried feeding the location variable to the mapping function, get an error, identify the source of the error (wrong data type) and only then find out that you need to do some pre-processing to that variable before using it. Being able to work in this cycle will make you better at data analysis.
You should also have ran some descriptive statistics and explored any relationships between variables that you care about. By this point the story will start emerging from the data!
Deliverable: a paragraph outlining any manipulations that you did on the data and why they were necessary. You may also submit graphs, charts, maps, diagrams, etc. along with observations that give you an insight into the data.
Now that the story is starting to emerge, shift your attention to how you want to present it – your visualization.
Deliverable: a mockup of your final product, indicating the positioning of key elements, and a paragraph explaining your reasoning behind it. This may, of course, change as you make progress in creating it.
The Visualization Checkpoint Now that the story is starting to emerge, shift your attention to how you want to present it.
Deliverable: a mockup of your final visualization, indicating the positioning of key elements, and a paragraph explaining your reasoning behind it. This may, of course, change as you make progress in creating it.