You can use Microsoft Excel if you're familiar with it, but for simplicity's sake all of my examples will be in Google Spreadsheets
All of our data wrangling will be done on this spreadsheet. Open it up and make your own local copy. In case you're wondering, yes this is actual, real data about deer hunting accidents in Wisconsin. Cool.
Spreadsheet programs come in many different flavors, but for the most part they are all copies of the same program. The commands are there, they may be in different menus or locations.
At this point you should have made a copy of this file: https://docs.google.com/spreadsheets/d/1mzd4F_slzOL6BYfLUaKhrXf9xV-KDfO4r5W4jwlgzSc/edit#gid=0 If not, go to that link and File > Make a Copy
Now you have your own copy! Hurray. Save it locally.
Here are some things you should do with every dataset you encounter:
1. The Eyeball Test. Scroll through the spreadsheet and look for patterns or missing spots. Some questions/notes that you should keep in mind:
Typically, you can't understand most data in the real world without making some phone calls.
All that said, we are going to continue anyway...
2. Set Up Shop. A few things to do before you get started that will make it easier to work with your data.
Freeze top row: View > Freeze
Turn on filters: Data > Filter
3. Data Wrangling. Techniques that will help you with most of your initial data questions.
Pivot Tables. Say I want to find out which gun brand is associated the most with fatal accidents. What is the best way to do that?
Data > Pivot Tables. We'll use this technique to answer the following questions:
4. More Things to Know
Make phone calls. Find the nerd that works with the data and talk to them.
Save early and often
Keep a data diary
Backup your data
If you can, go see it in the real world.
Bulletproof, Bulletproof, Bulletproof. Have other colleagues look at your data before it's published. Also make sure to do the following: