Vancouver Bicycle Crime PANDAS Project Setup

The Data

This little starter project is going to be centered on a couple of data sets that I found in csv format in the city of Vancouver's open data catalog. The first file of about 1700 entries contains information on all the bike racks in the city, while the second file of about 136000 entries contains crime information about the city dating back to 2015. The second data set conveniently has a category under it's 'Type' field called 'Theft of Bicycle' so naturally, I'm interested to find the worst place in the city to park my bike.

The Tools

I struggled for a time to decide whether I wanted to use R or PANDAS for this project. I ultimately settled on PANDAS for the following reasons.  
  1. I've used R before and I wanted to try something fresh.
  2. PANDAS is build on top of NumPy and therefore Python. 
  3. When using PANDAS I feel like I will have more control (likely because I come from a programming background).
  4. I saw a YouTube video that got me kinda psyched about it.
I didn't want to code python in interactive mode and I also didn't want to waste time running code from a prompt, so I selected the IDE PyCharm for the following reasons.
  1. It's JetBrains based, and I've used other JetBrains' IDE's.
  2. It apparently gets along well with GIT.
  3. It has R language support (something I might take advantage of later).
  4. Some guy on the internet (who's probably sponsored) told me to.

Summary

I have my environment set up now, so I'll probably post something more interesting in my next post.

Cheers,
Joe

Comments

Popular Posts