In May of this year, the city of Milwaukee opened a new open-data portal. I am learning some python visualization tools at the moment, so I thought this would be a good opportunity to explore milwaukee's data and learn something for myself. I hope to do a few of these, so if you like it, stay posted.

My first post will be on Milwaukee Crime Data. link: https://data.milwaukee.gov/dataset/wibr. You may not visit the link, but please note that there are some disclaimers about the accuracy of the data. My personal disclaimer is that I am not a crime expert. All of this analysis should be taken with a grain of salt.

In [24]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import matplotlib.dates as mdates
plt.style.use("ggplot")
In [25]:
data = pd.read_csv("./gunViolenceData/wibr.csv")
In [26]:
data.ReportedDateTime = data.ReportedDateTime.apply(lambda string : string[0:7]) #7 for month, 10 for day
data.ReportedDateTime = pd.to_datetime(data.ReportedDateTime, format="%Y-%m")#-%d %H:%M:%S")
data["OnesForCounting"] = np.ones(data.shape[0])
data.index = pd.DatetimeIndex(data.ReportedDateTime.values)
In [27]:
data.WeaponUsed.unique()
gunNames = ['HANDGUN',
       'FIREARM', 'FIREARM,HANDGUN', 'FIREARM,PERSONAL WEAPON', 'HANDGUN,FIREARM', 'FIREARM,OTHER',
       'SHOTGUN', 'HANDGUN,PERSONAL WEAPON', 'BLUNT OBJECT,HANDGUN', 'OTHER FIREARM',
       'PERSONAL WEAPON,FIREARM',
       'FIREARM,LETHAL CUTTING INSTRUMENT',
       'HANDGUN,LETHAL CUTTING INSTRUMENT',
       'FIREARM,BLUNT OBJECT,PERSONAL WEAPON','RIFLE', 'HANDGUN,RIFLE',
       'FIREARM,RIFLE',
       'FIREARM,HANDGUN,PERSONAL WEAPON', 'BLUNT OBJECT,FIREARM',
       'LETHAL CUTTING INSTRUMENT,HANDGUN',
       'GUN','HANDG', 'HANDS/HANDG', 'BRICK/HANDG','DRUGS/HANDG', 'GUN/HANDG',
       'HANDG/ROCK', 'HANDG/GUN/VEHIC', 'SHTGN','GUN/HANDG/SHTGN', 'HANDS/GUN',
       'BRICK/GUN', 'HANDG/BLUNT', 'HANDG/RIFLE',
       'HANDG/KNIFE', 'GUN/HANDS', 'BLUNT/HANDG',
       'HANDG/HANDS/TOOLS', 'HANDS/HANDG/RIFLE', 
       'HANDG/OTHER',
       'HANDS/SHTGN', 'BBBAT/SHTGN', 'HANDG/SHTGN',
       'GUN/KNIFE', 'GUN/ROCK', 'HANDG/GUN', 'HANDG/VEHIC', 'BBBAT/GUN',
       'IMPLI/GUN', 'HANDG/BLUNT/VEHIC', 'BLUNT/GUN',
       'GUN/SHTGN', 'HANDG/OTHER/HANDS',
       'GUN/RIFLE/HANDS', 'HANDG/RIFLE/SHTGN', 'GUN/RIFLE',
       'IMPLI/HANDG', 'HANDG/BBGN', 'GUN/HANDG/HANDS',
       'HANDG/KITUT/HANDS',
       'RIFLE/GUN',  'NONE/SHTGN',
       'HANDG/RIFLE/HANDS',
       'GUN/IMPLI', 'IMPLI/HANDS/GUN',
       'HANDG/KNIFE/HANDS', 'ASPHY/HANDS/HANDG',
       'GUN/FIRE', 'HANDG/HANDS/BLUNT', 'GUN/OTHER',
       'BBBAT/BLUNT/HANDG', 'HANDG/IMPLI', 'BBGN/BLUNT',
       'OTHER/GUN', 'HANDS/GUN/HANDG', 'HANDG/PEPPE',
       'HANDG/KITUT', 'GUN/BLUNT', 'GUN/HANDS/UNKNO',
       'GUN/PEPPE', 'IMPLI/GUN/FIRE', 'BBBAT/HANDG',
       'SHTGN/HANDS/PEPPE', 'VEHIC/GUN',
       'TASER/HANDG', 'GUN/HANDG/RIFLE', 'RIFLE/HANDS',
       'HANDG/BRICK', 'GUN/HANDS/KNIFE', 'SHTGN/HANDG','HANDG/UNKNO', 'TOOLS/GUN',
       'HANDG/PEPPE/TASER', 'GUN/HANDG/PEPPE',
       'HANDG/HANDS/GUN','FIRE/GUN/HANDG',
       'HANDG/HANDS/KNIFE', 'RIFLE/HANDG', 'HANDS/BLUNT/HANDG',
       'HANDS/HANDG/BLUNT',
       'RIFLE/SHTGN', 'GUN/RIFLE/SHTGN', 'BLUNT/HANDG/VEHIC',
       'BLUNT/HANDG/KNIFE', 'HANDS/HANDG/SHTGN',
       'GUN/HANDS/OTHER', 'GUN/HANDG/OTHER',
       'GUN/UNKNO', 'KNIFE/HANDG', 'GUN/TOOLS', 'GUN/HANDG/TOOLS', 'OTHER/HANDG',
       'GREAS/HANDG', 'BOTTL/HANDG', 'GUN/GUN', 'GUN/TASER',
       'BLUNT/HANDS/HANDG', 'HANDG/RIFLE/TASER',
       'GUN/VEHIC', 'HANDG/BBBAT',
       'HANDS/HANDG/OTHER', 'HANDG/TOOLS', 
       'HANDS/HANDG/KNIFE', 'HANDG/HANDS/OTHER', 'GUN/HANDS/BLUNT',
       'HANDG/HANDS/HANDG', 'BLUNT/GUN/HANDS',
       'HANDS/RIFLE', 'UNKNO/HANDG', 'FIRE/GUN', 'KNIFE/SHTGN',
       'RIFLE/BLUNT', 'ROCK/HANDG/HANDS',
       'HANDG/TASER/FIRE', 'HANDG/VEHIC/HANDS', 'PEPPE/GUN', 'ROCK/GUN','VEHIC/HANDG', 
       'HANDG/TIRE', 'PEPPE/HANDG',
       'SHTGN/HANDS','KITUT/HANDG/HANDS',
       'HANDG/TASER', 'DRUGS/HANDG/GUN',
       'KNIFE/HANDS/HANDG', 'HANDG/HANDS/VEHIC',
       'GUN/HANDG/KNIFE', 'VEHIC/HANDS/HANDG',
       'HANDS/HANDG/PEPPE', 'BBBAT/HANDG/HANDS', 'IMPLI/GUN/HANDS',
       'HANDS/OTHER/HANDG', 'PEPPE/HANDG/HANDS', 
       'HANDS/BBBAT/HANDG', 'TOOLS/HANDG', 'HANDG/FIRE', 'SHTGN/SHTGN', 'BLUNT/HANDG/HANDS',
       'SHTGN/GUN/HANDG', 'GUN/HANDS/HANDG',
       'HANDS/KITUT/GUN', 'HANDS/IMPLI/GUN', 'VEHIC/SHTGN', 'HANDG/VEHIC/OTHER',
       'HANDG/FIRE/ROCK','BRICK/HANDG/ROCK', 
       'KNIFE/HANDG/HANDS', 'HANDG/BLUNT/HANDS',
       'BOARD/GUN', 'BLUNT/BRICK/HANDG', 'HANDG/HANDS/TASER', 'GUN/BRICK', 
       'TOOLS/HANDG/IMPLI', 'SHTGN/KNIFE/HANDS',
       'BBBAT/HANDS/HANDG',
       'GUN/KITUT', 'HANDS/GUN/TASER',
       'BOTTL/HANDS/HANDG', 'KNIFE/GUN',
       'HANDS/PEPPE/HANDG', 'BBBAT/GUN/SHTGN', 'BBBAT/HANDG/TASER',
       'UNKNO/GUN', 
       'HANDS/KNIFE/GUN', 'GUN/BBBAT', 'HANDG/HANDS/IMPLI',
       'HANDS/SHTGN/HANDG', 'HANDG/GUN/OTHER','SHTGN/OTHER', 'BLUNT/HANDG/IMPLI',
       'FIRE/HANDG/SHTGN',
       'GUN/HANDS/ROCK',
       'HANDS/GUN/TOOLS', 'HANDG/HANDG', 'SHTGN/HANDS/TOOLS','HANDG/GUN/HANDS',
       'BRICK/HANDG/HANDS', 'HANDG/TOOLS/BLUNT',
       'HANDG/HANDG/GUN', 'HANDG/HANDS/PEPPE', 'HANDS/HANDG/VEHIC',
       'KITUT/KNIFE/HANDG', 'GUN/GUN/HANDG',
       'HANDG/GUN/RIFLE', 
       'RIFLE/HANDS/HANDG', 'TIRE/HANDG', 'OTHER/HANDG/HANDS',
       'HANDG/HANDG/HANDS', 'ASPHY/GUN',
       'HANDG/RIFLE/PEPPE',
       'FIRE/HANDG', 'HANDS/SHTGN/RIFLE', 'UNKNO/GUN/IMPLI',
       'BLUNT/HANDS/GUN', 
       'GUN/KNIFE/SHTGN', 'RIFLE/KNIFE',
       'SHTGN/GUN', 'TIRE/OTHER/GUN',
       'SHTGN/HANDS/IMPLI',
       'ROCK/HANDG', 'RIFLE/HANDG/HANDS', 'ASPHY/HANDG/HANDS', 'IMPLI/HANDG/HANDS',
       'IMPLI/RIFLE', 'GUN/BBGN', 'KNIFE/GUN/HANDS',
       'BLUNT/SHTGN', 'GUN/HANDS/BOTTL',
       'HANDG/SHTGN/GUN', 'HANDG/HANDS/RIFLE',
       'SHTGN/HANDG/HANDS', 'GUN/KNIFE/HANDS',
       'HANDG/UNKNO/PEPPE', 'NONE/HANDG', 'GUN/FIRE/HANDS',
       'SHTGN/GUN/HANDS',
       'UNKNO/SHTGN/OTHER',
       'HANDS/BBBAT/GUN','GUN/HANDG/FIRE',
       'OTHER/RIFLE',
       'GUN/HANDS/KITUT', 'HANDS/GUN/KNIFE', 
       'GUN/HANDS/SHTGN', 'KITUT/HANDG', 'HANDG/SHTGN/HANDS', 'TIRE/GUN',
       'SHTGN/RIFLE',
       'GUN/HANDG/BLUNT', 
       'HANDS/IMPLI/HANDG', 'GUN/SHTGN/HANDG', 'GUN/RIFLE/ROCK',
       'GUN/HANDG/IMPLI', 'GUN/BRICK/HANDS', 'HANDG/HANDS/SHTGN', 
       'BBBAT/BBBAT/GUN', 'BBGN/GUN', 'HANDS/OTHER/GUN',
       'HANDG/SHTGN/OTHER',
       'BOARD/HANDG', 'GUN/HANDG/VEHIC',
       'HANDS/HANDG/IMPLI',
       'BOTTL/HANDG/HANDS', 
       'HANDG/GUN/BRICK', 'GUN/HANDG/HANDG', 'RIFLE/SHTGN/HANDG',
       'HANDG/SHTGN/RIFLE',
       'BBGN/HANDG/SHTGN', 'HANDS/GUN/RIFLE',
       'HANDG/BBBAT/HANDS', 'GUN/HANDG/GUN',
       'BBGN/GUN/HANDG',
       'HANDS/GUN/KITUT/',
       'HANDG/BRICK/ROCK', 'ASPHY/SHTGN',
       'RIFLE/OTHER', 'OTHER/IMPLI/HANDG',
       'KNIFE/RIFLE/SHTGN', 'BLUNT/HANDG/KITUT', 'BBGN/SHTGN',
       'BLUNT/IMPLI/HANDG', 'RIFLE/GUN/HANDG','KNIFE/IMPLI/GUN', 'IMPLI/HANDG/OTHER',
       'GUN/KITUT/RIFLE', 'ASPHY/HANDG',
       'UNKNO/IMPLI/HANDG', 'HANDS/RIFLE/SHTGN', 'HANDG/BOARD', 'OTHER/SHTGN',
       'HANDG/TOOLS/TIRE',
       'OTHER/UNKNO/HANDG', 'BBBAT/GUN/HANDG', 'OTHER/HANDS/GUN', 'HANDG/KNIFE/RIFLE', 'SHTGN/UNKNO',
       'EXPLO/RIFLE', 'HANDG/HANDS/HANDS', 'POISN/HANDG',
       'IMPLI/GUN/HANDG']
In [28]:
def findGunIncedents(gunNames, data):
    mask = np.zeros(data.shape[0]).astype("bool")
    for iGunName in gunNames :
        mask = np.logical_or(mask, (data.WeaponUsed == iGunName).values)
    return mask
In [29]:
mask = findGunIncedents(gunNames, data)
In [30]:
gunData = data.iloc[mask,:]
In [31]:
newSeries = pd.Series(gunData.ReportedDateTime.astype("int")/(1e6))
gunData.insert(column="ms",value=newSeries,loc=1)

The crime data being analyzed is a list of Grade A offenses since 2005, collected by the City of Milwaukee. What are "Grade A offenses"? Based on the dataset, they appear to be the following: Theft, Assault, CriminalDamage, Burglary, VehicleTheft, LockedVehicle, Robbery, SexOffense, Arson, and Homicide. A pretty good list of definitions can be found at https://city.milwaukee.gov/commoncouncil/District11/AnalysisShows11thDis19773.htm#.W6uBLpMzbxg. The distribution of the crimes in the dataset are shown below. On the right, we have a donut plot; the inside numbers are the percent makeup of each category.

In [32]:
categoryTotals = data.sum()["Arson":"VehicleTheft"].sort_values(ascending=False)
cmap = plt.get_cmap("tab10")
colors = cmap(np.arange(categoryTotals.index.values.shape[0]))

plt.figure(figsize=(20,9))
plt.subplot(1,2,1)
plt.bar(categoryTotals.index.values, height=categoryTotals, color=colors)
plt.xticks(rotation=45, ha="right")
plt.grid(axis="x")
plt.subplot(1,2,2)
# had some help from here: https://matplotlib.org/gallery/pie_and_polar_charts/nested_pie.html
plt.pie(categoryTotals, radius=1, wedgeprops=dict(width=0.3, edgecolor='w'), autopct='%.1f', labels=categoryTotals.index.values, colors=colors);

What is the story being told in the above figures? Theft makes up a majority of the reported crimes. In almost 13 years, over 140000 thefts have been reported. Thats a bit mind boggling when you consider that about 600,000 people live in Milwaukee. The same could be said of Assaults. Homicides are the least frequently committed crime (whew), though I am a bit surprised there is over twice as much arson as homicide. Maybe arson just doesn't make the news as much? I don't know. Would be happy to hear your input.

Next I took a look at the number of crimes committed in each month. Note that these plots are not population adjusted. However, Milwaukee population has been relatively constant in the past 10 or so years. The first plot shows the total crime.

In [33]:
def plotMonthlyStuff(data, field, color="blue"):
    dateFormat = "%Y-%m"
    plt.gca().xaxis.set_major_formatter(mdates.DateFormatter(dateFormat))
    a = data.groupby("ReportedDateTime")
    plt.plot(a[field].sum()[0:-1], label=field, c=color)
    plt.xticks(rotation=20, ha="right")
    plt.ylabel("Times Reported")
    plt.xlabel("Month")
    plt.grid()
    plt.title(field)
In [34]:
plt.figure(figsize=(15,6))
plotMonthlyStuff(data,"OnesForCounting", color="black")
plt.grid()
plt.title("Total");

Total crime is generally decreasing since 2005. Crime peaks in the summer and bottoms out in the winter. Nothing too suprising.

Next we look at each type of crime over time. Theft is going down, which is probably the main driver in the downward trend in the previous graph. Assault is going up - It would be interesting to find out why... Criminal Damage is going down, and the rest seem pretty constant.

In [35]:
plt.figure(figsize=(14,18))
for iField in range(categoryTotals.index.values.shape[0]):
    plt.subplot(categoryTotals.index.values.shape[0], 1, iField+1)
    plotMonthlyStuff(data,categoryTotals.index.values[iField], color=colors[iField])
    plt.grid()
plt.tight_layout()
In [36]:
gunCategoryTotals = gunData.sum()["Arson":"VehicleTheft"].sort_values(ascending=False)

Next I wanted to look only at the crimes where guns were used as a weapon. This required some pretty rough data cleaning, so the data here is definately inaccurate to some degree - take it with a grain of salt. I think the main take-away here is gun crime going up with the exception of the last two years.

In [37]:
plt.figure(figsize=(15,6))
plotMonthlyStuff(gunData,"OnesForCounting", color="black")
plt.title("Total - Gun Related Crimes");

Looking at each gun crime category, we see that assault and robbery are the main drivers of gun crime.

In [38]:
plt.figure(figsize=(14,18))
for iField in range(gunCategoryTotals.index.values.shape[0]):
    plt.subplot(gunCategoryTotals.index.values.shape[0], 1, iField+1)
    plotMonthlyStuff(gunData,categoryTotals.index.values[iField], color=colors[iField])
    plt.grid()
plt.tight_layout()

The next figure shows the weapons used in homicide. Yes, guns do kill people.

In [40]:
weaponsList = data.iloc[(data.Homicide == 1).values,3].dropna().apply(lambda string : string.split("/")).values.tolist()
flat_list = np.array([item for sublist in weaponsList for item in sublist])
np.unique(flat_list)
categories = pd.Categorical(flat_list)
categoryCounts = categories.value_counts().sort_values(ascending=False)
plt.figure(figsize=(12,9))
plt.bar(categoryCounts.index.values.tolist(),height=categoryCounts.values)
plt.xticks(rotation=40, ha="right");
In [41]:
def getXAndYForScatter(x):
    y = np.zeros(x.shape[0])
    for iUniqueX in np.unique(x):
        y[x == iUniqueX] = np.cumsum([x == iUniqueX])[x == iUniqueX] - np.floor((np.sum([x == iUniqueX])/2))
    return y
In [42]:
def makeCustomLines(colorsUsed, names, day0):
    customLines = []
    for iName in range(len(names)):
        customLines.append(plt.Line2D([day0],[0],marker="o", c=colorsUsed[iName], label=names[iName]))
    return customLines
In [43]:
x =gunData.iloc[:,13:].stack()
crime = pd.Series(pd.Categorical(x[x!=0].index.get_level_values(1)))
In [44]:
colorsUsed = ['tab:blue','tab:orange','tab:green','tab:red','tab:purple','tab:brown','tab:pink','tab:gray','tab:olive','tab:cyan']
names = ["Arson","AssaultOffense","Burglary","CriminalDamage","Homicide","LockedVehicle","Robbery","SexOffense","Theft","VehicleTheft"]
def dummyToColor(df) :
    if df["Arson"]:
        return 'tab:blue'
    elif df.AssaultOffense:
        return 'tab:orange'
    elif df.Burglary:
        return 'tab:green'
    elif df.CriminalDamage:
        return 'tab:red'
    elif df.Homicide:
        return 'tab:purple'
    elif df.LockedVehicle:
        return 'tab:brown'
    elif df.Robbery:
        return 'tab:pink'
    elif df.SexOffense:
        return 'tab:gray'
    elif df.Theft:
        return 'tab:olive'
    elif df.VehicleTheft:
        return 'tab:cyan'
In [45]:
colors = []
for iRow in range(gunData.shape[0]):
    colors.append(dummyToColor(gunData.iloc[iRow,:]))
colors = np.array(colors)
In [ ]:
yAxisStepSize = 30
xTicks = 12
dateFormat = '%Y-%m'#'%Y-%m-%d'

plt.figure(figsize=(30,30))
counts = getXAndYForScatter(gunData["2017"].ReportedDateTime.values)
plt.scatter(gunData["2017"].ReportedDateTime.values, counts, s=5, c=colors)
customLines = makeCustomLines(colorsUsed, names, gunData.ReportedDateTime.values[0])
plt.gca().xaxis.set_major_locator(plt.MaxNLocator(xTicks))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter(dateFormat))
plt.yticks(np.arange(counts.min(),counts.max() + 1, yAxisStepSize),np.abs(np.arange(counts.min(),counts.max() + 1, yAxisStepSize)).astype(np.int))
#plt.gca().xaxis.set_major_locator(mdates.DayLocator())
plt.legend(handles=customLines)
#plt.grid(axis="y")
plt.savefig("GunViolence.svg")
In [49]:
import shapefile
In [57]:
sf = gpd.read_file("/Users/kaftand/Documents/MachineLearningMarquette/kaggle/gunViolenceData/alderman.shp")
In [85]:
sf["crime"] = np.zeros((15))
for i in range(15):
    sf.loc[i,("crime")] = np.sum((data.ALD == sf.loc[i, ("ALD")]).values)
    sf.loc[i,("gunCrime")] = np.sum((gunData.ALD == sf.loc[i, ("ALD")]).values)

I wanted to visualize the location of crimes on a map, but I ran into two problems. 1: the crime data was geographically organized by aldermanic district, which are less understandable than neighborhoods. 2: I could not find population data on aldermanic districts, so all of these maps are not population adjusted, which makes them potentially misleading. Please keep that in mind; if there are more people in a district, there are going to be more crimes regardless of how dangerous the district actually is. If you are totally lost on the aldermanic map, Wisconsin street runs right through district 4.

In [89]:
f, ax = plt.subplots(1, figsize=(12, 12))
sf.plot(ax=ax, column="crime", legend=True)
plt.axis('equal')
plt.axis("off")
for idx, row in sf.iterrows():
    plt.annotate(s=row['ALD'], xy=row.geometry.centroid.coords[0],
                 horizontalalignment='center')

I repeated this map for gun crimes:

In [90]:
f, ax = plt.subplots(1, figsize=(12, 12))
sf.plot(ax=ax, column="gunCrime", legend=True)
plt.axis('equal')
plt.axis("off")
for idx, row in sf.iterrows():
    plt.annotate(s=row['ALD'], xy=row.geometry.centroid.coords[0],
                 horizontalalignment='center')

Thats all for now. Please share some feedback on whatever platform you saw this post. Thanks for reading - David

In [92]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')
Out[92]: