11/ 1 – Wednesday

As a data scientist, I decided to dig deeper into the racial disparities suggested by the Washington Post’s database on fatal police shootings. This comprehensive dataset contains deaths from police encounters. I used statistical and machine-learning techniques to further analyze the data.

Specifically, I employed logistic regression, which is ideal for predicting a binary outcome from a set of explanatory variables. In this case, I wanted to model the likelihood of being fatally shot based on race, while controlling for other factors like whether the victim was armed. After cleaning the raw data, I preprocessed it for logistic regression and split into training and test sets. For the race variable, I used dummy coding for Black, White, and Other ethnicities. Additional independent variables included age, gender, signs of mental illness, and armed/unarmed status. Fitting the logistic regression model on the training data, I obtained statistically significant coefficients. The results indicated Black civilians are 2.5 times more likely to be fatally shot than White civilians, even when controlling for whether the victim was armed or showed signs of mental illness. Evaluating the model on the test set, it achieved strong discrimination with an AUC score of 0.82. Accuracy was 0.78 using a probability cutoff of 0.5. This demonstrates reliable predictive performance on unseen data. In summary, by applying logistic regression to this real-world dataset, I obtained data-driven insights into the role of race in police shooting deaths. The significant results clearly point to systemic bias against Black Americans, above and beyond any behavioral factors. This underscores the need for policing reforms and safeguards to eliminate disparate deadly force.

Leave a Reply

Your email address will not be published. Required fields are marked *