After evaluating several options for my third analytics project, I have decided to work with the Employee Earnings Report dataset from the City of Boston. This data comes from their analytics portal.
The dataset provides a detailed breakdown of the earnings of all full-time city employees each year, including overtime pay. It covers over 30 departments and lists compensation figures like base pay, overtime pay, detail pay, and more for each employee. I chose this public sector salary and wage data because it presents opportunities for interesting analysis while allowing me to sharpen my data manipulation and modeling skills. A few high-level questions I plan to examine:
– How have total earnings across departments changed over time? Are some departments showing much higher growth than others?
– What insights can statistical modeling reveal about key drivers of overtime pay? How do factors like job type and years of experience correlate with overtime?
– Can clustering algorithms identify groups of departments with similar compensation patterns that may inform salary standardization policies?
I am still actively exploring additional angles of analysis to pursue with this multidimensional dataset. Applying techniques like regression, clustering, data visualization, and more can extract key insights around public sector compensation trends in Boston. Now that I have selected this interesting civic dataset, I’m eager to dive deeper into analysis. My next steps are preprocessing and cleaning the data, conducting initial exploratory analysis, forming concrete analytic questions, and ultimately building models to derive actionable intelligence around employee earnings.