The purpose of this research is to determine the trends of homicide
in the United States and, furthermore, in Ohio. Through the close
observation of data frames and manipulation of the data, a “unsolved”
homicide rate was calculated, which takes the number of unsolved
homicides in a region over the total number of homicides there. While
California, Texas, and New York are the states with the highest number
of deaths, the states with the highest rate of unsolved homicides are
New York, Maryland, and Illinois. A similar difference can be seen in
Ohio’s counties: Cuyahoga, Franklin, and Hamilton have the highest count
of homicides while Knox, Auglaize, and Paulding have the highest
unsolved rates. Through graphical analysis, we see that African American
males between the ages of 18 to 30 in Ohio face homicide cases that go
unsolved more often than others. Finally, almost two-thirds of all
homicides in Ohio are committed using some sort of firearm.
Variables considered throughout this analysis include:
• City &
State of Homicide
• Year & Month of Homicide
• Whether the
case was solved or not
• Victim Sex, Race, and Age
•
Perpetrator Sex, Race, and Age
• Relation
• & Weapon
Used
Every year, at least 5,000 perpetrators get away with murder; this
means that, today, about a third of homicide cases go unsolved. National
statistics on murder are estimates and projections based on incomplete
police reports, as no agency’s are assigned to actually monitor failed
cases.
The Murder
Accountability Project is a non-profit organization that compiles
data from federal, state, and local governments about unsolved homicides
in order to educate Americans about the importance of accurately
accounting for these failed cases. All of the data is gathered by Thomas
Hargrove, a retired investigative journalist as well as former White
House correspondent. The data used in this analysis was obtained from Kaggle.
There are 628384 observations in our data once we drop Alaska,
Hawaii, and the District of Columbia. Overall, there is about a rate of
29.4% for unsolved homicide cases in the United States. Specifically,
New York, Maryland, Illinois, Massachusetts, and California have the
highest state wide rates. On the other hand, we see the most solved
cases coming out of North Dakota, Montana, South Dakota, South Carolina,
and Idaho. I would say that these results make sense logically; we see
more murders going unsolved in areas with denser populations. As sad as
it is to say, more people would probably notice a person going missing
in a small North Dakota town than bustling New York City. One might
argue that smaller, spread out towns would make it easier to get away
with murder, but our numbers seem to dismiss this belief.
There is a dramatic difference seem between the overall amount of
homicides in California compared to any other state. With a total of
99,783 murders, they make up about 15.9% of the data. To see the
difference, the next highest count is in Texas, with only 62,095
homicide cases appearing in the dataset. With a population already
accounting for about 12% of the United States, it was not a surprise
when California had the highest amount of homicides. For a bit of
background on United States population density, most of the American
population lives within a 100 miles radius from any coast, so we can
infer that the inland states (such as the Dakotas) probably just have
less people to commit crimes in general.
Ohio is rated number
11 for most counts of homicides in the United States. Further analysis
will be applied to the Ohio data as a case study.
The Ohio data contains 19158 observations. About 28.3% of homicide
cases remain unsolved. We would suspect the highest number of unsolved
cases to correspond to the counties with the highest population, just
due to the sheer amount of people located there. These would be the
counties with cities like Columbus, Toledo, Cleveland, Cincinnati, and
Dayton; so, Franklin, Lucas, Cuyahoga, Hamilton, and Montgomery,
respectively. However, we see that the counties with the highest rate of
unsolved homicides are Knox, Auglaize, and Paulding. When observing the
data more closely in order to determine why this is, we see that all
three of the counties have less than ten homicides observed in general.
So, this is making the rates look a lot larger than they are when
comparing to counties with thousands of homicides.
Through the
distribution of victim sex, we see that almost three times the amount of
males than females were victims of homicide in Ohio. About 24.26% of
female homicide cases remain unsolved, while cases with male victims are
about 29.70% unsolved.
The majority of homicide cases in Ohio
involve victims who are either White or Black. This is most likely
because the majority of the population of the state identify as one of
the two races, so let us consider primarily consider these. Based on the
Ohio Census, we see
that approximately 81% of the population is White and 13% is Black.
However, about 41% of homicides in Ohio involve White victims and a
whopping 58% of cases invovle Black victims.
The highest
amount of homicides happen to victims between the age of 18-30, with the
amount of cases decreasing exponentially as the victims get older. We
do, however, see about 11% of cases occur to victims who are minors;
though, these cases, at least visually, appear to be solved more often.
The overwhelming majority of murders in Ohio involve death by
a firearm. In order to more accurately visualize the distribution of
murder weapons, a pie chart is created. When firearms are removed, the
next three highest categories are : Knives, Blunt Objects, and Unknown.
Unfortunately, I am not sure if this says much about Ohio specifically,
as firearms seem to be the weapon of choice all over the United States
due to current gun regulations.
A major limitation of this data is the amount of crimes that remain
unreported. As we have already seen, not every homicide is able to be
solved. However, there are also plenty of homicides that just may never
be reported, due to factors such as no body ever being found or person
ever being reported missing.
Another limitation is that the
rates of unsolved homicides were not calculated with respect to
population, whether it be for state or for county. This makes some
unsolved homicides rates appear inflated, when really, they could be
better explained by the lack of homicides occurring there in general.
Finally, the “type” variable is so non-specific and could be
better developed. There is one category that is “murder or
manslaughter’, which I think could explain more homicide trends if split
into two. Manslaughter is defined as the unintentional killing of
another person and is often viewed as less severe than murder. By
separating the two, we might be able to find other underlying trends.
For example, there may be locations where murder is more prevalent than
manslaughter, as well as different weapons used between the two.
A possible future study using this same data could be seeing if we
can detect the presence of a serial killer. This could be potentially
done through a closer analysis of the precise locations of the murders.
While we do not have access to specific coordinates of where bodies may
have been found, we can focus on cities with high counts of unsolved
homicides.
Further, we could see which of these homicides had
similar weapons used, as well as similar victim characteristics. Most
serial killers target a specific type of person, so race/age/gender
could help us see any trends. While we can keep the year and months of
the incidents in mind, there are times where serial killers may remain
dormant for years.
While we focused on unsolved homicides in
our data, we were provided with characteristics of perpetrators for
those crimes that were solved. This could assist us in further profiling
potential killers, as we could see if there are similar trends between
them. Also, there could be some perpetrators who were caught for one
homicide, yet never confessed to another they did. We could use the same
idea of seeing victim trends and times to see if they connect to any
known perpetrators.
• Dataset
• Murder Accountability
Project
• Ohio
Census
• Special shout out to Dr. Chen for all of her time and
support ! :)
---
title: "Unsolved Homicides in the United States"
output:
flexdashboard::flex_dashboard:
theme:
version: 4
bootswatch: pulse
navbar-bg: "#F393AB"
orientation: columns
vertical_layout: fill
source_code: embed
---
<style>
.chart-title { /* chart_title */
font-size: 18px;
font-family: Helvetica;
}
body{ /* Normal */
font-size: 16px;
}
</style>
```{r setup, include=FALSE}
library(flexdashboard)
```
Introduction
================================
Column {data-width=500}
-----------------------------------------------------------------------
### Abstract
```{r data}
library(pacman)
p_load(ggplot2, tidyverse, maps, plotly, viridis)
## reading the dataset and dropping irrelevant data
crime <- read.csv("/Users/erin/Desktop/R/data/crime.csv")
crime <- crime %>%
rename(Type = Crime.Type, Solved = Crime.Solved, VicSex = Victim.Sex,
VicAge = Victim.Age, VicRace = Victim.Race, PerpSex = Perpetrator.Sex,
PerpAge = Perpetrator.Age, PerpRace = Perpetrator.Race) %>%
select(-c(Record.ID, Agency.Code, Agency.Name, Agency.Type,
Incident, Record.Source, Victim.Ethnicity,
Perpetrator.Ethnicity, Victim.Count, Perpetrator.Count,
Record.Source))
## remove hawaii alaska and dc for map purposes, still enough data points
crime <- crime[crime$State != "Hawaii" & crime$State != "Alaska" & crime$State !="District of Columbia",]
## combining firearm values
crime$Weapon[crime$Weapon=="Handgun"]<-"Firearm"
crime$Weapon[crime$Weapon=="Shotgun"]<-"Firearm"
crime$Weapon[crime$Weapon=="Rifle"]<-"Firearm"
crime$Weapon[crime$Weapon=="Gun"]<-"Firearm"
```
The purpose of this research is to determine the trends of homicide in the United States and, furthermore, in Ohio. Through the close observation of data frames and manipulation of the data, a "unsolved" homicide rate was calculated, which takes the number of unsolved homicides in a region over the total number of homicides there. While California, Texas, and New York are the states with the highest number of deaths, the states with the highest rate of unsolved homicides are New York, Maryland, and Illinois. A similar difference can be seen in Ohio's counties: Cuyahoga, Franklin, and Hamilton have the highest count of homicides while Knox, Auglaize, and Paulding have the highest unsolved rates. Through graphical analysis, we see that African American males between the ages of 18 to 30 in Ohio face homicide cases that go unsolved more often than others. Finally, almost two-thirds of all homicides in Ohio are committed using some sort of firearm.
<br>
<br>
Variables considered throughout this analysis include:
<br>
• City & State of Homicide
<br>
• Year & Month of Homicide
<br>
• Whether the case was solved or not
<br>
• Victim Sex, Race, and Age
<br>
• Perpetrator Sex, Race, and Age
<br>
• Relation
<br>
• & Weapon Used
Column {data-width=650}
-----------------------------------------------------------------------
### Background Information
Every year, at least 5,000 perpetrators get away with murder; this means that, today, about a third of homicide cases go unsolved. National statistics on murder are estimates and projections based on incomplete police reports, as no agency's are assigned to actually monitor failed cases.
<br>
<br>
The [Murder Accountability Project](https://www.murderdata.org) is a non-profit organization that compiles data from federal, state, and local governments about unsolved homicides in order to educate Americans about the importance of accurately accounting for these failed cases. All of the data is gathered by Thomas Hargrove, a retired investigative journalist as well as former White House correspondent. The data used in this analysis was obtained from [Kaggle](https://www.kaggle.com/datasets/murderaccountability/homicide-reports).
### Research Question(s)
1) What is the distribution of unsolved homicides across the United States?
2) More specifically, what is the distribution of unsolved homicides in Ohio?
3) What qualities are most prevalent in unsolved murder victims?
a) What is the distribution of Victim Sex?
b) What is the distribution of Victim Race?
c) What is the distribution of Victim Age?
d) What is the distribution of murder weapons?
Exploring the Data
================================
Column {data-width=500, .tabset}
-----------------------------------------------------------------------
### US Unsolved Map
```{r map1}
## creating us map
us_map <- map_data("state") %>%
filter(region != "district of columbia") %>%
select(-subregion)
us_map$region <- unname(sapply(us_map$region, str_to_title))
## finding total count of unsolved homicides
## 0 - solved, 1 - unsolved
crime$Solved <- crime$Solved %>%
recode("Yes" = 0,
"No" = 1)
usUnsolved <- aggregate(crime$Solved, list(crime$State), FUN=sum)
## finding total count of homicides
## joining data and finding unsolved rate
us_count <- crime %>%
group_by(State) %>%
mutate(count = n()) %>%
select(State, count) %>%
distinct()
us_count <- us_count %>%
left_join(usUnsolved, by = c("State" = "Group.1"))
colnames(us_count)[colnames(us_count) == "x"] <- "unsolvCount"
us_count <- us_count %>%
mutate(unsolvRate = round((unsolvCount / count), 5))
## left joining datasets
crime_map <- us_count %>%
left_join(us_map, by = c("State" = "region"))
us_count <- crime_map %>%
group_by(State) %>%
summarise(long = mean(long), lat = mean(lat), count = mean(count),
unsolvCount = mean(unsolvCount), unsolvRate = mean(unsolvRate))
us_count$abb <- state.abb[-c(2,11)]
## creating united states graph
g1 <- ggplot(crime_map, aes(x = long, y = lat)) +
geom_polygon(aes(group = group, fill = unsolvRate,
text = paste0("State: ", State, "\n",
"Total Homicides: ", count, "\n",
"Unsolved Homicides: ", unsolvCount, "\n",
"Unsolved Homicide Rate: ", unsolvRate)), colour = "white") +
scale_fill_viridis_c(option = "F") +
theme_minimal() +
theme(axis.title.x=element_blank(), axis.text.x=element_blank(),
axis.ticks.x=element_blank(), axis.title.y=element_blank(),
axis.text.y=element_blank(), axis.ticks.y=element_blank(),
panel.grid.major=element_blank(),
panel.background=element_blank()) +
labs(fill = "Unsolved Rate", title = "Distribution of US Homicide Cases")
ggplotly(g1, tooltip="text")
```
### US Count Map
```{r map2}
## bubble map
us_top15 <- us_count %>%
arrange(desc(count)) %>%
head(15)
us_top15 <- semi_join(crime_map, us_top15, by = "State")
top15_map <- us_top15 %>%
group_by(State) %>%
summarise(long = mean(long), lat = mean(lat), count = mean(count))
g3 <- crime_map %>%
ggplot() +
geom_polygon(aes(x = long, y = lat, group = group, text = State),
fill = "grey", alpha = 0.5, colour = "white") +
geom_point(data = top15_map,
aes(x = long, y = lat, size = count, color = count, alpha = count,
text = paste0(State, ":", count))) +
scale_size_continuous(range=c(1,25)) +
scale_color_viridis(option="F", name = "Number of Homicides") +
coord_map() +
theme_minimal() +
theme(axis.title.x=element_blank(), axis.text.x=element_blank(),
axis.ticks.x=element_blank(), axis.title.y=element_blank(),
axis.text.y=element_blank(), axis.ticks.y=element_blank(),
panel.grid.major=element_blank(),
panel.background=element_blank()) +
labs(title = "Top 15 Counts of Homicides")
ggplotly(g3, tooltip = "text")
```
### Glimpse of the Data
```{r glimpse}
glimpse.crime <- crime %>%
select(State, Year, Month, Solved, VicSex, VicRace, VicAge, Weapon) %>%
arrange(Year)
glimpse.crime <- glimpse.crime[sample(1:nrow(glimpse.crime), 1000,
replace=FALSE),]
DT::datatable(glimpse.crime, colnames = c("State", "Year", "Month", "Unsolved",
"Victim Sex", "Victim Race",
"Victim Age", "Weapon"))
```
### Raw Counts
```{r counts}
count_table <- us_count %>%
arrange(desc(count)) %>%
select(State, count, unsolvCount)
DT::datatable(count_table, colnames = c("State", "Homicide Count", "Unsolved Count"))
```
Ohio Analysis
================================
Column {data-width=500, .tabset}
-----------------------------------------------------------------------
### Ohio Unsolved Map
```{r map3}
## narrowing down to only ohio data
ohio <- crime[crime$State=="Ohio",]
## same process for map making as us map
ohio_county <- map_data("county", region = "ohio")
ohio_county$subregion <- str_to_title(ohio_county$subregion)
ohio_count <- ohio %>%
group_by(City) %>%
mutate(count = n()) %>%
select(City, count) %>%
distinct()
ohUnsolved <- aggregate(ohio$Solved, list(ohio$City), FUN=sum)
ohio_count <- ohio_count %>%
left_join(ohUnsolved, by = c("City" = "Group.1"))
colnames(ohio_count)[colnames(ohio_count) == "x"] <- "unsolvCount"
ohio_count <- ohio_count %>%
mutate(unsolvRate = round((unsolvCount / count), 5))
ohio_map <- ohio_count %>%
left_join(ohio_county, by = c("City" = "subregion"))
ohio_count <- ohio_map %>%
group_by(City) %>%
summarise(long = mean(long), lat = mean(lat), count = mean(count),
unsolvCount = mean(unsolvCount), unsolvRate = mean(unsolvRate))
g2 <- ggplot(ohio_map, aes(x = long, y = lat)) +
geom_polygon(aes(group = group, fill = unsolvRate,
text = paste0("County: ", City, "\n",
"Total Homicides: ", count, "\n",
"Unsolved Homicides: ", unsolvCount, "\n",
"Unsolved Homicide Rate: ", unsolvRate)), colour = "white") +
scale_fill_viridis_c(option = "F") +
theme_minimal() +
theme(axis.title.x=element_blank(), axis.text.x=element_blank(),
axis.ticks.x=element_blank(), axis.title.y=element_blank(),
axis.text.y=element_blank(), axis.ticks.y=element_blank(),
panel.grid.major=element_blank(),
panel.background=element_blank()) +
labs(fill = "Unsolved Rate", title = "Distribution of Ohio Homicide Cases")
ggplotly(g2, tooltip="text")
```
### Ohio Count Map
```{r map4}
## bubble map
ohio_top10 <- ohio_count %>%
arrange(desc(count)) %>%
head(10)
ohio_top10 <- semi_join(ohio_map, ohio_top10, by = "City")
top10oh_map <- ohio_top10 %>%
group_by(City) %>%
summarise(long = mean(long), lat = mean(lat), count = mean(count))
g4 <- ohio_map %>%
ggplot() +
geom_polygon(aes(x = long, y = lat, group = group, text = City),
fill = "grey", alpha = 0.5, colour = "white") +
geom_point(data = top10oh_map,
aes(x = long, y = lat, size = count, color = count, alpha = count,
text = paste0(City, ":", count))) +
scale_size_continuous(range=c(1,25)) +
scale_color_viridis(option="F", name = "Number of Homicides") +
coord_map() +
theme_minimal() +
theme(axis.title.x=element_blank(), axis.text.x=element_blank(),
axis.ticks.x=element_blank(), axis.title.y=element_blank(),
axis.text.y=element_blank(), axis.ticks.y=element_blank(),
panel.grid.major=element_blank(),
panel.background=element_blank()) +
labs(title = "Top 10 Counts of Homicides")
ggplotly(g4, tooltip = "text")
```
### Ohio Raw Counts
```{r ohcounts}
ohcount_table <- ohio_count %>%
arrange(desc(count)) %>%
select(City, count, unsolvCount)
DT::datatable(ohcount_table, colnames = c("County", "Homicide Count", "Unsolved Count"))
```
Column {data-width=500, .tabset}
-----------------------------------------------------------------------
### VicSex
```{r vicsex}
## reverting solved back to character values for graph purposes
ohio$Solved[ohio$Solved == 0]<-"Yes"
ohio$Solved[ohio$Solved == 1]<-"No"
## bar charts of victim characteristics
## dropping unknown in the graph just bc only 52 observations
p1 <- ggplot(ohio, aes(x = VicSex, fill = as.factor(Solved))) + geom_bar() +
scale_fill_discrete(name = "Solved?", type = c("#ffcde3", "#aa6785")) +
scale_x_discrete(limits = c("Female", "Male")) +
labs(title = "Distribution of Victim Sex in Ohio", x = "Victim Sex", y = "Number of Homicides") +
theme_classic()
ggplotly(p1)
```
### VicRace
```{r vicrace}
## renaming some races to better fit graph
ohio$VicRace[ohio$VicRace=="Asian/Pacific Islander"]<-"Asian/PI"
ohio$VicRace[ohio$VicRace=="Native American/Alaska Native"]<-"Native Amer"
p2 <- ggplot(ohio, aes(x = VicRace, fill = as.factor(Solved))) + geom_bar() +
scale_fill_discrete(name = "Solved?", type = c("#ffcde3", "#aa6785")) +
labs(title = "Distribution of Victim Race in Ohio", x = "Victim Race", y = "Number of Homicides") +
theme_classic()
ggplotly(p2)
```
### VicAge
```{r age}
## some ages entered incorrectly as 998 so we need to fix this...
## just gonna narrow down data
ohio_age <- ohio %>%
mutate(AgeGroup = case_when(
VicAge < 18 ~ "18-",
18 <= VicAge & VicAge < 30 ~ "18-30",
30 <= VicAge & VicAge < 40 ~ "30-40",
40 <= VicAge & VicAge < 50 ~ "40-50",
50 <= VicAge & VicAge < 60 ~ "50-60",
60 <= VicAge & VicAge < 70 ~ "60-70",
70 <= VicAge & VicAge < 80 ~ "70-80",
80 <= VicAge & VicAge < 90 ~ "80-90")) %>%
na.omit()
p3 <- ggplot(ohio_age, aes(x = AgeGroup, fill = as.factor(Solved))) + geom_bar() +
scale_fill_discrete(name = "Solved?", type = c("#ffcde3", "#aa6785")) +
labs(title = "Distribution of Victim Age in Ohio", x = "Age Group", y = "Number of Homicides") +
coord_flip() +
theme_classic()
ggplotly(p3)
```
### Weapon
```{r weapon}
p4 <- ggplot(ohio, aes(x = Weapon, fill = as.factor(Solved))) + geom_bar() +
scale_fill_discrete(name = "Solved?", type = c("#ffcde3", "#aa6785")) +
labs(title = "Distribution of Murder Weapons Used in Ohio", x = "Weapon", y = "Number of Homicides") +
coord_flip() +
theme_classic()
ggplotly(p4, tooltip = "text")
```
### Weapon*
```{r weapon2}
### pie chart to see more of the weapon distribution since so many types
weapon_count <- count(ohio, Weapon)
weapon_count$percent <- round(weapon_count$n/sum(weapon_count$n)*100,2)
colList = c("#FEF6F8","#FAD1DB","#B8143D","#F07594","#EB4770","#E6194C",
"#F5A3B8","#A11236","#8A0F2E","#5C0A1F","#450817","#2E050F")
pie <- plot_ly(weapon_count, labels = ~Weapon, values = ~n, type = 'pie', marker=list(colors=colList))
pie <- pie %>% layout(title = 'Weapon Distribution in Ohio',
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))
pie
```
Discussion
================================
Column {data-width=550, .tabset}
-----------------------------------------------------------------------
### US Results
There are 628384 observations in our data once we drop Alaska, Hawaii, and the District of Columbia. Overall, there is about a rate of 29.4% for unsolved homicide cases in the United States. Specifically, New York, Maryland, Illinois, Massachusetts, and California have the highest state wide rates. On the other hand, we see the most solved cases coming out of North Dakota, Montana, South Dakota, South Carolina, and Idaho. I would say that these results make sense logically; we see more murders going unsolved in areas with denser populations. As sad as it is to say, more people would probably notice a person going missing in a small North Dakota town than bustling New York City. One might argue that smaller, spread out towns would make it easier to get away with murder, but our numbers seem to dismiss this belief.
<br>
<br>
There is a dramatic difference seem between the overall amount of homicides in California compared to any other state. With a total of 99,783 murders, they make up about 15.9% of the data. To see the difference, the next highest count is in Texas, with only 62,095 homicide cases appearing in the dataset. With a population already accounting for about 12% of the United States, it was not a surprise when California had the highest amount of homicides. For a bit of background on United States population density, most of the American population lives within a 100 miles radius from any coast, so we can infer that the inland states (such as the Dakotas) probably just have less people to commit crimes in general.
<br>
<br>
Ohio is rated number 11 for most counts of homicides in the United States. Further analysis will be applied to the Ohio data as a case study.
### Ohio Results
The Ohio data contains 19158 observations. About 28.3% of homicide cases remain unsolved. We would suspect the highest number of unsolved cases to correspond to the counties with the highest population, just due to the sheer amount of people located there. These would be the counties with cities like Columbus, Toledo, Cleveland, Cincinnati, and Dayton; so, Franklin, Lucas, Cuyahoga, Hamilton, and Montgomery, respectively.
However, we see that the counties with the highest rate of unsolved homicides are Knox, Auglaize, and Paulding. When observing the data more closely in order to determine why this is, we see that all three of the counties have less than ten homicides observed in general. So, this is making the rates look a lot larger than they are when comparing to counties with thousands of homicides.
<br>
<br>
Through the distribution of victim sex, we see that almost three times the amount of males than females were victims of homicide in Ohio. About 24.26% of female homicide cases remain unsolved, while cases with male victims are about 29.70% unsolved.
<br>
<br>
The majority of homicide cases in Ohio involve victims who are either White or Black. This is most likely because the majority of the population of the state identify as one of the two races, so let us consider primarily consider these. Based on the [Ohio Census](https://www.census.gov/quickfacts/OH), we see that approximately 81% of the population is White and 13% is Black. However, about 41% of homicides in Ohio involve White victims and a whopping 58% of cases invovle Black victims.
<br>
<br>
The highest amount of homicides happen to victims between the age of 18-30, with the amount of cases decreasing exponentially as the victims get older. We do, however, see about 11% of cases occur to victims who are minors; though, these cases, at least visually, appear to be solved more often.
<br>
<br>
The overwhelming majority of murders in Ohio involve death by a firearm. In order to more accurately visualize the distribution of murder weapons, a pie chart is created. When firearms are removed, the next three highest categories are : Knives, Blunt Objects, and Unknown. Unfortunately, I am not sure if this says much about Ohio specifically, as firearms seem to be the weapon of choice all over the United States due to current gun regulations.
Column {data-width=500}
-----------------------------------------------------------------------
### Limitations
A major limitation of this data is the amount of crimes that remain unreported. As we have already seen, not every homicide is able to be solved. However, there are also plenty of homicides that just may never be reported, due to factors such as no body ever being found or person ever being reported missing.
<br>
<br>
Another limitation is that the rates of unsolved homicides were not calculated with respect to population, whether it be for state or for county. This makes some unsolved homicides rates appear inflated, when really, they could be better explained by the lack of homicides occurring there in general.
<br>
<br>
Finally, the "type" variable is so non-specific and could be better developed. There is one category that is "murder or manslaughter', which I think could explain more homicide trends if split into two. Manslaughter is defined as the unintentional killing of another person and is often viewed as less severe than murder. By separating the two, we might be able to find other underlying trends. For example, there may be locations where murder is more prevalent than manslaughter, as well as different weapons used between the two.
### Future Studies
A possible future study using this same data could be seeing if we can detect the presence of a serial killer. This could be potentially done through a closer analysis of the precise locations of the murders. While we do not have access to specific coordinates of where bodies may have been found, we can focus on cities with high counts of unsolved homicides.
<br>
<br>
Further, we could see which of these homicides had similar weapons used, as well as similar victim characteristics. Most serial killers target a specific type of person, so race/age/gender could help us see any trends. While we can keep the year and months of the incidents in mind, there are times where serial killers may remain dormant for years.
<br>
<br>
While we focused on unsolved homicides in our data, we were provided with characteristics of perpetrators for those crimes that were solved. This could assist us in further profiling potential killers, as we could see if there are similar trends between them. Also, there could be some perpetrators who were caught for one homicide, yet never confessed to another they did. We could use the same idea of seeing victim trends and times to see if they connect to any known perpetrators.
### References
• [Dataset](https://www.kaggle.com/datasets/murderaccountability/homicide-reports)
<br>
• [Murder Accountability Project](https://www.murderdata.org)
<br>
• [Ohio Census](https://www.census.gov/quickfacts/OH)
<br>
• Special shout out to Dr. Chen for all of her time and support ! :)