Final Report

Author

Kausthub Satluri, Jerry Chen, Opally Taebunyanuphap

The Dataset

Synoposis/Dataset Table

The dataset we used for our analysis is a Kaggle dataset that gathers information from every documented nuclear explosion before 2000. Each observation is a nuclear explosion, and each explosion has 16 corresponding variables. The variables are displayed in the table below:

Column	Data Type	Description	Example
`Data.Source`	String	Source that reported the explosion event	`"DOE"`
`WEAPON SOURCE COUNTRY`	String	Country deploying the nuclear device	`"USA"`
`WEAPON DEPLOYMENT LOCATION`	String	Region where the nuclear device was deployed	`"Alamogordo"`
`Location.Coordinates.Latitude`	Float	Latitude position	32.54
`Location.Coordinates.Longitude`	Float	Longitude position	-105.57
`Data.Magnitude.Body`	Float	Body wave magnitude of explosion (mb)	0.0
`Data.Magnitude.Surface`	Float	Surface wave magnitude of explosion (Ms)	0.0
`Location.Coordinates.Depth`	Float	Depth at detonation in km (positive = below ground; negative = above ground)	-0.1
`Data.Yield.Lower`	Float	Explosion yield lower estimate in kilotons of TNT	21.0
`Data.Yield.Upper`	Float	Explosion yield upper estimate in kilotons of TNT	21.0
`Data.Purpose`	String	Purpose of detonation: COMBAT, FMS, ME, PNE, SAM, SSE, TRANSP, WE, WR	`"Wr"`
`Data.Name`	String	Name of event or bomb	`"Trinity"`
`Data.Type`	String	Method of deployment (e.g., TOWER, AIRDROP, UNDERGROUND)	`"Tower"`
`Date.Day`	Integer	Day of detonation	16
`Date.Month`	Integer	Month of detonation	7
`Date.Year`	Integer	Year of detonation	1945

The following code block simply loads the dataset and the necessary dependencies so it can be used in future code blocks.

library(ggplot2)
library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

nuclear_tests <- read.csv("nuclear_explosions.csv")

Explaining Dataset Variables

Data.Source: Indicates the organization or entity that reported the nuclear explosion event. Example values include “DOE” (Department of Energy).
WEAPON SOURCE COUNTRY: Identifies the country that deployed or detonated the nuclear device, such as “USA,” “USSR,” etc.
WEAPON DEPLOYMENT LOCATION: Specifies the specific region, test site, or location where the nuclear device was deployed. Examples include “Alamogordo,” “NTS” (Nevada Test Site), “Bikini,” etc.
Location.Coordinates.Latitude: The latitude position of the explosion site, expressed as a decimal number (e.g., 32.54).
Location.Coordinates.Longitude: The longitude position of the explosion site, expressed as a decimal number (e.g., -105.57).
Data.Magnitude.Body: Represents the body wave magnitude (mb) of the explosion, which is a measure of the seismic signal strength.
Data.Magnitude.Surface: Represents the surface wave magnitude (Ms) of the explosion, another measure of seismic activity.
Location.Coordinates.Depth: Indicates the depth at detonation in kilometers. Positive values represent below-ground detonations, while negative values indicate above-ground detonations.
Data.Yield.Lower: The lower estimate of the explosion yield measured in kilotons of TNT equivalent.
Data.Yield.Upper: The upper estimate of the explosion yield measured in kilotons of TNT equivalent.
Data.Purpose: Indicates the purpose of the detonation, with codes such as:

COMBAT: Wartime use
FMS: Flight (missile) system tests
ME: Military effects
PNE: Peaceful nuclear explosion
SAM: Safety experiment
SSE: Storage-transportation safety experiment
TRANSP: Transportation
WE: Weapons effects testing
WR: Weapons related

Data.Name: The name assigned to the event or bomb (e.g., “Trinity”).
Data.Type: Describes the method of deployment or detonation, such as “TOWER,” “AIRDROP,” “UNDERGROUND,” etc.
Date.Day: The day of the month when the detonation occurred.
Date.Month: The month when the detonation occurred, represented as an integer (1-12).
Date.Year: The year when the detonation occurred.

How we use the dataset

In this project, we investigate three main research questions related to nuclear testing patterns over time. First, we examine how nuclear explosion yields have changed over time by country, identifying shifts in testing strategies across different nations. Second, we analyze how the purposes of nuclear test drops have evolved, rebucketing purpose categories to capture major historical trends. Third, we explore how the location and method of nuclear test drops have shifted over time by consolidating drop types and tracking changes across decades. Together, these questions aim to uncover how nuclear testing strategies and objectives have transformed throughout modern history.

Research Question 1: How have Nuclear Explosion Yields Changed Over Time By Country?

Visualization 1: Time-Series Line Plot

Because I am analyzing how explosions yields have changed over time, I first considered that the best way to go about this would be to do a time series evaluation to see the change in yield over time. I first wanted to see just the overall trend of average nuclear yield, so I mutated the dataset to filter by decade and calculated the average nuclear yield by country for that decade as well as for all countries combined. The dataset provides two estimations of yield, so I took the yield to be the average of the upper and lower estimation. The following does the aforementioned mutations and creates the following time-series line plot:

# Take Month, Day, and Year and make them into a test date variable
nuclear_data <- nuclear_tests |> 
  mutate(test_date = as.Date(paste(Date.Day, Date.Month, Date.Year, sep="/"), 
                             format = "%d/%m/%Y"))

# Calculate average yield for each country per decade
decade_avg_yields_by_country <- nuclear_data |>
  mutate(decade = floor(Date.Year/10)*10) |>
  group_by(WEAPON.SOURCE.COUNTRY, decade) |>
  summarize(avg_yield = mean((Data.Yeild.Lower + Data.Yeild.Upper)/2, na.rm = TRUE)) |>
  ungroup()

`summarise()` has grouped output by 'WEAPON.SOURCE.COUNTRY'. You can override
using the `.groups` argument.

# Calculate the overall average yield per decade
decade_total_avg_yield <- nuclear_data |>
  mutate(decade = floor(Date.Year/10)*10) |>
  group_by(decade) |>
  summarize(avg_yield = mean((Data.Yeild.Lower + Data.Yeild.Upper)/2, na.rm = TRUE)) |>
  ungroup() 

# Create the plot with country-specific lines and overall average overlay
ggplot() +
  # Country-specific lines
  geom_line(data = decade_avg_yields_by_country, 
            aes(x = decade, y = avg_yield, color = WEAPON.SOURCE.COUNTRY), 
            linetype = "dashed") +
  geom_point(data = decade_avg_yields_by_country,
             aes(x = decade, y = avg_yield, color = WEAPON.SOURCE.COUNTRY), size = 3) +
  # Overall average line
  geom_line(data = decade_total_avg_yield,
            aes(x = decade, y = avg_yield), 
            color = "black", size = 1.2) +
  geom_point(data = decade_total_avg_yield,
             aes(x = decade, y = avg_yield),
             color = "black", size = 4) +
  # Rest of the plot formatting
  labs(title = "Average Nuclear Test Yield by Country (By Decade)",
       subtitle = "Black line shows overall average across all countries",
       x = "Decade",
       y = "Average Yield per Test (kilotons)",
       color = "Country") +
  theme_minimal()

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Interpretation

The overall trend in the graph shows a significant rise in nuclear test yields from the 1940s through to the 1960s, followed by a steady decline from the 1970s into the 1980s and 1990s. This pattern reflects early breakthroughs in thermonuclear weapon design during the 1950s and early 1960s. Subsequent reductions in yield occurred largely due to international agreements, such as the Partial Test-Ban Treaty of 1963, which tremendously reduced atmospheric testing, and the Threshold Test-Ban Treaty of 1974, limiting underground explosions to 150 kilotons. Consequently, later tests shifted from large multi-megaton demonstrations to smaller, tactical detonations.

At a country-specific level, notable peaks and troughs correspond to major historical events. For example, the USSR’s spike in average yield in the 1960s can be traced to large atmospheric tests, including the infamous 50-megaton “Tsar Bomba” in 1961. Similarly, China’s high initial yields in the 1960s reflect fewer but more powerful tests, notably their 1967 thermonuclear test. The United States’ large average yield in the 1950s come from thermonuclear tests like “Ivy Mike” and the “Castle” series, while the UK’s high yields in the same decade result from their early thermonuclear tests during Operation Grapple, after which yields sharply decreased.

Visualization 2: Scatterplot

While the previous plot was very helpful in understanding the overall trend with regards to the evolution of nuclear yields over time, it got me wondering whether there were smaller nuances that were not being represented by the time-series line plot. To capture those small details as best as possible, I decided to make a simple scatter plot of all of the nuclear explosions colored by their source country. The x-axis will be the test date as defined in the previous code block and the y-axis will be the log nuclear yield of each explosion. The reason why I am using a log transformation on the y-axis is because there are immense variations in yield between the smallest and largest bombs. Without the log transformation, we will have the vast majority of points be clustered very close to the bottom of the graph with a few outliers far above. This would make it difficult to see the characteristics of the points’ distribution as well as the underlying relationship at hand. Log transformations, however, are not able to handle zero yield tests (which were often done to conduct material science experiments), so the upper yield limit was increased to be 0.0005 kilotons greater than the original to make sure that we don’t run into any infinite values after the log transformation while also making sure the data is not shifted too greatly. With all of this in mind, the following code block creates the scatterplot as well as a linear regression line and its summary.

#scatterplot
nuclear_data |>
  ggplot(aes(x = test_date, y = log10((Data.Yeild.Lower + Data.Yeild.Upper + 0.0005)/2))) + 
  geom_point(alpha = 0.5, size = 1, aes(color = WEAPON.SOURCE.COUNTRY)) + 
  geom_smooth(method="lm") +
  labs(x = "Test Date", y = "log Yield (kilotons)",
       title = "Global Nuclear Test Yields (colored by country)", 
       color = "Country")

`geom_smooth()` using formula = 'y ~ x'

#linear model
yield_lm <- lm(log10((Data.Yeild.Lower + Data.Yeild.Upper + 0.0005)/2) ~ test_date, data = nuclear_data)
summary(yield_lm)


Call:
lm(formula = log10((Data.Yeild.Lower + Data.Yeild.Upper + 5e-04)/2) ~ 
    test_date, data = nuclear_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.9074 -0.2457 -0.1116  0.6474  3.6270 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.140e+00  2.632e-02  43.304  < 2e-16 ***
test_date   2.276e-05  6.878e-06   3.309 0.000951 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.179 on 2044 degrees of freedom
Multiple R-squared:  0.00533,   Adjusted R-squared:  0.004843 
F-statistic: 10.95 on 1 and 2044 DF,  p-value: 0.0009513

Interpretation

The single linear model fitted across all countries shows a statistically significant positive trend (p-value ≈ 0.00095), indicating that when looking at all tests together, there’s evidence that the log yield tended to increase slightly over time. However, this relationship is very weak (R-squared ≈ 0.0053, or 0.53%), with the date of the test explaining less than 1% of the variation in log yields. While a slight upward trend exists mathematically across the whole dataset, the date is a very poor predictor of yield overall. The large residual standard error (1.179) confirms the visual impression: there’s a massive amount of scatter (variation) in yields that the simple time trend doesn’t capture.

The scatterplot reveals country-specific observations, with the USA (purple) and USSR (pink) clearly dominating the testing, especially in the early period (roughly 1950s-1970s), exhibiting the widest range of yields, including both the lowest and the highest observed yields (log yields well above 2.5, corresponding to megaton-range weapons). There appears to be a decrease in the frequency and possibly the maximum yield of their tests in later decades (post-1970s/1980s). France (gold) conducted tests throughout much of the period, with many tests clustering between log yields of roughly 0 and 2.5, and a noticeable band of lower-yield French tests (log yield around 0) prominent from the mid-1970s onwards. The UK (light blue) and China (red) have fewer tests shown compared to the USA/USSR/France, with UK tests more concentrated earlier, often in the lower-to-mid yield range, and China’s tests more spread out chronologically, with some relatively high-yield tests visible. India (green) and Pakistan (teal) entered the nuclear testing arena much later, with their tests clustered near the end of the plotted timeframe (late 1990s) and yields appearing to be in the mid-range relative to the overall historical spread. Even within a single country (especially USA/USSR), there’s huge variability in yields over time. All of these insights suggest that maybe conducting regressions for each country could help further identify if there do exist relationships between test date and test yield.

Statistical Analysis: Multiple Linear Regressions

(Not to be confused with Multiple Linear Regression). In this part, I will conduct Multiple Linear Regressions akin to the one I conducted in the previous section but for all of the countries individually. I will be omitting India and Pakistan from these regression as they do not have enough observations in the data to be able to create a reasonable linear regression. Every other country has at least 40 points, which should be sufficient to run a linear regression model. This regression will also be done with the same log transformation as that will decrease the spread of the data, potentially helping in the accuracy of the linear model. First, I will create 5 filtered datasets for each of the countries with all of the same variables as the original dataset.

# remove India and Pakistan
nuclear_data_no_IP <- nuclear_data |> filter(!(WEAPON.SOURCE.COUNTRY %in% c("INDIA", "PAKIST")))

# split into multiple datasets by country
list_of_country_datasets <- split(nuclear_data_no_IP, nuclear_data_no_IP$WEAPON.SOURCE.COUNTRY)

Now I will run linear regressions for each of the filtered datasets so that we can identify if there a positive linear relationship between test date and nuclear yield and also to characterize that relationship.

USA_lm <- lm(log10((Data.Yeild.Lower + Data.Yeild.Upper + 0.0005)/2) ~ test_date, data = list_of_country_datasets$USA)
summary(USA_lm)


Call:
lm(formula = log10((Data.Yeild.Lower + Data.Yeild.Upper + 5e-04)/2) ~ 
    test_date, data = list_of_country_datasets$USA)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.9597 -0.2841 -0.0557  0.5211  3.3816 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.163e+00  3.459e-02  33.632  < 2e-16 ***
test_date   6.373e-05  9.524e-06   6.692 3.62e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.107 on 1029 degrees of freedom
Multiple R-squared:  0.0417,    Adjusted R-squared:  0.04077 
F-statistic: 44.78 on 1 and 1029 DF,  p-value: 3.616e-11

USSR_lm <- lm(log10((Data.Yeild.Lower + Data.Yeild.Upper + 0.0005)/2) ~ test_date, data = list_of_country_datasets$USSR)
summary(USSR_lm)


Call:
lm(formula = log10((Data.Yeild.Lower + Data.Yeild.Upper + 5e-04)/2) ~ 
    test_date, data = list_of_country_datasets$USSR)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.4006 -0.2151 -0.1286  0.7470  3.4645 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.186e+00  4.577e-02  25.920   <2e-16 ***
test_date   -1.607e-05  1.269e-05  -1.266    0.206    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.2 on 712 degrees of freedom
Multiple R-squared:  0.002248,  Adjusted R-squared:  0.0008464 
F-statistic: 1.604 on 1 and 712 DF,  p-value: 0.2058

CHINA_lm <- lm(log10((Data.Yeild.Lower + Data.Yeild.Upper + 0.0005)/2) ~ test_date, data = list_of_country_datasets$CHINA)
summary(CHINA_lm)


Call:
lm(formula = log10((Data.Yeild.Lower + Data.Yeild.Upper + 5e-04)/2) ~ 
    test_date, data = list_of_country_datasets$CHINA)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.3259 -0.6298 -0.0479  0.4555  1.9894 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.739e+00  2.519e-01   6.901 2.28e-08 ***
test_date   -1.157e-05  4.896e-05  -0.236    0.814    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.187 on 41 degrees of freedom
Multiple R-squared:  0.001361,  Adjusted R-squared:  -0.023 
F-statistic: 0.05587 on 1 and 41 DF,  p-value: 0.8143

FRANCE_lm <- lm(log10((Data.Yeild.Lower + Data.Yeild.Upper + 0.0005)/2) ~ test_date, data = list_of_country_datasets$FRANCE)
summary(FRANCE_lm)


Call:
lm(formula = log10((Data.Yeild.Lower + Data.Yeild.Upper + 5e-04)/2) ~ 
    test_date, data = list_of_country_datasets$FRANCE)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.5746 -0.5125  0.0847  0.9123  1.9170 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 8.768e-01  1.419e-01   6.179 3.39e-09 ***
test_date   1.318e-05  3.050e-05   0.432    0.666    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.34 on 206 degrees of freedom
Multiple R-squared:  0.0009053, Adjusted R-squared:  -0.003945 
F-statistic: 0.1867 on 1 and 206 DF,  p-value: 0.6662

UK_lm <- lm(log10((Data.Yeild.Lower + Data.Yeild.Upper + 0.0005)/2) ~ test_date, data = list_of_country_datasets$UK)
summary(UK_lm)


Call:
lm(formula = log10((Data.Yeild.Lower + Data.Yeild.Upper + 5e-04)/2) ~ 
    test_date, data = list_of_country_datasets$UK)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.1081 -0.4869  0.1815  0.2798  2.0392 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.562e+00  1.636e-01   9.544 3.49e-12 ***
test_date   2.905e-05  3.472e-05   0.837    0.407    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.093 on 43 degrees of freedom
Multiple R-squared:  0.01601,   Adjusted R-squared:  -0.00687 
F-statistic: 0.6998 on 1 and 43 DF,  p-value: 0.4075

Conclusion

The complete analysis reveals a complex evolution of nuclear test yields, differing significantly across countries and time periods. Globally, the average yield per test peaked dramatically in the 1950s (~520 kilotons) and 1960s (~450 kilotons) before declining sharply to below 150 kilotons from the 1970s onward, as shown in the time-series plot. However, the scatterplot of individual tests highlights immense yield variability throughout the entire period, with log yields spanning roughly -4 to 4.5. A simple linear regression across all tests indicated a statistically significant (p ≈ 0.00095) but practically negligible positive trend between test date and log yield, explaining less than 1% (R-squared ≈ 0.0053) of the variance. This confirms that while overall average yields decreased after the 1960s, individual test yields remained highly diverse, driven by factors beyond a simple linear time progression.

Country-specific analyses further prove this point; average yields peaked at different times and magnitudes, such as the USA (~650 kT) and UK (~400 kT) in the 1950s versus the USSR (~1050 kT) and China (~1000 kT) in the 1960s. Subsequent multiple linear regressions on log yield versus test date for individual nations found a statistically significant, though weak, positive linear relationship only for the USA (p < 0.001, R² ≈ 0.042). In contrast, no significant linear trend was found for the USSR (p ≈ 0.206), China (p ≈ 0.814), France (p ≈ 0.666), or the UK (p ≈ 0.408), indicating their test yields did not follow a simple increasing or decreasing linear pattern over the timeframe analyzed. From this, we can conclude that there is no direct linear relationship between test date and nuclear yield for individual tests.

Further Research

There were some elements that this analysis was also not able to touch on, prompting future questions that are still open to research. Firstly, how did the cumulative nuclear yield tested by each country evolve over time, which might offer a different perspective on a country’s nuclear capability compared to average yields per test? Secondly, would incorporating data on the specific purpose or environment (e.g., atmospheric vs. underground) of each test reveal yield trends hidden within the aggregate data? Thirdly, is there a quantifiable relationship between the frequency of testing (number of tests per year or decade) and the average or maximum yield levels observed for each nation during specific periods? Answering these would require more visualizations, integrating additional categorical variables, or performing time-interval analyses, which were beyond the focus of this analysis.

Research Question 2: How do the nuclear explosion yields differ across locations and geographical regions?

Visualization 1: World Map

To address how nuclear explosion yields vary geographically, I created a world map visualization that plots each test location with color-coding to represent the upper explosion yield magnitude. This approach allows for easy and simple identification of testing patterns across different regions and countries. I utilized the upper yield estimate from the dataset to represent the maximum potential impact of each test, and applied a logarithmic color scale to effectively display the wide range of yields spanning from extremely small explosion to massive nuclear weapons. The following gets the world map from stadia map and plots the explosion data points onto the world map colored by the upper explosion yield estimate:

# PUT YOUR CODE AND PLOT HERE
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.4     ✔ tibble    3.2.1
✔ purrr     1.0.2     ✔ tidyr     1.3.1
✔ readr     2.1.5     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ggmap)

ℹ Google's Terms of Service: <https://mapsplatform.google.com>
  Stadia Maps' Terms of Service: <https://stadiamaps.com/terms-of-service>
  OpenStreetMap's Tile Usage Policy: <https://operations.osmfoundation.org/policies/tiles>
ℹ Please cite ggmap if you use it! Use `citation("ggmap")` for details.

register_stadiamaps("bc3a831e-0917-4fed-9acc-8637bb222157",
                          # write = TRUE saves this once
                          write = TRUE)

ℹ Replacing old key (bc3a831e) with new key in C:\Users\Opall\OneDrive\Documents/.Renviron

nuclear_data <- read_csv("nuclear_explosions.csv")

Rows: 2046 Columns: 16
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): WEAPON SOURCE COUNTRY, WEAPON DEPLOYMENT LOCATION, Data.Source, Da...
dbl (10): Location.Cordinates.Latitude, Location.Cordinates.Longitude, Data....

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

world <- c(left = -170, bottom = -60, right = 170, top = 80)
map <- get_stadiamap(world, zoom = 2, maptype = "stamen_toner_lite")

ℹ © Stadia Maps © Stamen Design © OpenMapTiles © OpenStreetMap contributors.

ggmap(map) +
  geom_point(
    data = nuclear_data, 
    aes(
      x = Location.Cordinates.Longitude, 
      y = Location.Cordinates.Latitude, 
      color = Data.Yeild.Upper
    ),
    alpha = 0.3,
    size = 2.2
  ) +
  scale_color_viridis_c(
    trans = "log10", 
    name = "Explosion Yield\n(Upper Estimate, kt)",
    breaks = c(0, 10, 100, 1000, 10000, 50000),
    limits = c(0.1, 50000),  # Set limits from 0.1 to 50000
    option = "plasma"
  ) +
  labs(
    title = "Nuclear Explosions across the World by Explosion Yield Upper Estimate",
    subtitle = "US, Russia, and Kazakstan made up most of the Nuclear Arms Race testing sites ",
    x = "Longitude", 
    y = "Latitude",
    caption = "Data Source: Kaggle Nuclear Explosions Dataset"
  ) +
  theme_minimal() +
  theme(
    legend.key.size = unit(0.8, "cm"),  
    legend.key.height = unit(0.5, "cm"), 
    legend.text = element_text(size = 6),  
    legend.title = element_text(size = 7),  
    legend.margin = margin(t = 0, r = 0, b = 0, l = 0, unit = "cm"),  
    plot.title = element_text(face = "bold", size = 13),
    axis.title = element_text(size = 8)
  )

Warning in scale_color_viridis_c(trans = "log10", name = "Explosion
Yield\n(Upper Estimate, kt)", : log-10 transformation introduced infinite
values.

Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_point()`).

Interpretation

The visualization reveals that most of the Explosions with extremely high-yield explosions, shown in yellow-orange, are predominantly are in Novaya Zemlya, the northern island of Russia, with a few other points in the Pacific Ocean. The highest concentration of medium-to-lower-yield nuclear explosions appears in three main regions: North America (particularly the South West of the United States) and in Kazakhstan and parts of Russia. These locations correspond to the majority of nuclear testing sites during the Cold War Arms Race, including the Nevada Test Site in the United States and the Soviet Union’s primary testing grounds, particularly the Novaya Zemlya archipelago and the Semipalatinsk Test Site, where the infamous 50-megaton “Tsar Bomba” and other high-yield thermonuclear devices were detonated.

This concentration reflects the significant testing sites during the Cold War, including the remote territories within superpower borders or their controlled areas that were designated for weapons development and testing. Other nuclear powers like France, the UK, and China conducted fewer tests, often in colonial possessions or remote territories, visible in their distinct clusters in the Pacific, Australia, and Africa.

Visualization 2: Boxplot

While the previous world map visualization provided a global perspective of nuclear testing locations, I wanted to explore the quantitative differences in test yields across specific geographical regions and locations in more detail by investigating the weapon deployment location variable. Since the variable contain more than 50 locations, I decided to group them into different categories, including US Domestic, US Pacific, Russia, Kazakhstan, French Pacific, Australia, Algeria, South Asian, Japan, Central Asia, and Other. I created a boxplot visualization with location category as the y-axis and the upper estimate of the explosion yield as the x-axis, which allows us to compare the statistical distribution of explosion yields between different location category, revealing both the central tendencies and outliers for each testing region. I also use the logarithmic scale for the upper estimate of nuclear explosion yields since it spans a very wide range of magnitude (from <1 to >10,000 kilotons), and without this transformation, the boxplot would compress most data points to only one side of the plot, obscuring critical patterns in yield distribution across different testing locations. This approach helps quantify the patterns observed in the map visualization and provides a more nuanced understanding of how explosion magnitudes varied across different testing sites.

location_categories <- c(
  # US Domestic Sites
  "Alamogordo" = "US Domestic", "NTS" = "US Domestic", "Carlsbad Nm" = "US Domestic",
  "Nellis Nv" = "US Domestic", "Fallon Nv" = "US Domestic", "Hattiesb Ms" = "US Domestic",
  "Hattiese Ms" = "US Domestic", "Amchitka Ak" = "US Domestic", "Farmingt Nm" = "US Domestic",
  "C. Nevada" = "US Domestic", "Grand V Co" = "US Domestic", "Rifle Co" = "US Domestic",
  "Mellis Nv" = "US Domestic",
  
  # US Pacific Sites
  "Bikini" = "US Pacific", "Enewetak" = "US Pacific", "Christmas Is" = "US Pacific",
  "Johnston Is" = "US Pacific", "Malden Is" = "US Pacific",
  
  # Russia/Soviet Union
  "Orenbg Russ" = "Russia", "N2 Russ" = "Russia", "Htr Russ" = "Russia", 
  "Bashkir Russ" = "Russia", "Tyumen Russ" = "Russia", "Perm Russ" = "Russia", 
  "Stavro Russ" = "Russia", "Ural Russ" = "Russia", "Arkhan Russ" = "Russia", 
  "Murm Russ" = "Russia", "Kalmyk Russ" = "Russia", "Komi Russ" = "Russia", 
  "Jakuts Russ" = "Russia", "Jakuts Ruse" = "Russia", "Krasno Russ" = "Russia", 
  "Chita Russ" = "Russia", "Irkuts Russ" = "Russia", "Bashki Russ" = "Russia", 
  "Astrak Russ" = "Russia", "Kz Russ" = "Russia", "Kemero Russ" = "Russia",
  "Tuymen Russ" = "Russia", "Lop Nor" = "Russia", "Mtr Russ" = "Russia",
  
  # Kazakhstan
  "Semi Kazakh" = "Kazakhstan", "Kazakh" = "Kazakhstan", "Marali Kazakh" = "Kazakhstan",
  "Azgir Kazakh" = "Kazakhstan", "Mangy Kazakh" = "Kazakhstan", "W Kazakh" = "Kazakhstan",
  "Azgie Kazakh" = "Kazakhstan", "Kazakhstan" = "Kazakhstan",
  
  # French Pacific
  "Mururoa" = "French Pacific", "Fangataufa" = "French Pacific", "Muruhoa" = "French Pacific",
  "Wsw Mururoa" = "French Pacific", "W Mururoa" = "French Pacific", "Hururoa" = "French Pacific",
  "Murueoa" = "French Pacific", "Mueueoa" = "French Pacific", "Fangataufaa" = "French Pacific",
  
  # Australia
  "Monteb Austr" = "Australia", "Emu Austr" = "Australia", "Marali Austr" = "Australia",
  
  # Algeria
  "Reggane Alg" = "Algeria", "In Ecker Alg" = "Algeria",
  
  # South Asian
  "Pokhran" = "South Asia", "Chagai" = "South Asia", "Kharan" = "South Asia",
  
  # Japan
  "Hiroshima" = "Japan", "Nagasaki" = "Japan",
  
  # Other Central Asian
  "Uzbek" = "Central Asia", "Mary Turkmen" = "Central Asia", "Ukraine" = "Central Asia",
  "Ukeaine" = "Central Asia", "Pamuk Uzbek" = "Central Asia",
  
  # Other Testing Locations
  "Offuswcoast Nz" = "Other", "S. Atlantic" = "Other", 
  "S.Atlantic" = "Other"
)

# add new column
nuclear_data <- nuclear_data |>
  mutate(Location_Category = case_when(
    `WEAPON DEPLOYMENT LOCATION` %in% names(location_categories) ~ 
      location_categories[`WEAPON DEPLOYMENT LOCATION`],
    TRUE ~ "Other"
  ))

nuclear_data |>
    ggplot(aes(x = Location_Category, y = Data.Yeild.Upper)) +
    geom_boxplot() + 
    scale_y_log10() +
    coord_flip() + 
    labs(
      title = "Upper Estimate of Nuclear Explosion Yield by Locations",
      x = "Location", 
      y = "Log Explosion Yield\n(Upper Estimate, kt)",
      caption = "Source: Nuclear Explosions Dataset"
    )

Warning in scale_y_log10(): log-10 transformation introduced infinite values.

Warning: Removed 36 rows containing non-finite outside the scale range
(`stat_boxplot()`).

Interpretation

From the visualization, US Pacific testing sites demonstrate the highest median yield and upper quartile values among all locations, reflecting the United States’ strategy of conducting its larger thermonuclear tests on remote Pacific islands rather than on domestic soil. This aligns with the yellow-orange high-yield points visible in the Pacific region on our map visualization. The US Domestic sites show a lower median yield but still maintain considerable variability and higher median than other location category, consistent with the Nevada Test Site’s role in hosting a wide range of both tactical and strategic weapon tests.

Russia and Kazakhstan exhibit similar yield distributions, with Russia showing slightly lower median values but both featuring numerous high-yield outliers. This confirms the concentration of high-yield tests in these regions that was visible in our map visualization; however, despite the many high yield explosions present in those regions, their overall medians are still lower the US Pacific. The “Other” category also seems to have an extensive spread of outlier points, illustrating the wide range of test magnitudes conducted at these locations throughout the nuclear era.

The French Pacific testing areas show the same median yield as Kazakhstan, Other, South Asian, and Algeria, with no outliers. South Asia (comprising Indian and Pakistani tests) displays a remarkably narrow interquartile range, as well as Japan, representing the Hiroshima and Nagasaki bombings, which may be due to fewer data points with similar yields compared to other location categories. Australia and Central Asia display lower yield ranges, consistent with their limited roles as testing grounds compared to the major nuclear powers’ primary sites.

This boxplot effectively quantifies and reinforces the geographic patterns observed in our map visualization, providing statistical evidence that testing location strategies varied significantly by country, with some powers preferring to conduct their highest-yield tests in remote territories while reserving domestic sites for lower-yield or more specialized testing.

Statistical Analysis: Analysis of Variance (ANOVA)

To determine if there are statistically significant differences in mean yields between different location categories, I will conduct a one-way Analysis of Variance (ANOVA). Since the visualization revealed that nuclear explosion yields span multiple orders of magnitude, I’ll apply a logarithmic transformation to the yield data to meet the ANOVA assumptions of normality and homogeneity of variance.

# Prepare data for ANOVA by log-transforming yields
nuclear_data <- nuclear_data |>
  mutate(log_yield = log10(Data.Yeild.Upper))

# Remove infinite values that might be caused by zero yields
nuclear_data_filtered <- nuclear_data |>
  filter(is.finite(log_yield))

# Run ANOVA on log-transformed yields by Location_Category
yield_anova <- aov(log_yield ~ Location_Category, data = nuclear_data_filtered)
summary(yield_anova)

                    Df Sum Sq Mean Sq F value Pr(>F)    
Location_Category   10  145.9  14.589   14.92 <2e-16 ***
Residuals         1999 1954.8   0.978                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The ANOVA results show that the p-value is <2e-16, which is an extremely small p-value. This means that we can reject null hypothesis, and thus, there are statistically significant differences in mean upper estimate nuclear explosion yields between the different location categories. This statistical confirmation validates what we observed in the boxplot visualization—that testing locations had significantly different explosion yield profiles. The differences in testing strategies across geographical regions represent true, statistically significant variations in how different sites were utilized for nuclear testing.

Conclusion

Through our visualizations and statistical analysis, we found that nuclear testing locations were not randomly distributed but strategically chosen by nuclear powers, with clear preferences for remote territories, colonial possessions, and designated domestic testing sites. There are statistically significant differences in explosion yields across location categories, as confirmed by our ANOVA analysis (p < 2e-16). This indicates that countries employed different testing strategies depending on geographical considerations.

The US Pacific testing sites demonstrated the highest median yield among all locations, reflecting the United States’ deliberate strategy to conduct larger thermonuclear tests away from the mainland in remote island territories. Russia and Kazakhstan show similar yield distributions with numerous high-yield outliers, particularly in Novaya Zemlya (northern Russia), where some of the most powerful nuclear devices were detonated, including the massive “Tsar Bomba.” Testing sites within domestic borders (particularly US Domestic sites) generally show lower median yields but still maintain considerable variability, consistent with their use for a wide range of both tactical and strategic weapon testing.

These findings address our research question regarding how nuclear explosion yields differ across locations and geographical regions. The world map visualization clearly illustrates the spatial distribution of testing sites and yield magnitudes, while the boxplot quantifies these differences and the ANOVA confirms their statistical significance.

Further Research

While this analysis has provided valuable insights into geographical patterns of nuclear testing yields, several questions remain that could be addressed in future research. For instance, why does the “other” location category shows a huge variance of explosion yield. Perhaps, there may be unseen problems in weapon deployment location that need addressing. Furthermore, these visualizations only explore the yield and the locations. Other questions may involve other nuclear test characteristics (underground vs. atmospheric, weapon type, purpose) and how they correlate with location and yield. These questions remain unanswered in our current analysis due to limitations in the available data and the need for more advanced statistical and geospatial analysis techniques. Future work would benefit from a deeper look into the data and the category as associated with the real world that combine nuclear testing records with historical information to develop a more comprehensive understanding of the complex relationships between geography and nuclear testing strategies.

Research Question 3: How did the purposes of nuclear test delivery evolve over time?

Visualization 1:

Because I am analyzing how the purposes of nuclear tests have shifted across decades, I first considered that the best approach would be to evaluate how the distribution of test purposes changed over time. To simplify the analysis and better capture major trends, I rebucketed the original purpose labels into five broader categories and similarly consolidated the drop type classifications into six major categories. I then created a new variable for the decade in which each test occurred and built a contingency table summarizing the number of tests for each purpose across decades. To visualize these patterns, I produced a bar plot showing how delivery methods varied over time within each purpose category, and a line plot tracking the number of tests by purpose over time. These visualizations, combined with a Chi-Square Test of Independence, allowed me to formally test whether nuclear test purposes varied significantly across different decades.

library(ggplot2)
library(ggrepel)

Warning: package 'ggrepel' was built under R version 4.4.3

library(dplyr)

df <- read.csv("nuclear_explosions.csv", stringsAsFactors = FALSE)

# Clean and categorize Purpose
df$Data.Purpose <- toupper(trimws(df$Data.Purpose))

Purpose_Simplified <- character(nrow(df))
for (i in 1:nrow(df)) {
  parts <- unlist(strsplit(df$Data.Purpose[i], "/"))
  parts <- trimws(parts)
  
  if ("COMBAT" %in% parts) {
    Purpose_Simplified[i] <- "Combat"
  } else if (any(parts %in% c("ME", "SAM", "FMS"))) {
    Purpose_Simplified[i] <- "Operational Impacts"
  } else if (any(parts %in% c("PNE", "SSE", "TRANSP"))) {
    Purpose_Simplified[i] <- "Noncombat"
  } else if (any(parts %in% c("WR"))) {
    Purpose_Simplified[i] <- "Weapons Testing"
  } else if (any(parts %in% c("WE"))) {
    Purpose_Simplified[i] <- "Damage Modeling"
  } else {
    Purpose_Simplified[i] <- "Noncombat"
  }
}
df$Purpose_Simplified <- Purpose_Simplified

# Classify drop type into broader categories
Drop_Category_Military <- character(nrow(df))
for (i in 1:nrow(df)) {
  if (df$Data.Type[i] %in% c("Airdrop", "Rocket", "Balloon", "Ship")) {
    Drop_Category_Military[i] <- "Military Delivery"
  } else if (df$Data.Type[i] %in% c("Surface", "Crater")) {
    Drop_Category_Military[i] <- "Surface"
  } else if (df$Data.Type[i] %in% c("Tower", "Space", "Atmosph")) {
    Drop_Category_Military[i] <- "Atmospheric"
  } else if (df$Data.Type[i] %in% c("Shaft", "Tunnel", "Gallery", "Mine", "Ug", "Shaft/Gr", "Shaft/Lg")) {
    Drop_Category_Military[i] <- "Underground"
  } else if (df$Data.Type[i] %in% c("Barge", "Uw", "Watersur", "Water Su")) {
    Drop_Category_Military[i] <- "Water-based"
  } else {
    Drop_Category_Military[i] <- "Other"
  }
}
df$Drop_Category_Military <- Drop_Category_Military

# Summarize test count
df_summary <- as.data.frame(table(df$Date.Year, df$Purpose_Simplified, df$Drop_Category_Military))
colnames(df_summary) <- c("Date.Year", "Purpose_Simplified", "Drop_Category_Military", "n")
df_summary$Date.Year <- as.integer(as.character(df_summary$Date.Year))
df_summary$n <- as.integer(df_summary$n)

# Plot: Facet by Purpose, fill = Drop Category
ggplot(df_summary, aes(x = Date.Year, y = n, fill = Drop_Category_Military)) +
  geom_bar(stat = "identity") +
  scale_fill_manual(values = c(
    "Military Delivery" = "#e41a1c",
    "Atmospheric" = "#00cea4",
    "Underground" = "#b8edff",
    "Surface" = "#a186ff",
    "Water-based" = "#1b1ea2",
    "Other" = "#999999"
  )) +
  facet_wrap(~Purpose_Simplified, scales = "free_y") +
  labs(
    title = "Test Delivery Methods Over Time by Purpose",
    x = "Year",
    y = "Number of Tests",
    fill = "Drop Type Category"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Interpretation

This bar plot examines the number of nuclear tests over time, split by their purpose (Combat, Operational Impacts, Weapons Testing, Damage Modeling, Noncombat) and colored by drop type category (Military Delivery, Atmospheric, Underground, Surface, Water-based). Each facet corresponds to a different purpose. Weapons Testing had a major surge in the 1950s and 1960s, dominated first by atmospheric and surface tests and then transitioning to underground tests post 1960s, likely following international treaties limiting atmospheric testing. Operational Impacts and Noncombat purposes show relatively steady but smaller-scale testing compared to Weapons Testing, with more underground or surface tests.Damage Modeling and Combat had very few tests overall, with no strong trend or clear preference for a delivery method.The main takeaway that can be derived from this graph is that the method of delivery shifted significantly over time, especially from atmospheric to underground tests, particularly for weapons development purposes.

Visualization 2:

# Summarize count of tests per year and purpose
df_line <- as.data.frame(table(df$Date.Year, df$Purpose_Simplified))
colnames(df_line) <- c("Date.Year", "Purpose_Simplified", "n")
df_line$Date.Year <- as.integer(as.character(df_line$Date.Year))
df_line$n <- as.integer(df_line$n)

# Line plot: Year vs. Test Count by Purpose (no points)
ggplot(df_line, aes(x = Date.Year, y = n, color = Purpose_Simplified)) +
  geom_line(size = 1.2) +
  scale_color_manual(values = c(
    "Combat" = "#e41a1c",
    "Weapons Testing" = "#4daf4a",
    "Damage Modeling" = "#984ea3",
    "Operational Impacts" = "#377eb8",
    "Noncombat" = "#ff7f00"
  )) +
  labs(
    title = "Number of Nuclear Tests Over Time by Purpose",
    x = "Year",
    y = "Number of Tests",
    color = "Test Purpose"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),
    legend.position = "right"
  )

Interpretation

This time-series line plot tracks the yearly number of nuclear tests for each purpose category. Based on the graph, Weapon Testing heavily dominates nuclear testing through the 1950s-1970s, peaking around the early 1960s. However, after the 1970s, a sharp decline in all types of tests is evident, especially after the end of the Cold War.Noncombat purposes show small but noticeable testing activity after the decline of combat-focused tests, suggesting a potential shift toward peaceful or non-aggressive nuclear uses. Operational Impacts and Damage Modeling had very limited but steady levels across the decades. The main takeaway of this second graphic was that testing was the primary driver of nuclear explosions historically, but the overall trend moved sharply downward toward minimal or peaceful purposes after the 1990s.

Statistical Analysis

df$Decade <- floor(df$Date.Year / 10) * 10

df$Purpose_Collapsed <- ifelse(
  df$Purpose_Simplified %in% c("Combat", "Damage Modeling"),
  "Damage Modeling",
  df$Purpose_Simplified
)

contingency_table <- table(df$Decade, df$Purpose_Collapsed)
contingency_table <- contingency_table[apply(contingency_table, 1, sum) > 0, apply(contingency_table, 2, sum) > 0]
print(contingency_table)

      
       Damage Modeling Noncombat Operational Impacts Weapons Testing
  1940               4         0                   0               5
  1950              27        34                   7             223
  1960              66        92                  26             519
  1970              49        90                  21             388
  1980              34        56                  21             327
  1990               3         1                   1              52

chi_result <- chisq.test(contingency_table)

Warning in chisq.test(contingency_table): Chi-squared approximation may be
incorrect

print(chi_result)


    Pearson's Chi-squared test

data:  contingency_table
X-squared = 33.412, df = 15, p-value = 0.004115

The summary statistics show that nuclear testing was heavily dominated by weapons testing during the 1950s and 1960s, while noncombat and operational impacts purposes became relatively more common in later decades. The Chi-Square Test of Independence conducted on the contingency table produced a p-value near zero, which is much smaller than the significance threshold of 0.05. This result leads to rejecting the null hypothesis and indicates that the purpose of nuclear tests significantly changed over different decades. The findings from both the summary statistics and the chi-square test are consistent with the research question, showing a clear historical shift in nuclear testing purposes over time.

Conclusion

Based on the evidence from the graphs and statistical analysis, the answer is clearly yes, the purpose of nuclear weapon drops has shifted over time. For example, the bar plot and line graph both show that nuclear testing was dominated by weapons testing during the 1950s and 1960s, but shifted toward noncombat and operational impacts testing in later decades. The summary statistics confirm this visual pattern, showing a steep decline in weapons-related tests after the 1960s. The Chi-Square Test of Independence further justifies this conclusion statistically, producing a p-value near zero, which is far below the significance threshold of 0.05. This allows us to reject the null hypothesis and conclude that test purpose and decade are not independent.

Further Research

While the project successfully analyzed changes in the purpose, yield, and location of nuclear testing over several decades, there are important questions left unanswered due to data limitations. The dataset used only includes nuclear tests up to the early 2000s, but nuclear testing activities have continued into the present, particularly in countries such as North Korea. Without extending the dataset to include more recent tests, we cannot fully capture whether newer nuclear programs follow similar historical trends or represent a new phase of nuclear testing behavior. Future work could address this limitation by incorporating updated data and applying similar analyses to examine whether the shift toward noncombat or operational purposes has persisted, reversed, or evolved in different geopolitical contexts. This would provide a more complete and current understanding of nuclear testing trends beyond the historical periods covered here.