Group 25 Report – Measuring Joy: What Makes a Nation Happy?

Author

Brook Xiao et al.

Published

April 28, 2025

Project Overview

We study why some countries report markedly higher life-satisfaction than others, using the World Happiness Report for 2019 (pre-COVID) and 2021 (mid-pandemic). Focusing on the six official explanatory components, we pose four research questions about economic resources, social cohesion, civil freedom, health, generosity, and regional clustering.

Dataset description (2019 and 2021 slices)

We analyse the 2019 sample (n = 156 countries) and the 2021 sample (n = 149) from the World Happiness Report.

The response variable is the Happiness Score – the national mean answer to Gallup’s Cantril Ladder (0 = worst possible life, 10 = best).

Six additive components


Factor

Definition
GDP per capita Ladder-point contribution from national income
Healthy life-expectancy Contribution from expected healthy years
Social support Contribution from having someone to rely on
Freedom Contribution from feeling free to choose life course
Generosity Contribution from recent charitable giving
Perceived corruption Contribution reflecting cleaner institutions

Country/region name and overall rank are descriptive only; all numeric analyses use the cleaned tibbles df2019 and df2021.

Research questions

  1. RQ 1 – To what extent do GDP-per-capita and social support jointly predict 2019 Happiness Score?

  2. RQ 2 – How strongly does perceived freedom to make life choices relate to 2019 Happiness Score?

  3. RQ 3 – Do healthy life-expectancy and generosity together shape 2021 Happiness Score?

  4. RQ 4 – Do countries from the same geographical region share similar happiness profiles?

Each question is addressed with at least two figures plus a formal statistical model.


RQ 1 · GDP × Social support

Figure 1 – Bivariate scatter

Statistical evidence

term estimate std.error statistic p.value conf.low conf.high
(Intercept) 0.696 0.039 17.733 0 0.618 0.773
gdp 0.567 0.040 14.284 0 0.489 0.645

Each additional ladder-point of the GDP-per-capita contribution is associated with a 0.57 ± 0.08-point increase in the social-support contribution (R² ≈ 0.55, p < 10⁻²⁹).

Figure 2 – Coefficient plot

GDP β ≈ 1.35; Social-support β ≈ 1.54; Adjusted R² ≈ 0.77.


RQ 2 · Freedom to choose life course

Figure 3 – Contour density

The density ridge peaks at freedom ≈ 0.50 and happiness ≈ 6.0, confirming the upward trend.

Figure 4 – Linear fit

Statistical evidence

term estimate std.error statistic p.value conf.low conf.high
(Intercept) 3.679 0.215 17.075 0 3.253 4.104
freedom_to_make_life_choices 4.403 0.516 8.536 0 3.384 5.421

A full-scale freedom gain raises Happiness by 4.40 ± 0.52 points (R² ≈ 0.49, p < 2 × 10⁻¹⁶).


RQ 3 · Healthy life-expectancy × Generosity (2021)

Figure 5 – Hex-binned heat-map

Figure 6 – LOESS interaction

Statistical evidence

term estimate std.error statistic p.value conf.low conf.high
(Intercept) -2.570 0.546 -4.710 0.000 -3.649 -1.492
healthy_life_expectancy 0.125 0.008 14.950 0.000 0.109 0.142
generosity -5.518 4.097 -1.347 0.180 -13.615 2.579
healthy_life_expectancy:generosity 0.098 0.064 1.544 0.125 -0.028 0.224

Life-expectancy β ≈ 0.125 (p < 2 × 10⁻¹⁶); generosity & interaction ns; Adjusted R² ≈ 0.60.


RQ 4 · Regional similarity

Figure 7 – Dendrogram

Figure 8 – PCA scatter with 95 % ellipses

Statistical evidence

term df sumsq meansq statistic p.value
regional_indicator 9 106.053 11.784 25.34 0
Residuals 139 64.637 0.465 NA NA

One-way ANOVA: F(9, 139) ≈ 25.34, p < 2 × 10⁻¹⁶; regions explain ≈ 62% of cross-country happiness variance.

Silhouette width for the four-cluster solution is 0.46, indicating moderately distinct regional groupings.


Conclusions

Across both years, economic output, social support, and healthy life-span explain the bulk of cross-national happiness.

A two-variable model (GDP + support) accounts for 77 % of 2019 variance; perceived freedom adds a smaller but significant 4.40 ± 0.52 ladder-points per full-scale unit.

In 2021, healthy life-expectancy remains the strongest single correlate (≈ 0.13 points per extra healthy year), while generosity shows no reliable main or interaction effect.

Regional patterns are pronounced: an ANOVA indicates that ≈ 62 % of between-country variation sits at the regional level, with coherent clusters in Western Europe, Latin America, and Sub-Saharan Africa.

These findings suggest that policies promoting both economic security and social cohesion yield the largest well-being gains, and that civil freedom offers an additional but smaller boost.


Future work

  • Add control variables we already have.

    Re-fit the 2019 multiple-regression model including healthy life-expectancy and generosity to see whether GDP and social support remain significant when the full set of components is considered.

    Model-assumption checks.

    Use residual-versus-fitted and QQ plots (via plot(lm_obj)) to confirm linearity and normal-error assumptions for each regression used in the report.

    Compare pre- and mid-pandemic means.

    Conduct paired t-tests (or Wilcoxon signed-rank tests if normality fails) on each component’s 2019 vs 2021 scores to quantify how the pandemic shifted the drivers.

    Evaluate multicollinearity.

    Compute variance-inflation factors (car::vif) for the full 2019 model; if VIF > 5, consider centred variables or principal-component regression (covered in lecture).

    Visualise change over time.

    Produce a simple line chart for the ten largest countries, plotting their Happiness Score in 2015, 2019, and 2021 to illustrate trajectories without advanced time-series methods.

    Report uncertainty for cluster analysis.

    Repeat the hierarchical clustering with single and average linkage and record whether regional groupings persist; discuss any instability instead of pursuing bootstrap or gap-stat techniques.