# Commentary on Kun Zhang’s presentation

16 March 2021, online causal inference seminar

# Step back to the statistical view

• Kun has laid out the ML/AI importance very clearly

# The two fundamental problems of “causal inference”

1. Estimation: Accepting a certain causal structure, what’re the effects of various configurations of the causes?
2. Discovery: What is the causal structure of the system anyway?

# Estimation problems

• Includes e.g. policy learning as an application
• Get clear about the estimand
• Get clear about the conditions under which the estimand is or is not identified from the distribution of observables
• Get clever about the estimation and/or testing
• An immense task, but all resting on assumptions about the causal structure

# Discovery problems

• Given the distribution of observables, what’s the causal structure?
• Or what’s the range of possible causal structures?
• Experiment is one way of answering this!
• There had better be non-experimental ways
• E.g., geology explains the causes of earthquakes without randomized controlled trials on tectonic plate boundaries
• But maybe we’re fooling ourselves and the geologists don’t really know any more than astrologers
• Why think discovery problems are solvable?

# Perdition

• Hume (1739) (in modern language): all we can observe is association (“constant conjunction”), not counterfactuals or causes (“necesary connexion”)
• Anticipated by al-Ghazali (n.d.) in 1100 (it is not “habitual” for “a corpse to sit up and write an eloquent volume in a well-ordered script”, but habits can change)
• Bertrand Russell (1954): there is a special place in Hell for philosophers who think they have refuted Hume (or al-Ghazali)
• Is Kun putting his soul in danger?

# Skating perdition

• We need to impose some assumptions
• These should be weaker or more plausible than the assumptions used in estimation problems
• Causal assumptions in $$\Rightarrow$$ causal conclusions out
• “The goal of therapy is to turn neurotic misery into everyday unhappiness” (attrib. Freud)
• The goal of causal discovery is to turn metaphysical misery into everyday statistical unhappiness

# What kind of statistical problem?

• Causal structure is qualitative
• The DAG, or the non-parametric structural equations, or Rubin-style ignorability
• Picking qualitative aspects of a statistical model is model selection
• Model selection has issues which don’t match continuous-parameter-estimation intuition, but it’s not impossible
• E.g. post-selection inference (perhaps jut by data-splitting)
• Model selection also rests on assumptions

# What kind of results do we find in causal discovery?

• The kind Kun has just explained to us: if we make these assumptions about the causal structure, and we make those assumptions about the functional forms in the statistical model, then such-and-such a procedure will consistently select the right causal structure
• Constraint-based methods, noise assumptions, etc.

• Some assumptions are harder to buy than others
• Linear models are very hard for me personally to swallow
• Expanding the range of assumptions under which we know we have consistent discovery procedures (should) make this work an easier sell
• Once we hit the frontier, there will be a trade-off between harder-to-swallow assumptions and more-precise conclusions (at fixed data size)
• Nonparametric conditional independence tests (Zhang et al. 2011) have lower power than partial-correlation tests for linear-Gaussian relations
• This again is no different from any other statistical problem! (Manski 2003)
• Anyone willing to estimate an ATE by propensity-score matching has declared their reservation price…

# Summing up

• We know causal discovery is possible under causal and statistical assumptions
• comparable to or weaker than assumptions for causal estimation problems
• We should think of this as a kind of model selection problem
• There is an immense field for statistical and econometric work here

# References

al-Ghazali, Abu Hamid Muhammad ibn Muhammad at-Tusi. n.d. The Incoherence of the Philosophers = Tahafut al-Falasifah: A Parallel English-Arabic Text. Provo, Utah: Brigham Young University Press.

Hume, David. 1739. A Treatise of Human Nature: Being an Attempt to Introduce the Experimental Method of Reasoning into Moral Subjects. London: John Noon.

Manski, Charles F. 2003. Partial Identification of Probability Distributions. New York: Springer-Verlag.

Russell, Bertrand. 1954. Nightmares of Eminent Persons. New York: Simon; Schusters.

Zhang, Kun, Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. 2011. “Kernel-Based Conditional Independence Test and Application in Causal Discovery.” In Proceedings of the Twenty-Seventh Conference Annual Conference on Uncertainty in Artificial Intelligence (Uai-11), edited by Fabio Gagliardi Cozman and Avi Pfeffer, 804–13. Corvallis, Oregon: AUAI Press. http://arxiv.org/abs/1202.3775.