My colleague Tim Doggett and I recently traveled to London to participate in a Risk Prediction Initiative 2.0workshop on Atlantic Hurricane Volatility.Several of the leading academic scientists who study tropical cyclones were there, as well as scientists from other catastrophe risk modeling companies and reinsurers. The exchange of ideas during workshop talks, coffee breaks, and dinner was fantastic.
Toward the end of the workshop, discussion turned to the connection between the number of storms that form in the open(Atlantic) ocean and the number that ultimately make landfall on the U.S. coast. The discussion was reflective of a more widely held sense in the scientific community that basinwide activity has stronger physical ties to large-scale climate, and is therefore more predictable.
Because of this greater predictability, and because each of the storms forming in the basin can be considered a candidate for landfall, people often ask whether a season's landfalling activity can be predicted from basinwide activity.
From an informatics point of view, this is the wrong question.
One way to model the connection between basinwide and landfalling activity that more accurately respects their physical connection is as follows.
Landfalling tropical cyclone (TC) counts and non-landfalling, or open ocean TC counts, can be thought of as separate random variables (X and Y, respectively, in the diagram below). Each can be modeled as statistically conditional on the climate conditions known to affect them through physical atmosphere-ocean interactions.
If some of the climate variables used to model each TC count variable are shared, then the two random variables will be conditionally independent, yet correlated through their relationships to the underlying climate.
The quantities in boxes are random. The dashed and solid arrows between them indicate stochastic and deterministic dependences, respectively, with information flowing in the direction they point.
Within this framework, basinwide activity (Z in the diagram) is simply the deterministic sum of landfalling and open ocean activity. In other words, if landfalling and open ocean TC counts are known, then so is the basinwide count.
If the covariance between X and Y is positive, it is possible for basinwide activity to have stronger climatic dependence than landfalling TC counts or open ocean TC counts alone. However, the landfalling TC count is the relevant quantity for hazard predictability.
Predicting basinwide activity first to get to the landfalling TC count only adds additional information about the count of open ocean TCs. In other words, basinwide activity won't add any more information about landfalls than is already available from climate conditions alone.
Using available climatic information directly to predict landfalls is a much more efficient use of information. A recent paper from AIR published in peer-reviewed literature provides one example of building such a model for landfalls depending on climate state.
In general, careful thinking through the causal relationships between random quantities can help ground statistical models in a physical understanding of the system, and ensure a more efficient use of the available information.