Revise _negbinom_ estimation and reorganize `make_forecast` (!5) · Merge requests · CSEP Group / Earthquake Forecast Testing Experiment for Italy / Models / Mock Model

Revise negbinom estimation and reorganize make_forecast

This PR is two-fold, with both aspects depending on each other somehow.

Details:

1. Change negbinom implementation

ATTENTION: this affects case H of floatCSEP!

correctly determine counts
- by using the full catalog span from the date of 1st event (catalog_start.date(), any magnitude) until 'now' (t0)
- to indeed end at 'now', need to adjust the last bin edge created with np.arange()
correctly estimate tau parameter (it's 1 / alpha, not 1. / alpha * mu)
correctly determine mean of the background/'past' seismicity (mean_bg)
- it's not the mu or lambd estimates for Poisson, since negbinom sampling must refer to the mean 'n_events' above mag_min, not above mag_compl adjusted to mag_min
- essentially, mean_bg is just the average of the counts array
- So why can mag_compl not be used in negbinom parameter estimation? Because it is based on a variance estimate (of events >= mag_min), which cannot be adjusted to a different magnitude threshold. I consider this is a weakness compared to using Poisson; perhaps there is a clever solution?
calculate parameters for 'recent' seismicity component differently than for background/'past' seismicity
- since variance for 'recent' seismicity is biased or not defined, use background's dispersion to estimate it from the 'recent' mean
outsource negbinom parameter estimation to a function to facilitate treating background and recent seismicity differently (only based on mean and var)
further simplifications
add some assertions and detailed comments (purely informative)
FYI: former condition if lambd != 0: ... else ... (no recent seismicity) is now caught by:
1. if mean == 0 : return 1, 1 (tau, theta) in calc_negbinom_params(),
2. leading n_events = numpy.random.negative_binomial(tau, theta) to always sample a zero.

2. Reorganize make_forecast, incl. completely separate Poisson/negbinom and use clear names for catalog subsets

Poisson-based and negbinom-based modeling share some catalog subsets for sampling event locations, but use different magnitude-thresholded catalog subsets for parameter estimation (see above). Clarifying those relations necessitated some reorganizations; specific changes:

create and use catalog subsets with self-explanatory names:
- cat_past (past events)
  - → cat_past_compl (formerly cat_total, above mag_compl), used by Poisson and always for sampling event locations
  - → cat_past_minmag (formerly cat_total_mag, above mag_min), used by negbinom
- cat_prev (recent events)
  - → cat_prev_thresh (formerly catalog_prev, above mag_compl or mag_min depending on apply_mc_to_lookback), used by Poisson and always for catalog sampling
  - → cat_prev_mingmag (recent events above mag_min), used by negbinom
completely separate Poisson- and negbinom-related operations (previously, the latter still depended on the former, see above)
rename extension apply_mc_to_lambda to a more general apply_mc_to_lookback ('lambda' suggests compatibility only with Poisson, but sampling locations of recent events above mag_compl can also be used with negbinom).
- ATTENTION: this also affects case H of floatCSEP! Please adapt models.yml!
More parameter explanations in args.txt
- incl. that mag_compl and apply_mc_to_lookback input arguments are not used in negbinom parameter estimation.

Edited Feb 19, 2025 by Marcus Herrmann

Revise _negbinom_ estimation and reorganize `make_forecast`

Merge request reports