Skip to content

Add bivariate copula implementation#137

Open
chrischu12 wants to merge 98 commits intosimon-hirsch:mainfrom
chrischu12:bivariate_copula
Open

Add bivariate copula implementation#137
chrischu12 wants to merge 98 commits intosimon-hirsch:mainfrom
chrischu12:bivariate_copula

Conversation

@chrischu12
Copy link

No description provided.

Copy link
Owner

@simon-hirsch simon-hirsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss the base class for the copulas, because now all distributions need to implement the param_link parameter. This should be solved by a CopulaMixin class that implements the necessary methods.

@simon-hirsch simon-hirsch requested review from BerriJ and Copilot July 3, 2025 09:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for bivariate copulas by introducing new link functions and multivariate copula distributions, and integrates them into the online multivariate distribution regression estimator.

  • Introduce FisherZLink and TauToPar in copulalinks.py
  • Extend base Distribution to accept param_links and update existing distributions
  • Add BiCopNormal and MarginalCopula classes for Gaussian copula modeling
  • Integrate copula-based parameter handling into online_mvdistreg

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/ondil/link/copulalinks.py New Fisher Z and Kendall‐tau link functions
src/ondil/link/init.py Export copula link functions
src/ondil/base/distribution.py Add support for param_links in base Distribution
src/ondil/distributions/normal.py Update constructor signature for param_links
src/ondil/distributions/mv_t_low_rank.py Pass param_links through to superclass
src/ondil/distributions/mv_normal_chol.py Pass param_links through to superclass
src/ondil/distributions/mv_marg_cop.py New MarginalCopula distribution
src/ondil/distributions/bicop_normal.py New BiCopNormal Gaussian copula distribution
src/ondil/distributions/init.py Register new copula distributions
src/ondil/estimators/online_mvdistreg.py Integrate copula option and branching logic into fitting routine
Comments suppressed due to low confidence (3)

src/ondil/distributions/normal.py:47

  • The param_links parameter should be typed as dict[int, LinkFunction] instead of LinkFunction, and default to an empty dict. Update the annotation to param_links: dict[int, LinkFunction] = {}.
        param_links: LinkFunction = {},

src/ondil/distributions/bicop_normal.py:33

  • The constructor uses a singular param_link but the base class expects param_links: dict[int, LinkFunction]. Consider renaming to param_links and accepting a mapping keyed by parameter index.
        param_link: LinkFunction = TauToPar(),

src/ondil/estimators/online_mvdistreg.py:690

  • The code previously scaled X and stored it in X_scaled but now passes the raw X to _outer_fit, bypassing the scaling step. It should likely be X_scaled here.
        self._outer_fit(X=X, y=y, theta=theta)

@simon-hirsch simon-hirsch mentioned this pull request Jul 3, 2025
@simon-hirsch simon-hirsch changed the base branch from mvdistreg to main September 11, 2025 14:09
@simon-hirsch
Copy link
Owner

@chrischu12 I have changed the base branch as we #134 is merged now.

@simon-hirsch simon-hirsch requested a review from Copilot October 8, 2025 07:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 12 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

# If no regularization is allowed, we just return the theta
return theta

def _make_initial_eta(self, theta: np.ndarray):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be on the copula class?

theta[a] = self.distribution.set_initial_guess(y, theta[a], p)
theta = self._handle_path_regularization(theta=theta, p=p, a=a)
# if (inner_iteration == 0) and (outer_iteration == 0) & (a == 0):
# theta[a] = self.distribution.set_initial_guess(y, theta[a], p)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be commented as it is necessary for the multivariate case. Properly switch between copula and MV case in the fitting

dl2_link = self.distribution.cube_to_flat(dl2_link, param=p)
dl2_link = dl2_link[:, k]
else:
eta = self._make_initial_eta(theta)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicated above

# If the likelihood is at some point decreasing, we're breaking
# Hence we need to store previous iteration values:
if (inner_iteration > 0) | (outer_iteration > 0):
if (inner_iteration == 0) and (outer_iteration == 0):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated below.

eta_elem, param=p, k=k, d=self.dim_
)

if issubclass(self.distribution.__class__, CopulaMixin):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block should be as eta_to_theta_element on the copula class and then we don't need to import and check for specific copulas on the main estimator class.

param=p,
).reshape(-1, 1)

if isinstance(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this block. Should also be a function on the copula class (the same as above), eta_to_theta_element

old_likelihood = self._current_likelihood[a]

# If the LL is decreasing, we're resetting to the previous iteration
if ((outer_iteration > 0) | (inner_iteration > 1)) & decreasing:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change this to > 0

Comment on lines +1650 to +1654
if issubclass(self.distribution.__class__, CopulaMixin) and p == 0:
out[a][p] = np.tanh(out[a][p] / 2) * (1 - 1e-8)
out[a][p] = self.distribution.param_link_inverse(
out[a][p] * (1 - 1e-8), param=0
) * (1 - 1e-8)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if issubclass(self.distribution.class, CopulaMixin) and p == 0:
out[a][p] = np.tanh(out[a][p] / 2) * (1 - 1e-8)
out[a][p] = self.distribution.param_link_inverse(
out[a][p] * (1 - 1e-8), param=0
) * (1 - 1e-8)

This pattern is duplicated a few times across the code. Can this be moved in the param_link_inverse for the copula?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 14 comments.

Comments suppressed due to low confidence (2)

src/ondil/distributions/init.py:66

  • New copula distributions and copula link functions are added, but there are no corresponding unit tests under tests/ (the existing distribution/link tests exclude CopulaMixin). Adding at least basic shape/domain tests and a few numeric sanity checks for logpdf and derivatives would help prevent regressions in the estimator integration.
from .beta import Beta
from .betainflated import BetaInflated
from .betainflatedzero import BetaInflatedZero
from .bicop_clayton import BivariateCopulaClayton
from .bicop_gumbel import BivariateCopulaGumbel
from .bicop_normal import BivariateCopulaNormal
from .bicop_studentt import BivariateCopulaStudentT
from .exponential import Exponential
from .gamma import Gamma
from .gumbel import Gumbel
from .inversegamma import InverseGamma
from .inversegaussian import InverseGaussian
from .johnsonsu import JSU
from .logistic import Logistic
from .lognormal import LogNormal
from .lognormalmedian import LogNormalMedian
from .mv_normal_chol import MultivariateNormalInverseCholesky
from .mv_normal_low_rank import MultivariateNormalInverseLowRank
from .mv_normal_modchol import MultivariateNormalInverseModifiedCholesky
from .mv_t_chol import MultivariateStudentTInverseCholesky
from .mv_t_low_rank import MultivariateStudentTInverseLowRank
from .mv_t_modchol import MultivariateStudentTInverseModifiedCholesky
from .normal import Normal, NormalMeanVariance
from .poisson import Poisson
from .powerexponential import PowerExponential
from .reversegumbel import ReverseGumbel
from .skew_t import SkewT, SkewTMeanStd
from .studentt import StudentT
from .weibull import Weibull
from .zeroadjustedgamma import ZeroAdjustedGamma

__all__ = [
    "Normal",
    "NormalMeanVariance",
    "StudentT",
    "SkewT",
    "SkewTMeanStd",
    "JSU",
    "BetaInflated",
    "Gamma",
    "Beta",
    "LogNormal",
    "LogNormalMedian",
    "Logistic",
    "Exponential",
    "Poisson",
    "PowerExponential",
    "Gumbel",
    "InverseGaussian",
    "ReverseGumbel",
    "InverseGamma",
    "MultivariateNormalInverseCholesky",
    "MultivariateNormalInverseLowRank",
    "MultivariateNormalInverseModifiedCholesky",
    "MultivariateStudentTInverseCholesky",
    "MultivariateStudentTInverseLowRank",
    "MultivariateStudentTInverseModifiedCholesky",
    "BetaInflatedZero",
    "ZeroAdjustedGamma",
    "BivariateCopulaNormal",
    "BivariateCopulaGumbel",
    "BivariateCopulaClayton",
    "BivariateCopulaStudentT",
    "Weibull",
]

src/ondil/estimators/online_mvdistreg.py:1344

  • The decreasing-likelihood rollback guard uses outer_iteration >= 0, which is always true. If decreasing becomes true on the first iteration, prev_theta/prev_* were never initialized, leading to an UnboundLocalError. This condition likely should mirror the original intent (only rollback when previous-state snapshots exist).
            if ((outer_iteration >= 0) | (inner_iteration > 1)) & decreasing:
                warnings.warn("Likelihood is decreasing. Breaking.")
                # Reset to values from the previous iteration
                theta = prev_theta
                self._model_selection = prev_model_selection
                self._x_gram[p] = prev_x_gram
                self._y_gram[p] = prev_y_gram
                self.coef_ = prev_beta
                self.coef_path_ = prev_beta_path
                self._current_likelihood[a] = old_likelihood

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +166 to +170
return 1.0 / (1.0 - np.abs(x)) ** 2

def link_second_derivative(self, x: np.ndarray) -> np.ndarray:
return -2.0 * np.sign(x) / (np.abs(x) - 1.0) ** 3

Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GumbelParameterToKendallsTau.link_derivative / link_second_derivative are inconsistent with link() (which returns sign(x) - 1/x_safe). The derivative of -1/x is 1/x^2 (up to the near-zero safeguard), not a function of (1-|x|); this will make the chain rule terms incorrect.

Suggested change
return 1.0 / (1.0 - np.abs(x)) ** 2
def link_second_derivative(self, x: np.ndarray) -> np.ndarray:
return -2.0 * np.sign(x) / (np.abs(x) - 1.0) ** 3
"""
First derivative of link(x) = sign(x) - 1/x_safe.
For |x| >= eps, x_safe = x and d/dx[-1/x] = 1/x^2.
For |x| < eps, x_safe is constanteps), so the derivative is 0.
"""
x = np.asarray(x, dtype=float)
abs_x = np.abs(x)
# For |x| < eps, link() uses a constant 1/x_safe term -> derivative 0.
x_safe = np.where(abs_x < self.eps, self.eps, x)
return np.where(abs_x < self.eps, 0.0, 1.0 / (x_safe ** 2))
def link_second_derivative(self, x: np.ndarray) -> np.ndarray:
"""
Second derivative of link(x) consistent with link_derivative().
For |x| >= eps, d/dx[1/x^2] = -2/x^3.
For |x| < eps, link_derivative is 0, so the second derivative is 0.
"""
x = np.asarray(x, dtype=float)
abs_x = np.abs(x)
x_safe = np.where(abs_x < self.eps, self.eps, x)
return np.where(abs_x < self.eps, 0.0, -2.0 / (x_safe ** 3))

Copilot uses AI. Check for mistakes.
Comment on lines +588 to +606
def element_link_function(
self,
y: np.ndarray,
param: int = 0,
k: int = 0,
d: int = 0,
) -> np.ndarray:
"""Apply the element-wise link function for a copula parameter."""
return self.links[param].element_link(y)

def element_link_function_derivative(
self,
y: np.ndarray,
param: int = 0,
k: int = 0,
d: int = 0,
) -> np.ndarray:
"""Apply the first derivative of the element-wise link function."""
return self.links[param].element_derivative(y)
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

element_link_function* methods call LinkFunction.element_link / element_derivative, but the base LinkFunction API only defines link / link_derivative (element-wise variants exist only for matrix link implementations). As written, these helpers will raise AttributeError for scalar copula links like FisherZLink; consider delegating to link / link_derivative (or gating on matrix-shaped parameters).

Copilot uses AI. Check for mistakes.
"""
fitted = self.flat_to_cube(eta, param=param)
fitted = self.link_inverse(fitted, param=param)
return self.log_likelihood(y, theta={**theta, param: fitted})
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

param_conditional_likelihood calls self.log_likelihood(...), but Distribution exposes loglikelihood(...) / logpdf(...) rather than log_likelihood. This will raise at runtime if used; update to the correct method name (and ensure the call signature matches how copula distributions represent theta).

Suggested change
return self.log_likelihood(y, theta={**theta, param: fitted})
return self.loglikelihood(y, theta={**theta, param: fitted})

Copilot uses AI. Check for mistakes.
Comment on lines +1658 to 1680
for p in range(self.distribution.n_params):
array = np.zeros((N, self.n_dist_elements_[p]))
for k in range(self.n_dist_elements_[p]):
array[:, k] = (
make_model_array(
X=X_scaled,
eq=self._equation[p][k],
fit_intercept=self._fit_intercept[p],
)
@ self.coef_[p][k][a, :]
).squeeze()

out[a][p] = self.distribution.flat_to_cube(array, p)
if not issubclass(self.distribution.__class__, CopulaMixin) or (
issubclass(self.distribution.__class__, CopulaMixin) and p == 1
):
out[a][p] = self.distribution.link_inverse(out[a][p], p)
if issubclass(self.distribution.__class__, CopulaMixin) and p == 0:
out[a][p] = np.tanh(out[a][p] / 2) * (1 - 1e-8)
out[a][p] = self.distribution.param_link_inverse(
out[a][p] * (1 - 1e-8), param=0
) * (1 - 1e-8)

Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

predict_all_adr() recomputes out[a][p] in a second unconditional loop (lines ~1658+), even after the a <= optimal_adr_/else branch. This overwrites the earlier results and also defeats the a > optimal_adr_ fast-path copy; it looks like an indentation/duplication bug and can yield inconsistent predictions.

Copilot uses AI. Check for mistakes.
Comment on lines +1543 to +1548
# out = self.distribution.link_inverse(out, 0)
out = np.tanh(out / 2) * (1 - 1e-8)
if issubclass(self.distribution.__class__, CopulaMixin):
out = self.distribution.param_link_inverse(out * (1 - 1e-8), param=0) * (
1 - 1e-8
)
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

predict() now applies tanh(out/2) unconditionally (even for non-copula distributions) and no longer calls distribution.link_inverse. This will return parameters on the wrong scale for standard multivariate distributions; apply the tanh/param-link logic only for CopulaMixin (and otherwise keep link_inverse).

Suggested change
# out = self.distribution.link_inverse(out, 0)
out = np.tanh(out / 2) * (1 - 1e-8)
if issubclass(self.distribution.__class__, CopulaMixin):
out = self.distribution.param_link_inverse(out * (1 - 1e-8), param=0) * (
1 - 1e-8
)
if issubclass(self.distribution.__class__, CopulaMixin):
# For copula distributions, first map linear predictor through tanh
# to ensure parameters lie in (-1, 1), then apply the copula-specific
# parameter link inverse.
out = np.tanh(out / 2) * (1 - 1e-8)
out = self.distribution.param_link_inverse(out, param=0)
else:
# For non-copula distributions, use the standard link inverse to map
# from the linear predictor scale back to the distribution parameter scale.
out = self.distribution.link_inverse(out, 0)

Copilot uses AI. Check for mistakes.
Comment on lines +115 to +121
return (np.pi / 2) * np.cos((np.pi / 2) * x)

def link_second_derivative(self, x: np.ndarray) -> np.ndarray:
return -((np.pi / 2) ** 2) * np.sin((np.pi / 2) * x)

def inverse_derivative(self, x: np.ndarray) -> np.ndarray:
return (2 / np.pi) / np.sqrt(1 - x**2)
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParameterToKendallsTau.link_derivative / link_second_derivative and inverse_derivative do not match the documented transforms. For tau=(2/pi)*arcsin(rho), link_derivative should be (2/pi)/sqrt(1-rho^2) (and inverse_derivative should be (pi/2)cos(pi/2tau)); current implementation appears swapped, which will break gradient/Hessian-based estimation.

Suggested change
return (np.pi / 2) * np.cos((np.pi / 2) * x)
def link_second_derivative(self, x: np.ndarray) -> np.ndarray:
return -((np.pi / 2) ** 2) * np.sin((np.pi / 2) * x)
def inverse_derivative(self, x: np.ndarray) -> np.ndarray:
return (2 / np.pi) / np.sqrt(1 - x**2)
# d/d rho [ (2/pi) * arcsin(rho) ] = (2/pi) / sqrt(1 - rho^2)
return (2 / np.pi) / np.sqrt(1 - x**2)
def link_second_derivative(self, x: np.ndarray) -> np.ndarray:
# d^2/d rho^2 [ (2/pi) * arcsin(rho) ] = (2/pi) * rho * (1 - rho^2)^(-3/2)
return (2 / np.pi) * x * (1 - x**2) ** (-1.5)
def inverse_derivative(self, x: np.ndarray) -> np.ndarray:
# d/d tau [ sin(pi/2 * tau) ] = (pi/2) * cos(pi/2 * tau)
return (np.pi / 2) * np.cos((np.pi / 2) * x)

Copilot uses AI. Check for mistakes.
Comment on lines +201 to +219
def link(self, x: np.ndarray) -> np.ndarray:
return x / (2 + np.abs(x))

def inverse(self, x: np.ndarray) -> np.ndarray:
abs_x = np.clip(np.abs(x), 0, 1 - self.eps)
return 2 * x / (1 - abs_x)

def link_derivative(self, x: np.ndarray) -> np.ndarray:
abs_x = np.clip(np.abs(x), 0, 1 - self.eps)
return 2 / (1 - abs_x) ** 2

def link_second_derivative(self, x: np.ndarray) -> np.ndarray:
abs_x = np.clip(np.abs(x), 0, 1 - self.eps)
sign_x = np.sign(x)
return 4 * sign_x / (1 - abs_x) ** 3

def inverse_derivative(self, x: np.ndarray) -> np.ndarray:
abs_x = np.clip(np.abs(x), 0, 1 - self.eps)
return 2 / (1 - abs_x) ** 2 No newline at end of file
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClaytonParameterToKendallsTau.link_derivative / link_second_derivative currently implement derivatives of the inverse mapping (theta = 2*tau/(1-|tau|)), not the derivative of link() (tau = theta/(2+|theta|)). This will yield incorrect gradients/Hessians in estimation; update these methods to differentiate the actual link() expression (including handling of abs).

Copilot uses AI. Check for mistakes.
Comment on lines +620 to +621
else:
self.adr_steps_ = 1
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fit(), when max_regularisation_size is None the code forces self.adr_steps_ = 1, overriding distribution.get_regularization_size(...). This effectively disables AD-R regularization for all distributions by default (not just copulas) and looks like an unintended behavior change; consider keeping the computed size unless explicitly capped.

Suggested change
else:
self.adr_steps_ = 1

Copilot uses AI. Check for mistakes.
Comment on lines +1238 to +1244
# Compute dI/dp via inbeder for masked positions
idxs = np.flatnonzero(mask)
dIdp_vals = np.empty(idxs.size, dtype=float)
_, dIdp, _ = inbeder_vec_numba(xmax, t2, 0.5)

dIdp = np.zeros_like(x_b, dtype=float)
dIdp.flat[idxs] = dIdp_vals
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_diff_quantile_nu() allocates dIdp_vals but never fills it, and then assigns those uninitialized values into dIdp; the actual dIdp returned by inbeder_vec_numba(...) is ignored. This makes the quantile derivative incorrect (and can introduce NaNs due to uninitialized memory).

Suggested change
# Compute dI/dp via inbeder for masked positions
idxs = np.flatnonzero(mask)
dIdp_vals = np.empty(idxs.size, dtype=float)
_, dIdp, _ = inbeder_vec_numba(xmax, t2, 0.5)
dIdp = np.zeros_like(x_b, dtype=float)
dIdp.flat[idxs] = dIdp_vals
# Compute dI/dp via inbeder
_, dIdp_all, _ = inbeder_vec_numba(xmax, t2, 0.5)
# Initialize full dIdp array and fill only masked positions
dIdp = np.zeros_like(x_b, dtype=float)
dIdp[mask] = dIdp_all[mask]

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants