Skip to content

R internals

These wrappers back the optional R-powered functionality exposed through soundscapy.spi and soundscapy.satp.

soundscapy.r_wrapper._r_wrapper

Internal R integration for skew-normal and circumplex model calculations.

Wraps rpy2 to expose R's sn package and the bundled CircE BFGS scripts. Session state is held in a single module-level :class:RSession dataclass instance (_state). Call :func:get_r_session to obtain it; the session is initialised lazily on first access.

Not intended for direct use — all public names are re-exported from soundscapy.spi and soundscapy.satp.

CLASS DESCRIPTION
RSession

State container for the active R session.

FUNCTION DESCRIPTION
get_r_session

Return the active R session, initialising lazily on first call.

reset_r_session

Reset all session state, forcing re-initialisation on the next call.

install_r_packages

.. deprecated::

sample_mtsn

Sample from a multivariate truncated skew-normal distribution.

dp2cp

Convert Direct Parameters (DP) to Centred Parameters (CP).

cp2dp

Convert Centred Parameters (CP) to Direct Parameters (DP).

bfgs_fit

Fit a circumplex model and return extracted fit statistics.

RSession dataclass

RSession(
    sn: Any = None, base: Any = None, active: bool = False
)

State container for the active R session.

A single module-level instance (_state) holds the loaded R package objects. :func:get_r_session initialises it lazily on first call; :func:reset_r_session clears it to force re-initialisation.

ATTRIBUTE DESCRIPTION
sn

Loaded sn R package object.

TYPE: Any

base

Loaded base R package object.

TYPE: Any

active

True once initialisation has completed successfully.

TYPE: bool

get_r_session

get_r_session() -> RSession

Return the active R session, initialising lazily on first call.

RETURNS DESCRIPTION
RSession

The module-level _state instance.

RAISES DESCRIPTION
ImportError

If R is not installed or its version is too old, if the sn R package is missing or too old, or if the embedded CircE scripts cannot be sourced.

RuntimeError

If session initialisation fails for any other reason.

Source code in src/soundscapy/r_wrapper/_r_wrapper.py
def get_r_session() -> RSession:
    """Return the active R session, initialising lazily on first call.

    Returns
    -------
    :
        The module-level ``_state`` instance.

    Raises
    ------
    ImportError
        If R is not installed or its version is too old, if the ``sn``
        R package is missing or too old, or if the embedded CircE scripts
        cannot be sourced.
    RuntimeError
        If session initialisation fails for any other reason.
    """
    if not _state.active:
        _initialize_r_session()
    return _state

reset_r_session

reset_r_session() -> bool

Reset all session state, forcing re-initialisation on the next call.

Note: the R process itself continues running — rpy2 does not support terminating the embedded R interpreter.

RETURNS DESCRIPTION
bool

True if successful, False if an error occurred.

Source code in src/soundscapy/r_wrapper/_r_wrapper.py
def reset_r_session() -> bool:
    """Reset all session state, forcing re-initialisation on the next call.

    Note: the *R process itself continues running* — rpy2 does not support
    terminating the embedded R interpreter.

    Returns
    -------
    :
        ``True`` if successful, ``False`` if an error occurred.
    """
    global _state  # noqa: PLW0603

    try:
        import gc  # noqa: PLC0415

        was_active = _state.active
        _state = RSession()
        gc.collect()
        if was_active:
            logger.info("R session successfully reset")
        else:
            logger.debug("R session state cleared")
    except Exception:
        logger.exception("Error during R session reset")
        return False
    else:
        return True

install_r_packages

install_r_packages(
    packages: list[str] | None = None,
) -> None

.. deprecated:: This function is a no-op and will be removed in a future release. Install the R sn package directly from an R session::

    install.packages('sn')
Source code in src/soundscapy/r_wrapper/_r_wrapper.py
def install_r_packages(packages: list[str] | None = None) -> None:  # noqa: ARG001
    """
    .. deprecated::
        This function is a no-op and will be removed in a future release.
        Install the R ``sn`` package directly from an R session::

            install.packages('sn')
    """
    warnings.warn(
        "install_r_packages() is deprecated and will be removed in a future release. "
        "Install the R 'sn' package directly from R: install.packages('sn')",
        DeprecationWarning,
        stacklevel=2,
    )

sample_mtsn

sample_mtsn(
    selm_model: RS4 | None = None,
    xi: ndarray | None = None,
    omega: ndarray | None = None,
    alpha: ndarray | None = None,
    a: float = -1,
    b: float = 1,
    n: int = 1000,
    max_iter: int = 100000,
) -> np.ndarray

Sample from a multivariate truncated skew-normal distribution.

Uses rejection sampling to ensure samples fall within [a, b] for both dimensions.

PARAMETER DESCRIPTION
selm_model

Fitted SELM model from R's sn package. If provided, xi, omega, and alpha are ignored.

TYPE: RS4 | None DEFAULT: None

xi

Location parameter (2×1 array).

TYPE: ndarray | None DEFAULT: None

omega

Scale matrix (2×2 array).

TYPE: ndarray | None DEFAULT: None

alpha

Skewness parameter (2×1 array).

TYPE: ndarray | None DEFAULT: None

a

Lower truncation bound for both dimensions.

TYPE: float DEFAULT: -1

b

Upper truncation bound for both dimensions.

TYPE: float DEFAULT: 1

n

Number of samples to generate.

TYPE: int DEFAULT: 1000

max_iter

Maximum total candidate draws before raising RuntimeError.

TYPE: int DEFAULT: 100000

RETURNS DESCRIPTION
ndarray

Array of samples (n × 2).

RAISES DESCRIPTION
ValueError

If neither selm_model nor all of xi, omega, alpha are given.

RuntimeError

If max_iter draws are exhausted before n accepted samples are collected.

Source code in src/soundscapy/r_wrapper/_r_wrapper.py
def sample_mtsn(
    selm_model: RS4 | None = None,
    xi: np.ndarray | None = None,
    omega: np.ndarray | None = None,
    alpha: np.ndarray | None = None,
    a: float = -1,
    b: float = 1,
    n: int = 1000,
    max_iter: int = 100_000,
) -> np.ndarray:
    """
    Sample from a multivariate truncated skew-normal distribution.

    Uses rejection sampling to ensure samples fall within ``[a, b]`` for both
    dimensions.

    Parameters
    ----------
    selm_model
        Fitted SELM model from R's ``sn`` package.  If provided, ``xi``,
        ``omega``, and ``alpha`` are ignored.
    xi
        Location parameter (2×1 array).
    omega
        Scale matrix (2×2 array).
    alpha
        Skewness parameter (2×1 array).
    a
        Lower truncation bound for both dimensions.
    b
        Upper truncation bound for both dimensions.
    n
        Number of samples to generate.
    max_iter
        Maximum total candidate draws before raising ``RuntimeError``.

    Returns
    -------
    :
        Array of samples (n × 2).

    Raises
    ------
    ValueError
        If neither ``selm_model`` nor all of ``xi``, ``omega``, ``alpha`` are given.
    RuntimeError
        If ``max_iter`` draws are exhausted before ``n`` accepted samples are
        collected.
    """
    if selm_model is None and not (
        xi is not None and omega is not None and alpha is not None
    ):
        raise ValueError("Either selm_model or xi, omega, and alpha must be provided.")

    accepted: list[np.ndarray] = []
    total_drawn = 0
    batch_size = max(n, 64)

    while len(accepted) < n:
        remaining_budget = max_iter - total_drawn
        if remaining_budget <= 0:
            raise RuntimeError(
                f"sample_mtsn: reached max_iter={max_iter} without collecting "
                f"{n} accepted samples (got {len(accepted)}). "
                "The distribution may have negligible mass inside "
                f"[{a}, {b}]. Adjust the bounds or increase max_iter."
            )
        current_batch_size = min(batch_size, remaining_budget)
        candidates = sample_msn(
            selm_model=selm_model,
            xi=xi,
            omega=omega,
            alpha=alpha,
            n=current_batch_size,
        )
        total_drawn += current_batch_size
        in_bounds = (
            (candidates[:, 0] >= a)
            & (candidates[:, 0] <= b)
            & (candidates[:, 1] >= a)
            & (candidates[:, 1] <= b)
        )
        accepted.extend(candidates[in_bounds])

    return np.vstack(accepted[:n])

dp2cp

dp2cp(
    xi: ndarray,
    omega: ndarray,
    alpha: ndarray,
    family: Literal["SN", "ESN", "ST", "SC"] = "SN",
) -> tuple

Convert Direct Parameters (DP) to Centred Parameters (CP).

PARAMETER DESCRIPTION
xi

Location parameter (2×1 array).

TYPE: ndarray

omega

Scale matrix (2×2 array).

TYPE: ndarray

alpha

Skewness parameter (2×1 array).

TYPE: ndarray

family

Distribution family.

TYPE: Literal['SN', 'ESN', 'ST', 'SC'] DEFAULT: 'SN'

RETURNS DESCRIPTION
tuple

Tuple of centred parameters (mean, sigma, skew).

Source code in src/soundscapy/r_wrapper/_r_wrapper.py
def dp2cp(
    xi: np.ndarray,
    omega: np.ndarray,
    alpha: np.ndarray,
    family: Literal["SN", "ESN", "ST", "SC"] = "SN",
) -> tuple:
    """Convert Direct Parameters (DP) to Centred Parameters (CP).

    Parameters
    ----------
    xi
        Location parameter (2×1 array).
    omega
        Scale matrix (2×2 array).
    alpha
        Skewness parameter (2×1 array).
    family
        Distribution family.

    Returns
    -------
    :
        Tuple of centred parameters ``(mean, sigma, skew)``.
    """
    r = get_r_session()
    dp_r = robjects.ListVector(
        {
            "xi": robjects.FloatVector(xi.T),
            "Omega": _np2rmat(omega),
            "alpha": robjects.FloatVector(alpha),
        }
    )
    cp_r = r.sn.dp2cp(dp_r, family=family)
    return tuple(_r2np(cp_r[i]) for i in range(len(cp_r)))

cp2dp

cp2dp(
    mean: ndarray,
    sigma: ndarray,
    skew: ndarray,
    family: Literal["SN", "ESN", "ST", "SC"] = "SN",
) -> tuple

Convert Centred Parameters (CP) to Direct Parameters (DP).

PARAMETER DESCRIPTION
mean

Mean vector (2×1 array).

TYPE: ndarray

sigma

Covariance matrix (2×2 array).

TYPE: ndarray

skew

Skewness vector (2×1 array).

TYPE: ndarray

family

Distribution family.

TYPE: Literal['SN', 'ESN', 'ST', 'SC'] DEFAULT: 'SN'

RETURNS DESCRIPTION
tuple

Tuple of direct parameters (xi, omega, alpha).

Source code in src/soundscapy/r_wrapper/_r_wrapper.py
def cp2dp(
    mean: np.ndarray,
    sigma: np.ndarray,
    skew: np.ndarray,
    family: Literal["SN", "ESN", "ST", "SC"] = "SN",
) -> tuple:
    """Convert Centred Parameters (CP) to Direct Parameters (DP).

    Parameters
    ----------
    mean
        Mean vector (2×1 array).
    sigma
        Covariance matrix (2×2 array).
    skew
        Skewness vector (2×1 array).
    family
        Distribution family.

    Returns
    -------
    :
        Tuple of direct parameters ``(xi, omega, alpha)``.
    """
    r = get_r_session()
    cp_r = robjects.ListVector(
        {
            "mean": robjects.FloatVector(mean.T),
            "Sigma": _np2rmat(sigma),
            "skew": robjects.FloatVector(skew),
        }
    )
    dp_r = r.sn.cp2dp(cp_r, family=family)
    return tuple(_r2np(dp_r[i]) for i in range(len(dp_r)))

bfgs_fit

bfgs_fit(
    data_cor: DataFrame,
    n: int,
    scales: list[str] = PAQ_IDS,
    m_val: int = 3,
    *,
    equal_ang: bool = True,
    equal_com: bool = True,
) -> dict[str, Any]

Fit a circumplex model and return extracted fit statistics.

Calls the embedded CircE BFGS implementation and converts the result to a Python dict with scalar normalisation and a scipy-computed p-value.

PARAMETER DESCRIPTION
data_cor

Correlation matrix of the data.

TYPE: DataFrame

n

Number of observations used to compute data_cor.

TYPE: int

scales

List of scale names.

TYPE: list[str] DEFAULT: PAQ_IDS

m_val

Number of Fourier dimensions.

TYPE: int DEFAULT: 3

equal_ang

Whether to enforce equal-angles constraint.

TYPE: bool DEFAULT: True

equal_com

Whether to enforce equal-communalities constraint.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
dict[str, Any]

Dictionary of fit statistics.

Source code in src/soundscapy/r_wrapper/_r_wrapper.py
def bfgs_fit(
    data_cor: pd.DataFrame,
    n: int,
    scales: list[str] = PAQ_IDS,
    m_val: int = 3,
    *,
    equal_ang: bool = True,
    equal_com: bool = True,
) -> dict[str, Any]:
    """Fit a circumplex model and return extracted fit statistics.

    Calls the embedded CircE BFGS implementation and converts the result to a
    Python dict with scalar normalisation and a scipy-computed p-value.

    Parameters
    ----------
    data_cor
        Correlation matrix of the data.
    n
        Number of observations used to compute ``data_cor``.
    scales
        List of scale names.
    m_val
        Number of Fourier dimensions.
    equal_ang
        Whether to enforce equal-angles constraint.
    equal_com
        Whether to enforce equal-communalities constraint.

    Returns
    -------
    :
        Dictionary of fit statistics.
    """
    r = get_r_session()

    with (robjects.default_converter + pandas2ri.converter).context():
        # Only the Python→R conversion needs the pandas2ri context.
        # Calling as_matrix() inside the context would cause its R-matrix
        # return value to be auto-converted back to numpy by the active
        # converter, producing a numpy array instead of an R matrix.
        r_data_cor = robjects.conversion.get_conversion().py2rpy(data_cor)

    r_cor_mat = r.base.as_matrix(r_data_cor)
    circe_bfgs = robjects.globalenv["CircE.BFGS"]

    bfgs_model = circe_bfgs(
        r_cor_mat,
        v_names=robjects.StrVector(scales),
        m=m_val,
        N=n,
        start_values="PFA",
        equal_ang=equal_ang,
        equal_com=equal_com,
        iterlim=1000,
        try_refit_BFGS=True,
        print_level=0,
        file=robjects.NULL,
    )

    return _extract_bfgs_stats(bfgs_model)