Skip to content

SSM Analysis

circumplex.analysis.ssm_analyze

Main SSM analysis function.

This module provides the primary user-facing function for performing Structural Summary Method (SSM) analysis on circumplex data.

FUNCTION DESCRIPTION
ssm_analyze

Perform Structural Summary Method (SSM) analysis on circumplex data.

ssm_analyze

ssm_analyze(data: DataFrame, scales: list[str] | list[int], angles: ndarray | list[float] | None = None, measures: list[str] | str | None = None, grouping: str | None = None, boots: int = 2000, interval: float = 0.95, measures_labels: list[str] | None = None, *, contrast: bool = False, listwise: bool = True, seed: int | None = None) -> SSM

Perform Structural Summary Method (SSM) analysis on circumplex data.

This is the main entry point for SSM analysis. It automatically determines whether to perform mean-based or correlation-based analysis based on the measures parameter, calculates SSM parameters, and computes bootstrap confidence intervals.

PARAMETER DESCRIPTION
data

DataFrame containing circumplex scale scores (and optionally measures and grouping variables)

TYPE: DataFrame

scales

Column names or indices for circumplex scales. Must be ordered according to their angular positions (matching the order in angles).

TYPE: list[str] | list[int]

angles

Angular positions for the scales in degrees. If None, uses standard octant angles [90, 135, 180, 225, 270, 315, 360, 45]. Length must match the number of scales.

TYPE: ndarray | list[float] | None DEFAULT: None

measures

Column name(s) for external measure variable(s) to correlate with scales.

  • If None: Performs mean-based analysis (profile analysis)
  • If string or list: Performs correlation-based analysis

TYPE: list[str] | str | None DEFAULT: None

grouping

Column name for grouping variable. If None, treats all data as a single group. The variable will be converted to a categorical factor.

TYPE: str | None DEFAULT: None

contrast

If True, calculates differences between groups or measures.

  • Requires exactly 2 groups (with 0 or 1 measures), OR
  • Requires exactly 2 measures (with 1 group) Raises ValueError if requirements not met.

TYPE: bool DEFAULT: False

boots

Number of bootstrap resamples for confidence interval calculation. Default: 2000.

TYPE: int DEFAULT: 2000

interval

Confidence level for bootstrap intervals (e.g., 0.95 for 95% CI). Default: 0.95.

TYPE: float DEFAULT: 0.95

listwise

Missing data handling:

  • If True: Listwise deletion (remove rows with any missing value)
  • If False: Pairwise deletion (use all available data for each calculation) Default: True.

TYPE: bool DEFAULT: True

measures_labels

Optional custom labels for measures (same length as measures). If None, uses the column names.

TYPE: list[str] | None DEFAULT: None

seed

Random seed for reproducibility of bootstrap results. If None, results will vary across runs.

TYPE: int | None DEFAULT: None

RETURNS DESCRIPTION
SSM

Dictionary containing:

  • results : pd.DataFrame SSM parameters and confidence intervals with columns:

    • Label: Profile label (group/measure name or combination)
    • Group: Group identifier
    • Measure: Measure identifier (None for mean-based analysis)
    • e_est, x_est, y_est, a_est, d_est, fit_est: Point estimates
    • e_lci, x_lci, y_lci, a_lci, d_lci: Lower confidence intervals
    • e_uci, x_uci, y_uci, a_uci, d_uci: Upper confidence intervals

    Note: Displacement (d) is in degrees. No CIs for fit parameter.

  • scores : pd.DataFrame Mean scores (mean-based) or correlation scores (correlation-based) for each scale, with Label column.

  • details : dict Analysis metadata including boots, interval, listwise, angles, contrast, and score_type.

  • type : str Analysis type: mean or correlation

RAISES DESCRIPTION
ValueError
  • If angles length doesn't match scales length
  • If contrast=True but requirements not met
  • If data contains no valid observations after missing data handling

Examples:

Mean-based analysis (single group)

>>> from circumplex import ssm_analyze
>>> from circumplex.data import load_dataset
>>> aw2009 = load_dataset('aw2009')
>>> results = ssm_analyze(aw2009, scales=list(range(8)), seed=12345)
>>> print(results['results'][['Label', 'e_est', 'a_est', 'd_est']])
     Label  e_est  a_est      d_est
0      All  0.423  0.981  344.358

Mean-based analysis (multiple groups with contrast)

>>> jz2017 = load_dataset('jz2017')
>>> results = ssm_analyze(jz2017, scales=list(range(1, 9)),
...                        grouping='Gender', contrast=True, seed=12345)
>>> print(results['results'][['Label', 'e_est', 'a_est']])
          Label  e_est  a_est
0        Female  0.635  0.158
1          Male  0.596  0.192
2  Male - Female -0.039  0.034

Correlation-based analysis (single measure)

>>> results = ssm_analyze(jz2017, scales=list(range(1, 9)),
...                        measures='PARPD', seed=12345)
>>> print(results['results'][['Label', 'e_est', 'a_est', 'd_est']])
   Label  e_est  a_est   d_est
0  PARPD  0.250  0.150  128.9

Correlation-based analysis (measure contrast)

>>> results = ssm_analyze(jz2017, scales=list(range(1, 9)),
...                        measures=['ASPD', 'NARPD'],
...                        contrast=True, seed=12345)
>>> print(results['results'][['Label', 'e_est', 'a_est']])
             Label  e_est  a_est
0             ASPD  0.253  0.055
1            NARPD  0.311  0.203
2  NARPD - ASPD     0.058  0.148
Notes

This function is a Python port of ssm_analyze() from the R circumplex package (Zimmermann & Wright, 2017). It maintains numerical parity with the R implementation to at least 3 decimal places.

SSM Parameters:

  • elevation (e): Mean of all scale scores
  • x_value (x): Projection onto x-axis (cosine component)
  • y_value (y): Projection onto y-axis (sine component)
  • amplitude (a): Vector length (prototypicality)
  • displacement (d): Angular position in degrees [0, 360)
  • fit: Model fit (R²), proportion of variance explained

Bootstrap Confidence Intervals:

Uses percentile method with stratified sampling when groups are present. Displacement CIs use circular statistics to handle angular wrapping.

See Also

load_dataset : Load example datasets OCTANTS : Standard octant angles for 8-scale circumplex

References

Zimmermann, J., & Wright, A. G. C. (2017). Beyond description in interpersonal construct validation: Methodological advances in the circumplex Structural Summary Approach. Assessment, 24(1), 3-23. https://doi.org/10.1177/1073191115621795

Zimmermann, J., & Wright, A. G. C. (2017). The circumplex package [Computer software]. https://cran.r-project.org/package=circumplex

Source code in src/circumplex/analysis/ssm_analyze.py
def ssm_analyze(
    data: pd.DataFrame,
    scales: list[str] | list[int],
    angles: np.ndarray | list[float] | None = None,
    measures: list[str] | str | None = None,
    grouping: str | None = None,
    boots: int = 2000,
    interval: float = 0.95,
    measures_labels: list[str] | None = None,
    *,
    contrast: bool = False,
    listwise: bool = True,
    seed: int | None = None,
) -> SSM:
    """Perform Structural Summary Method (SSM) analysis on circumplex data.

    This is the main entry point for SSM analysis. It automatically determines
    whether to perform mean-based or correlation-based analysis based on the
    `measures` parameter, calculates SSM parameters, and computes bootstrap
    confidence intervals.

    Parameters
    ----------
    data
        DataFrame containing circumplex scale scores (and optionally measures
        and grouping variables)
    scales
        Column names or indices for circumplex scales. Must be ordered according
        to their angular positions (matching the order in `angles`).
    angles
        Angular positions for the scales in degrees. If None, uses standard
        octant angles [90, 135, 180, 225, 270, 315, 360, 45]. Length must
        match the number of scales.
    measures
        Column name(s) for external measure variable(s) to correlate with scales.

        - If None: Performs mean-based analysis (profile analysis)
        - If string or list: Performs correlation-based analysis
    grouping
        Column name for grouping variable. If None, treats all data as a
        single group. The variable will be converted to a categorical factor.
    contrast
        If True, calculates differences between groups or measures.

        - Requires exactly 2 groups (with 0 or 1 measures), OR
        - Requires exactly 2 measures (with 1 group)
        Raises ValueError if requirements not met.
    boots
        Number of bootstrap resamples for confidence interval calculation.
        Default: 2000.
    interval
        Confidence level for bootstrap intervals (e.g., 0.95 for 95% CI).
        Default: 0.95.
    listwise
        Missing data handling:

        - If True: Listwise deletion (remove rows with any missing value)
        - If False: Pairwise deletion (use all available data for each calculation)
        Default: True.
    measures_labels
        Optional custom labels for measures (same length as measures).
        If None, uses the column names.
    seed
        Random seed for reproducibility of bootstrap results. If None, results
        will vary across runs.

    Returns
    -------
    :
        Dictionary containing:

        - **results** : pd.DataFrame
            SSM parameters and confidence intervals with columns:

            - Label: Profile label (group/measure name or combination)
            - Group: Group identifier
            - Measure: Measure identifier (None for mean-based analysis)
            - e_est, x_est, y_est, a_est, d_est, fit_est: Point estimates
            - e_lci, x_lci, y_lci, a_lci, d_lci: Lower confidence intervals
            - e_uci, x_uci, y_uci, a_uci, d_uci: Upper confidence intervals

            Note: Displacement (d) is in degrees. No CIs for fit parameter.

        - **scores** : pd.DataFrame
            Mean scores (mean-based) or correlation scores (correlation-based)
            for each scale, with Label column.

        - **details** : dict
            Analysis metadata including boots, interval, listwise, angles,
            contrast, and score_type.

        - **type** : str
            Analysis type: `mean` or `correlation`

    Raises
    ------
    ValueError
        - If angles length doesn't match scales length
        - If contrast=True but requirements not met
        - If data contains no valid observations after missing data handling

    Examples
    --------
    **Mean-based analysis (single group)**

    >>> from circumplex import ssm_analyze
    >>> from circumplex.data import load_dataset
    >>> aw2009 = load_dataset('aw2009')
    >>> results = ssm_analyze(aw2009, scales=list(range(8)), seed=12345)
    >>> print(results['results'][['Label', 'e_est', 'a_est', 'd_est']])
         Label  e_est  a_est      d_est
    0      All  0.423  0.981  344.358

    **Mean-based analysis (multiple groups with contrast)**

    >>> jz2017 = load_dataset('jz2017')
    >>> results = ssm_analyze(jz2017, scales=list(range(1, 9)),
    ...                        grouping='Gender', contrast=True, seed=12345)
    >>> print(results['results'][['Label', 'e_est', 'a_est']])
              Label  e_est  a_est
    0        Female  0.635  0.158
    1          Male  0.596  0.192
    2  Male - Female -0.039  0.034

    **Correlation-based analysis (single measure)**

    >>> results = ssm_analyze(jz2017, scales=list(range(1, 9)),
    ...                        measures='PARPD', seed=12345)
    >>> print(results['results'][['Label', 'e_est', 'a_est', 'd_est']])
       Label  e_est  a_est   d_est
    0  PARPD  0.250  0.150  128.9

    **Correlation-based analysis (measure contrast)**

    >>> results = ssm_analyze(jz2017, scales=list(range(1, 9)),
    ...                        measures=['ASPD', 'NARPD'],
    ...                        contrast=True, seed=12345)
    >>> print(results['results'][['Label', 'e_est', 'a_est']])
                 Label  e_est  a_est
    0             ASPD  0.253  0.055
    1            NARPD  0.311  0.203
    2  NARPD - ASPD     0.058  0.148

    Notes
    -----
    This function is a Python port of ssm_analyze() from the R circumplex
    package (Zimmermann & Wright, 2017). It maintains numerical parity with
    the R implementation to at least 3 decimal places.

    **SSM Parameters:**

    - **elevation (e)**: Mean of all scale scores
    - **x_value (x)**: Projection onto x-axis (cosine component)
    - **y_value (y)**: Projection onto y-axis (sine component)
    - **amplitude (a)**: Vector length (prototypicality)
    - **displacement (d)**: Angular position in degrees [0, 360)
    - **fit**: Model fit (R²), proportion of variance explained

    **Bootstrap Confidence Intervals:**

    Uses percentile method with stratified sampling when groups are present.
    Displacement CIs use circular statistics to handle angular wrapping.

    See Also
    --------
    [`load_dataset`](../data/#circumplex.data.load_dataset) : Load example datasets
    [`OCTANTS`](../utils/angles/#circumplex.utils.angles.OCTANTS) :
        Standard octant angles for 8-scale circumplex

    References
    ----------
    Zimmermann, J., & Wright, A. G. C. (2017). Beyond description in
    interpersonal construct validation: Methodological advances in the
    circumplex Structural Summary Approach. *Assessment, 24*(1), 3-23.
    https://doi.org/10.1177/1073191115621795

    Zimmermann, J., & Wright, A. G. C. (2017). The circumplex package
    [Computer software]. https://cran.r-project.org/package=circumplex

    """
    # Validate and process scales
    if isinstance(scales[0], int):
        scale_names = [data.columns[i] for i in scales]
    else:
        scale_names = scales

    n_scales = len(scale_names)

    # Process angles
    if angles is None:
        # Use default octant angles
        if n_scales != 8:
            msg = (
                f"When angles=None, exactly 8 scales are required (got {n_scales}). "
                "Please provide custom angles for non-octant circumplex models."
            )
            raise ValueError(msg)
        angles_deg = OCTANTS
    else:
        # Convert to numpy array if needed
        angles_deg = np.array(angles, dtype=float)

        # Validate length
        if len(angles_deg) != n_scales:
            msg = (
                f"Length of angles ({len(angles_deg)}) must match "
                f"number of scales ({n_scales})"
            )
            raise ValueError(msg)

    # Convert angles to radians for internal use
    # Ensure numpy array type for downstream functions
    angles_rad = cast("np.ndarray", degrees_to_radians(angles_deg))

    # Route to appropriate analysis function
    if measures is None:
        # Mean-based analysis
        return SSM.from_dict(
            ssm_analyze_means(
                data=data,
                scales=scale_names,
                angles=angles_rad,
                grouping=grouping,
                contrast=contrast,
                boots=boots,
                interval=interval,
                listwise=listwise,
                seed=seed,
            )
        )
    # Correlation-based analysis
    return SSM.from_dict(
        ssm_analyze_corrs(
            data=data,
            scales=scale_names,
            angles=angles_rad,
            measures=measures,
            grouping=grouping,
            contrast=contrast,
            boots=boots,
            interval=interval,
            listwise=listwise,
            measures_labels=measures_labels,
            seed=seed,
        )
    )