ICC
This module contains functions for running intraclass correlation coefficient (ICC) analyses.
icc
Functions:
-
calculate_intraclass_correlations–Computes intraclass correlation coefficients (ICCs) for a specified list of
calculate_intraclass_correlations
calculate_intraclass_correlations(data: DataFrame, variables: List[str], variable_col: str = 'variable', subject_col: str = 'subjectID', timepoint_col: str = 'timepoint', value_col: str = 'value', icc_type: str = 'ICC3') -> DataFrame
Computes intraclass correlation coefficients (ICCs) for a specified list of
variables across multiple subjects and timepoints (or raters). This
function is designed for use with long-format DataFrames, where each row
represents a single observation, including the subject ID, timepoint,
variable name, and corresponding value. It supports various types of ICC
calculations, as implemented in the corresponding function from
pingouin. The output is a DataFrame containing the ICC
values, associated statistics, and confidence intervals for each variable.
Parameters:
-
(dataDataFrame) –The DataFrame containing the data variables. This should be a long-format DataFrame with one row per observation.
-
(variablesList[str]) –List of variables (as strings) for which to calculate intraclass correlations.
-
(variable_colstr, default:'variable') –The name of the column containing the variable names. Defaults to
"variable". -
(subject_colstr, default:'subjectID') –The name of the column containing the subject IDs. Defaults to
"subjectID". -
(timepoint_colstr, default:'timepoint') –The name of the column containing the timepoint (rater). Defaults to
"timepoint". -
(value_colstr, default:'value') –The name of the column containing the values to be rated. Defaults to
"value". -
(icc_typestr, default:'ICC3') –The type of intraclass correlation. Defaults to
"ICC3".
Returns:
-
DataFrame–pd.DataFrame: A DataFrame containing intraclass correlation results for
-
DataFrame–each variable.
Example
import pandas as pd
import numpy as np
# Number of subjects
num_subjects = 5
# Initial values for timepoint 1
values_timepoint_1 = {
'A': np.random.randint(5, 15, num_subjects),
'B': np.random.randint(5, 15, num_subjects)
}
# Adding Gaussian noise to create values for timepoint 2
noise = np.random.normal(0, 0.5, num_subjects)
# Constructing the DataFrame
data = pd.DataFrame({
'subjectID': np.repeat(np.arange(1, num_subjects + 1), 4),
'timepoint': [1, 1, 2, 2] * num_subjects,
'variable': ['A', 'B', 'A', 'B'] * num_subjects,
'value': np.concatenate([
values_timepoint_1['A'],
values_timepoint_1['B'],
values_timepoint_1['A'] + noise,
values_timepoint_1['B'] + noise
])
})
# Calculate intraclass correlations for variables A, B, and C
icc_results = calculate_intraclass_correlations(
data=data,
variables=['A', 'B'],
variable_col='variable',
subject_col='subjectID',
timepoint_col='timepoint',
value_col='value',
icc_type='ICC3'
)
print(icc_results)
Source code in stats_utils/reliability/icc.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |