Analysis
This module contains functions for regression models.
analysis
Classes:
-
ModelOutput–Dataclass to store the output of the sequential regression function
Functions:
-
add_bootstrap_methods_to_ols–Add bootstrap methods to the OLS results class.
-
sequential_regression–Fits a series of regression models across different factor solutions.
ModelOutput
dataclass
ModelOutput(models: List[ols], anova_results: DataFrame, r2s: List[float], summaries: List[str], n_solutions: int, y_var: str)
Dataclass to store the output of the sequential regression function
Attributes:
-
models(List[ols]) –List of fitted models.
-
anova_results(DataFrame) –ANOVA results.
-
r2s(List[float]) –List of adjusted r2s.
-
summaries(List[str]) –List of model summaries.
-
n_solutions(int) –Number of solutions.
-
y_var(str) –Name of dependent variable.
add_bootstrap_methods_to_ols
add_bootstrap_methods_to_ols(results: RegressionResults) -> RegressionResults
Add bootstrap methods to the OLS results class.
Parameters:
-
(resultsRegressionResults) –The results of an OLS regression.
Returns:
-
RegressionResults(RegressionResults) –The results object with the bootstrap methods added.
Example
# Assuming `results` is the output of an OLS regression
results = add_bootstrap_methods_to_ols(results)
results.bootstrap(n_bootstraps=2000)
conf_int = results.conf_int_bootstrap()
# Access the pvals
pvals = results.pvalues_bootstrap
Source code in stats_utils/regression/analysis.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | |
sequential_regression
sequential_regression(data: DataFrame, y: str, n_solutions: int = 4, covariates: List[str] = [], n_bootstraps: int = 2000) -> Tuple[List[ols], DataFrame, List[float]]
Fits a series of regression models across different factor solutions.
Parameters:
-
(dataDataFrame) –Dataframe containing dependent variable, covariates (age and gender), and factor scores. Assumes that factor scores are named
Sol{N}_ML{M}whereNis the total number of factors andMis the number of each factor within that solution. -
(ystr) –Name of dependent variable.
-
(n_solutionsint, default:4) –Number of solutions. Defaults to
4. -
(covariatesList[str], default:[]) –List of covariates to include in the model (in addition to age and gender). Defaults to
[]. -
(n_bootstrapsint, default:2000) –Number of bootstraps to run. Defaults to
2000.
Returns:
-
Tuple[List[ols], DataFrame, List[float]]–Tuple[List[smf.ols], pd.DataFrame, List[float]]: Returns the list of fitted models, the ANOVA table, and a list of adjusted r2s.
Source code in stats_utils/regression/analysis.py
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | |