StepMix

For StepMixR, please refer to this repository.

A Python package following the scikit-learn API for generalized mixture modeling. The package supports categorical data (Latent Class Analysis) and continuous data (Gaussian Mixtures/Latent Profile Analysis). StepMix can be used for both clustering and supervised learning.

Additional features include:

Support for missing values through Full Information Maximum Likelihood (FIML);
Multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory;
Covariates and distal outcomes;
Parametric and non-parametric bootstrapping.

Reference

If you find StepMix useful, please leave a ⭐ and consider citing our arXiv preprint:

@article{morin2023stepmix,
  title={StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables},
  author={Morin, Sacha and Legault, Robin and Lalibert{\'e}, F{\'e}lix and Bakk, Zsuzsa and Gigu{\`e}re, Charles-{\'E}douard and de la Sablonni{\`e}re, Roxane and Lacourse, {\'E}ric},
  journal={arXiv preprint arXiv:2304.03853},
  year={2023}
}

Install

You can install StepMix with pip, preferably in a virtual environment:

pip install stepmix

Quickstart

A StepMix mixture using categorical variables on a preloaded data matrix. StepMix accepts either numpy.arrayor pandas.DataFrame. Categories should be integer-encoded and 0-indexed.

from stepmix.stepmix import StepMix

# Categorical StepMix Model with 3 latent classes
model = StepMix(n_components=3, measurement="categorical")
model.fit(data)

# Allow missing values
model_nan = StepMix(n_components=3, measurement="categorical_nan")
model_nan.fit(data_nan)

For binary data you can also use measurement="binary" or measurement="binary_nan". For continuous data, you can fit a Gaussian Mixture with diagonal covariances using measurement="continuous" or measurement="continuous_nan".

Set verbose=1 for a detailed output.

Please refer to the StepMix tutorials to learn how to combine continuous and categorical data in the same model.

Tutorials

Detailed tutorials are available in notebooks:

Generalized Mixture Models with StepMix: an in-depth look at how mixture models can be defined with StepMix. The tutorial uses the Iris Dataset as an example and covers:
1. Gaussian Mixtures (Latent Profile Analysis);
2. Binary Mixtures (LCA);
3. Categorical Mixtures (LCA);
4. Mixed Categorical and Continuous Mixtures;
5. Missing Values through Full-Information Maximum Likelihood.
Stepwise Estimation with StepMix: a tutorial demonstrating how to define measurement and structural models. The tutorial discusses:
1. LCA models with distal outcomes;
2. LCA models with covariates;
3. 1-step, 2-step and 3-step estimation;
4. Corrections (BCH or ML) and other options for 3-step estimation;
5. Putting it All Together: A Complete Model with Missing Values
Model Selection:
1. Selecting the number of components in a mixture model (n_components) with cross-validation;
2. Selecting the number of components with the Parametric Bootstrapped Likelihood Ratio Test (BLRT);
3. Fit indices: AIC, BIC and other metrics.
Parameters, Bootstrapping and CI: a tutorial discussing how to:
1. Access StepMix parameters;
2. Bootstrap StepMix estimators;
3. Quickly plot confidence intervals.
Supervised and Semi-Supervised Learning with StepMix:
1. Binary Classification;
2. Multiclass Classification;
3. Semi-Supervised Learning;
4. Cross-Validation.
Deriving p-values in StepMix: a tutorial demonstrating how to transform SM parameters into conventional regression coefficients and how to derive p-values. The tutorial covers models with:
1. Continuous covariate;
2. Binary covariate;
3. Categorical covariate;
4. Multiple covariates (different distributions);
5. Binary distal outcome;

Name		Name	Last commit message	Last commit date
Latest commit History 434 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
stepmix		stepmix
test		test
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
README-dev.md		README-dev.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StepMix

Reference

Install

Quickstart

Tutorials

About

Releases 11

Packages

Contributors 5

Languages

License

Labo-Lacourse/stepmix

Folders and files

Latest commit

History

Repository files navigation

StepMix

Reference

Install

Quickstart

Tutorials

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 5

Languages

Packages