MainThe foundations for training machine learning models. |
|
---|---|
mikropml: User-Friendly R Package for Robust Machine Learning Pipelines |
|
Preprocess data prior to running machine learning |
|
Run the machine learning pipeline |
|
Plotting helpersVisualize performance to help you tune hyperparameters and choose model methods. |
|
Plot hyperparameter performance metrics |
|
Plot performance metrics for multiple ML runs with different parameters |
|
Tidy the performance dataframe |
|
Get hyperparameter performance metrics |
|
Combine hyperparameter performance metrics for multiple train/test splits |
|
Package Data |
|
datasets |
|
Small OTU abundance dataset |
|
Mini OTU abundance dataset |
|
Mini OTU abundance dataset with 3 categorical variables |
|
Groups for otu_mini_multi |
|
ML results |
|
Results from running the pipline with L2 logistic regression on |
|
Results from running the pipline with random forest on |
|
Results from running the pipline with rpart2 on |
|
Results from running the pipline with svmRadial on |
|
Results from running the pipline with xbgTree on |
|
Results from running the pipeline with glmnet on |
|
Results from running the pipeline with glmnet on |
|
Results from running the pipeline with glmnet on |
|
misc |
|
Cross validation on |
|
Replace spaces in all elements of a character vector with underscores |
|
Pipeline customizationThese are functions called by preprocess_data() or run_ml(). We make them available in case you would like to customize various steps of the pipeline beyond the arguments provided by the main functions. |
|
Remove columns appearing in only |
|
Get preprocessed dataframe for continuous variables |
|
Randomize feature order to eliminate any position-dependent effects |
|
Select indices to partition the data into training & testing sets. |
|
Get outcome type. |
|
Set hyperparameters based on ML method and dataset characteristics |
|
Generate the tuning grid for tuning hyperparameters |
|
Define cross-validation scheme and training parameters |
|
Get default performance metric name |
|
Get default performance metric function |
|
Train model using |
|
Get performance metrics for test data |
|
Get model performance metrics as a one-row tibble |
|
Get feature importance using the permutation method |
|
Group correlated features |