
Package index
- 
          
mikropmlmikropml-package - mikropml: User-Friendly R Package for Robust Machine Learning Pipelines
 
- 
          
preprocess_data() - Preprocess data prior to running machine learning
 
- 
          
run_ml() - Run the machine learning pipeline
 
- 
          
get_feature_importance() - Get feature importance using the permutation method
 
- 
          
get_performance_tbl() - Get model performance metrics as a one-row tibble
 
- 
          
calc_model_sensspec()calc_mean_roc()calc_mean_prc() - Calculate and summarize performance for ROC and PRC plots
 
- 
          
calc_mean_perf() - Generic function to calculate mean performance curves for multiple models
 
- 
          
calc_baseline_precision() - Calculate the fraction of positives, i.e. baseline precision for a PRC curve
 
- 
          
calc_balanced_precision() - Calculate balanced precision given actual and baseline precision
 
- 
          
compare_models() - Perform permutation tests to compare the performance metric across all pairs of a group variable.
 
- 
          
permute_p_value() - Calculated a permuted p-value comparing two models
 
- 
          
bootstrap_performance() - Calculate a bootstrap confidence interval for the performance on a single train/test split
 
- 
          
plot_mean_roc()plot_mean_prc() - Plot ROC and PRC curves
 
- 
          
plot_hp_performance() - Plot hyperparameter performance metrics
 
- 
          
plot_model_performance() - Plot performance metrics for multiple ML runs with different parameters
 
- 
          
tidy_perf_data() - Tidy the performance dataframe
 
- 
          
get_hp_performance() - Get hyperparameter performance metrics
 
- 
          
combine_hp_performance() - Combine hyperparameter performance metrics for multiple train/test splits
 
- 
          
otu_small - Small OTU abundance dataset
 
- 
          
otu_mini_bin - Mini OTU abundance dataset
 
- 
          
otu_mini_multi - Mini OTU abundance dataset with 3 categorical variables
 
- 
          
otu_mini_multi_group - Groups for otu_mini_multi
 
- 
          
otu_data_preproc - Mini OTU abundance dataset - preprocessed
 
- 
          
otu_mini_bin_results_glmnet - Results from running the pipeline with L2 logistic regression on 
otu_mini_binwith feature importance and grouping 
- 
          
otu_mini_bin_results_rf - Results from running the pipeline with random forest on 
otu_mini_bin 
- 
          
otu_mini_bin_results_rpart2 - Results from running the pipeline with rpart2 on 
otu_mini_bin 
- 
          
otu_mini_bin_results_svmRadial - Results from running the pipeline with svmRadial on 
otu_mini_bin 
- 
          
otu_mini_bin_results_xgbTree - Results from running the pipeline with xbgTree on 
otu_mini_bin 
- 
          
otu_mini_cont_results_glmnet - Results from running the pipeline with glmnet on 
otu_mini_binwithOtu00001as the outcome 
- 
          
otu_mini_cont_results_nocv - Results from running the pipeline with glmnet on 
otu_mini_binwithOtu00001as the outcome column, using a custom train control scheme that does not perform cross-validation 
- 
          
otu_mini_multi_results_glmnet - Results from running the pipeline with glmnet on 
otu_mini_multifor multiclass outcomes 
- 
          
otu_mini_cv - Cross validation on 
train_data_miniwith grouped features. 
- 
          
replace_spaces() - Replace spaces in all elements of a character vector with underscores
 
Pipeline customization
Customize various steps of the pipeline beyond the arguments provided by run_ml() and preprocess_data().
- 
          
remove_singleton_columns() - Remove columns appearing in only 
thresholdrow(s) or fewer. 
- 
          
get_caret_processed_df() - Get preprocessed dataframe for continuous variables
 
- 
          
randomize_feature_order() - Randomize feature order to eliminate any position-dependent effects
 
- 
          
get_partition_indices() - Select indices to partition the data into training & testing sets.
 
- 
          
get_outcome_type() - Get outcome type.
 
- 
          
get_hyperparams_list() - Set hyperparameters based on ML method and dataset characteristics
 
- 
          
get_tuning_grid() - Generate the tuning grid for tuning hyperparameters
 
- 
          
define_cv() - Define cross-validation scheme and training parameters
 
- 
          
get_perf_metric_name() - Get default performance metric name
 
- 
          
get_perf_metric_fn() - Get default performance metric function
 
- 
          
train_model() - Train model using 
caret::train(). 
- 
          
calc_perf_metrics() - Get performance metrics for test data
 
- 
          
group_correlated_features() - Group correlated features