Better track your ML experiments with MLflow

Links & Resources

We all know how painful keeping track of your machine learning experiments can be.

You train a bunch of models of different flavours (Random Forests, XGBoost, Neural Networks, etc.).

For each model, you explore a range of hyper-parameters. Then you compute the performance metrics on some test data.

Sometimes you change the training data by adding or removing features.

Some other times, you have to work in a team and combine your results with other data scientits...


How do you manage these experiments in such a way the are easily tracable and therefore reproducible? MLflow is perfectly suited to this task.

To learn more about MLflow, watch the video tutorial.

Here's what I'll discuss:

  • Setting up MLflow locally to track some machine learning experiences I performed on a dataset
  • For each model fit, using MLflow to track:

    • metrics
    • hyper-parameters
    • source scripts executing the run
    • code version
    • notes & comments
  • Comparing different runs between through the MFflow UI
  • Setting up a tracking on AWS

To reproduce my experiments and set up MLflow on AWS, have a look at my Github repo.

Happy coding 💻