Quickstart

Hydra application

The following example demonstrates how to use Hydraflow with a Hydra application. There are two main steps to using Hydraflow:

Set the MLflow experiment using the Hydra job name.
Start a new MLflow run that logs the Hydra configuration.

apps/quickstart.py
import logging
from dataclasses import dataclass

import hydra
import mlflow
from hydra.core.config_store import ConfigStore
from hydra.core.hydra_config import HydraConfig

import hydraflow

log = logging.getLogger(__name__)


@dataclass
class Config:
    width: int = 1024
    height: int = 768


cs = ConfigStore.instance()
cs.store(name="config", node=Config)


@hydra.main(config_name="config", version_base=None)
def app(cfg: Config) -> None:
    hc = HydraConfig.get()
    mlflow.set_experiment(hc.job.name)

    with hydraflow.start_run(cfg):
        log.info(f"{cfg.width=}, {cfg.height=}")


if __name__ == "__main__":
    app()

Start a new MLflow run

hydraflow.start_run starts a new MLflow run that logs the Hydra configuration. It returns the started run so that it can be used to log metrics, parameters, and artifacts within the context of the run.

with hydraflow.start_run(cfg) as run:
    pass

Run the application

Single-run

Run the Hydra application as a normal Python script.

$ python apps/quickstart.py
2025/03/01 09:47:50 INFO mlflow.tracking.fluent: Experiment with name 'quickstart' does not exist. Creating a new experiment.
[2025-03-01 09:47:50,852][__main__][INFO] - cfg.width=1024, cfg.height=768

Check the MLflow CLI to view the experiment.

$ mlflow experiments search
Experiment Id       Name        Artifact Location                                                     
------------------  ----------  ----------------------------------------------------------------------
0                   Default     file:///home/runner/work/hydraflow/hydraflow/mlruns/0                 
100184320647032899  quickstart  file:///home/runner/work/hydraflow/hydraflow/mlruns/100184320647032899

Multi-run

$ python apps/quickstart.py -m width=400,600 height=100,200,300
[2025-03-01 09:47:55,094][HYDRA] Launching 6 jobs locally
[2025-03-01 09:47:55,094][HYDRA]    #0 : width=400 height=100
[2025-03-01 09:47:55,266][__main__][INFO] - cfg.width=400, cfg.height=100
[2025-03-01 09:47:55,270][HYDRA]    #1 : width=400 height=200
[2025-03-01 09:47:55,348][__main__][INFO] - cfg.width=400, cfg.height=200
[2025-03-01 09:47:55,351][HYDRA]    #2 : width=400 height=300
[2025-03-01 09:47:55,428][__main__][INFO] - cfg.width=400, cfg.height=300
[2025-03-01 09:47:55,431][HYDRA]    #3 : width=600 height=100
[2025-03-01 09:47:55,507][__main__][INFO] - cfg.width=600, cfg.height=100
[2025-03-01 09:47:55,510][HYDRA]    #4 : width=600 height=200
[2025-03-01 09:47:55,587][__main__][INFO] - cfg.width=600, cfg.height=200
[2025-03-01 09:47:55,590][HYDRA]    #5 : width=600 height=300
[2025-03-01 09:47:55,665][__main__][INFO] - cfg.width=600, cfg.height=300

Use Hydraflow API

Run collection

>>> import hydraflow
>>> rc = hydraflow.list_runs("quickstart")
>>> print(rc)
RunCollection(7)

Retrieve a run

>>> run = rc.first()
>>> print(type(run))
<class 'mlflow.entities.run.Run'>

>>> cfg = hydraflow.load_config(run)
>>> print(type(cfg))
>>> print(cfg)
<class 'omegaconf.dictconfig.DictConfig'>
{'width': 1024, 'height': 768}

>>> run = rc.last()
>>> cfg = hydraflow.load_config(run)
>>> print(cfg)
{'width': 600, 'height': 300}

Filter runs

>>> filtered = rc.filter(width=400)
>>> print(filtered)
RunCollection(3)

>>> filtered = rc.filter(height=[100, 300])
>>> print(filtered)
RunCollection(4)

>>> filtered = rc.filter(height=(100, 300))
>>> print(filtered)
RunCollection(6)

Group runs

>>> grouped = rc.groupby("width")
>>> for key, group in grouped.items():
...     print(key, group)
1024 RunCollection(1)
400 RunCollection(3)
600 RunCollection(3)

>>> grouped = rc.groupby(["height"])
>>> for key, group in grouped.items():
...     print(key, group)
('768',) RunCollection(1)
('100',) RunCollection(2)
('200',) RunCollection(2)
('300',) RunCollection(2)

Config dataframe

>>> print(rc.data.config)
   width  height
0   1024     768
1    400     100
2    400     200
3    400     300
4    600     100
5    600     200
6    600     300