Quickstart
Hydra application
The following example demonstrates how to use Hydraflow with a Hydra application. There are two main steps to using Hydraflow:
- Set the MLflow experiment using the Hydra job name.
- Start a new MLflow run that logs the Hydra configuration.
apps/quickstart.py | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
Set the MLflow experiment
hydraflow.set_experiment
sets the MLflow experiment using the Hydra job name.
Optionally, it can also set the tracking URI with uri
argument.
For example,
hydraflow.set_experiment(uri="sqlite:///mlruns.db")
Start a new MLflow run
hydraflow.start_run
starts a new MLflow run that logs the Hydra configuration.
It returns the started run so that it can be used to log metrics, parameters, and artifacts
within the context of the run.
with hydraflow.start_run(cfg) as run:
pass
Run the application
Single-run
Run the Hydra application as a normal Python script.
$ python apps/quickstart.py
2025/01/28 14:46:20 INFO mlflow.tracking.fluent: Experiment with name 'quickstart' does not exist. Creating a new experiment.
[2025-01-28 14:46:20,928][__main__][INFO] - cfg.width=1024, cfg.height=768
Check the MLflow CLI to view the experiment.
$ mlflow experiments search
Experiment Id Name Artifact Location
------------------ ---------- ----------------------------------------------------------------------
0 Default file:///home/runner/work/hydraflow/hydraflow/mlruns/0
245602949584624815 quickstart file:///home/runner/work/hydraflow/hydraflow/mlruns/245602949584624815
Multi-run
$ python apps/quickstart.py -m width=400,600 height=100,200,300
[2025-01-28 14:46:25,162][HYDRA] Launching 6 jobs locally
[2025-01-28 14:46:25,162][HYDRA] #0 : width=400 height=100
[2025-01-28 14:46:25,279][__main__][INFO] - cfg.width=400, cfg.height=100
[2025-01-28 14:46:25,281][HYDRA] #1 : width=400 height=200
[2025-01-28 14:46:25,425][__main__][INFO] - cfg.width=400, cfg.height=200
[2025-01-28 14:46:25,427][HYDRA] #2 : width=400 height=300
[2025-01-28 14:46:25,505][__main__][INFO] - cfg.width=400, cfg.height=300
[2025-01-28 14:46:25,507][HYDRA] #3 : width=600 height=100
[2025-01-28 14:46:25,584][__main__][INFO] - cfg.width=600, cfg.height=100
[2025-01-28 14:46:25,586][HYDRA] #4 : width=600 height=200
[2025-01-28 14:46:25,663][__main__][INFO] - cfg.width=600, cfg.height=200
[2025-01-28 14:46:25,665][HYDRA] #5 : width=600 height=300
[2025-01-28 14:46:25,742][__main__][INFO] - cfg.width=600, cfg.height=300
Use Hydraflow API
Run collection
>>> import mlflow
>>> mlflow.set_experiment("quickstart")
>>> import hydraflow
>>> rc = hydraflow.list_runs()
>>> print(rc)
RunCollection(7)
Retrieve a run
>>> run = rc.first()
>>> print(type(run))
<class 'mlflow.entities.run.Run'>
>>> cfg = hydraflow.load_config(run)
>>> print(type(cfg))
>>> print(cfg)
<class 'omegaconf.dictconfig.DictConfig'>
{'width': 1024, 'height': 768}
>>> run = rc.last()
>>> cfg = hydraflow.load_config(run)
>>> print(cfg)
{'width': 600, 'height': 300}
Filter runs
>>> filtered = rc.filter(width=400)
>>> print(filtered)
RunCollection(3)
>>> filtered = rc.filter(height=[100, 300])
>>> print(filtered)
RunCollection(4)
>>> filtered = rc.filter(height=(100, 300))
>>> print(filtered)
RunCollection(6)
>>> run = rc.find(height=100)
>>> print(run.data.params)
{'height': '100', 'width': '400'}
>>> run = rc.find_last(height=100)
>>> print(run.data.params)
{'height': '100', 'width': '600'}
Map runs
>>> params = rc.map(lambda x: x.data.params)
>>> for p in params:
... print(p)
{'height': '768', 'width': '1024'}
{'height': '100', 'width': '400'}
{'height': '200', 'width': '400'}
{'height': '300', 'width': '400'}
{'height': '100', 'width': '600'}
{'height': '200', 'width': '600'}
{'height': '300', 'width': '600'}
>>> list(rc.map_id(print))
903a22469ebf46b49322d1b1e9e1bcc9
3102b7a7957c44f5874f073ca905ed80
1819fbefc8ad4b2391d9c51c26122e64
36e7967804ee4a8ab8646d1d290a63eb
9f8bed2dd6414126a91fd4cdc7b99240
e2620faeab254fe7957154437145843c
533ee2ca14ff48a696e6a9b1e7469648
Group runs
>>> grouped = rc.groupby("width")
>>> for key, group in grouped.items():
... print(key, group)
1024 RunCollection(1)
400 RunCollection(3)
600 RunCollection(3)
>>> grouped = rc.groupby(["height"])
>>> for key, group in grouped.items():
... print(key, group)
('768',) RunCollection(1)
('100',) RunCollection(2)
('200',) RunCollection(2)
('300',) RunCollection(2)
Config dataframe
>>> print(rc.data.config)
width height
0 1024 768
1 400 100
2 400 200
3 400 300
4 600 100
5 600 200
6 600 300