Overview

Use a sweep configuration to define the hyperparameters to optimize during training. You can specify the hyperparameters to optimize, the search strategy to use, and other sweep settings. The following sections describe the top-level structure of a sweep configuration. For a comprehensive list of top-level keys, see Sweep configuration options.

Basic structure

Sweep configurations use key-value pairs and nested structures. You can define your sweep configuration in a YAML file or in a Python dictionary. The structure of the sweep configuration is the same regardless of where you define it.

Where to define your sweep configuration?Define your sweep configuration in a YAML file if you want to manage sweeps from the command line or keep the sweep configuration separate from your training code.Define your sweep configuration in a Python dictionary if your training algorithm is defined in a Python script or notebook, or if you want to keep the sweep configuration close to your training code.

Top-level keys define qualities of your sweep search such as the name of the sweep (name key), the parameters to search through (parameters key), the methodology to search the parameter space (method key), and more. The values associated with each key can be a string, a number, a list, or another nested key-value pair. The value type depends on the key. For example, the following code snippet shows a sweep configuration with the method, metric, and parameters keys. The method key specifies the search strategy (bayes). The metric key specifies the metric to optimize and whether to minimize or maximize it. The parameters key specifies the hyperparameters to optimize and their values or distributions.

CLI
Python script or notebook

The following code snippet shows how to define a sweep configuration in a YAML file named config.yaml:

config.yaml

program: train.py
name: sweepdemo
method: bayes
metric:
  goal: minimize
  name: validation_loss
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
  batch_size:
    values: [16, 32, 64]
  epochs:
    values: [5, 10, 15]
  optimizer:
    values: ["adam", "sgd"]

Within the top level parameters key (line 7), the following keys are nested: learning_rate (line 8), batch_size (line 11), epochs (line 14), and optimizer (line 17). For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more.

The following code snippet stores a sweep configuration in a variable named sweep_configuration:

train.py

sweep_configuration = {
    "name": "sweepdemo",
    "method": "bayes",
    "metric": {"goal": "minimize", "name": "validation_loss"},
    "parameters": {
        "learning_rate": {"min": 0.0001, "max": 0.1},
        "batch_size": {"values": [16, 32, 64]},
        "epochs": {"values": [5, 10, 15]},
        "optimizer": {"values": ["adam", "sgd"]},
    },
}

Within the top level parameters key (line 5), the following keys are nested: learning_rate (line 6), batch_size (line 7), epochs (line 8), and optimizer (line 10). For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more.

See Define sweep configuration options for a comprehensive list of top-level sweep configuration keys and their associated values.

Double nested parameters

Sweep configurations support nested parameters. Double nested parameters are useful for organizing your hyperparameters into categories. For example, you can group hyperparameters related to the optimizer under an optimizer category and group hyperparameters related to the model architecture under a model category. To define a nested parameter, include an additional parameters key under the top-level parameter name. The following example shows a sweep configuration with three nested parameters: nested_category_1, nested_category_2, and nested_category_3. Each nested parameter includes two additional parameters: momentum and weight_decay.

nested_category_1, nested_category_2, and nested_category_3 are placeholders. Replace them with names that fit your use case.

The following code snippets show how to define nested parameters in both a YAML file and a Python dictionary.

CLI
Python script or notebook

config.yaml

program: sweep_nest.py
name: nested_sweep
method: random
metric:
  name: loss
  goal: minimize
parameters:
  optimizer:
    values: ['adam', 'sgd']
  fc_layer_size:
    values: [128, 256, 512]
  dropout:
    values: [0.3, 0.4, 0.5]
  epochs:
    value: 1
  learning_rate:
    distribution: uniform
    min: 0
    max: 0.1
  batch_size:
    distribution: q_log_uniform_values
    q: 8
    min: 32
    max: 256
  nested_category_1:
    parameters:
      momentum:
        distribution: uniform
        min: 0.0
        max: 0.9
      weight_decay:
        values: [0.0001, 0.0005, 0.001]
  nested_category_2:
    parameters:
      momentum:
        distribution: uniform
        min: 0.0
        max: 0.9
      weight_decay:
        values: [0.1, 0.2, 0.3]
  nested_category_3:
    parameters:
      momentum:
        distribution: uniform
        min: 0.5
        max: 0.7
      weight_decay:
        values: [0.2, 0.3, 0.4]

{
  "program": "sweep_nest.py",
  "name": "nested_sweep",
  "method": "random",
  "metric": {
    "name": "loss",
    "goal": "minimize"
  },
  "parameters": {
    "optimizer": {
      "values": ["adam", "sgd"]
    },
    "fc_layer_size": {
      "values": [128, 256, 512]
    },
    "dropout": {
      "values": [0.3, 0.4, 0.5]
    },
    "epochs": {
      "value": 1
    },
    "learning_rate": {
      "distribution": "uniform",
      "min": 0,
      "max": 0.1
    },
    "batch_size": {
      "distribution": "q_log_uniform_values",
      "q": 8,
      "min": 32,
      "max": 256
    },
    "nested_category_1": {
      "parameters": {
        "momentum": {
          "distribution": "uniform",
          "min": 0.0,
          "max": 0.9
        },
        "weight_decay": {
          "values": [0.0001, 0.0005, 0.001]
        }
      }
    },
    "nested_category_2": {
      "parameters": {
        "momentum": {
          "distribution": "uniform",
          "min": 0.0,
          "max": 0.9
        },
        "weight_decay": {
          "values": [0.1, 0.2, 0.3]
        }
      }
    },
    "nested_category_3": {
      "parameters": {
        "momentum": {
          "distribution": "uniform",
          "min": 0.5,
          "max": 0.7
        },
        "weight_decay": {
          "values": [0.2, 0.3, 0.4]
        }
      }
    }
  }
}

Nested parameters defined in sweep configuration overwrite keys specified in a W&B run configuration.As an example, suppose you have train.py script that initializes a run with a nested default:

def main():
    with  wandb.init(config={"nested_param": {"manual_key": 1}}) as run:
        # Your training code here

Your sweep configuration defines nested parameters under a top-level "parameters" key:

sweep_configuration = {
    "method": "grid",
    "metric": {"name": "score", "goal": "minimize"},
    "parameters": {
        "top_level_param": {"value": 0},
        "nested_param": {
            "parameters": {
                "learning_rate": {"value": 0.01},
                "double_nested_param": {
                    "parameters": {"x": {"value": 0.9}, "y": {"value": 0.8}}
                },
            }
        },
    },
}

sweep_id = wandb.sweep(sweep=sweep_configuration, project="<project>")
wandb.agent(sweep_id, function=main, count=4)

During a sweep run, run.config["nested_param"] reflects the subtree defined by the sweep (learning_rate, double_nested_param) config and does not include manual_key defined in wandb.init(config=...).

Sweep configuration template

The following template shows how you can configure parameters and specify search constraints. Replace hyperparameter_name with the name of your hyperparameter and any values enclosed in <>.

config.yaml

program: <insert>
method: <insert>
parameter:
  hyperparameter_name0:
    value: 0  
  hyperparameter_name1: 
    values: [0, 0, 0]
  hyperparameter_name: 
    distribution: <insert>
    value: <insert>
  hyperparameter_name2:  
    distribution: <insert>
    min: <insert>
    max: <insert>
    q: <insert>
  hyperparameter_name3: 
    distribution: <insert>
    values:
      - <list_of_values>
      - <list_of_values>
      - <list_of_values>
early_terminate:
  type: hyperband
  s: 0
  eta: 0
  max_iter: 0
command:
- ${Command macro}
- ${Command macro}
- ${Command macro}
- ${Command macro}      

To express a numeric value using scientific notation, add the YAML !!float operator, which casts the value to a floating point number. For example, min: !!float 1e-5. See Command example.

Sweep configuration examples

CLI
Python script or notebook

config.yaml

program: train.py
method: random
metric:
  goal: minimize
  name: loss
parameters:
  batch_size:
    distribution: q_log_uniform_values
    max: 256 
    min: 32
    q: 8
  dropout: 
    values: [0.3, 0.4, 0.5]
  epochs:
    value: 1
  fc_layer_size: 
    values: [128, 256, 512]
  learning_rate:
    distribution: uniform
    max: 0.1
    min: 0
  optimizer:
    values: ["adam", "sgd"]

train.py

sweep_config = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "loss"},
    "parameters": {
        "batch_size": {
            "distribution": "q_log_uniform_values",
            "max": 256,
            "min": 32,
            "q": 8,
        },
        "dropout": {"values": [0.3, 0.4, 0.5]},
        "epochs": {"value": 1},
        "fc_layer_size": {"values": [128, 256, 512]},
        "learning_rate": {"distribution": "uniform", "max": 0.1, "min": 0},
        "optimizer": {"values": ["adam", "sgd"]},
    },
}

Bayes hyperband example

program: train.py
method: bayes
metric:
  goal: minimize
  name: val_loss
parameters:
  dropout:
    values: [0.15, 0.2, 0.25, 0.3, 0.4]
  hidden_layer_size:
    values: [96, 128, 148]
  layer_1_size:
    values: [10, 12, 14, 16, 18, 20]
  layer_2_size:
    values: [24, 28, 32, 36, 40, 44]
  learn_rate:
    values: [0.001, 0.01, 0.003]
  decay:
    values: [1e-5, 1e-6, 1e-7]
  momentum:
    values: [0.8, 0.9, 0.95]
  epochs:
    value: 27
early_terminate:
  type: hyperband
  s: 2
  eta: 3
  max_iter: 27

The proceeding tabs show how to specify either a minimum or maximum number of iterations for early_terminate:

Maximum number of iterations
Minimum number of iterations

The brackets for this example are: [3, 3*eta, 3*eta*eta, 3*eta*eta*eta], which equals [3, 9, 27, 81].

early_terminate:
  type: hyperband
  min_iter: 3

The brackets for this example are [27/eta, 27/eta/eta], which equals [9, 3].

early_terminate:
  type: hyperband
  max_iter: 27
  s: 2

Macro and custom command arguments example

For more complex command line arguments, you can use macros to pass environment variables, the Python interpreter, and additional arguments. W&B supports pre defined macros and custom command line arguments that you can specify in your sweep configuration. For example, the following sweep configuration (sweep.yaml) defines a command that runs a Python script (run.py) with the ${env}, ${interpreter}, and ${program} macros replaced with the appropriate values when the sweep runs. The --batch_size=${batch_size}, --test=True, and --optimizer=${optimizer} arguments use custom macros to pass the values of the batch_size, test, and optimizer parameters defined in the sweep configuration.

sweep.yaml

program: run.py
method: random
metric:
  name: validation_loss
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
command:
  - ${env}
  - ${interpreter}
  - ${program}
  - "--batch_size=${batch_size}"
  - "--optimizer=${optimizer}"
  - "--test=True"

The associated Python script (run.py) can then parse these command line arguments using the argparse module.

run.py

# run.py  
import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--batch_size', type=int)
parser.add_argument('--optimizer', type=str, choices=['adam', 'sgd'], required=True)
parser.add_argument('--test', type=str2bool, default=False)
args = parser.parse_args()

# Initialize a W&B Run
with wandb.init('test-project') as run:
    run.log({'validation_loss':1})

See the Command macros section in Sweep configuration options for a list of pre-defined macros you can use in your sweep configuration.

Boolean arguments

The argparse module does not support boolean arguments by default. To define a boolean argument, you can use the action parameter or use a custom function to convert the string representation of the boolean value to a boolean type. As an example, you can use the following code snippet to define a boolean argument. Pass store_true or store_false as an argument to ArgumentParser.

import wandb
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--test', action='store_true')
args = parser.parse_args()

args.test  # This will be True if --test is passed, otherwise False

You can also define a custom function to convert the string representation of the boolean value to a boolean type. For example, the following code snippet defines the str2bool function, which converts a string to a boolean value.

def str2bool(v: str) -> bool:
  """Convert a string to a boolean. This is required because
  argparse does not support boolean arguments by default.
  """
  if isinstance(v, bool):
      return v
  return v.lower() in ('yes', 'true', 't', '1')

Guides

Integrations

Reference

Basic structure

Double nested parameters

Sweep configuration template

Sweep configuration examples

Bayes hyperband example

Macro and custom command arguments example

Boolean arguments

Guides

Integrations

Reference

Documentation Index

​Basic structure

​Double nested parameters

​Sweep configuration template

​Sweep configuration examples

​Bayes hyperband example

​Macro and custom command arguments example

​Boolean arguments

Basic structure

Double nested parameters

Sweep configuration template

Sweep configuration examples

Bayes hyperband example

Macro and custom command arguments example

Boolean arguments