The Wayback Machine - https://web.archive.org/web/20200719111135/https://optuna.org/
Automated search for optimal hyperparameters using Python conditionals, loops, and syntax
Efficiently search large spaces and prune unpromising trials for faster results
Parallelize hyperparameter searches over multiple threads or processes without modifying code
A simple optimization problem:
objective function to be optimized. Let's minimize (x -
2)^2trial object. Here, a float value of
x is suggested from -10 to 10study object and invoke the optimize method over 100
trialsimport optuna
def objective(trial):
x = trial.suggest_uniform('x', -10, 10)
return (x - 2) ** 2
study = optuna.create_study()
study.optimize(objective, n_trials=100)
study.best_params # E.g. {'x': 2.002108042}
You can optimize PyTorch hyperparameters, such as the number of layers and the number of hidden nodes in each layer, in three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport torch
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
# 2. Suggest values of the hyperparameters using a trial object.
n_layers = trial.suggest_int('n_layers', 1, 3)
layers = []
in_features = 28 * 28
for i in range(n_layers):
out_features = trial.suggest_int('n_units_l{}'.format(i), 4, 128)
layers.append(torch.nn.Linear(in_features, out_features))
layers.append(torch.nn.ReLU())
in_features = out_features
layers.append(torch.nn.Linear(in_features, 10))
layers.append(torch.nn.LogSoftmax(dim=1))
model = torch.nn.Sequential(*layers).to(torch.device('cpu'))
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
You can optimize Chainer hyperparameters, such as the number of layers and the number of hidden nodes in each layer, in three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport chainer
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
# 2. Suggest values of the hyperparameters using a trial object.
n_layers = trial.suggest_int('n_layers', 1, 3)
layers = []
for i in range(n_layers):
n_units = int(trial.suggest_loguniform('n_units_l{}'.format(i), 4, 128))
layers.append(L.Linear(None, n_units))
layers.append(F.relu)
layers.append(L.Linear(None, 10))
model = L.Classifier(chainer.Sequential(*layers))
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
You can optimize TensorFlow hyperparameters, such as the number of layers and the number of hidden nodes in each layer, in three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport tensorflow as tf
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
# 2. Suggest values of the hyperparameters using a trial object.
n_layers = trial.suggest_int('n_layers', 1, 3)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten())
for i in range(n_layers):
num_hidden = int(trial.suggest_loguniform('n_units_l{}'.format(i), 4, 128))
model.add(tf.keras.layers.Dense(num_hidden, activation='relu'))
model.add(tf.keras.layers.Dense(CLASSES))
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
You can optimize Keras hyperparameters, such as the number of filters and kernel size, in three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport keras
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
model = Sequential()
# 2. Suggest values of the hyperparameters using a trial object.
model.add(
Conv2D(filters=trial.suggest_categorical('filters', [32, 64]),
kernel_size=trial.suggest_categorical('kernel_size', [3, 5]),
strides=trial.suggest_categorical('strides', [1, 2]),
activation=trial.suggest_categorical('activation', ['relu', 'linear']),
input_shape=input_shape))
model.add(Flatten())
model.add(Dense(CLASSES, activation='softmax'))
# We compile our model with a sampled learning rate.
lr = trial.suggest_loguniform('lr', 1e-5, 1e-1)
model.compile(loss='sparse_categorical_crossentropy', optimizer=RMSprop(lr=lr), metrics=['accuracy'])
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
You can optimize MXNet hyperparameters, such as the number of layers and the number of hidden nodes in each layer, in three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport mxnet as mx
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
# 2. Suggest values of the hyperparameters using a trial object.
n_layers = trial.suggest_int('n_layers', 1, 3)
data = mx.symbol.Variable('data')
data = mx.sym.flatten(data=data)
for i in range(n_layers):
num_hidden = int(trial.suggest_loguniform('n_units_l{}'.format(i), 4, 128))
data = mx.symbol.FullyConnected(data=data, num_hidden=num_hidden)
data = mx.symbol.Activation(data=data, act_type="relu")
data = mx.symbol.FullyConnected(data=data, num_hidden=10)
mlp = mx.symbol.SoftmaxOutput(data=data, name="softmax")
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
You can optimize Scikit-Learn hyperparameters, such as the C parameter of
SVC and the max_depth of the RandomForestClassifier,
in
three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport sklearn
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
# 2. Suggest values for the hyperparameters using a trial object.
classifier_name = trial.suggest_categorical('classifier', ['SVC', 'RandomForest'])
if classifier_name == 'SVC':
svc_c = trial.suggest_loguniform('svc_c', 1e-10, 1e10)
classifier_obj = sklearn.svm.SVC(C=svc_c, gamma='auto')
else:
rf_max_depth = int(trial.suggest_loguniform('rf_max_depth', 2, 32))
classifier_obj = sklearn.ensemble.RandomForestClassifier(max_depth=rf_max_depth, n_estimators=10)
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
You can optimize XGBoost hyperparameters, such as the booster type and alpha, in three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport xgboost as xgb
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
...
# 2. Suggest values of the hyperparameters using a trial object.
param = {
'silent': 1,
'objective': 'binary:logistic',
'booster': trial.suggest_categorical('booster', ['gbtree', 'gblinear', 'dart']),
'lambda': trial.suggest_loguniform('lambda', 1e-8, 1.0),
'alpha': trial.suggest_loguniform('alpha', 1e-8, 1.0)
}
bst = xgb.train(param, dtrain)
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
You can optimize LightGBM hyperparameters, such as boosting type and the number of leaves, in three steps:
objective function and return accuracytrial objectstudy object and execute the optimizationimport lightgbm as lgb
import optuna
# 1. Define an objective function to be maximized.
def objective(trial):
...
# 2. Suggest values of the hyperparameters using a trial object.
param = {
'objective': 'binary',
'metric': 'binary_logloss',
'verbosity': -1,
'boosting_type': 'gbdt',
'lambda_l1': trial.suggest_loguniform('lambda_l1', 1e-8, 10.0),
'lambda_l2': trial.suggest_loguniform('lambda_l2', 1e-8, 10.0),
'num_leaves': trial.suggest_int('num_leaves', 2, 256),
'feature_fraction': trial.suggest_uniform('feature_fraction', 0.4, 1.0),
'bagging_fraction': trial.suggest_uniform('bagging_fraction', 0.4, 1.0),
'bagging_freq': trial.suggest_int('bagging_freq', 1, 7),
'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),
}
gbm = lgb.train(param, dtrain)
...
return accuracy
# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
Check more examples including PyTorch Ignite, Dask-ML and MLFlow at our Github
repository.
It also provides the visualization demo as follows:
from optuna.visualization import plot_intermediate_values
...
plot_intermediate_values(study)
Released Optuna v1.5.0 with new features, a useful alias and performance improvements.
— OptunaAutoML (@OptunaAutoML) June 1, 2020
🆕 Cross-validation support for LightGBM Tuner
🆕 A new multi-objective optimization algorithm: NSGA-II
🆕 Mean Decrease Impurity (MDI) hyperparameter importancehttps://t.co/9cCihec6k3 pic.twitter.com/XFU4QjP01G
If you use Optuna in a scientific publication, please use the following citation:
Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta,and Masanori Koyama. 2019. Optuna: A Next-generation Hyperparameter Optimization Framework. In KDD.
Bibtex entry:
@inproceedings{optuna_2019,
title={Optuna: A Next-generation Hyperparameter Optimization Framework},
author={Akiba, Takuya and Sano, Shotaro and Yanase, Toshihiko and Ohta, Takeru and Koyama, Masanori},
booktitle={Proceedings of the 25rd {ACM} {SIGKDD} International Conference on Knowledge Discovery and Data Mining},
year={2019}
}