Search Algorithms

AutoGluon System Implementatin Logic

https://raw.githubusercontent.com/zhanghang1989/AutoGluonWebdata/master/doc/api/autogluon_system.png

Important components of the AutoGluon system include the Searcher, Scheduler and Resource Manager:

  • The Searcher suggests hyperparameter configurations for the next training job.

  • The Scheduler runs the training job when computation resources become available.

In this tutorial, we illustrate how various search algorithms work and compare their performance via toy experiments.

FIFO Scheduling vs. Early Stopping

In this section, we compare the different behaviors of a sequential First In, First Out (FIFO) scheduler using autogluon.scheduler.FIFOScheduler vs. a preemptive scheduling algorithm autogluon.scheduler.HyperbandScheduler that early-terminates certain training jobs that do not appear promising during their early stages.

Create a Dummy Training Function

import numpy as np
import autogluon as ag

@ag.args(
    lr=ag.space.Real(1e-3, 1e-2, log=True),
    wd=ag.space.Real(1e-3, 1e-2))
def train_fn(args, reporter):
    for e in range(10):
        dummy_accuracy = 1 - np.power(1.8, -np.random.uniform(e, 2*e))
        reporter(epoch=e, accuracy=dummy_accuracy, lr=args.lr, wd=args.wd)
/var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.metrics.classification module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.metrics. Anything that cannot be imported from sklearn.metrics is now part of the private API.
  warnings.warn(message, FutureWarning)

FIFO Scheduler

This scheduler runs training trials in order. When there are more resources available than required for a single training job, multiple training jobs may be run in parallel.

scheduler = ag.scheduler.FIFOScheduler(train_fn,
                                       resource={'num_cpus': 2, 'num_gpus': 0},
                                       num_trials=20,
                                       reward_attr='accuracy',
                                       time_attr='epoch')
scheduler.run()
scheduler.join_jobs()
Starting Experiments
Num of Finished Tasks is 0
Num of Pending Tasks is 20
HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))
Finished Task with config: {'lr': 0.0031622777, 'wd': 0.0055} and reward: 0.9997084979745358
Finished Task with config: {'lr': 0.0023680037344273497, 'wd': 0.00980258795956366} and reward: 0.9995104379366804
Finished Task with config: {'lr': 0.0023046823367855198, 'wd': 0.009378780848622207} and reward: 0.9987710776248258
Finished Task with config: {'lr': 0.0025117716912458543, 'wd': 0.0038650137343571173} and reward: 0.9994946008106337
Finished Task with config: {'lr': 0.003168036351039268, 'wd': 0.002534119969228645} and reward: 0.9981324732151834
Finished Task with config: {'lr': 0.0020131145674300343, 'wd': 0.0021859370411764787} and reward: 0.9998127858805036
Finished Task with config: {'lr': 0.0011329846062054097, 'wd': 0.008824246292624089} and reward: 0.9996418284140945
Finished Task with config: {'lr': 0.003080530089521116, 'wd': 0.007726110773817898} and reward: 0.9995987038392062
Finished Task with config: {'lr': 0.005847107271205574, 'wd': 0.004942982079623443} and reward: 0.9991894482946119
Finished Task with config: {'lr': 0.008352945961095018, 'wd': 0.0012215813909021843} and reward: 0.9999440389170993
Finished Task with config: {'lr': 0.0015172111054862344, 'wd': 0.0012617866827924137} and reward: 0.9968406995955217
Finished Task with config: {'lr': 0.003246931011868301, 'wd': 0.005999891177389028} and reward: 0.997560417275033
Finished Task with config: {'lr': 0.0012479272335196967, 'wd': 0.0064839942038445115} and reward: 0.9998894026211204
Finished Task with config: {'lr': 0.0026442724878534417, 'wd': 0.008818400050521681} and reward: 0.9986559472047832
Finished Task with config: {'lr': 0.003352123698691432, 'wd': 0.0019266265560927182} and reward: 0.9984360433059107
Finished Task with config: {'lr': 0.004019654627141818, 'wd': 0.009997445340586348} and reward: 0.9999652768067441
Finished Task with config: {'lr': 0.0018891215483215682, 'wd': 0.009473543431092687} and reward: 0.9992405701254338
Finished Task with config: {'lr': 0.0033574826489634908, 'wd': 0.0059305101086370125} and reward: 0.9956597381250288
Finished Task with config: {'lr': 0.00725610071976362, 'wd': 0.006768250148069643} and reward: 0.9998253586497194
Finished Task with config: {'lr': 0.0010579233181392657, 'wd': 0.0016007869455174181} and reward: 0.9981920526565943

Visualize the results:

scheduler.get_training_curves(plot=True, use_legend=False)
../../_images/output_algorithm_17e5ae_5_0.png

Hyperband Scheduler

The Hyperband Scheduler terminates training trials that don’t appear promising during the early stages to free up compute resources for more promising hyperparameter configurations.

scheduler = ag.scheduler.HyperbandScheduler(train_fn,
                                            resource={'num_cpus': 2, 'num_gpus': 0},
                                            num_trials=20,
                                            reward_attr='accuracy',
                                            time_attr='epoch',
                                            grace_period=1)
scheduler.run()
scheduler.join_jobs()
Starting Experiments
Num of Finished Tasks is 0
Num of Pending Tasks is 20
HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))
Finished Task with config: {'lr': 0.0013119769451703492, 'wd': 0.008420449686819006} and reward: 0.9363001183692645
Finished Task with config: {'lr': 0.0031622777, 'wd': 0.0055} and reward: 0.9997699754494808
Finished Task with config: {'lr': 0.003100250158951793, 'wd': 0.006954318479506093} and reward: 0.9994296287730586
Finished Task with config: {'lr': 0.006207899654437529, 'wd': 0.005115637879648105} and reward: 0.999338740990627
Finished Task with config: {'lr': 0.004399792123378237, 'wd': 0.006869939159367451} and reward: 0.6054625760180542
Finished Task with config: {'lr': 0.0010848797780777036, 'wd': 0.00756449663334364} and reward: 0.5906063229485552
Finished Task with config: {'lr': 0.0022223750998211973, 'wd': 0.0035697578608729005} and reward: 0.600553651320286
Finished Task with config: {'lr': 0.001816798784587603, 'wd': 0.007986850459136927} and reward: 0.999848989499412
Finished Task with config: {'lr': 0.005348622313046577, 'wd': 0.004502924001323096} and reward: 0.5301071514197571
Finished Task with config: {'lr': 0.0022738853694636613, 'wd': 0.008781111384177927} and reward: 0.5558467195070161
Finished Task with config: {'lr': 0.0016685773960210763, 'wd': 0.00710863043263675} and reward: 0.9416078718398605
Finished Task with config: {'lr': 0.0065987519696654225, 'wd': 0.008683030564565233} and reward: 0.5537434694264384
Finished Task with config: {'lr': 0.004254685027867591, 'wd': 0.004157280971186315} and reward: 0.5735568507416069
Finished Task with config: {'lr': 0.0018192983530967626, 'wd': 0.004973151625529599} and reward: 0.9078010080001341
Finished Task with config: {'lr': 0.009570899165962763, 'wd': 0.0067388585070934075} and reward: 0.5365378006043358
Finished Task with config: {'lr': 0.00943527319597958, 'wd': 0.009634438404685888} and reward: 0.6099215773475833
Finished Task with config: {'lr': 0.004397422638242178, 'wd': 0.0022976414780964266} and reward: 0.5770821737380238
Finished Task with config: {'lr': 0.0014676817813007168, 'wd': 0.00982514671442718} and reward: 0.9966026462873852
Finished Task with config: {'lr': 0.003547927953451838, 'wd': 0.006632577831804181} and reward: 0.4622331004750234
Finished Task with config: {'lr': 0.0017914032192079133, 'wd': 0.005011770337491084} and reward: 0.9999623401474572

Visualize the results:

scheduler.get_training_curves(plot=True, use_legend=False)
../../_images/output_algorithm_17e5ae_9_0.png

Random Search vs. Reinforcement Learning

In this section, we demonstrate the behaviors of random search and reinforcement learning in a simple simulation environment.

Create a Reward Function for Toy Experiments

Import the packages:

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

Input Space x = [0: 99], y = [0: 99]. The rewards is a combination of 2 gaussians as shown in the following figure:

Generate the simulated reward as a mixture of 2 gaussians:

def gaussian2d(x, y, x0, y0, xalpha, yalpha, A):
    return A * np.exp( -((x-x0)/xalpha)**2 -((y-y0)/yalpha)**2)

x, y = np.linspace(0, 99, 100), np.linspace(0, 99, 100)
X, Y = np.meshgrid(x, y)

Z = np.zeros(X.shape)
ps = [(20, 70, 35, 40, 1),
      (80, 40, 20, 20, 0.7)]
for p in ps:
    Z += gaussian2d(X, Y, *p)

Visualize the reward space:

fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, Z, cmap='plasma')
ax.set_zlim(0,np.max(Z)+2)
plt.show()
../../_images/output_algorithm_17e5ae_15_0.png

Create Training Function

We can simply define an AutoGluon searchable function with a decorator ag.args. The reporter is used to communicate with AutoGluon search and scheduling algorithms.

@ag.args(
    x=ag.space.Categorical(*list(range(100))),
    y=ag.space.Categorical(*list(range(100))),
)
def rl_simulation(args, reporter):
    x, y = args.x, args.y
    reporter(accuracy=Z[y][x])

Reinforcement Learning

rl_scheduler = ag.scheduler.RLScheduler(rl_simulation,
                                        resource={'num_cpus': 1, 'num_gpus': 0},
                                        num_trials=300,
                                        reward_attr="accuracy",
                                        controller_batch_size=4,
                                        controller_lr=5e-3)
rl_scheduler.run()
rl_scheduler.join_jobs()
print('Best config: {}, best reward: {}'.format(rl_scheduler.get_best_config(), rl_scheduler.get_best_reward()))
Reserved DistributedResource(
    Node = Remote REMOTE_ID: 0,
    <Remote: 'inproc://172.31.45.231/9976/1' processes=1 threads=8, memory=33.24 GB>
    nCPUs = 0) in Remote REMOTE_ID: 0,
    <Remote: 'inproc://172.31.45.231/9976/1' processes=1 threads=8, memory=33.24 GB>
Starting Experiments
Num of Finished Tasks is 0
Num of Pending Tasks is 300
  0%|          | 0/76 [00:00<?, ?it/s]Finished Task with config: {'x.choice': 54, 'y.choice': 60} and reward: 0.4131320021194963
Finished Task with config: {'x.choice': 59, 'y.choice': 85} and reward: 0.25248241948989986
Finished Task with config: {'x.choice': 71, 'y.choice': 54} and reward: 0.452177348673922
Finished Task with config: {'x.choice': 84, 'y.choice': 84} and reward: 0.03655423885462522
  1%|▏         | 1/76 [00:00<00:30,  2.45it/s]Finished Task with config: {'x.choice': 42, 'y.choice': 43} and reward: 0.44561928050295485
Finished Task with config: {'x.choice': 62, 'y.choice': 29} and reward: 0.3129734289166378
Finished Task with config: {'x.choice': 64, 'y.choice': 89} and reward: 0.16521724999224768
Finished Task with config: {'x.choice': 38, 'y.choice': 5} and reward: 0.055140459703497166
  3%|▎         | 2/76 [00:00<00:28,  2.55it/s]Finished Task with config: {'x.choice': 27, 'y.choice': 80} and reward: 0.9025895808289532
Finished Task with config: {'x.choice': 96, 'y.choice': 79} and reward: 0.01675323386032184
Finished Task with config: {'x.choice': 38, 'y.choice': 52} and reward: 0.6328227532925973
Finished Task with config: {'x.choice': 47, 'y.choice': 47} and reward: 0.4369372160417949
  4%|▍         | 3/76 [00:01<00:28,  2.60it/s]Finished Task with config: {'x.choice': 56, 'y.choice': 7} and reward: 0.039952445147950336
Finished Task with config: {'x.choice': 39, 'y.choice': 33} and reward: 0.3257993021856096
Finished Task with config: {'x.choice': 92, 'y.choice': 8} and reward: 0.03906803139561688
Finished Task with config: {'x.choice': 83, 'y.choice': 64} and reward: 0.2004520333101668
  5%|▌         | 4/76 [00:01<00:27,  2.64it/s]Finished Task with config: {'x.choice': 2, 'y.choice': 77} and reward: 0.7444461288479184
Finished Task with config: {'x.choice': 36, 'y.choice': 14} and reward: 0.11531548824101008
Finished Task with config: {'x.choice': 83, 'y.choice': 86} and reward: 0.03682398109240761
Finished Task with config: {'x.choice': 95, 'y.choice': 87} and reward: 0.010053264415937721
  7%|▋         | 5/76 [00:01<00:26,  2.66it/s]Finished Task with config: {'x.choice': 97, 'y.choice': 46} and reward: 0.3161396814775167
Finished Task with config: {'x.choice': 46, 'y.choice': 52} and reward: 0.4974644053536026
Finished Task with config: {'x.choice': 79, 'y.choice': 77} and reward: 0.07935556115551717
Finished Task with config: {'x.choice': 79, 'y.choice': 67} and reward: 0.17085551866906823
  8%|▊         | 6/76 [00:02<00:26,  2.67it/s]Finished Task with config: {'x.choice': 11, 'y.choice': 15} and reward: 0.14131830986160224
Finished Task with config: {'x.choice': 71, 'y.choice': 53} and reward: 0.47455256379247157
Finished Task with config: {'x.choice': 63, 'y.choice': 94} and reward: 0.15444995894372754
Finished Task with config: {'x.choice': 57, 'y.choice': 75} and reward: 0.33073294541320397
  9%|▉         | 7/76 [00:02<00:26,  2.65it/s]Finished Task with config: {'x.choice': 51, 'y.choice': 26} and reward: 0.18846597355335637
Finished Task with config: {'x.choice': 10, 'y.choice': 19} and reward: 0.18136358228461952
Finished Task with config: {'x.choice': 46, 'y.choice': 72} and reward: 0.5774605629951269
Finished Task with config: {'x.choice': 40, 'y.choice': 76} and reward: 0.7058736551652897
 11%|█         | 8/76 [00:02<00:25,  2.67it/s]Finished Task with config: {'x.choice': 44, 'y.choice': 1} and reward: 0.03249007863969159
Finished Task with config: {'x.choice': 21, 'y.choice': 33} and reward: 0.4247735686936988
Finished Task with config: {'x.choice': 55, 'y.choice': 60} and reward: 0.39956895655479624
Finished Task with config: {'x.choice': 13, 'y.choice': 16} and reward: 0.1552861522645142
 12%|█▏        | 9/76 [00:03<00:25,  2.66it/s]Finished Task with config: {'x.choice': 60, 'y.choice': 94} and reward: 0.18915413038019477
Finished Task with config: {'x.choice': 21, 'y.choice': 90} and reward: 0.77816551129998
Finished Task with config: {'x.choice': 60, 'y.choice': 67} and reward: 0.3109689510968434
Finished Task with config: {'x.choice': 37, 'y.choice': 45} and reward: 0.5408988749006151
 13%|█▎        | 10/76 [00:03<00:24,  2.69it/s]Finished Task with config: {'x.choice': 34, 'y.choice': 69} and reward: 0.8520424635717397
Finished Task with config: {'x.choice': 60, 'y.choice': 11} and reward: 0.06220932769716972
Finished Task with config: {'x.choice': 42, 'y.choice': 6} and reward: 0.05312575413313245
Finished Task with config: {'x.choice': 89, 'y.choice': 97} and reward: 0.013178123417393045
 14%|█▍        | 11/76 [00:04<00:23,  2.71it/s]Finished Task with config: {'x.choice': 66, 'y.choice': 23} and reward: 0.2529084027182172
Finished Task with config: {'x.choice': 65, 'y.choice': 38} and reward: 0.49583631011448265
Finished Task with config: {'x.choice': 66, 'y.choice': 14} and reward: 0.10416741370431903
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 16%|█▌        | 12/76 [00:04<00:24,  2.66it/s]Finished Task with config: {'x.choice': 30, 'y.choice': 57} and reward: 0.829886169891878
Finished Task with config: {'x.choice': 35, 'y.choice': 46} and reward: 0.5846609186365146
Finished Task with config: {'x.choice': 59, 'y.choice': 3} and reward: 0.025054518512878866
Finished Task with config: {'x.choice': 31, 'y.choice': 62} and reward: 0.8709395257996267
 17%|█▋        | 13/76 [00:04<00:23,  2.68it/s]Finished Task with config: {'x.choice': 98, 'y.choice': 25} and reward: 0.17939595099821065
Finished Task with config: {'x.choice': 95, 'y.choice': 63} and reward: 0.11610891252206063
Finished Task with config: {'x.choice': 9, 'y.choice': 20} and reward: 0.18989747667591955
Finished Task with config: {'x.choice': 64, 'y.choice': 99} and reward: 0.12178118688087612
 18%|█▊        | 14/76 [00:05<00:23,  2.66it/s]Finished Task with config: {'x.choice': 63, 'y.choice': 50} and reward: 0.4368457895739434
Finished Task with config: {'x.choice': 55, 'y.choice': 52} and reward: 0.4028107985736251
Finished Task with config: {'x.choice': 24, 'y.choice': 31} and reward: 0.3817091079995325
Finished Task with config: {'x.choice': 40, 'y.choice': 62} and reward: 0.6969580893232793
 20%|█▉        | 15/76 [00:05<00:22,  2.70it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 67} and reward: 0.9814913209712314
Finished Task with config: {'x.choice': 32, 'y.choice': 40} and reward: 0.5087969233381394
Finished Task with config: {'x.choice': 10, 'y.choice': 20} and reward: 0.19318127651343833
Finished Task with config: {'x.choice': 65, 'y.choice': 76} and reward: 0.20282352738135428
 21%|██        | 16/76 [00:05<00:22,  2.69it/s]Finished Task with config: {'x.choice': 19, 'y.choice': 78} and reward: 0.9600071681988102
Finished Task with config: {'x.choice': 93, 'y.choice': 1} and reward: 0.010896047938056276
Finished Task with config: {'x.choice': 34, 'y.choice': 15} and reward: 0.12939423666456223
Finished Task with config: {'x.choice': 63, 'y.choice': 62} and reward: 0.31372800819230695
Finished Task with config: {'x.choice': 93, 'y.choice': 1} and reward: 0.010896047938056276
 22%|██▏       | 17/76 [00:06<00:21,  2.71it/s]Finished Task with config: {'x.choice': 81, 'y.choice': 96} and reward: 0.031703038768058744
Finished Task with config: {'x.choice': 65, 'y.choice': 84} and reward: 0.17254207043832434
Finished Task with config: {'x.choice': 9, 'y.choice': 52} and reward: 0.7398755445448225
Finished Task with config: {'x.choice': 97, 'y.choice': 54} and reward: 0.21495484912430388
 24%|██▎       | 18/76 [00:06<00:21,  2.70it/s]Finished Task with config: {'x.choice': 5, 'y.choice': 3} and reward: 0.05032274511975561
Finished Task with config: {'x.choice': 97, 'y.choice': 72} and reward: 0.03416132366430215
Finished Task with config: {'x.choice': 55, 'y.choice': 9} and reward: 0.04922766858232881
Finished Task with config: {'x.choice': 41, 'y.choice': 52} and reward: 0.5806808361616504
 25%|██▌       | 19/76 [00:07<00:21,  2.68it/s]Finished Task with config: {'x.choice': 27, 'y.choice': 45} and reward: 0.650688955916307
Finished Task with config: {'x.choice': 97, 'y.choice': 53} and reward: 0.22935714262309276
Finished Task with config: {'x.choice': 12, 'y.choice': 24} and reward: 0.2529076252482878
Finished Task with config: {'x.choice': 32, 'y.choice': 69} and reward: 0.8888090750132936
 26%|██▋       | 20/76 [00:07<00:20,  2.69it/s]Finished Task with config: {'x.choice': 29, 'y.choice': 18} and reward: 0.1730263130667964
Finished Task with config: {'x.choice': 85, 'y.choice': 40} and reward: 0.6756957290216257
Finished Task with config: {'x.choice': 37, 'y.choice': 69} and reward: 0.7901920138784992
Finished Task with config: {'x.choice': 88, 'y.choice': 58} and reward: 0.286327467312913
 28%|██▊       | 21/76 [00:07<00:20,  2.67it/s]Finished Task with config: {'x.choice': 48, 'y.choice': 55} and reward: 0.488952436424344
Finished Task with config: {'x.choice': 82, 'y.choice': 90} and reward: 0.03511541242031658
Finished Task with config: {'x.choice': 24, 'y.choice': 31} and reward: 0.3817091079995325
Finished Task with config: {'x.choice': 44, 'y.choice': 90} and reward: 0.48670585177293596
 29%|██▉       | 22/76 [00:08<00:20,  2.68it/s]Finished Task with config: {'x.choice': 48, 'y.choice': 50} and reward: 0.4527992474270069
Finished Task with config: {'x.choice': 8, 'y.choice': 4} and reward: 0.058422749634424065
Finished Task with config: {'x.choice': 89, 'y.choice': 62} and reward: 0.19018552350948792
Finished Task with config: {'x.choice': 26, 'y.choice': 74} and reward: 0.9614044025404535
 30%|███       | 23/76 [00:08<00:19,  2.70it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 46} and reward: 0.6886459183180196
Finished Task with config: {'x.choice': 54, 'y.choice': 49} and reward: 0.4009235733031741
Finished Task with config: {'x.choice': 63, 'y.choice': 45} and reward: 0.46885050272174905
Finished Task with config: {'x.choice': 77, 'y.choice': 52} and reward: 0.5350782292847762
 32%|███▏      | 24/76 [00:08<00:19,  2.71it/s]Finished Task with config: {'x.choice': 46, 'y.choice': 69} and reward: 0.5802833298863385
Finished Task with config: {'x.choice': 44, 'y.choice': 52} and reward: 0.5294533508693043
Finished Task with config: {'x.choice': 2, 'y.choice': 12} and reward: 0.0937625707459735
Finished Task with config: {'x.choice': 17, 'y.choice': 45} and reward: 0.6717131358286801
 33%|███▎      | 25/76 [00:09<00:18,  2.71it/s]Finished Task with config: {'x.choice': 53, 'y.choice': 69} and reward: 0.42463765730607606
Finished Task with config: {'x.choice': 43, 'y.choice': 52} and reward: 0.5462230869026983
Finished Task with config: {'x.choice': 25, 'y.choice': 74} and reward: 0.9700697273516173
Finished Task with config: {'x.choice': 55, 'y.choice': 52} and reward: 0.4028107985736251
 34%|███▍      | 26/76 [00:09<00:18,  2.70it/s]Finished Task with config: {'x.choice': 23, 'y.choice': 52} and reward: 0.8108532667658626
Finished Task with config: {'x.choice': 41, 'y.choice': 52} and reward: 0.5806808361616504
Finished Task with config: {'x.choice': 61, 'y.choice': 69} and reward: 0.2880558719901152
Finished Task with config: {'x.choice': 32, 'y.choice': 52} and reward: 0.7276509087726585
 36%|███▌      | 27/76 [00:10<00:18,  2.69it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
Finished Task with config: {'x.choice': 78, 'y.choice': 52} and reward: 0.5359258876938673
Finished Task with config: {'x.choice': 88, 'y.choice': 69} and reward: 0.09579273174126507
Finished Task with config: {'x.choice': 47, 'y.choice': 74} and reward: 0.5485750572832516
 37%|███▋      | 28/76 [00:10<00:18,  2.67it/s]Finished Task with config: {'x.choice': 46, 'y.choice': 52} and reward: 0.4974644053536026
Finished Task with config: {'x.choice': 16, 'y.choice': 52} and reward: 0.8061063575714768
Finished Task with config: {'x.choice': 27, 'y.choice': 52} and reward: 0.7850991782603625
Finished Task with config: {'x.choice': 55, 'y.choice': 52} and reward: 0.4028107985736251
 38%|███▊      | 29/76 [00:10<00:17,  2.67it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 72, 'y.choice': 52} and reward: 0.5059934264424871
Finished Task with config: {'x.choice': 38, 'y.choice': 52} and reward: 0.6328227532925973
Finished Task with config: {'x.choice': 27, 'y.choice': 52} and reward: 0.7850991782603625
 39%|███▉      | 30/76 [00:11<00:17,  2.67it/s]Finished Task with config: {'x.choice': 40, 'y.choice': 52} and reward: 0.5981207041323738
Finished Task with config: {'x.choice': 46, 'y.choice': 74} and reward: 0.5723229680537476
Finished Task with config: {'x.choice': 33, 'y.choice': 52} and reward: 0.713395029404709
Finished Task with config: {'x.choice': 35, 'y.choice': 52} and reward: 0.6827438813536933
Finished Task with config: {'x.choice': 40, 'y.choice': 52} and reward: 0.5981207041323738
 41%|████      | 31/76 [00:11<00:16,  2.67it/s]Finished Task with config: {'x.choice': 30, 'y.choice': 52} and reward: 0.7536095770338388
Finished Task with config: {'x.choice': 37, 'y.choice': 52} and reward: 0.6498556852950799
Finished Task with config: {'x.choice': 37, 'y.choice': 52} and reward: 0.6498556852950799
Finished Task with config: {'x.choice': 32, 'y.choice': 52} and reward: 0.7276509087726585
 42%|████▏     | 32/76 [00:11<00:16,  2.65it/s]Finished Task with config: {'x.choice': 37, 'y.choice': 52} and reward: 0.6498556852950799
Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
Finished Task with config: {'x.choice': 27, 'y.choice': 74} and reward: 0.9512641104822159
 43%|████▎     | 33/76 [00:12<00:16,  2.62it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 52} and reward: 0.8061063575714768
Finished Task with config: {'x.choice': 16, 'y.choice': 52} and reward: 0.8061063575714768
Finished Task with config: {'x.choice': 16, 'y.choice': 52} and reward: 0.8061063575714768
Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
 45%|████▍     | 34/76 [00:12<00:15,  2.64it/s]Finished Task with config: {'x.choice': 31, 'y.choice': 52} and reward: 0.7410814359404405
Finished Task with config: {'x.choice': 16, 'y.choice': 52} and reward: 0.8061063575714768
Finished Task with config: {'x.choice': 32, 'y.choice': 52} and reward: 0.7276509087726585
Finished Task with config: {'x.choice': 80, 'y.choice': 52} and reward: 0.5316010536952981
 46%|████▌     | 35/76 [00:13<00:15,  2.65it/s]Finished Task with config: {'x.choice': 46, 'y.choice': 52} and reward: 0.4974644053536026
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 52} and reward: 0.8061063575714768
 47%|████▋     | 36/76 [00:13<00:15,  2.65it/s]Finished Task with config: {'x.choice': 35, 'y.choice': 52} and reward: 0.6827438813536933
Finished Task with config: {'x.choice': 27, 'y.choice': 52} and reward: 0.7850991782603625
Finished Task with config: {'x.choice': 31, 'y.choice': 74} and reward: 0.8970279379062928
Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
 49%|████▊     | 37/76 [00:13<00:14,  2.64it/s]Finished Task with config: {'x.choice': 27, 'y.choice': 52} and reward: 0.7850991782603625
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 93, 'y.choice': 74} and reward: 0.038273278217563975
Finished Task with config: {'x.choice': 32, 'y.choice': 52} and reward: 0.7276509087726585
 50%|█████     | 38/76 [00:14<00:14,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 52} and reward: 0.8061063575714768
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 51%|█████▏    | 39/76 [00:14<00:13,  2.68it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
Finished Task with config: {'x.choice': 27, 'y.choice': 74} and reward: 0.9512641104822159
Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 53%|█████▎    | 40/76 [00:14<00:13,  2.68it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 52} and reward: 0.8161012247440166
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 54%|█████▍    | 41/76 [00:15<00:13,  2.65it/s]Finished Task with config: {'x.choice': 32, 'y.choice': 52} and reward: 0.7276509087726585
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 55%|█████▌    | 42/76 [00:15<00:12,  2.66it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 57%|█████▋    | 43/76 [00:16<00:12,  2.64it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 58%|█████▊    | 44/76 [00:16<00:12,  2.64it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 59%|█████▉    | 45/76 [00:16<00:11,  2.64it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 61%|██████    | 46/76 [00:17<00:11,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 62%|██████▏   | 47/76 [00:17<00:10,  2.66it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 63%|██████▎   | 48/76 [00:17<00:10,  2.65it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 64%|██████▍   | 49/76 [00:18<00:10,  2.64it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 66%|██████▌   | 50/76 [00:18<00:09,  2.61it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 67%|██████▋   | 51/76 [00:19<00:09,  2.62it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 68%|██████▊   | 52/76 [00:19<00:09,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 70%|██████▉   | 53/76 [00:19<00:08,  2.64it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 71%|███████   | 54/76 [00:20<00:08,  2.63it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 72%|███████▏  | 55/76 [00:20<00:07,  2.63it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 74%|███████▎  | 56/76 [00:21<00:07,  2.65it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 75%|███████▌  | 57/76 [00:21<00:07,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 76%|███████▋  | 58/76 [00:21<00:06,  2.61it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 78%|███████▊  | 59/76 [00:22<00:06,  2.64it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 79%|███████▉  | 60/76 [00:22<00:06,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 80%|████████  | 61/76 [00:22<00:05,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 82%|████████▏ | 62/76 [00:23<00:05,  2.64it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 83%|████████▎ | 63/76 [00:23<00:04,  2.63it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 84%|████████▍ | 64/76 [00:24<00:04,  2.63it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 86%|████████▌ | 65/76 [00:24<00:04,  2.64it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
 87%|████████▋ | 66/76 [00:24<00:03,  2.66it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 88%|████████▊ | 67/76 [00:25<00:03,  2.57it/s]Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 89%|████████▉ | 68/76 [00:25<00:03,  2.60it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 91%|█████████ | 69/76 [00:25<00:02,  2.60it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 92%|█████████▏| 70/76 [00:26<00:02,  2.63it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 93%|█████████▎| 71/76 [00:26<00:01,  2.64it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 95%|█████████▍| 72/76 [00:27<00:01,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 16, 'y.choice': 74} and reward: 0.9772040425573253
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 96%|█████████▌| 73/76 [00:27<00:01,  2.65it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
 97%|█████████▋| 74/76 [00:27<00:00,  2.63it/s]Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
Finished Task with config: {'x.choice': 21, 'y.choice': 74} and reward: 0.9892484241569526
100%|██████████| 76/76 [00:28<00:00,  2.69it/s]
Best config: {'x.choice': 21, 'y.choice': 74}, best reward: 0.9892484241569526

Compare the performance

Get the result history:

results_rl = [v[0]['accuracy'] for v in rl_scheduler.training_history.values()]
results_random = [v[0]['accuracy'] for v in random_scheduler.training_history.values()]

Average result every 10 trials:

import statistics
results1 = [statistics.mean(results_random[i:i+10]) for i in range(0, len(results_random), 10)]
results2 = [statistics.mean(results_rl[i:i+10]) for i in range(0, len(results_rl), 10)]

Plot the results:

plt.plot(range(len(results1)), results1, range(len(results2)), results2)
[<matplotlib.lines.Line2D at 0x7f1cf777ebd0>,
 <matplotlib.lines.Line2D at 0x7f1cf777e710>]
../../_images/output_algorithm_17e5ae_27_1.png