.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "gallery/print_train_and_save_agent.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_gallery_print_train_and_save_agent.py: ========================================== Train a PPO agent on Accenta's environment ========================================== This example shows how to train a PPO agent. .. GENERATED FROM PYTHON SOURCE LINES 13-14 Import required packages .. GENERATED FROM PYTHON SOURCE LINES 14-22 .. code-block:: default from rlenv.envs.wall.core import AccentaEnv import torch as th from stable_baselines3 import PPO from rlagent.data import ppo_trained_model_example_path .. GENERATED FROM PYTHON SOURCE LINES 23-24 Make the environment .. GENERATED FROM PYTHON SOURCE LINES 24-28 .. code-block:: default env = AccentaEnv() .. GENERATED FROM PYTHON SOURCE LINES 29-30 Make the agent's policy .. GENERATED FROM PYTHON SOURCE LINES 30-38 .. code-block:: default # Custom actor (pi) and value function (vf) networks # of two layers of size 32 each with Relu activation function # https://stable-baselines3.readthedocs.io/en/master/guide/custom_policy.html#custom-network-architecture policy_kwargs = dict(activation_fn=th.nn.ReLU, net_arch=[dict(pi=[32, 32], vf=[32, 32])]) .. GENERATED FROM PYTHON SOURCE LINES 39-40 Make the agent .. GENERATED FROM PYTHON SOURCE LINES 40-44 .. code-block:: default model = PPO("MlpPolicy", env, policy_kwargs=policy_kwargs, verbose=1) .. GENERATED FROM PYTHON SOURCE LINES 45-46 Train the agent .. GENERATED FROM PYTHON SOURCE LINES 46-50 .. code-block:: default model.learn(total_timesteps=10000) .. GENERATED FROM PYTHON SOURCE LINES 51-52 Save the trained agent .. GENERATED FROM PYTHON SOURCE LINES 52-56 .. code-block:: default model.save(ppo_trained_model_example_path()) .. GENERATED FROM PYTHON SOURCE LINES 57-58 Reload the trained agent .. GENERATED FROM PYTHON SOURCE LINES 58-61 .. code-block:: default del model model = PPO.load(ppo_trained_model_example_path(), env=env) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.000 seconds) .. _sphx_glr_download_gallery_print_train_and_save_agent.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: print_train_and_save_agent.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: print_train_and_save_agent.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_