Add a loop to run SGD for several epochs for each update, which can make the training process faster

In the current version, the agent only perform a single gradient update for every train_frequency, which means that gradient update is not really done frequently. This may results in slow training.

If we allow multiple gradient updates for every train_frequency, the agent may be able to learn more quickly. So, we should add this option. Especially, since SB3 also has this [option](https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/dqn/dqn.py#L194). 

However, we also cannot do too many gradient updates, as the experience buffer, where the training data are sampled from, keeps changing. Too many gradient updates would cause the agent to overfit to specific buffer versions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a loop to run SGD for several epochs for each update, which can make the training process faster #3

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add a loop to run SGD for several epochs for each update, which can make the training process faster #3

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions