Skip to content

Add a loop to run SGD for several epochs for each update, which can make the training process faster #3

Description

@yonatank93

In the current version, the agent only perform a single gradient update for every train_frequency, which means that gradient update is not really done frequently. This may results in slow training.

If we allow multiple gradient updates for every train_frequency, the agent may be able to learn more quickly. So, we should add this option. Especially, since SB3 also has this option.

However, we also cannot do too many gradient updates, as the experience buffer, where the training data are sampled from, keeps changing. Too many gradient updates would cause the agent to overfit to specific buffer versions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions