Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(cy): add BDQ algorithm #558

Merged
merged 12 commits into from
Jan 3, 2023
Merged

feature(cy): add BDQ algorithm #558

merged 12 commits into from
Jan 3, 2023

Conversation

Cloud-Pku
Copy link
Collaborator

@Cloud-Pku Cloud-Pku commented Dec 15, 2022

Description

Add the bdq algorithm and config for hopper and halfcheetah env.

Related Issue

TODO

n-step for BDQ
gradient scaling for backbone network
unify bdq_nstep_td_error and q_nstep_td_error function
unify action_map function
unit Tests writing

Check List

  • merge the latest version source branch/repo, and resolve all the conflicts
  • pass style check
  • pass all the tests

@Cloud-Pku Cloud-Pku added algo Add new algorithm or improve old one config Update config labels Dec 15, 2022
ding/model/common/head.py Outdated Show resolved Hide resolved
ding/policy/bdq.py Outdated Show resolved Hide resolved
ding/policy/bdq.py Outdated Show resolved Hide resolved
ding/rl_utils/td.py Outdated Show resolved Hide resolved
dizoo/mujoco/config/halfcheetah_bdq_config.py Outdated Show resolved Hide resolved
dizoo/mujoco/envs/mujoco_env.py Outdated Show resolved Hide resolved
dizoo/mujoco/envs/mujoco_env.py Outdated Show resolved Hide resolved
ding/model/common/head.py Outdated Show resolved Hide resolved
ding/model/common/head.py Show resolved Hide resolved
ding/model/common/head.py Outdated Show resolved Hide resolved
ding/model/common/head.py Show resolved Hide resolved
ding/model/template/q_learning.py Outdated Show resolved Hide resolved
ding/model/template/q_learning.py Outdated Show resolved Hide resolved
ding/rl_utils/td.py Show resolved Hide resolved
dizoo/mujoco/config/halfcheetah_bdq_config.py Outdated Show resolved Hide resolved
dizoo/mujoco/config/halfcheetah_bdq_config.py Show resolved Hide resolved
dizoo/mujoco/config/halfcheetah_bdq_config.py Outdated Show resolved Hide resolved
extend n-step TD;
polished;
@codecov
Copy link

codecov bot commented Dec 27, 2022

Codecov Report

Merging #558 (880ed8b) into main (0a25e46) will decrease coverage by 0.14%.
The diff coverage is 64.28%.

❗ Current head 880ed8b differs from pull request most recent head 7138bc4. Consider uploading reports for the commit 7138bc4 to get more accurate results

@@            Coverage Diff             @@
##             main     #558      +/-   ##
==========================================
- Coverage   84.59%   84.44%   -0.15%     
==========================================
  Files         555      556       +1     
  Lines       45197    45405     +208     
==========================================
+ Hits        38233    38344     +111     
- Misses       6964     7061      +97     
Flag Coverage Δ
unittests 84.44% <64.28%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
ding/model/common/__init__.py 100.00% <ø> (ø)
ding/rl_utils/__init__.py 100.00% <ø> (ø)
ding/policy/bdq.py 26.00% <26.00%> (ø)
ding/model/template/q_learning.py 85.82% <94.73%> (+0.69%) ⬆️
ding/model/common/head.py 98.56% <100.00%> (+0.13%) ⬆️
ding/model/template/__init__.py 100.00% <100.00%> (ø)
ding/model/template/tests/test_q_learning.py 99.18% <100.00%> (+0.05%) ⬆️
ding/policy/__init__.py 100.00% <100.00%> (ø)
ding/policy/command_mode_policy_instance.py 93.38% <100.00%> (+0.11%) ⬆️
ding/rl_utils/td.py 92.96% <100.00%> (+0.25%) ⬆️
... and 5 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@PaParaZz1 PaParaZz1 changed the title WIP: feature(cy): add BDQ algorithm feature(cy): add BDQ algorithm Jan 2, 2023
@PaParaZz1 PaParaZz1 merged commit 9c689d2 into opendilab:main Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
algo Add new algorithm or improve old one config Update config
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants