Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(xjx): multiprocess tblogger, fix circular reference problem #156

Merged
merged 8 commits into from
Dec 22, 2021

Conversation

sailxjx
Copy link
Member

@sailxjx sailxjx commented Dec 20, 2021

Description

Related Issue

TODO

Check List

  • merge the latest version source branch/repo, and resolve all the conflicts
  • pass style check
  • pass all the tests

@codecov
Copy link

codecov bot commented Dec 20, 2021

Codecov Report

Merging #156 (fc0d7ee) into main (b040b1c) will increase coverage by 0.32%.
The diff coverage is 96.40%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #156      +/-   ##
==========================================
+ Coverage   84.90%   85.22%   +0.32%     
==========================================
  Files         427      429       +2     
  Lines       32751    33567     +816     
==========================================
+ Hits        27806    28609     +803     
- Misses       4945     4958      +13     
Flag Coverage Δ
unittests 85.22% <96.40%> (+0.32%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
ding/utils/tests/test_log_helper.py 100.00% <ø> (ø)
ding/framework/parallel.py 88.51% <71.42%> (-0.38%) ⬇️
ding/framework/tests/test_wrapper.py 94.11% <87.50%> (-5.89%) ⬇️
ding/utils/log_writer_helper.py 97.77% <97.77%> (ø)
ding/framework/context.py 100.00% <100.00%> (ø)
ding/framework/task.py 97.79% <100.00%> (-0.52%) ⬇️
ding/framework/tests/test_parallel.py 82.92% <100.00%> (+0.42%) ⬆️
ding/framework/tests/test_task.py 100.00% <100.00%> (ø)
ding/utils/__init__.py 95.65% <100.00%> (+0.19%) ⬆️
ding/utils/log_helper.py 98.50% <100.00%> (ø)
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b040b1c...fc0d7ee. Read the comment docs.

@PaParaZz1 PaParaZz1 added the enhancement New feature or request label Dec 20, 2021
if self._thread_pool:
self._thread_pool.shutdown()
self.middleware.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some comments for the sequence of calling clear method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,97 @@
from tensorboardX import SummaryWriter
from typing import TYPE_CHECKING
if TYPE_CHECKING:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some comments for why use this TYPE_CHECKING, I guess it's because of circle import

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

and event components of the task (see ``writer.plugin``).
"""

def __init__(self, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some comments for additional kwargs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Plugin ``task``, so when using this writer in the task pipeline, it will automatically send requests\
to the main writer instead of writing it to the disk. So we can collect data from multiple processes\
and write them into one file.
Usage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Example comments like this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self._get_file_writer()
self._lazy_initialized = True

def __del__(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whether need to check some flag first to avoid calling close method repeatedly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double close check is done in the super method, so we don't need to check this

self._is_writer = False
self._lazy_initialized = False

def plugin(self, task: "Task", is_writer: False) -> "DistributedWriter":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_writer: bool = False

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done



ready_to_parallel_fns = [
'add_audio',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you can wrapper all the attributes of SummaryWriter by this way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have wrapped all the attributes starting with add_ in summarywriter, but I am not sure whether certain methods will succeed through rpc, such as add_image.

@sailxjx sailxjx merged commit 92d973c into main Dec 22, 2021
@sailxjx sailxjx deleted the feature/tblogger branch December 22, 2021 04:08
puyuan1996 pushed a commit to puyuan1996/DI-engine that referenced this pull request Apr 18, 2022
…pendilab#156)

* Fix recur reference in task and parallel, add distributed logger

* Update logger

* Clear ref list when exit task/parallel

* Put task in with statment

* Fix test

* FFix test

* Test is hard

* More comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants