fix: check the number of collected data in `post_fp_check_fail` #882

HuangJiameng · 2022-08-17T10:00:58Z

see #737
dpdispatcher use flag_if_job_task_fail to mark the failed jobs, so post_fp_check_fail can be used before checking the frames. If we use the alternative way the issue mentioned, we should consider different representative outputs according to fp_style. I am wondering if it is a repeat of the following frame checks. However, since flag_if_job_task_fail will be marked as True if one task in the group is failed, I am afraid that rfail could be high when only a few tasks fail. I'd like to ask for some suggestions.

codecov-commenter · 2022-08-17T10:04:55Z

Codecov Report

Merging #882 (f2df1af) into devel (d55d400) will increase coverage by 0.09%.
The diff coverage is 91.66%.

@@            Coverage Diff             @@
##            devel     #882      +/-   ##
==========================================
+ Coverage   38.12%   38.22%   +0.09%     
==========================================
  Files          99       99              
  Lines       17782    17823      +41     
==========================================
+ Hits         6779     6812      +33     
- Misses      11003    11011       +8

Impacted Files	Coverage Δ
dpgen/generator/run.py	`62.24% <91.66%> (+0.02%)`	⬆️
dpgen/dispatcher/AWS.py	`25.26% <0.00%> (-2.87%)`	⬇️
dpgen/auto_test/Elastic.py	`62.69% <0.00%> (-0.33%)`	⬇️
dpgen/tools/relabel.py	`14.60% <0.00%> (-0.14%)`	⬇️
dpgen/generator/arginfo.py	`100.00% <0.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

njzjz · 2022-08-17T20:07:53Z

This is not correct as dpdispatcher does not allow a command to fail. It will retry and raise RuntimeError. If you want to allow some command to return a non-zero exit code, you can consider executing some_command || touch some_flag and then count some_flag instead.

HuangJiameng · 2022-08-18T01:23:25Z

How about counting *_task_tag_finished? Each task has task_tag_finished. $rfail = 1 - \frac{nfinished}{ntasks}$
Creating a new tag seems cumbersome.

njzjz · 2022-08-18T01:46:26Z

Dpdispatcher expects all tasks finished.

HuangJiameng · 2022-08-18T01:53:10Z

Then post_fp_check_fail seems useless? Is it reasonable to discard it? It doesn't work now in fact.

njzjz · 2022-08-18T02:03:12Z

It is used with the old dispatcher.

njzjz · 2022-08-18T02:38:15Z

Also, I think the tag will not be backward unless it is added into the list of backward files.

HuangJiameng · 2022-08-23T08:51:56Z

Since dpdispatcher will raise an error if any task fails, while the ratio of fail frames shows available data, shall we deprecate post_fp_check_fail?

njzjz · 2022-08-23T22:56:33Z

One can use some_command || : in the command to allow some_command to fail (but the whole command still successes). And dpdata will skip non-coverage data. After new data is collected, we can count the number of data. The failed ratio can be calculated as what I mentioned in #737.

HuangJiameng · 2022-08-24T09:54:38Z

check fail according to the number of collected data

HuangJiameng · 2022-08-24T13:20:25Z

Weirdly, unit tests are passed locally. @njzjz Do you have any idea?

dpgen/generator/run.py

see pr deepmodeling#882

njzjz · 2022-08-29T01:28:14Z

It looks like the failed unit test outputs a MultiSystems directory. Your method does not cover this situation. You may use dpgen.util.expand_sys_str.

dpgen/generator/run.py

njzjz

LGTM

HuangJiameng added 2 commits August 17, 2022 15:15

fix post_fp_check_fail

0b23ba3

Update run.py

56afbdd

wanghan-iapcm requested review from AnguseZhang and njzjz and removed request for AnguseZhang August 17, 2022 12:41

update post_fp_check_fail

6a62a98

HuangJiameng added 2 commits August 24, 2022 19:16

Update run.py

4751c76

fix post_fp_check_fail

dac6073

njzjz reviewed Aug 28, 2022

View reviewed changes

dpgen/generator/run.py Outdated Show resolved Hide resolved

check npy instead of raw in post_fp_check_fail

2634246

see pr deepmodeling#882

HuangJiameng added 2 commits August 29, 2022 10:33

update post_fp_check_fail using expand_sys_str

a9c8df2

update post_fp_check_fail using expand_sys_str

230e9ef

njzjz reviewed Aug 29, 2022

View reviewed changes

dpgen/generator/run.py Outdated Show resolved Hide resolved

Update run.py

f2df1af

njzjz approved these changes Aug 29, 2022

View reviewed changes

njzjz linked an issue Aug 29, 2022 that may be closed by this pull request

[BUG] post_fp_check_fail does not work for dpdispatcher #737

Closed

AnguseZhang approved these changes Sep 1, 2022

View reviewed changes

HuangJiameng changed the title ~~Check failed tasks~~ fix: check the number of collected data in post_fp_check_fail Sep 1, 2022

AnguseZhang merged commit dba436a into deepmodeling:devel Sep 1, 2022

HuangJiameng deleted the check_failed_tasks branch September 26, 2022 08:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: check the number of collected data in `post_fp_check_fail` #882

fix: check the number of collected data in `post_fp_check_fail` #882

HuangJiameng commented Aug 17, 2022

codecov-commenter commented Aug 17, 2022 •

edited

Loading

njzjz commented Aug 17, 2022 •

edited

Loading

HuangJiameng commented Aug 18, 2022

njzjz commented Aug 18, 2022

HuangJiameng commented Aug 18, 2022

njzjz commented Aug 18, 2022

njzjz commented Aug 18, 2022

HuangJiameng commented Aug 23, 2022

njzjz commented Aug 23, 2022 •

edited

Loading

HuangJiameng commented Aug 24, 2022 •

edited

Loading

HuangJiameng commented Aug 24, 2022

njzjz commented Aug 29, 2022

njzjz left a comment

fix: check the number of collected data in post_fp_check_fail #882

fix: check the number of collected data in post_fp_check_fail #882

Conversation

HuangJiameng commented Aug 17, 2022

codecov-commenter commented Aug 17, 2022 • edited Loading

Codecov Report

njzjz commented Aug 17, 2022 • edited Loading

HuangJiameng commented Aug 18, 2022

njzjz commented Aug 18, 2022

HuangJiameng commented Aug 18, 2022

njzjz commented Aug 18, 2022

njzjz commented Aug 18, 2022

HuangJiameng commented Aug 23, 2022

njzjz commented Aug 23, 2022 • edited Loading

HuangJiameng commented Aug 24, 2022 • edited Loading

HuangJiameng commented Aug 24, 2022

njzjz commented Aug 29, 2022

njzjz left a comment

Choose a reason for hiding this comment

fix: check the number of collected data in `post_fp_check_fail` #882

fix: check the number of collected data in `post_fp_check_fail` #882

codecov-commenter commented Aug 17, 2022 •

edited

Loading

njzjz commented Aug 17, 2022 •

edited

Loading

njzjz commented Aug 23, 2022 •

edited

Loading

HuangJiameng commented Aug 24, 2022 •

edited

Loading