Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Improve how movie and batch seek to clean up on premature exits #6016

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

PaulWessel
Copy link
Member

See #6007 for motivation. This PR moves the creation of the cleanup script up front and once we pass the section where the working dir has been created we no longer call Return on error but just set the error value and jump to the end of the program so that we can run the cleanup_script before exiting with the saved error. No impact on my tests since it only would kick in when a movie or batch job fails.

See #6007 for motivation.  This PR moves the creation of the cleanup script up front and once we pass the section where the working dir has been created we no longer Return on error but jump to the end of the program so that we can run the cleanup_script before exiting with the error.
@PaulWessel PaulWessel added enhancement Improving an existing feature maintenance Boring but important stuff for the core devs labels Nov 15, 2021
@PaulWessel PaulWessel added this to the 6.3.0 milestone Nov 15, 2021
@PaulWessel PaulWessel self-assigned this Nov 15, 2021
@maxrjones
Copy link
Member

Two questions related to this:

  1. Could this be applied for the ctrl-c case?
  2. Is there a way to exit better if there a problem in the mainscript? Here is an example that hangs due to an (intentional) error in the main script:

test.sh:

gmt movie main.sh -Nglobe -T360 -Fgif -C6ix6ix100 -Lf -P -Wtest-dir -Z -Vd

main.sh:

gmt begin
   gmt coast -R0/10/0/-10 -JG${MOVIE_FRAME}/20/${MOVIE_WIDTH} -Gmaroon -Sturquoise -Bg -X0 -Y0
gmt end

@PaulWessel
Copy link
Member Author

Not sure about Ctrl-C yet. As for a bug in main.sh. Perhaps we should ALWAYS run the master frame, either when selected (it is then one of the deliverables) or not (then it is just a run and we delete the output). We do not want to enter the frame loop with a broken script I think.

I do not think we can detect all types of script bugs. Imagine you forget to give a filename to plot and it sits there waiting for stdin or something. Not sure if that is even possible in a system call. We are checking return status of the main scripts and exit if they are nonzero.

@maxrjones
Copy link
Member

Not sure about Ctrl-C yet. As for a bug in main.sh. Perhaps we should ALWAYS run the master frame, either when selected (it is then one of the deliverables) or not (then it is just a run and we delete the output). We do not want to enter the frame loop with a broken script I think.

I do not think we can detect all types of script bugs. Imagine you forget to give a filename to plot and it sits there waiting for stdin or something. Not sure if that is even possible in a system call. We are checking return status of the main scripts and exit if they are nonzero.

OK, these could be addressed separate from this PR. Regarding the changes here, I get this error when running anim01.sh:

movie [INFORMATION]: MP4 movie built: anim01.mp4
bash: movie_cleanup.sh: No such file or directory
movie [ERROR]: Running cleanup script movie_cleanup.sh returned error 32512 - exiting

@PaulWessel
Copy link
Member Author

OK, will fix this after a defense. The current directory changed I think as to where we are.

@maxrjones
Copy link
Member

The test-dir is not removed after the successful completion of this animation:

test-movie.sh:

gmt movie globe.sh -Nglobe -T360 -Fgif -C6ix6ix100 -Lf -P -Wtest-dir -Z -Vd

globe.sh:

gmt begin
   gmt coast -R0/10/0/10 -JG${MOVIE_FRAME}/20/${MOVIE_WIDTH} -Gmaroon -Sturquoise -Bg -X0 -Y0
gmt end

@PaulWessel
Copy link
Member Author

I am still working on this and worry that the changes I have been working on are too extensive to trust at the even of a release. The problems we are trying to fix have to do with user errors etc., which is less critical than bugs.

@maxrjones
Copy link
Member

I am still working on this and worry that the changes I have been working on are too extensive to trust at the even of a release. The problems we are trying to fix have to do with user errors etc., which is less critical than bugs.

I agree, let's hold off until after the release. Holding off for the more extensive changes is better than merging a partial fix because there seem to be some problems with SEGV after Ctrl-C on this branch.

@PaulWessel
Copy link
Member Author

FYI, the latest commit works for your case above. However, there are unresolved issues: I faked an error in the globe.sh script (bad color) and it fails, but the system call does not return an error (always 0). So I cannot capture it. I then realized we set -e in gmt --new-script for this purpose, and when running such a script on the CLI it does return the right error (check with echo $?). Yet, when I tried this in movie it still does not return an error. So I think we need more research her on how to add maybe additional shell commands to force a return code or not.

See if you think the current fix for removing dirs work: I made the cleanup_file now be an absolute path and also place it in the temp dir with a name depending on the PID. It is deleted separately as per Windows limitation and it is now a hidden script not meant for user examination.

Taking an 1hour break to take wife for a short swim.

@maxrjones
Copy link
Member

For me, there are now seg faults if Ctrl-C is used while create-movie.sh is executing, which seems worse than the original problem of being left with the temp-dir directory.

@PaulWessel
Copy link
Member Author

OK, I think movie and batch need to set their own Ctrl-C handling since it differs from what gmt sets.

@PaulWessel
Copy link
Member Author

Can you tell me what exactly create-movie.sh is? When I try Ctrl-C on the globe example it quits cleanly:

gmt movie globe.sh -Nglobe -T360 -Fgif -C6ix6ix100 -Lf -P -Wtest-dir -Z
^C^C

@maxrjones
Copy link
Member

Can you tell me what exactly create-movie.sh is? When I try Ctrl-C on the globe example it quits cleanly:

gmt movie globe.sh -Nglobe -T360 -Fgif -C6ix6ix100 -Lf -P -Wtest-dir -Z
^C^C

The same except running it as a script and with debug output:
create-movie.sh:

gmt movie globe.sh -Nglobe -T360 -Fgif -C6ix6ix100 -Lf -P -Wtest-dir -Z -Vd

globe.sh:

gmt begin
   gmt coast -R0/10/0/10 -JG${MOVIE_FRAME}/20/${MOVIE_WIDTH} -Gmaroon -Sturquoise -Bg -X0 -Y0
gmt end
bash create-movie.sh
^C^C
...
movie [DEBUG]: Launch script for frame 030
^Cmovie [ERROR]: Running script bash movie_frame.sh 030 & returned error 2 - aborting.
gmt [WARNING]: Unable to determine current working directory.
gmt [WARNING]: Unable to determine current working directory.
gmt [WARNING]: Unable to determine current working directory.
gmt [WARNING]: Unable to determine current working directory.
figure [ERROR]: gmt figure: Unable to open movie parameter file ../movie_params_023.sh
gmt [WARNING]: Unable to determine current working directory.
ERROR: Caught signal number 11 (Segmentation fault: 11) at
gmt [WARNING]: Unable to determine current working directory.
ERROR: Caught signal number 11 (Segmentation fault: 11) at
...

@PaulWessel
Copy link
Member Author

The trouble is that the Ctrl-C may hit while a system call is calling Ghostscript, for instance, which it does quote often. Then you kill gs and lots of residual errors show up. Ctrl-C on gmt movie does handle system calls. I get these for instance:

^Cpsconvert [ERROR]: System call [gs -q -dNOPAUSE -dBATCH -dNOSAFER -dSCANCONVERTERTYPE=2 -dMaxBitmap=2147483647 -dUseFastColor=true -dGraphicsAlphaBits=2 -dTextAlphaBits=4 -sDEVICE=png16m -g600x600 -r100 -sOutputFile='../globe_062.png' '/Users/pwessel/.gmt/sessions/gmt_session.65878/psconvert_65912d.eps'] returned error 2.

Other times, no error. Just happens to hit a lull between system calls. Given that we launch shell scripts, these are out of reach once we let them go since we do not wait for return (which is why I cannot get an error back via the system return code - it just tells me I successfully launched a script.

I do have a solution to detecting if there is a bug in the main script: Basically, add -e to the script header and add something like "echo ok > /tmp/ok.number" and if that file exists when the script completes then we know it also got to the end.

I think we will just keep this branch open for now.

@PaulWessel PaulWessel changed the title Improve how movie and batch seek to clean up on premature exits WIP Improve how movie and batch seek to clean up on premature exits Nov 17, 2021
@PaulWessel PaulWessel modified the milestones: 6.3.0, Future release Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving an existing feature maintenance Boring but important stuff for the core devs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants