Skip to content

Commit

Permalink
[FLINK-10856] Take latest checkpoint to resume from in resume from ex…
Browse files Browse the repository at this point in the history
…ternalized checkpoint e2e test

Since it can happen that some empty checkpoint directories are left, we have to take the latest
checkpoint directory in order to resume from an externalized checkpoint. This commit changes the
test_resume_externalized_checkpoint.sh to sort the checkpoint directories in descending order and
then takes the head checkpoint directory.
  • Loading branch information
tillrohrmann committed Nov 14, 2018
1 parent 9f1f25d commit 6f15c17
Showing 1 changed file with 7 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -111,19 +111,14 @@ else
cancel_job $DATASTREAM_JOB
fi

CHECKPOINT_PATH=$(ls -d $CHECKPOINT_DIR/$DATASTREAM_JOB/chk-[1-9]*)
# take the latest checkpoint
CHECKPOINT_PATH=$(ls -d $CHECKPOINT_DIR/$DATASTREAM_JOB/chk-[1-9]* | sort -Vr | head -n1)

if [ -z $CHECKPOINT_PATH ]; then
echo "Expected an externalized checkpoint to be present, but none exists."
exit 1
fi

NUM_CHECKPOINTS=$(echo $CHECKPOINT_PATH | wc -l | tr -d ' ')
if (( $NUM_CHECKPOINTS > 1 )); then
echo "Expected only exactly 1 externalized checkpoint to be present, but $NUM_CHECKPOINTS exists."
exit 1
fi

echo "Restoring job with externalized checkpoint at $CHECKPOINT_PATH ..."

BASE_JOB_CMD=`buildBaseJobCmd $NEW_DOP "-s file:https://${CHECKPOINT_PATH}"`
Expand All @@ -141,6 +136,11 @@ fi

DATASTREAM_JOB=$($JOB_CMD | grep "Job has been submitted with JobID" | sed 's/.* //g')

if [ -z $DATASTREAM_JOB ]; then
echo "Resuming from externalized checkpoint job could not be started."
exit 1
fi

wait_job_running $DATASTREAM_JOB
wait_oper_metric_num_in_records SemanticsCheckMapper.0 200

Expand Down

0 comments on commit 6f15c17

Please sign in to comment.