Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to open region: Empty region directory #4127

Open
evenyag opened this issue Jun 10, 2024 · 2 comments
Open

Failed to open region: Empty region directory #4127

evenyag opened this issue Jun 10, 2024 · 2 comments
Assignees
Labels
C-bug Category Bugs

Comments

@evenyag
Copy link
Contributor

evenyag commented Jun 10, 2024

What type of bug is this?

Crash

What subsystems are affected?

Distributed Cluster

Minimal reproduce step

  1. Create a table with multiple regions.
  2. Drop the table
  3. While dropping the table, restart the cluster

What did you expect to see?

The cluster is restarted

What did you see instead?

Datanode crashes.

What operating system did you use?

Unrelated

What version of GreptimeDB did you use?

0.8.2

Relevant log output and stack trace

2024-06-10T13:01:56+08:00   2024-06-10T05:01:56.804391Z  INFO RegionServerRequestHandler::handle:RegionServer::handle_region_request{region_id="924401516150784(215229, 0)"}:handle_request{request_type="Create"}: datanode::region_server: Region 924401516150784(215229, 0) is registered to engine mito
2024-06-10T13:01:56+08:00   2024-06-10T05:01:56.816112Z  INFO RegionServerRequestHandler::handle:RegionServer::handle_region_request{region_id="924401516150785(215229, 1)"}:handle_request{request_type="Create"}: datanode::region_server: Region 924401516150785(215229, 1) is registered to engine mito
2024-06-10T13:01:56+08:00   2024-06-10T05:01:56.843830Z  INFO RegionServerRequestHandler::handle:RegionServer::handle_region_request{region_id="924401516150787(215229, 3)"}:handle_request{request_type="Create"}: datanode::region_server: Region 924401516150787(215229, 3) is registered to engine mito
2024-06-10T13:01:56+08:00   2024-06-10T05:01:56.866768Z  INFO RegionServerRequestHandler::handle:RegionServer::handle_region_request{region_id="924401516150786(215229, 2)"}:handle_request{request_type="Create"}: datanode::region_server: Region 924401516150786(215229, 2) is registered to engine mito

2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.394926Z  INFO mito2::worker::handle_drop: Try to drop region: 924401516150787(215229, 3)
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.395009Z  INFO mito2::worker::handle_drop: Try to drop region: 924401516150785(215229, 1)
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.394999Z  INFO mito2::worker::handle_drop: Try to drop region: 924401516150784(215229, 0)

2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.412498Z  INFO mito2::region: Stopped region manifest manager, region_id: 924401516150784(215229, 0)
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.412554Z  INFO mito2::worker::handle_drop: Region 924401516150784(215229, 0) is dropped logically, but some files are not deleted yet
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.412729Z  INFO RegionServerRequestHandler::handle:RegionServer::handle_region_request{region_id="924401516150784(215229, 0)"}:handle_request{request_type="Drop"}: datanode::region_server: Region 924401516150784(215229, 0) is deregistered from engine mito
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.412822Z  INFO datanode::alive_keeper: Deregister alive countdown for region 924401516150784(215229, 0)
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.418047Z  INFO mito2::region: Stopped region manifest manager, region_id: 924401516150787(215229, 3)
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.418183Z  INFO mito2::worker::handle_drop: Region 924401516150787(215229, 3) is dropped logically, but some files are not deleted yet
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.418302Z  INFO RegionServerRequestHandler::handle:RegionServer::handle_region_request{region_id="924401516150787(215229, 3)"}:handle_request{request_type="Drop"}: datanode::region_server: Region 924401516150787(215229, 3) is deregistered from engine mito
2024-06-10T13:01:57+08:00   2024-06-10T05:01:57.418456Z  INFO datanode::alive_keeper: Deregister alive countdown for region 924401516150787(215229, 3)
2024-06-10T13:03:19+08:00   2024-06-10T05:03:19.440529Z  INFO mito2::worker::handle_open: Region 924401516150785(215229, 1) is dropped, result: Ok(true)

2024-06-10T13:03:19+08:00   2024-06-10T05:03:19.452182Z  INFO mito2::worker::handle_open: Try to open region 924401516150786(215229, 2)
2024-06-10T13:03:41+08:00   2024-06-10T05:03:41.399459Z  INFO mito2::worker::handle_open: Try to open region 924401516150785(215229, 1)
2024-06-10T13:03:41+08:00   1: Failed to handle request for region 924401516150785(215229, 1), at /home/runner/work/greptimedb/greptimedb/greptimedb/src/datanode/src/region_server.rs:497:14


Error: 0: Failed to start datanode, at /home/runner/work/greptimedb/greptimedb/greptimedb/src/cmd/src/datanode.rs:277:14
1: Failed to handle request for region 924401516150785(215229, 1), at /home/runner/work/greptimedb/greptimedb/greptimedb/src/datanode/src/region_server.rs:497:14
2: Failed to open region, at /home/runner/work/greptimedb/greptimedb/greptimedb/src/mito2/src/worker/handle_open.rs:133:54
3: Empty region directory, region_id: 924401516150785(215229, 1), region_dir: data/xxxxxx/public/215229/215229_0000000001/, at /home/runner/work/greptimedb/greptimedb/greptimedb/src/mito2/src/region/opener.rs:256:14
@evenyag evenyag added the C-bug Category Bugs label Jun 10, 2024
@evenyag
Copy link
Contributor Author

evenyag commented Jul 29, 2024

The procedure drops the table and removes some of the region directories. But while restarting the cluster, the procedure may roll back and restore the metadata. Then the datanode will open the dropped directory and unable to start.

@sunng87
Copy link
Member

sunng87 commented Jul 29, 2024

We can log this error and let the server starts. It would be nice to have a background job to cleanup the state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category Bugs
Projects
None yet
Development

No branches or pull requests

3 participants