You can use the Datastore mode managed bulk delete service to delete data from your database. This feature supports deletion against one or more kinds.
This page describes how to delete Firestore in Datastore mode entities in bulk using
the managed bulk delete service. The
Datastore mode managed bulk delete service is available through
the gcloud
command-line tool and the Datastore mode
API (REST, RPC).
Before you begin
Before you can use the managed bulk delete service, you must complete the following tasks:
- Enable billing for your Google Cloud project. Only Google Cloud projects with billing enabled can use the bulk delete functionality.
-
Make sure your account has the necessary permissions for Datastore mode. If you are the project owner, your account has the required permissions. Otherwise, the following roles grant the necessary permissions for bulk delete operations:
- Datastore mode roles:
Owner
,Cloud Datastore Owner
, orCloud Datastore Bulk Admin
- Datastore mode roles:
Set up gcloud
for your project
You can initiate bulk delete operations through the Google Cloud console or
the gcloud
command-line tool. To use gcloud
, set up the command-line tool
and connect to your project in one of the following ways:
Access
gcloud
from the Google Cloud console using Cloud Shell.Make sure
gcloud
is configured for the correct project:gcloud config set project [PROJECT_ID]
Bulk delete data
A bulk delete operation first finds all applicable entities in your database and deletes them in batches. You may still query or read these entities while the results may vary based on the progress made. Bulk delete will not delete any entities added or modified after the operation starts.
Bulk delete specific kinds
gcloud
To bulk delete specific kinds, use the
--collection-ids
flag. The flag uses Firestore terminology, but a collection group ID is the same as a kind ID. The operation deletes only
the kinds with the given IDs. A kind
includes all entities and nested entities (at any path) with the specified
kinds.
gcloud beta firestore bulk-delete \ --collection-ids=[COLLECTION_GROUP_ID_1_OR_KIND_1],[COLLECTION_GROUP_ID_2_OR_KIND_2],[SUBCOLLECTION_GROUP_ID_1_OR_KIND_3] \ --database=[DATABASE]
Manage bulk delete operations
After you start a bulk delete operation, Datastore mode assigns the operation a unique name. You can use the operation name to delete, cancel, or check the status of the operation.
Operation names are prefixed with projects/[PROJECT_ID]/databases/[DATABASE_ID]/operations/
,
for example:
projects/my-project/databases/(default)/operations/ASA1MTAwNDQxNAgadGx1YWZlZAcSeWx0aGdpbi1zYm9qLW5pbWRhEgopEg
However, you can leave out the prefix when specifying an operation name for
the describe
, cancel
, and delete
commands.
List all bulk delete operations
gcloud
Use the operations list
command to see all running and recently completed operations, including bulk delete operations:
gcloud firestore operations list
Check operation status
gcloud
Use the operations describe
command to show the status of a bulk delete operation.
gcloud firestore operations describe [OPERATION_NAME]
Estimate the completion time
A request for the status of a long-running operation returns the metrics
workEstimated
and workCompleted
. Each of these metrics is returned in both
number of bytes and number of entities:
workEstimated
shows the estimated total number of bytes and entities an operation will process. Datastore mode might omit this metric if it can't make an estimate.workCompleted
shows the number of bytes and entities deleted so far. After the operation completes, the value shows the total number of bytes and entities that were actually processed, which might be larger than the value ofworkEstimated
.
Divide workCompleted
by workEstimated
for a rough progress estimate. This
estimate might be inaccurate, because it depends on delayed statistics
collection.
Cancel an operation
gcloud
Use the operations cancel
command to stop an operation in progress:
gcloud firestore operations cancel [OPERATION_NAME]
Cancelling a running operation doesn't undo the operation. A cancelled bulk delete operation will not recover the deleted entities.
Delete an operation
Use the gcloud firestore operations delete
command to remove
a completed operation from the list of recent operations. To cancel a running operation, use the earlier cancellation operation.
gcloud firestore operations delete [OPERATION_NAME]
Billing and pricing for bulk delete operations
You are required to enable billing for your Google Cloud project before you use the managed bulk delete service.
Bulk delete operations are charged for entity reads and deletes at the rates listed in Datastore mode pricing. Bulk delete operations incur one index entry read per entity found and one delete operation per entity deleted. You are charged one read operation for up to 1000 index entries read. For example, for a bulk delete operation that deleted 1500 entities, you will be charged with 2 entities reads and 1500 entity deletes.
Note that Datastore mode charges on the actual work done. If the operation is cancelled or failed due to user error, you will be charged with progress made. Datastore mode won't charge read or delete for the entities that are not eventually deleted (for example, entities that are modified after the delete operation started). The cost will be attributed on the day of operation completion.
Bulk delete operations will not trigger your Google Cloud budget alerts until after completion. Similarly, reads and deletes performed during a bulk delete operation are applied to your free quota/usage after the operation is complete. Bulk delete operations will not affect the usage shown in the usage section of the console.
View bulk delete costs
Bulk delete operations apply the goog-firestoremanaged:bulkdelete
label to billed operations. In the Cloud Billing reports page,
you can use this label to view costs related to bulk delete operations.
Best practices
Avoid skipping over deleted data which might slow down the queries.
Other option
Dataflow is a managed service for developing and executing data processing workflows which can also bulk delete data. Dataflow provides a Datastore delete template that you can use to delete entities selected by a GQL query.
For more information, see the Dataflow documentation on executing the Datastore bulk delete template.
If your use case involves data retention, see Manage data retention with TTL policies.