Delete entities in bulk

You can use the Datastore mode managed bulk delete service to delete data from your database. This feature supports deletion against one or more kinds.

This page describes how to delete Firestore in Datastore mode entities in bulk using the managed bulk delete service. The Datastore mode managed bulk delete service is available through the gcloud command-line tool and the Datastore mode API (REST, RPC).

Before you begin

Before you can use the managed bulk delete service, you must complete the following tasks:

  1. Enable billing for your Google Cloud project. Only Google Cloud projects with billing enabled can use the bulk delete functionality.
  2. Make sure your account has the necessary permissions for Datastore mode. If you are the project owner, your account has the required permissions. Otherwise, the following roles grant the necessary permissions for bulk delete operations:

    • Datastore mode roles: Owner, Cloud Datastore Owner, or Cloud Datastore Bulk Admin

Set up gcloud for your project

You can initiate bulk delete operations through the Google Cloud console or the gcloud command-line tool. To use gcloud, set up the command-line tool and connect to your project in one of the following ways:

Bulk delete data

A bulk delete operation first finds all applicable entities in your database and deletes them in batches. You may still query or read these entities while the results may vary based on the progress made. Bulk delete will not delete any entities added or modified after the operation starts.

Bulk delete specific kinds

gcloud

To bulk delete specific kinds, use the --collection-ids flag. The flag uses Firestore terminology, but a collection group ID is the same as a kind ID. The operation deletes only the kinds with the given IDs. A kind includes all entities and nested entities (at any path) with the specified kinds.

gcloud beta firestore bulk-delete \
--collection-ids=[COLLECTION_GROUP_ID_1_OR_KIND_1],[COLLECTION_GROUP_ID_2_OR_KIND_2],[SUBCOLLECTION_GROUP_ID_1_OR_KIND_3] \
--database=[DATABASE]

Manage bulk delete operations

After you start a bulk delete operation, Datastore mode assigns the operation a unique name. You can use the operation name to delete, cancel, or check the status of the operation.

Operation names are prefixed with projects/[PROJECT_ID]/databases/[DATABASE_ID]/operations/, for example:

projects/my-project/databases/(default)/operations/ASA1MTAwNDQxNAgadGx1YWZlZAcSeWx0aGdpbi1zYm9qLW5pbWRhEgopEg

However, you can leave out the prefix when specifying an operation name for the describe, cancel, and deletecommands.

List all bulk delete operations

gcloud

Use the operations list command to see all running and recently completed operations, including bulk delete operations:

gcloud firestore operations list

Check operation status

gcloud

Use the operations describe command to show the status of a bulk delete operation.

gcloud firestore operations describe [OPERATION_NAME]

Estimate the completion time

A request for the status of a long-running operation returns the metrics workEstimated and workCompleted. Each of these metrics is returned in both number of bytes and number of entities:

  • workEstimated shows the estimated total number of bytes and entities an operation will process. Datastore mode might omit this metric if it can't make an estimate.

  • workCompleted shows the number of bytes and entities deleted so far. After the operation completes, the value shows the total number of bytes and entities that were actually processed, which might be larger than the value of workEstimated.

Divide workCompleted by workEstimated for a rough progress estimate. This estimate might be inaccurate, because it depends on delayed statistics collection.

Cancel an operation

gcloud

Use the operations cancel command to stop an operation in progress:

gcloud firestore operations cancel [OPERATION_NAME]

Cancelling a running operation doesn't undo the operation. A cancelled bulk delete operation will not recover the deleted entities.

Delete an operation

Use the gcloud firestore operations delete command to remove a completed operation from the list of recent operations. To cancel a running operation, use the earlier cancellation operation.

gcloud firestore operations delete [OPERATION_NAME]

Billing and pricing for bulk delete operations

You are required to enable billing for your Google Cloud project before you use the managed bulk delete service.

Bulk delete operations are charged for entity reads and deletes at the rates listed in Datastore mode pricing. Bulk delete operations incur one index entry read per entity found and one delete operation per entity deleted. You are charged one read operation for up to 1000 index entries read. For example, for a bulk delete operation that deleted 1500 entities, you will be charged with 2 entities reads and 1500 entity deletes.

Note that Datastore mode charges on the actual work done. If the operation is cancelled or failed due to user error, you will be charged with progress made. Datastore mode won't charge read or delete for the entities that are not eventually deleted (for example, entities that are modified after the delete operation started). The cost will be attributed on the day of operation completion.

Bulk delete operations will not trigger your Google Cloud budget alerts until after completion. Similarly, reads and deletes performed during a bulk delete operation are applied to your free quota/usage after the operation is complete. Bulk delete operations will not affect the usage shown in the usage section of the console.

View bulk delete costs

Bulk delete operations apply the goog-firestoremanaged:bulkdelete label to billed operations. In the Cloud Billing reports page, you can use this label to view costs related to bulk delete operations.

Best practices

Avoid skipping over deleted data which might slow down the queries.

Other option

Dataflow is a managed service for developing and executing data processing workflows which can also bulk delete data. Dataflow provides a Datastore delete template that you can use to delete entities selected by a GQL query.

For more information, see the Dataflow documentation on executing the Datastore bulk delete template.

If your use case involves data retention, see Manage data retention with TTL policies.