Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic pagination #4608

Open
dillonstreator opened this issue May 30, 2024 · 0 comments
Open

Deterministic pagination #4608

dillonstreator opened this issue May 30, 2024 · 0 comments

Comments

@dillonstreator
Copy link
Contributor

dillonstreator commented May 30, 2024

Most resources do not have a field to sort against that provides a consistent and deterministic paging. For instance, I'm trying to page through CareTeam's with a particular tag and the end result of the paging does not result in the full set of CareTeam's that exist in the system. Keep in mind that the resource set behind the base filter is not actively changing and yet the pagination does not result in all entries.

const seenCareTeamIds = new Set<string>();
for await (const resources of medplum.searchResourcePages('CareTeam', {
_tag: '...|...',
_count: 1000,
_sort: 'name',
})) {
  for (const resource of resources) {
    if (seenCareTeamIds.has(resource.id!)) {
      console.log('Repeated care team', resource.id);
      continue;
    }
    seenCareTeamIds.add(resource.id!);

    // ...
  }
}

These CareTeam's do not have a date set so that cannot be sorted against. The name's are all the same so this is understandably a bad field to sort against as well.

After updating the script to sort against patient, the results were MUCH better but still resulted in several duplicate records across pages. The fact that there is not a consistent field shared across all resources that can be deterministically sorted against feels like a gap in the spec.

I don't see this covered by the FHIR spec but it would be really valuable to have a shared field across all resources that is either an internal sortable ULID or a 'created at' field. This field could be added by default to the end of whatever sort is provided by the calling client to give more deterministic paging by default. 'created at' could be backfilled based on the first _history entry for the matching resource. Not sure if ULIDs can be easily backfilled but if a seed date can be provided, the derived 'created at' could be used.

@dillonstreator dillonstreator changed the title Deterministic pagination across all resources Deterministic pagination May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant