Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sync operation to cover all scenarios #2426

Closed
ageryck opened this issue Feb 2, 2024 · 9 comments
Closed

Improve sync operation to cover all scenarios #2426

ageryck opened this issue Feb 2, 2024 · 9 comments
Labels
P1 High priority issue

Comments

@ageryck
Copy link

ageryck commented Feb 2, 2024

Describe the bug
The sync process goes through the following steps

  • Get all records from local change
  • Group them by id and resource type
  • Squash related resource requests (merge create payload with update payload of the same resource)
  • Chunk the squashed records
  • Bundle the records and sync

This is failing for particular scenarios as shared below
A Group created earlier then members added to it later created an error if the changes to the Group resource holding the members are squashed
The create payload will be saved with a top ID in the local change
The patch payload(below) will come much later in the local change table together with the patient create payload

[
	{
		"op": "add",
		"path": "/member",
		"value": [
			{
				"entity": {
					"reference": "Patient/8f5df579-5890-465f-99e3-34a70cdd1b07"
				}
			}
		]
	}
]

The group by processing will pull the patch payload and squash it together with the group create payload leaving the patient create payload down in the queue
When chunking happens the group payload will be in a separate bundle and it's referencing the patient in another bundle

Component
Engine

To Reproduce
As explained above

Expected behavior
Squash changes without interfering with the references

Suggested Solutions for consideration

  1. We could avoid squashing of records
  2. In the squash logic, we can establish ref to original resources and avoid them if they are missing
  3. We only squash if local changes happen ina sequence
  4. First order the create requests first and process them then quash the updates and send later

Would you like to work on the issue?
Please state if this issue should be assigned to you or who you think could help to solve this issue.

@ageryck
Copy link
Author

ageryck commented Feb 2, 2024

Preliminary test on option 1
HAPI is rejecting bundles with multiple requests for the same resource, we may still need to squash as earlier suggested

{
    "resourceType": "OperationOutcome",
    "text": {
        "status": "generated",
        "div": "<div xmlns=\"http:https://www.w3.org/1999/xhtml\"><h1>Operation Outcome</h1><table border=\"0\"><tr><td style=\"font-weight: bold;\">ERROR</td><td>[]</td><td><pre>HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2</pre></td>\n\t\t\t</tr>\n\t\t</table>\n\t</div>"
    },
    "issue": [
        {
            "severity": "error",
            "code": "processing",
            "diagnostics": "HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2"
        }
    ]
}

@jingtang10
Copy link
Collaborator

Preliminary test on option 1 HAPI is rejecting bundles with multiple requests for the same resource, we may still need to squash as earlier suggested

{
    "resourceType": "OperationOutcome",
    "text": {
        "status": "generated",
        "div": "<div xmlns=\"http:https://www.w3.org/1999/xhtml\"><h1>Operation Outcome</h1><table border=\"0\"><tr><td style=\"font-weight: bold;\">ERROR</td><td>[]</td><td><pre>HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2</pre></td>\n\t\t\t</tr>\n\t\t</table>\n\t</div>"
    },
    "issue": [
        {
            "severity": "error",
            "code": "processing",
            "diagnostics": "HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2"
        }
    ]
}

thanks @ageryck - what is a minimum example for this error? does hapi always throw this if you have 2 resources with the same id? or does it only happen when you have mulitple updates? if you have one create and one update does it work?

@ageryck
Copy link
Author

ageryck commented Feb 2, 2024

Preliminary test on option 1 HAPI is rejecting bundles with multiple requests for the same resource, we may still need to squash as earlier suggested

{
    "resourceType": "OperationOutcome",
    "text": {
        "status": "generated",
        "div": "<div xmlns=\"http:https://www.w3.org/1999/xhtml\"><h1>Operation Outcome</h1><table border=\"0\"><tr><td style=\"font-weight: bold;\">ERROR</td><td>[]</td><td><pre>HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2</pre></td>\n\t\t\t</tr>\n\t\t</table>\n\t</div>"
    },
    "issue": [
        {
            "severity": "error",
            "code": "processing",
            "diagnostics": "HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2"
        }
    ]
}

thanks @ageryck - what is a minimum example for this error? does hapi always throw this if you have 2 resources with the same id? or does it only happen when you have mulitple updates? if you have one create and one update does it work?

@jingtang10 this happens when there is more than 1 resource with identical fullUrl, we still need to squash it as a server requirement

@jingtang10 jingtang10 linked a pull request Feb 2, 2024 that will close this issue
7 tasks
@MJ1998 MJ1998 mentioned this issue Feb 2, 2024
7 tasks
@jingtang10
Copy link
Collaborator

linking jajoo's pr which is a prototype - not actually sure it'll fix the issue as per ager's latest comment.

@jingtang10
Copy link
Collaborator

Preliminary test on option 1 HAPI is rejecting bundles with multiple requests for the same resource, we may still need to squash as earlier suggested

{
    "resourceType": "OperationOutcome",
    "text": {
        "status": "generated",
        "div": "<div xmlns=\"http:https://www.w3.org/1999/xhtml\"><h1>Operation Outcome</h1><table border=\"0\"><tr><td style=\"font-weight: bold;\">ERROR</td><td>[]</td><td><pre>HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2</pre></td>\n\t\t\t</tr>\n\t\t</table>\n\t</div>"
    },
    "issue": [
        {
            "severity": "error",
            "code": "processing",
            "diagnostics": "HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2"
        }
    ]
}

thanks @ageryck - what is a minimum example for this error? does hapi always throw this if you have 2 resources with the same id? or does it only happen when you have mulitple updates? if you have one create and one update does it work?

@jingtang10 this happens when there is more than 1 resource with identical fullUrl, we still need to squash it as a server requirement

@aditya-07 mentioned that a create and an update would still work, but just not multiple updates. can you confirm?

@ageryck
Copy link
Author

ageryck commented Feb 2, 2024

Preliminary test on option 1 HAPI is rejecting bundles with multiple requests for the same resource, we may still need to squash as earlier suggested

{
    "resourceType": "OperationOutcome",
    "text": {
        "status": "generated",
        "div": "<div xmlns=\"http:https://www.w3.org/1999/xhtml\"><h1>Operation Outcome</h1><table border=\"0\"><tr><td style=\"font-weight: bold;\">ERROR</td><td>[]</td><td><pre>HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2</pre></td>\n\t\t\t</tr>\n\t\t</table>\n\t</div>"
    },
    "issue": [
        {
            "severity": "error",
            "code": "processing",
            "diagnostics": "HAPI-0535: Transaction bundle contains multiple resources with ID: Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2"
        }
    ]
}

thanks @ageryck - what is a minimum example for this error? does hapi always throw this if you have 2 resources with the same id? or does it only happen when you have mulitple updates? if you have one create and one update does it work?

@jingtang10 this happens when there is more than 1 resource with identical fullUrl, we still need to squash it as a server requirement

@aditya-07 mentioned that a create and an update would still work, but just not multiple updates. can you confirm?

@aditya-07 the bundle I shared fails for this id "fullUrl": "Group/1f21aae5-0801-4d1d-a52f-f65558ed7ae2" which appears 3 times (1 create and 2 patch resources) when I eliminate 1 patch it still fails but when I eliminate both patches it's processed by HAPI successfully
cc @jingtang10

@jingtang10
Copy link
Collaborator

Discussed with @aditya-07 in a bit more detail today.

@aditya-07 started this thread https://chat.fhir.org/#narrow/stream/179167-hapi/topic/Multiple.20operations.20for.20single.20resource.20in.20a.20FHIR.20Transac.2E.2E.2E and through discussions we discovered that HAPI's behaviour is probably compliant with the spec (see thread for details). Basically, you should not have multiple "writes" (this includes create update and delete) for the same resource in the same transaction.

I think what we can do now, for this particular issue, is to figure out the best way to group squashed changes into requests on a best effort basis. Granted we will not solve the dependency issue fully, but the reality is we will never be able to, because under certain circumstances (i.e. circular dependency plus a small transation page size) there will simply be no way to squash local changes in a way that could result in valid requests. This should be a very rare case and the way to handle it is to throw exceptions and let the application handle it.

but what we can do here is put in some heuristics to make sure in this particular case ager raised, we will have updates on patients before updates on groups due to the direction of the reference.

@ageryck
Copy link
Author

ageryck commented Feb 5, 2024

@jingtang10 after testing the squashing of individual chunks which eliminated the HAPI error, this seems to be the silver bullet in this puzzle, with only one downside of irregular bundle size this solution unblocks and solves the relational dependency by a great margin. We have tested this with sample extracts of prod data and on the first device that had approx 14k records on its local change (translating to approx 28 bundles, only 3 failed and they did for genuine reasons that we are dealing with) another db dump, we had about 2k records on local change, this has synced successfully yet both were failing because of invalid refs

@santosh-pingle santosh-pingle added the P1 High priority issue label Feb 12, 2024
@ageryck
Copy link
Author

ageryck commented Mar 18, 2024

This PerResourcePatchGenerator PR has been merged, we can test against this to close this issue

@ageryck ageryck closed this as completed Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority issue
Projects
Status: Complete
Development

Successfully merging a pull request may close this issue.

3 participants