Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse FHIR JSON in Python? #20

Closed
gitpushdashf opened this issue Aug 25, 2020 · 7 comments
Closed

Parse FHIR JSON in Python? #20

gitpushdashf opened this issue Aug 25, 2020 · 7 comments

Comments

@gitpushdashf
Copy link

google.protobuf.json_format's functions don't seem to work with FHIR JSON, even that produced by Synthea.

Would it be possible to add some kind of function that would convert to FHIR JSON and vice versa?

@gitpushdashf
Copy link
Author

Actually, it sounds like this will be addressed here: #19 (comment)

Is there another option in the meanwhile?

@nickgeorge
Copy link
Collaborator

For reference, raw json-to-proto functionality like google.protobuf.json_format will never work with FHIR JSON, because there's not a 1-1 between FHIR JSON and fields on proto - easiest example there is the "datetime" object is a string in FHIR, but a well-structured, multi-field object in FhirProto. Another example is extensions on primitives like String - in FHIR JSON these are represented as _myStringField, but in FhirProto extensions are fields on primitive "objects", so myStringField.extension.

As for other options, I think our time scale is short enough - O(weeks) - that I wouldn't recommend trying to hack together alternatives, but the best option I can think of is, if you have a static FHIR JSON dataset you're interested in using, you could write a quick C++ program to parse the JSON to proto, and persist it on disk as proto binary files that you could read in python.

But, again, I don't really recommend this - in addition to just parsing and printing, the upcoming python release will have common library functions like those in e.g., https://github.com/google/fhir/blob/master/cc/google/fhir/codes.h that make working with FHIR in python a lot easier.

@gitpushdashf
Copy link
Author

Thank you, Nick! I'm really looking forward to the new release.

We have a Java program that can do it for now, may integrate it for the time being.

@gitpushdashf
Copy link
Author

I'm curious, I see some code that looks like it's for JSON and Proto comparison (https://github.com/google/fhir/tree/master/py/google/fhir/testing) and some code that's Cython (https://github.com/google/fhir/blob/master/bazel/cython.bzl).

Is there a way to compile the C++ code as a Python library that we can call?

While we can convert files one-off with Java, it would be great to do it in real time with Python.

@gitpushdashf
Copy link
Author

Not really sure how to work bazel, but not seeing any c/python targets. Been trying to select just the bazel/cython.BUILD file but it doesn't seem to load any targets no matter what I do.

localhost:~/repos/fhir$ bazel query //... | grep -i ython
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded

If we could import the C++ library into Python it would help a ton.

@nickgeorge
Copy link
Collaborator

It is possible - the specific cython dependency you're seeing there is related to some machine learning code - specifically, the Bundle-to-seqex code is in C++ and is wrapped by the python tensorflow code. Theoretically the same could work for other aspects of the C++ library, e.g., profiling, but we'd rather invest the time and maintenance burden in native implementations. What would be really nice is if there were a pattern that would allow consumers of the fhirproto library to write their own cython wrappers - that way you people could leverage parts that they needed, and switch over to native implementations when they become available. I can take a look in to what that would look like.

As always, thanks a lot for the feedback - it's really helpful to me to hear what new features are top of mind for y'all!

@Cam2337
Copy link
Collaborator

Cam2337 commented Oct 7, 2020

Per conversation over in #19, as of 0.6.0 we have native Python3 parsing, printing, and basic resource validation support for FHIR STU3 and FHIR R4 (among other functionality for extensions, codes, etc.). For more information on the available pypi package, see: https://pypi.org/project/google-fhir/.

You can see https://github.com/google/fhir-examples/tree/master/py/google/fhir_examples for some additional Python FHIR examples around json_format and resource validation.

Note that we also have full bazel support, and so if you would rather, you can simply clone the repository, and bazel build/test as needed. We strongly recommend performing this in a venv to help ensure that the execution of the Python interpreter by bazel is hermetically-sealed.

We're continuing to work on adding stronger validation support (akin to what is available in C++ with fhir_path), as well as continuing to profile+test+iterate.

Going to mark this issue as "closed" for now, but feel free to re-open or create a new issue if you have any questions/comments/concerns. We'd love your feedback and are excited to keep working on this with the community! :)

@Cam2337 Cam2337 closed this as completed Oct 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants