Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/ingestion-models #26

Merged
merged 11 commits into from
Dec 10, 2020
Merged

feature/ingestion-models #26

merged 11 commits into from
Dec 10, 2020

Conversation

evamaxfield
Copy link
Member

Pull request recommendations:

  • Name your pull request your-development-type/short-description. Ex: feature/read-tiff-files
  • Link to any relevant issue in the PR description. Ex: Resolves [Add builds for all platforms #12], adds tiff file format support

Resolves #10

  • Provide context of changes.

Building off of @isaacna's work in #20, I first went through how we are currently accepting data in cdptools and then went through and figured out what we would need to add and where to fully utilize the new db schema.

I also edited the base db models and adjusted primary keys where I realized didn't make sense, set some parameters to required as we can either calc / generate them on the fly during the pipeline etc, and fixed documentation.

Finally, I added a bin script to generate documentation specifically for the EventIngestionModel class using the two examples I laid out in the pipeline.ingestion_models module (EXAMPLE_MINIMAL_EVENT and EXAMPLE_FILLED_EVENT).

To see documentation view of these examples:

git clone https://github.com/CouncilDataProject/cdp-backend.git
cd cdp-backend
git checkout feature/ingestion-models
pip install .[dev]
make docs

and navigate to "Event Data Ingestion Model".

Other random note: I made some comments about "generation of router_string on the person class", I looked up how to remove accent characters and seems very possible for us to create such a generator prior to document upload so should be good to go there. StackOverflow example

  • Provide relevant tests for your feature or bug fix.
  • Provide or update documentation for any feature added by your pull request.

Pinging @nniiicc because I think you will like the auto-genned documentation for the ingestion model. Relates back to Packaging Municipal Legislative Event Data for Reuse & Exchange. Speaking of which, where is that poster? The abstract is in zenodo but not the poster and I remember I really liked it.

Thanks for contributing!

@evamaxfield evamaxfield added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 7, 2020
@evamaxfield evamaxfield self-assigned this Dec 7, 2020
@codecov
Copy link

codecov bot commented Dec 7, 2020

Codecov Report

Merging #26 (3d75dd3) into main (ea646ce) will increase coverage by 0.22%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #26      +/-   ##
==========================================
+ Coverage   98.68%   98.91%   +0.22%     
==========================================
  Files          17       20       +3     
  Lines         458      551      +93     
==========================================
+ Hits          452      545      +93     
  Misses          6        6              
Impacted Files Coverage Δ
cdp_backend/database/validators.py 100.00% <ø> (ø)
cdp_backend/database/models.py 100.00% <100.00%> (ø)
cdp_backend/pipeline/ingestion_models.py 100.00% <100.00%> (ø)
cdp_backend/tests/pipeline/__init__.py 100.00% <100.00%> (ø)
...dp_backend/tests/pipeline/test_ingestion_models.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ea646ce...3d75dd3. Read the comment docs.

Copy link

@pulumi pulumi bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍹 The Update (preview) for stack CouncilDataProjectServiceAccount/cdp-infrastructure/cdp-example was successful.

Resource Changes

    Name                                                                     Type                                   Operation
+   cdp-example-firestore-service                                            gcp:projects/service:Service           create
+   matter_file-matter_ref_ASCENDING_name_ASCENDING                          gcp:firestore/index:Index              create
+   event-body_ref_ASCENDING_event_datetime_ASCENDING                        gcp:firestore/index:Index              create
+   event-body_ref_ASCENDING_event_datetime_DESCENDING                       gcp:firestore/index:Index              create
+   matter_status-matter_ref_ASCENDING_update_datetime_DESCENDING            gcp:firestore/index:Index              create
+   event_minutes_item_file-event_minutes_item_ref_ASCENDING_name_ASCENDING  gcp:firestore/index:Index              create
+   transcript-session_ref_ASCENDING_confidence_DESCENDING                   gcp:firestore/index:Index              create
+   cdp-infrastructure-cdp-example                                           pulumi:pulumi:Stack                    create
+   cdp-example-firestore-app                                                gcp:appengine/application:Application  create
+   event_minutes_item-event_ref_ASCENDING_index_ASCENDING                   gcp:firestore/index:Index              create
+   event_minutes_item-event_ref_ASCENDING_index_DESCENDING                  gcp:firestore/index:Index              create
+   matter_status-matter_ref_ASCENDING_update_datetime_ASCENDING             gcp:firestore/index:Index              create
+   cdp-example                                                              CDPStack                               create
+   cdp-example-speech-service                                               gcp:projects/service:Service           create
+   cdp-example-compute-service                                              gcp:projects/service:Service           create
+   transcript-session_ref_ASCENDING_created_DESCENDING                      gcp:firestore/index:Index              create

@evamaxfield
Copy link
Member Author

Oh and @isaacna you were totally correct. There is really no way this could have been done programmatically top-down or something.

Copy link
Collaborator

@isaacna isaacna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built the docs and the event ingestion model page looks great! I agree with making the EventMinutesItem optional so we can have a more minimal version of the data. Also agree with the other minor tweaks with the database and ingestion models.

Also just out of curiosity, how did you get code coverage to count the ingestion model lines? Is it working now because the models are being used from the test subdirectory?

@evamaxfield
Copy link
Member Author

Also just out of curiosity, how did you get code coverage to count the ingestion model lines? Is it working now because the models are being used from the test subdirectory?

Yea I think its just because they are being used / initialized somewhere.

@tohuynh
Copy link
Collaborator

tohuynh commented Dec 9, 2020

I wasn't able to generate the docs. I got WARNING: Problem with "mdinclude" directive: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 1701: invalid continuation byte and then WARNING: toctree contains reference to document 'event_data_ingestion_model' that doesn't have a title: no link will be generated. Is this because of some encoding issue in event_data_ingestion_model.md?

@isaacna
Copy link
Collaborator

isaacna commented Dec 10, 2020

Also just out of curiosity, how did you get code coverage to count the ingestion model lines? Is it working now because the models are being used from the test subdirectory?

Yea I think its just because they are being used / initialized somewhere.

Hm I guess it just doesn't working trying to create examples via classmethods

@evamaxfield
Copy link
Member Author

evamaxfield commented Dec 10, 2020

@tohuynh

I wasn't able to generate the docs. I got WARNING: Problem with "mdinclude" directive: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 1701: invalid continuation byte and then WARNING: toctree contains reference to document 'event_data_ingestion_model' that doesn't have a title: no link will be generated. Is this because of some encoding issue in event_data_ingestion_model.md?

@tohuynh I just pushed a very small to change to explicitely in UTF-8, can you retry this branch and if it fails can I get the full stack trace?

Copy link

@pulumi pulumi bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍹 The Update (preview) for stack CouncilDataProjectServiceAccount/cdp-infrastructure/cdp-example was successful.

Resource Changes

    Name                                                                     Type                                   Operation
+   cdp-example-compute-service                                              gcp:projects/service:Service           create
+   event-body_ref_ASCENDING_event_datetime_DESCENDING                       gcp:firestore/index:Index              create
+   matter_status-matter_ref_ASCENDING_update_datetime_DESCENDING            gcp:firestore/index:Index              create
+   matter_status-matter_ref_ASCENDING_update_datetime_ASCENDING             gcp:firestore/index:Index              create
+   transcript-session_ref_ASCENDING_created_DESCENDING                      gcp:firestore/index:Index              create
+   cdp-infrastructure-cdp-example                                           pulumi:pulumi:Stack                    create
+   event-body_ref_ASCENDING_event_datetime_ASCENDING                        gcp:firestore/index:Index              create
+   event_minutes_item-event_ref_ASCENDING_index_ASCENDING                   gcp:firestore/index:Index              create
+   event_minutes_item-event_ref_ASCENDING_index_DESCENDING                  gcp:firestore/index:Index              create
+   transcript-session_ref_ASCENDING_confidence_DESCENDING                   gcp:firestore/index:Index              create
+   cdp-example                                                              CDPStack                               create
+   cdp-example-firestore-app                                                gcp:appengine/application:Application  create
+   event_minutes_item_file-event_minutes_item_ref_ASCENDING_name_ASCENDING  gcp:firestore/index:Index              create
+   matter_file-matter_ref_ASCENDING_name_ASCENDING                          gcp:firestore/index:Index              create
+   cdp-example-speech-service                                               gcp:projects/service:Service           create
+   cdp-example-firestore-service                                            gcp:projects/service:Service           create

Copy link
Collaborator

@tohuynh tohuynh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. And thanks for fixing the make docs error. It works for me now.


"""Pipeline package for cdp_backend."""

from .ingestion_models import EventIngestionModel # noqa: F401
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also import the other ingestion models so they could be used in cookiecutter-cdp-deployment or it is not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah. I think after reading this comment, I would rather just remove this shortener rather than make all of them shorter.

i.e. users would have to from cdp_backend.pipeline.ingestion_models import EventIngestionModel rather than current (from cdp_backend.pipeline import EventIngestionModel)

Keeping them all standard rather than making Event special makes more sense to me.

Copy link

@pulumi pulumi bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍹 The Update (preview) for stack CouncilDataProjectServiceAccount/cdp-infrastructure/cdp-example was successful.

Resource Changes

    Name                                                                     Type                                   Operation
+   cdp-example-firestore-app                                                gcp:appengine/application:Application  create
+   transcript-session_ref_ASCENDING_created_DESCENDING                      gcp:firestore/index:Index              create
+   cdp-example-firestore-service                                            gcp:projects/service:Service           create
+   cdp-example-compute-service                                              gcp:projects/service:Service           create
+   event_minutes_item_file-event_minutes_item_ref_ASCENDING_name_ASCENDING  gcp:firestore/index:Index              create
+   cdp-infrastructure-cdp-example                                           pulumi:pulumi:Stack                    create
+   cdp-example-speech-service                                               gcp:projects/service:Service           create
+   event-body_ref_ASCENDING_event_datetime_ASCENDING                        gcp:firestore/index:Index              create
+   event-body_ref_ASCENDING_event_datetime_DESCENDING                       gcp:firestore/index:Index              create
+   matter_status-matter_ref_ASCENDING_update_datetime_ASCENDING             gcp:firestore/index:Index              create
+   transcript-session_ref_ASCENDING_confidence_DESCENDING                   gcp:firestore/index:Index              create
+   cdp-example                                                              CDPStack                               create
+   event_minutes_item-event_ref_ASCENDING_index_ASCENDING                   gcp:firestore/index:Index              create
+   event_minutes_item-event_ref_ASCENDING_index_DESCENDING                  gcp:firestore/index:Index              create
+   matter_file-matter_ref_ASCENDING_name_ASCENDING                          gcp:firestore/index:Index              create
+   matter_status-matter_ref_ASCENDING_update_datetime_DESCENDING            gcp:firestore/index:Index              create

@evamaxfield evamaxfield merged commit c4d3058 into main Dec 10, 2020
@evamaxfield evamaxfield deleted the feature/ingestion-models branch December 10, 2020 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate minimal event data and minimal session data over from cdptools
3 participants