Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set assertions in/write assertions to human readable file? #130

Open
pschloss opened this issue Dec 28, 2023 · 1 comment
Open

Set assertions in/write assertions to human readable file? #130

pschloss opened this issue Dec 28, 2023 · 1 comment

Comments

@pschloss
Copy link

I am wondering whether you all had previously considered reading in assert/verify statements from a file.

In my explorations of TDD with data analysis, I came across the nifty tdda python package. It appears to have a fair amount of overlap with what is doable in assertr. My sense from their white papers is that tdda can algorithmically create constraints by summarizing columns in a data frame and writing those to a file or some other data structure. Those constraints can be modified to fine tune them and re-used in subsequent assertions when new data is considered.

I think it would be pretty powerful to have a yaml or json file that specifies the assertions for a file that can be loaded by assertr and applied to a data frame within a pipeline. A benefit of having a file-based approach to this would be that the file could also serve as a type of data dictionary that would be more readable than assertr code.

I suspect this might be a fair amount of effort to implement as was curious if it was something that's already on your roadmap or if you would be interested in contributions along these lines.

@tonyfischetti
Copy link
Owner

That's so funny you mention that.. about a week back I realized I should have to implement something like this for the work I'm doing now.
Not sure I want to have assertions written purely in JSON/yaml, or any other markdown language
But maybe an assertr chain can be expressed in a separate file, and then included/sourced by the analysis script

I'll have to do some thinking about the best way to implement it (I want assertr to do the right thing)
I'd love to hear more about your experience with tdda and anything else you were thinking regarding this isea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants