Create estimation validation loop #136

Tanmay-Kulkarni101 · 2020-04-30T10:51:34Z

Change the behavior of Dummy Outcome Refuter, so as to compare the behavior, keeping the value of the treatment constant.

We create an estimator for t=t_i and then find the value for the other values of t_i. This allows us to isolate the causal effect of t from y during the optimization process.

amit-sharma

Looks good to me so far. Added a suggestion on how to simply data_preprocess method.
I guess you are now working on running the dummy outcome generator on the other chunks and then evaluating the effect?

dowhy/causal_refuters/dummy_outcome_refuter.py

Tanmay-Kulkarni101 · 2020-05-01T09:27:51Z

In a nutshell, the plan is to train on a partition of a DataFrame, and then using the the trained data on the remaining part of the original data.

amit-sharma

The logic is correct, but can be simplified in many cases.
Also there is a larger point of what to do when the pipeline actions contain only zero, noise or permute. does not make sense to create groups or validation set. I gave a suggestion in the inline comments. Let me know what you think.

dowhy/causal_refuters/dummy_outcome_refuter.py

- move func_args to the end of argument list - replace X_chunk by X_train

All function calls now pass func_args as a variable length keyworded argument list ( **kwargs ). Accordingly, all function calls and examples in the documentation have been changed.

dowhy/causal_refuters/dummy_outcome_refuter.py

- Move the no_estimator check out of the simulation loop - Replace np.zeros with None - Use a common variable validation_df to refer to the validation data

amit-sharma · 2020-05-06T06:58:13Z

Thanks @Tanmay-Kulkarni101 . This looks ready to merge once you add a few tests.

tests/causal_refuters/test_dummy_outcome_refuter.py

Tanmay-Kulkarni101 added 4 commits April 30, 2020 16:09

WIP: Create cross validation study

d8f9b35

We create an estimator for t=t_i and then find the value for the other values of t_i. This allows us to isolate the causal effect of t from y during the optimization process.

edit inputs for estimators

b4fd146

add preamble comment

e0a801a

remove redundant variables

ea9de4d

amit-sharma reviewed Apr 30, 2020

View reviewed changes

rename the preprocess_data to preprocess_data_by_treatment

2b6d68d

process test and validation with parallel operations

3829f80

amit-sharma requested changes May 5, 2020

View reviewed changes

Tanmay-Kulkarni101 added 8 commits May 5, 2020 18:50

scalable bucket size

d145be1

handle buckets with few data points

ea05b2e

replace new_outcome with outcome

c2c1f2a

standardize the behavior of estimate_dummy_outcome

a29de33

- move func_args to the end of argument list - replace X_chunk by X_train

standardize function calls

02e93c1

All function calls now pass func_args as a variable length keyworded argument list ( **kwargs ). Accordingly, all function calls and examples in the documentation have been changed.

remove redundant code

6c9b45b

add optimization for no estimator pipelines

b782878

add simulation loop

9a97936

Tanmay-Kulkarni101 requested a review from amit-sharma May 5, 2020 16:30