Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training for causal graph generation #1

Closed
arunvellat opened this issue Mar 12, 2019 · 4 comments
Closed

training for causal graph generation #1

arunvellat opened this issue Mar 12, 2019 · 4 comments

Comments

@arunvellat
Copy link

arunvellat commented Mar 12, 2019

Hello,
It seems that during training the network with each epoch, a single training sample (consisting of X as the tensor of all time series, with target time shifted, and Y as the target time series) is being used. Am I missing something, or is this how it was intended ?

@M-Nauta
Copy link
Owner

M-Nauta commented Mar 12, 2019

You are right, TCDF can currently only handle one training sample (consisting of multiple time series). This was intended, since this was suitable for the datasets we used. Section 5.1 in our paper describes the datasets we used in our experiments. However, we can imagine that there are temporal datasets with multiple instances. We might extend TCDF in the future to handle this, but this is currently not supported.

@ritchieng
Copy link

You can try a rolling fixed window by dividing the long time series into multiple samples, and leaving tiny gaps in between as embargoes specifically just for this case. This way you can observe if training from samples up to T - 100 D has similar out-of-sample performance (or slight deterioration, not a total breakdown) from T - 100 to T.

@M-Nauta M-Nauta closed this as completed Mar 14, 2019
@arunvellat
Copy link
Author

Suppose I have only one replication of N time series (i.e. a single csv file with N time series), on which I run the model. How would the approach that you take of considering a training sample consisting of full time series data, compare with that of breaking up the time series into smaller chunks, and learning from several training samples, for the prediction task (evaluate_prediction.py) ? Also, have you considered the case that the causal dependency graph could vary with time ?

@M-Nauta M-Nauta reopened this Mar 15, 2019
@M-Nauta
Copy link
Owner

M-Nauta commented Mar 15, 2019

We have not yet considered the case where the causal graph changes over time. This is however an interesting (and probably challenging) topic for future work. However, a way to compare graphs over time is to break up all the time series in multiple chunks, and apply TCDF to each chunk (consisting of multiple time series) such that the resulting causal graphs can be compared (although the chunks should not be too short, since TCDF performs better on longer time series). This could give you an indication of the changing causal relationships. Regarding your question on prediction performance: this depends on the data. If the underlying causal relationships in the data are stable (i.e. one static underlying causal graph), I expect that TCDF will more accurately predict the time series when it is not chunked (thus using the whole dataset), simply because there is then more data to train on. However, if the causal relationships change over time, I expect that TCDF will predict more accurately when the data is divided into smaller chunks, since these chunks then contain the underlying causal relationships of that time period.

@M-Nauta M-Nauta closed this as completed Mar 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants