Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WT in the paper leaks info #6

Open
JannyKul opened this issue Feb 26, 2019 · 13 comments
Open

WT in the paper leaks info #6

JannyKul opened this issue Feb 26, 2019 · 13 comments
Assignees
Labels
priority-1 question Further information is requested

Comments

@JannyKul
Copy link

Hey Timothy, I added a comment on DeepLearning_Financial too about this and tried to expand here. There's no other way they get to the results they do.

Interested about your thoughts

@timothyyu
Copy link
Owner

timothyyu commented Mar 1, 2019

I am skeptical about the results from the model as described in the paper - that is why I am attempting to replicate both the model and apply to the raw data.

The deeplearning_financial implementation of the WSAE-LSTM model isn't exactly 1:1 to how it's described in the source journal - one example is that the haar wavelet is supposed to be the WT type used, but the author uses the db4/db6 wavelet type:
https://github.com/mlpanda/DeepLearning_Financial/blob/master/models/wavelet.py#L5
image

@timothyyu
Copy link
Owner

timothyyu commented Mar 1, 2019

@JannyKul by applying the wavelet transform separately on each train-validate-test split, no future information should be allowed to leak into the model:

Example applied to train-validate-test split of first csci300 index data segment:
image

@timothyyu timothyyu self-assigned this Mar 1, 2019
@timothyyu
Copy link
Owner

see #7 & 8073c42

the way I implemented the train-validate-test split + fit scaling with RobustScaler on the train set, transforming the validate and test sets respectively (per period) avoids/sidesteps both the wavelet transform and scaling from leaking future info/data

@timothyyu timothyyu reopened this Mar 2, 2019
@timothyyu
Copy link
Owner

issue reopened - this has not been entirely resolved yet, but I am confident I am on the right track

  1. for each index dataset --- > split into 24 intervals [DONE]
  2. split each of those 24 intervals into train-validate-test splits [DONE]
  3. for each train period, scale with fit_transform, and then apply the scaling from the train set on the validate and test sets, respectively
  4. then apply the wavelet transform on the scaled train set, and use the threshold/sigma values from the train set on the validate and test sets, respectively - by not applying the same methodology to the validate and test sets for the wavelet transform, future data may be leaked into the model

@timothyyu
Copy link
Owner

timothyyu commented Mar 3, 2019

  1. for each train period, scale with fit_transform, and then apply the scaling from the train set on the validate and test sets, respectively

Implemented as of v0.1.2 / b715d88
https://github.com/timothyyu/wsae-lstm/releases/tag/v0.1.2

image

@asdfqwer2015
Copy link

Hope to see the amazing results. Please keep refining :)

@timothyyu
Copy link
Owner

def waveletSmooth( x, wavelet="haar", level=2, declevel=2):
    # calculate the wavelet coefficients
    coeff = pywt.wavedec( x, wavelet, mode='periodization',level=declevel,axis=0 )
    # calculate a threshold
    sigma = mad(coeff[-level])
    #print("sigma: ",sigma)
    uthresh = sigma * np.sqrt( 2*np.log( len( x ) ) )
    coeff[1:] = ( pywt.threshold( i, value=uthresh, mode="hard" ) for i in coeff[1:] )
    # reconstruct the signal using the thresholded coefficients
    y = pywt.waverec( coeff, wavelet, mode='periodization',axis=0 )
    return y,sigma,uthresh

@timothyyu
Copy link
Owner

image

@timothyyu
Copy link
Owner

@JannyKul to prevent/side step the issue of the wavelet transform leaking data into the rest of the model, I'm going to see if I can save the sigma and uthresh values from each train set, and then use those values on the validate and test sets, respectively.

While I'm almost certain this will lower the overall accuracy of the model, it is a technically more correct/accurate approach to prevent data from leaking.

That being said, running the wavelet transform independently on each train-validate-test split when there are clearly defined intervals for each split ----> this sets the groundwork/precedent for an online/constantly trained version of the model, where the parameters for sigma and uthresh will be adjusted on the fly between intervals - so it's not entirely technically incorrect, either.

@timothyyu
Copy link
Owner

timothyyu commented Mar 4, 2019

EDIT:
However, if the sigma and uthreshold values don't match when reversing the denoising process, then the reconstructed signal is going to be invalid - and since there are clearly defined intervals + prediction periods, applying the transform independently should be fine, but I will save the calculated sigma and uthresh values to revisit this aspect of the model later.

@JannyKul
Copy link
Author

@timothyyu I agree with your thought process but if we just take a step back for a moment and start with what we know - we can use ML to give us either a momentum signal or a mean-reversion signal. Which we construct depends on the features we extract from the time series. If we smooth out the over reaction/under-reaction movements using a simple moving average and train with this we're creating a momentum signal. If we feature engineer with volatility/range then we create a mean-reversionary signal. A WT doesn't actually help us with either; we cut off the outliers so trying to get our model to find a mean-reversionary signal will fail and the direction changes are so erratic a momentum signal will fail too. I suspect your technique of applying WT on train/val/test separately will have a model that fits v well on the train set but never generalises to the val/test set. Saving sigma and uthreshold is a good next logical step but I definitely recall the paper not making mention of this and I found a few older papers talking about applying WT and coming to ridiculously high accuracy rates too and not making mention of this either. I basically came to the conclusion that for these authors earning an accreditation to further their career was more important than advancing human knowledge. I suspect much like the Bre-X scandal of the 90's we are both looking for something that quite simply doesn't exist. I'd definitely be interested if you make any progress so please so keep me updated so I can sheepishly retract my comment. Also, I haven't gone through your other notebook re crypto order book yet but intuitively thats a technique that should work - good luck sir!

@timothyyu
Copy link
Owner

timothyyu commented Mar 15, 2019

@JannyKul
Your comment(s)/criticism of the model design and existing attempts to replicate/implement said model are valid - they do not need retraction.

For a streaming/online model, a dynamic sigma & uthresh that is adjusted at set intervals will probably work, but for a batch fed model/static model, saving the sigma & uthresh values is a critical step that I think is essential toward evaluating this kind of model properly.

In attempting to replicate the results of the paper, I fully intend to go beyond what is described in the original paper (and existing attempts to implement said model) - if there are errors in implementation or design, I will use my best academic/empirical judgement to evaluate said error and address it.

Also, see issue #7 - it is highly relevant to this issue, specifically in the application of scaling/denoising.

@timothyyu
Copy link
Owner

incomplete/work in progress:
9c796bf#diff-3c3d6d5243e1476c8c1f21078c759772R39

@timothyyu timothyyu added the question Further information is requested label Aug 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority-1 question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants