"level" parameter in waveletSmooth function #9

danizil · 2019-06-03T09:54:10Z

Hi Timothy.
In reviewing your code I ran into issues using the waveletSmooth function (in the directory subrepos/models/wavelet). I think it might be the difference in our pywt versions, but the function was doing the wavelet decomposition along the features axis and not separately for each feature along its time series.
After fixing it, I noticed that the "level" parameter was only in charge of thresholding
the detail coefficients using the "level" detail coefficient's median.
I'm hardly a wavelet expert, and have learned it only now for this algorithm, but I changed your code to threshold coefficients according to their level's median, because that was done in all denoising sources I have seen.
Could you explain your consideration in choosing one level for all cD thresholding?
cheers!
Danny

timothyyu · 2019-06-10T01:50:13Z

@danizil the waveletSmooth function in the subrepos/models/wavelet directory is from DeepLearning_Financial, a previous attempt to replicate the results of the paper: (https://github.com/mlpanda/DeepLearning_Financial)

I am currently using a modified implementation of that formula, seen here:
https://github.com/timothyyu/wsae-lstm/blob/master/wsae_lstm/models/wavelet.py#L27

timothyyu · 2019-06-10T01:50:25Z

timothyyu · 2019-06-10T01:54:46Z

level is 2 as defined in the original paper; as for the axis, I am still looking into how that specific step is tied to the next level of the model (the stacked autoencoder stage). I am fairly confident that my implementation is on the right track axis wise, but I am not infallible. I do recall the feature set being seemingly incorrectly oriented when using axis=1, but it is something that I will have to double check

Related/relevant:
#7
#6
https://github.com/timothyyu/wsae-lstm/blob/master/reports/csi300%20index%20data%20tvt%20split%20scale%20denoise%20visual.pdf

danizil · 2019-06-11T12:44:40Z

Hi @timothyyu

What puzzles me is not the decomposition level, which can be toggled, but the fact that we take the threshold from the "level" level's median coefficient and apply it to all other levels. That way did work better at reconstruction though so I went on without further exploration.
Regarding the axis, the way I understood it is that we're supposed to compress nineteen indicators into ten features' time series' (i.e. on the indicator axis and not on the time axis), and then run the compressed features through an lstm. I imagine that whichever denoising process transforms the dataframe into 19X(DecLvl + 1) is preformed on the correct axis (spectral decomposition is relevant only on the time axis is what I mean). I'll add my own code underneath, could be that it's a package version thing.
After exploring for a bit, and not being able to converge with the AE, I think ill try a new scaling process which compresseses the data to the range (0, 1). The reason is mainly that I wasn't able to figure out how to recreate the scaled original signal with ReLUs or tanhs without using another linear transformation in the end, thereby breaking the symmetry of the AE. Other resons are that this will make it possible to use sigmoids as in the paper and also I saw it used in AE tutorials:
https://www.youtube.com/watch?v=582irhtQOhw
on minute 11:39
https://medium.com/datadriveninvestor/deep-autoencoder-using-keras-b77cd3e8be95
I think this is the subject for another issue though, and will update on it.
Here is my code for the wavelet function (besides my comments, that I think explain the process though are verbose, there are the two options and a transpose on X):
Cheers!

def waveletSmooth(x, wavelet="db4", level=1, DecLvl=2):
    # calculate the wavelet coefficients
    # danny: coeffs is (DecLvl + 1) arrays: one approximation coefficients (cA) array (lowest frequencies)
    # and then DecLvl number of detail coefficients (cD)
    coeffs = pywt.wavedec(x.T, wavelet, mode="per", level=DecLvl)

    # calculate a threshold
    # danny: mad is median deviation (not standard deviation)
    sigma = mad(coeffs[-level]) #danny: should be shape 2X19. this is the original but i turned it off
    #danny: option 2 - scale each cD by its own median
    # sigma = np.array([mad(i) for i in coeffs[1:]])

    # changing this threshold also changes the behavior,
    # but I have not played with this very much
    # danny: uthresh is universal threshold - a formula appearing in articles (haven't gone into it)
    uthresh = sigma * np.sqrt(2 * np.log(len(x)))

    # danny: we take [1:] because these are the detail coefficients and we denoise them
    coeffs[1:] = (pywt.threshold(coeffs[1:][i], value=uthresh[i], mode="soft") for i in range(len(coeffs[1:])))
    # reconstruct the signal using the thresholded coefficients
    y = pywt.waverec(coeffs, wavelet, mode="per")
    return y```

timothyyu · 2019-07-12T21:02:47Z

related closed issue (duplicate):
#12

timothyyu · 2019-07-12T21:12:45Z

@danizil make sure the wavelet type in your code is haar, not db4. The authors for the WSAE-LSTM specifically specify haar; the existing/previous attempt to implement this model for the wavelet stage by mlpanda uses db4, but that is incorrect

I am still looking into/examining the level median application/decomposition (#1) + the axis orientation (#2); one of main issues I'm running into is that the authors of the model were not very specific when it comes to particular aspects of the implementation of their model (see #6 and #7 for relevant discussion regarding that).

Basically beyond a certain point, the highest academic judgement/practice should be used to fill in the gaps in the implementation of the model + correction of errors.

timothyyu · 2019-07-12T21:29:31Z

3. After exploring for a bit, and not being able to converge with the AE, I think ill try a new scaling process which compresseses the data to the range (0, 1). The reason is mainly that I wasn't able to figure out how to recreate the scaled original signal with ReLUs or tanhs without using another linear transformation in the end, thereby breaking the symmetry of the AE.

That is one of the fundamental issues that I am looking into - Whether the scaling and denoising with the wavelet transform is reversed at some stage before/after the LSTM layer, used to make predictions one timestep ahead:

I can't say I have a definitive answer/solution yet, but I'm going to be trying more than one method/approach. Unfortunately, the actual journal does not explicitly detail this component/issue in defining the model & the model pipeline for the price data + technical indicator data.

timothyyu · 2019-07-12T22:52:29Z

From the pywavelets documentation, p.38-39:

timothyyu · 2019-07-12T23:19:33Z

Regarding the axis, the way I understood it is that we're supposed to compress nineteen indicators into ten features' time series' (i.e. on the indicator axis and not on the time axis), and then run the compressed features through an lstm. I imagine that whichever denoising process transforms the dataframe into 19X(DecLvl + 1) is preformed on the correct axis (spectral decomposition is relevant only on the time axis is what I mean). I'll add my own code underneath, could be that it's a package version thing.

Axis decomposition check started; see 707dfb5:
https://github.com/timothyyu/wsae-lstm/blob/master/notebooks/6a_wavelet_axis_decomp_check.ipynb

timothyyu · 2019-07-12T23:25:45Z

axis = 1 wavelet has an extra y axis column that has to be removed to be accurate feature wise:

timothyyu · 2019-09-13T00:37:04Z

will have to double check but i believe i was correct initially with axis=0 - still may be worth running in parallel with axis=1 with the extra column chopped off as an A/B control or test

Regarding the axis, the way I understood it is that we're supposed to compress nineteen indicators into ten features' time series' (i.e. on the indicator axis and not on the time axis), and then run the compressed features through an lstm. I imagine that whichever denoising process transforms the dataframe into 19X(DecLvl + 1) is preformed on the correct axis (spectral decomposition is relevant only on the time axis is what I mean). I'll add my own code underneath, could be that it's a package version thing.

The authors of the original paper were not explicitly clear or detailed for this aspect of the model - will also have to take a closer look at/reevaluate the autoencoder stage in how it is supposed to work on the transformed data (19X DecLvl+1) before LSTM input

timothyyu self-assigned this Jul 11, 2019

timothyyu added the priority-1 label Jul 12, 2019

timothyyu assigned timothyyu and unassigned timothyyu Jul 12, 2019

timothyyu mentioned this issue Jul 12, 2019

dual stage normalization and scaling #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"level" parameter in waveletSmooth function #9

"level" parameter in waveletSmooth function #9

danizil commented Jun 3, 2019

timothyyu commented Jun 10, 2019

timothyyu commented Jun 10, 2019

timothyyu commented Jun 10, 2019 •

edited

Loading

danizil commented Jun 11, 2019 •

edited

Loading

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Sep 13, 2019 •

edited

Loading

"level" parameter in waveletSmooth function #9

"level" parameter in waveletSmooth function #9

Comments

danizil commented Jun 3, 2019

timothyyu commented Jun 10, 2019

timothyyu commented Jun 10, 2019

timothyyu commented Jun 10, 2019 • edited Loading

danizil commented Jun 11, 2019 • edited Loading

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Jul 12, 2019

timothyyu commented Sep 13, 2019 • edited Loading

timothyyu commented Jun 10, 2019 •

edited

Loading

danizil commented Jun 11, 2019 •

edited

Loading

timothyyu commented Sep 13, 2019 •

edited

Loading