Weighted Fourier Transform: Spectral Gating & Main Frequency🙏🏻 This drop has 2 purposes:
1) to inform every1 who'd ever see it that Weighted Fourier Tranform does exist, while being available nowhere online, not even in papers, yet there's nothing incredibly complicated about it, and it can/should be used in certain cases;
2) to show TradingView users how they can use it now in dem endevours, to show em what spectral filtering is, and what can they do with all of it in diy mode.
... so we gonna have 2 sections in the description
Section 1: Weighted Fourier Transform
It's quite easy to include weights in Fourier analysis: you just premultiply each datapoint by its corresponding weight -> feed to direct Fourier Transform, and then divide by weights after inverse Fourier transform. Alternatevely, in direct transform you just multiply contributions of each data point to the real and imaginary parts of the Fourier transform by corresponding weights (in accumulation phase), and in inverse transform you divide by weights instead during the accumulation phase. Everything else stays the same just like in non-weighted version.
If you're from the first target group let's say, you prolly know a thing or deux about how to code & about Fourier Transform, so you can just check lines of code to see the implementation of Weighted Discrete version of Fourier Transform, and port it to to any technology you desire. Pine Script is a developing technology that is incredibly comfortable in use for quant-related tasks and anything involving time series in general. While also using Python for research and C++ for development, every time I can do what I want in Pine Script, I reach for it and never touch matlab, python, R, or anything else.
Weighted version allows you to explicetly include order/time information into the operation, which is essential with every time series, although not widely used in mainstream just as many other obvious and right things. If you think deeply, you'll understand that you can apply a usual non-weighted Fourier to any 2d+ data you can (even if none of these dimensions represent time), because this is a geometric tool in essence. By applying linearly decaying weights inside Fourier transform, you're explicetly saying, "one of these dimensions is Time, and weights represent the order". And obviously you can combine multiple weightings, eg time and another characteristic of each datum, allows you to include another non-spatial dimension in your model.
By doing that, on properly processed (not only stationary but Also centered around zero data), you can get some interesting results that you won't be able to recreate without weights:
^^ A sine wave, centered around zero, period of 16. Gray line made by: DWFT (direct weighted Fourier transform) -> spectral gating -> IWFT (inverse weighted Fourier transform) -> plotting the last value of gated reconstructed data, all applied to expanding window. Look how precisely it follows the original data (the sine wave) with no lag at all. This can't be done by using non-weighted version of Fourier transform.
^^ spectral filtering applied to the whole dataset, calculated on the latest data update
And you should never forget about Fast Fourier Transform, tho it needs recursion...
Section 2: About use cases for quant trading, about this particular implementaion in Pine Script 6 (currently the latest version as of Friday 13, December 2k24).
Given the current state of things, we have certain limits on matrix size on TradingView (and we need big dope matrixes to calculate polynomial regression -> detrend & center our data before Fourier), and recursion is not yet available in Pine Script, so the script works on short datasets only, and requires some time.
A note on detrending. For quality results, Fourier Transform should be applied to not only stationary but also centered around zero data. The rightest way to do detrending of time series
is to fit Cumulative Weighted Moving Polynomial Regression (known as WLSMA in some narrow circles xD) and calculate the deltas between datapoint at time t and this wonderful fit at time t. That's exactly what you see on the main chart of script description: notice the distances between chart and WLSMA, now look lower and see how it matches the distances between zero and purple line in WFT study. Using residuals of one regression fit of the whole dataset makes less sense in time series context, we break some 'time' and order rules in a way, tho not many understand/cares abouit it in mainstream quant industry.
Two ways of using the script:
Spectral Gating aka Spectral filtering. Frequency domain filtering is quite responsive and for a greater computational cost does not introduce a lag the way it works with time-domain filtering. Works this way: direct Fourier transform your data to get frequency & phase info -> compute power spectrum out of it -> zero out all dem freqs that ain't hit your threshold -> inverse Fourier tranform what's left -> repeat at each datapoint plotting the very first value of reconstructed array*. With this you can watch for zero crossings to make appropriate trading decisions.
^^ plot Freq pass to use the script this way, use Level setting to control the intensity of gating. These 3 only available values: -1, 0 and 1, are the general & natural ones.
* if you turn on labels in script's style settings, you see the gray dots perfectly fitting your data. They get recalculated (for the whole dataset) at each update. You call it repainting, this is for analytical & aesthetic purposes. Included for demonstration only.
Finding main/dominant frequency & period. You can use it to set up Length for your other studies, and for analytical purposes simply to understand the periodicity of your data.
^^ plot main frequency/main period to use the script this way. On the screenshot, you can see the script applied to sine wave of period 16, notice how many datapoints it took the algo to figure out the signal's period quite good in expanding window mode
Now what's the next step? You can try applying signal windowing techniques to make it all less data-driven but your ego-driven, make a weighted periodogram or autocorrelogram (check Wiener-Khinchin Theorem ), and maybe whole shiny spectrogram?
... you decide, choice is yours,
The butterfly reflect the doors ...
∞
Spectral
Quick scan for cycles🙏🏻
The followup for
As I told before, ML based algorading is all about detecting any kind of non-randomness & exploiting it (cuz allegedly u cant trade randomness), and cycles are legit patterns that can be leveraged
But bro would u really apply Fourier / Wavelets / 'whatever else heavy' on every update of thousands of datasets, esp in real time on HFT / nearly HFT data? That's why this metric. It works much faster & eats hell of a less electicity, will do initial rough filtering of time series that might contain any kind of cyclic behaviour. And then, only on these filtered datasets u gonna put Periodograms / Autocorrelograms and see what's going there for real. Better to do it 10x times less a day on 10x less datasets, right?
I ended up with 2 methods / formulas, I called em 'type 0' and 'type 1':
- type 0: takes sum of abs deviations from drift line, scales it by max abs deviation from the same drift line;
- type 1: takes sum of abs deviations from drift line, scales it by range of non-abs deviations from the same drift line.
Finnaly I've chosen type 0 , both logically (sum of abs dev divided by max abs dev makes more sense) and experimentally. About that actually, here are both formulas put on sine waves with uniform noise:
^^ generated sine wave with uniform noise
^^ both formulas on that wave
^^ both formulas on real data
As you can see type 0 is less affected by noise and shows higher values on synthetic data, but I decided to put type 1 inside as well, in case my analysis was not complete and on real data type 1 can actually be better since it has a lil higher info gain / info content (still not sure). But I can assure u that out of all other ways I've designed & tested for quite a time I tell you, these 2 are really the only ones who got there.
Now about dem thresholds and how to use it.
Both type 0 and type 1 can be modelled with Beta distribution, and based on it and on some obvious & tho non mainstream statistical modelling techniques, I got these thresholds, so these are not optimized overfitted values, but natural ones. Each type has 3 thresholds (from lowest to highest):
- typical value (turned off by default). aka basis ;
- typical deviation from typical value, aka deviation ;
- maximum modelled deviation from typical value (idk whow to call it properly for now, this is my own R&D), aka extension .
So when the metric is above one of these thresholds (which one is up to you, you'll read about it in a sec), it means that there might be a strong enough periodic signal inside the data, and the data got to be put through proper spectral analysis tools to confirm / deny it.
If you look at the pictures above again, you'll see gray signal, that's uniform noise. Take a look at it and see where does it sit comparing to the thresholds. Now you just undertand that picking up a threshold is all about the amount of false positives you care to withstand.
If you take basis as threshold, you'll get tons of false positives (that's why it's even turned off by default), but you'll almost never miss a true positive. If you take deviation as threshold, it's gonna be kinda balanced approach. If you take extension as threshold, you gonna miss some cycles, and gonna get only the strongest ones.
More true positives -> more false positives, less false positives -> less true positives, can't go around that mane
Just to be clear again, I am not completely sure yet, but I def lean towards type 0 as metric, and deviation as threshold.
Live Long and Prosper
P.S.: That was actually the main R&D of the last month, that script I've released earlier came out as derivative.
P.S.: These 2 are the first R&Ds made completely in " art-space", St. Petersburg. Come and see me, say wassup🤘🏻