Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-implement PWM generator logic #7231

Merged
merged 68 commits into from
Nov 20, 2020
Merged

Conversation

earlephilhower
Copy link
Collaborator

Add special-purpose PWM logic to preserve alignment of PWM signals for
things like RGB LEDs.

Keep a sorted list of GPIO changes in memory. At time 0 of the PWM
cycle, set all pins to high. As time progresses bring down the
additional pins as their duty cycle runs out. This way all PWM signals
are time aligned by construction.

This also reduces the number of PWM interrupts by up to 50%. Before,
both the rising and falling edge of a PWM pin required an interrupt (and
could shift arround accordingly). Now, a single IRQ sets all PWM rising
edges (so 1 no matter how many PWM pins) and individual interrupts
generate the falling edges.

The code favors duty cycle accuracy over PWM period accuracy (since PWM
is simulating an analog voltage it's the %age of time high that's the
critical factor in most apps, not the refresh rate). Measurements give
it about 35% less total error over full range at 20khz than master.

@me-no-dev used something very similar in the original PWM generator.

Add special-purpose PWM logic to preserve alignment of PWM signals for
things like RGB LEDs.

Keep a sorted list of GPIO changes in memory.  At time 0 of the PWM
cycle, set all pins to high.  As time progresses bring down the
additional pins as their duty cycle runs out.  This way all PWM signals
are time aligned by construction.

This also reduces the number of PWM interrupts by up to 50%.  Before,
both the rising and falling edge of a PWM pin required an interrupt (and
could shift arround accordingly).  Now, a single IRQ sets all PWM rising
edges (so 1 no matter how many PWM pins) and individual interrupts
generate the falling edges.

The code favors duty cycle accuracy over PWM period accuracy (since PWM
is simulating an analog voltage it's the %age of time high that's the
critical factor in most apps, not the refresh rate).  Measurements give
it about 35% less total error over full range at 20khz than master.

@me-no-dev used something very similar in the original PWM generator.
Use fixed point math to adjust running PWM channels to the new
frequency.
Copy over full high/low periods only on the falling edge of a cycle,
ensuring phase alignment for Tone and Servo.
@dok-net
Copy link
Contributor

dok-net commented Apr 19, 2020

Here is an unsolicited test, for comparing against PR #7022, #7022 (comment):
PR7231_20200419_231458

@earlephilhower
Copy link
Collaborator Author

Thanks! Can you give test parameters for the above graph?

@earlephilhower
Copy link
Collaborator Author

earlephilhower commented Apr 19, 2020

-edit: Superseded below. This graph had 6 total PWMs running...

I changed the test definition and reran and I can kind of repro that hump, but it's not looking like your output. Smells like its related to the handoff between busy-wait and IRQ handled updates.
Test sketch:
https://gist.github.com/earlephilhower/26f0f5695e2a3f401a4173c00c9926ed

Test Master (yes, it's in PowerShell for now)
https://gist.github.com/earlephilhower/8ab3a22e61aa584a860162922fbb572f

@s-hadinger
Copy link
Contributor

As stated, I confirm there is no more visible flickering with leds in Tasmota.

Any chance this PR is integrated in v2.7?

@earlephilhower
Copy link
Collaborator Author

With multiple (different) PWMs there's the non-linear bumps you see on the graph in #7231 (comment) (kind of a worst-case test where we have the PWM periods going in opposite directions). I wrote it literally during a bout of insomnia yesterday AM and haven't optimized it at all.

There's also #7022 which @dok-net has worked hard on and which he seems to get better performance for his app from. It does show better linearity than this one, on the little logic analyzer I'm using, looked pretty stable, too, but I've not tried 3 channels or other stuff yet.

Net result, it's very complicated. You can always merge this PR into your own master before building if needed, while we try and figure out the best solution...

Ensure both the general purpose waveform generator and the PWM generator
are disabled on a pin used for Tone/digitalWrite.
@earlephilhower
Copy link
Collaborator Author

I actually had 6 PWMs running in the prior graph, not 2 (bug just fixed).

After the latest push, here is what I get with 2 opposing 25khz PWMs going in opposing directions:

image

Much nicer. Error is <1% for everywhere except the hump at 160-230.

@dok-net
Copy link
Contributor

dok-net commented Apr 20, 2020

@earlephilhower I found that aggregating the GPIO changes and writing to the two set and clear registers just once each, had overall worse performance, due to the added code for collecting state, I guess. While it does give perfect phase sync, the performance impact was unacceptable. I've reverted this now in PR #7202.
About your graph, it's unfortunate that it hides the jitter completely...
Here's 40kHz, 80MHz, single GPIO:
PR7231_20200421_004458
And this is 25kHz, 160MHz (which @devyte says not to use for proving anything), two GPIO PWM in opposing increments:
PR7231_2_20200421_005316
I'm posting the same test's results for PR#7022 here: #7022 (comment)

@dok-net
Copy link
Contributor

dok-net commented Apr 20, 2020

@earlephilhower We additionally need performance figures - I take it you were speaking from experience, not hard numbers. And built binary memory sizes. Can you please look into how you'll get well defined 0V and Vmax bands? I derive from the images that those are maintained only for exactly 0 and 1023, if at all.

@earlephilhower
Copy link
Collaborator Author

Thanks for the screenshots!

Jitter's actually recorded and in the CSV/XLS. I just didn't graph it to keep things simple. With the jitter, you can't see the average value which it sawtooth in #7022, too, so I don't think there's a good single answer.

Here's what he HW recorded. I took the std.dev of the high period (i.e. duty cycle jitter) and divided by the average period reported over the ~100 samples it examined.
image

I can measure anything we need if the analyzer can record it and I can teach PS or Python how to look for it, no sweat!

For 0 and 100%, I'm not sure I understand you. Aren't those special-cased as digitalWrite(0/1)? That's what's in master now and in this PR (although, as you noticed, there was a bug and that analogWrite(100%)->digitalWrite(1) mapping was busted...it's fixed with last push)

@earlephilhower
Copy link
Collaborator Author

Oh, and it's very cool that your performance shows the same dip at about the same time (the bump on the 1 pin has a corresponding dip on the other pin, so I guess we're just looking at different pins of the pair)! Independent verification is always best.

@dok-net
Copy link
Contributor

dok-net commented Apr 20, 2020

@earlephilhower I don't know what commit 160435d was exactly, but since I've rebased and squashed a serious bug fix and a commit (GPOS and GPOC use) that was so bad in performance that wanted to get rid of that completely, it just may be that you are still testing that badly badly broken revision!!! Please rerun and probably take down your graphs with that revision... I'm sorry.

@earlephilhower
Copy link
Collaborator Author

@dok-net "106-435d" is "160mhz, commit 435d." It's the last force-push. Do you have unpushed commits, maybe?

@dok-net
Copy link
Contributor

dok-net commented Apr 20, 2020

@earlephilhower 0 and 1023, as digitalWrite(…) etc. in master, is one of the things that got me started on #7022, because that totally loses phase. Let me think, try switching off servos perfectly with your code such that they don't ever sweep into nirvana on detach. My hope is still to use waveform for communications signal generation, like software serial. For that, exact level on exit from a wave defined by precise runtime is a must, such as to finish a byte on stop, which is high, not low. You see, the features I've built in are what I'm fighting for here, your PR wouldn't cut it for those (imaged) uses (just yet). My worry is that this competion, where you keep merging small portions, are going to drive me nuts rebasing or merging my code - probably I'll have to keep forcing my stuff over everything, which ends up badly, too, if that goes on for too long.

@dok-net
Copy link
Contributor

dok-net commented Apr 20, 2020

@earlephilhower OMG, 160 "-" 435d - I'm sorry, your graphics is slightly blurry and small, I thought it was 160435d :-) :-) :-) So, 435d5d8fd750c6c5631244ee0592b42fd24421bb is just right.

@earlephilhower
Copy link
Collaborator Author

Ah, I see. For master and this PR, then analogWrite(max/min) is a no-op.

So I think I get:

@dok-net
Copy link
Contributor

dok-net commented Apr 21, 2020

#7022 maintains phase if the waveform generation is running, but with either zero duty or 100% duty. Also, the runtime parameter exits on the exact level at that time, I am sure anymore, but master at least always switches back to LOW, which is not what I expect.

@dok-net
Copy link
Contributor

dok-net commented Apr 21, 2020

And, you haven't picked that up yet, while I have removed ns-precise phase alignment due to the overall performance issue with that, one can specify phase delay to another previously started wave, so you could generate three waves at 120° phase shift to each other, for instance :-)

@earlephilhower
Copy link
Collaborator Author

The hump is caused by having a PWM falling edge just before the cycle restart, while having the other channel requesting a 1->0 transition just outside the busy-loop window of 10us. So it gets an IRQ for channel B 0->1, then waits 2..8us for the next PWM full cycle 0->1, and ends up returning from interrupt and not scheduling another IRQ for 10us...hence the horizontal leg of the bump...
image

A hump in the dueling PWMs was very prominent in prior pulls.

The hump was caused by having a PWM falling edge just before the cycle
restart, while having the other channel requesting a 1->0 transition
just outside the busy-loop window of 10us. So it gets an IRQ for channel
B 0->1, then waits 2..8us for the next PWM full cycle 0->1, and ends up
returning from interrupt and not scheduling another IRQ for 10us...hence
the horizontal leg of the bump...

Reduce the minimum IRQ latency a little bit to minimize this effect.
There will still be a (significantly smaller) hump when things cross, but
it won't be anywhere near as bad or detectable.
@earlephilhower
Copy link
Collaborator Author

earlephilhower commented Apr 21, 2020

Latest average PWM result with 2 PWMs at 25KHZ running opposite (same tests as all others)

image

@earlephilhower
Copy link
Collaborator Author

Single PWM at 40khz, 160mhz. This is different than the other graphs show here which are 2-pwm at 25khz moving in opposite directions

image

Results in much closer PWM frequency range over any number of PWM pins,
while taking 0 add'l overhead in IRAM or in the IRQ.
When _setPWMFreq was called the initial PWM mask was not set to 0
leading to occasional issues where non-PWM pins would be set to 1
on the nextPWM cycle.  Manifested itself as an overtone at the PWM
frequency +/-.
Borrow a trick from esp8266#7022 to exit the busy loop when the next event is
too far out.  Also reduce the IRQ delta subtraction because it was
initially not NMI so there was much more variation than now.

Keep the PWM state machine active at a higher prio than the standard
tone generation when the next edge is very close (i.e. when we're at
the max or min of the range and have 2 or more near edges).  Adds a
lot of resolution to the response at low and high ranges.

Go from relative to absolute cycle counts in the main IRQ loop so that
we don't mingle delta-cycles when the delta start was significantly
different.
Keep PWM error <2.0% on entire range, from 0-100%, and remove the
hump seen in testC by fixing the min IRQ delay setting.
The IRQ lead time was a tiny bit undersized, causing IRQs to come back
too late for about .25us worth of PWM range.  Adjust the constant
accordingly
Since the adjustment for when a 160mhz compile is running at 80mhz
is giving bad behavior, simply remove it and revert to old behavior.
@earlephilhower
Copy link
Collaborator Author

Updated to new PWMRANGE, but not yet tested with new core/GCC

@BigMike71
Copy link

Hello, with the latest versions of tasmota (lite) 8.4.x I have strong visible flickering again with my LED lamps at low brightness!
are the changes in the current Tasmota already included?

@s-hadinger
Copy link
Contributor

@BigMike71 Tasmota currently uses code from #7022.

@Jason2866
Copy link
Contributor

@BigMike71 Tasmota is back to #7231 we had sometimes execptions when using PWMi with #7022 Using #7231 there is no flicker and no execption.

Includes patches to allow side-by-side existance of the two versions and
a slight change such that the #define ..._PWM is unneeded (i.e. allow
existing makefiles/scripts/etc. to get expected behavior w/o any changes
on their part).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants