Pulse Width Modulation, continued --------------------------------- from various The digi article in issue #20 of C=Hacking left a few loose ends, and generated some followups. First, Otto Jarvinen (sounddemon) emailed to say that the SID detection routine occasionally reported incorrect results for him, and suggested that a workaround was to do the detect several times. YMMV! Second, a day or two after issue #20 was released, Levente discovered a brilliant way to play 6-bit PWM digis on a stock machine: -- I couldn't resist, and tried something out (see attachment). It works!!! :-) In fact, when I wrote the last letter I didn't know that I found something useable, just had some ideas - I felt that I'm at the right place. When I read C=H 20 this morning and read your comment about the Test bit (from the PRG), I knew that it must work. All I had to do is then to put this idea into code. The whole idea is about starting the pulse by software, and then having the SID turn it back to 0 after a time. Is it possible? ...The keys are the Test bit (the SID wave counter can be reseted anytime), the pulse width register, the wave counter and the SIDs way of generating pulse wave. (Ie. the pulse wave is high, as long as the wave counter is less than the value in the pulse width register). Check this algorithm: - Init: volume at max, voice 1 sustain level max, start attack. Freq is selected well (=$4000), so the wave counter is incremented by 4 every processor clock cycles. Loop: - load next sample value, and put it to the pulse width low register ($d402; ensure that $d403 is 0). - Set test bit, and clear test bit (counter reset). - Increase sample pointer, some delay, then loop. The delay must be 64 clock cycles + the time while the Test bit is kept set (4 cycles if using STA $d404 : STX $d404 immediately with pre-loaded values). What will happen? The 8-bit sample value is put directly to the pulse width register (MSBs of the pulse width register are cleared!...). The wave counter is started (release test bit), and it increases 4 by every CPU cycles (= counts 256 in 64 cycles). After some time, the counter will reach the value in the pulse width register. This happens in exactly after (8-bit sample value / 4) cycles, because of the above. In this cycle (or the next?...) the SID turns its pulse output to 0. Voilá! One must just make sure that the loop length in cycles matches the above conditions, and then it runs like hell... Since it does exactly the same on the SID as the other (bit-banging) way, it just does it with some hardware help, there's also no problem with the 4khz maximum barrier (since the oscillator is reset every loop). With little enhancement, it's possible to write an about 7.5 bits player for a stock C64 by this method. This is what you find in the attachment... The idea is using all the 3 channels simultaneously. A slightly increased sample value is written to the three pulse width registers, so the oscillators will finish the duty cycle one processor cycle later, when there's a carry between bits(0,1) to the MSBs. The replay freq is the CPU clk / 68 (~15khz). 64 cycles (variable duty cycle) + 4 cycles (constant duty cycle because of the reset time - no problems with that, it doesn't change (just gives a small constant DC...)). By similar methods, it should be possible to write a sample player with higher PWM freq (with less resolution of course, but eliminating this still audible whistling). (I tried using the filter to reduce it, but it sounded so bad that I left it out. It clicked like hell. The FETs got saturated.) [Richard Atkinson suggested turning down the sustain volumes to avoid this] See the attachment, and the binary. I think the sample sounds pretty good :-). (The cut is from 'Greece 2000' by Three drives on a vinyl). (Another idea that popped up in my mind: since the TED sound generator can also be reset, I could probably translate this idea to the Plus/4 :-O ). Best regards, Levente -- The binary is available at http://www.ffd2.com/fridge/chacking/ towards the bottom of the page. Third, I received a very interesting email from an Apple-II guy, which I'd like to pass on: -- Hi! I found your page as I was searching for something else 6502-related, and was very interested. Although I have always been aware of the C64, I have never really been a user--I have used Apple II's since 1980. I was particularly interested in the article on playing "digis" on the C64. I became interested in playing digitized sounds on the Apple II in 1993, after hearing a 3-bit, 11.025 KHz PWM player. At 3 bits, you can imagine how noisy speech samples were, but the overall effect for a 1 MHz machine with a 1-bit speaker "toggle" was amazing. It made me wonder how far this PWM technique could be pushed on a stock, 1 MHz Apple II (not the somewhat faster, 65816-based IIgs). The short answer is, much farther than I expected! Robin and Stephen accurately describe the theoretical PWM limit as 6 bit samples at about 16 KHz for a stock 1 MHz machine, but, as they point out, that is not practically realizable for a number of reasons, unless the play loop is completely unrolled! Furthermore, in the Apple II world, sampled sounds have acquired a few standardized sampling rates--mostly as a result of Mac influence, which was in turn influenced by CD's. The most common rate in the Apple II world is 11.025 KHz, or one-fourth of the audio CD sampling rate. This is commonly considered to be "AM radio quality", with a Nyquist bandwidth of about 5.5 KHz and a practical bandwidth of 4+ KHz, given practical anti-aliasing filters (at the sampling end, not the playback end). A frequency of 11.025 KHz is, though high, still painfully audible to people whose ears are not zonked--a piercing "squeal" running through every sound. So even though it is possible to write a practical 6-bit 11.025 KHz PWM player (usually called a SoftDAC in the Apple II world), the resulting listening experience is disappointing. So I went to work on a way to do 2x oversampling, and built a 5-bit 22.050 KHz PWM player. It was sad to lose a bit, but the absence of any audible "carrier" more than compensated for it! If you have access to an 8-bit Apple II (preferably with lower case, like a //e), and also preferably with a way of attaching an external speaker or headphones in place of the miserable 2.75" internal speaker, then you can easily give it a try and judge for yourself. I'm pretty proud of the novel design of the code, which I would characterize as "vectored" unrolled loops, one for every two pulse duty cycles, which I wrote a BASIC program to write for me--much less painful for counting cycles! The package is available on the web at: http://members.aol.com/MJMahon/index.html and is called Sound Editor v2.2, since I had to "dress up" the player into something fun to play with. ;-) An earlier version of Sound Editor was published on SoftDisk in 1994, IIRC, but this one is a little more evolved. It also introduced 2:1 ADPCM compression of 8-bit sampled sounds, to save disk space. It is a lossy compression, but not very noticeably. The editor package also includes those routines, in 6502 assembly code. All of this should be trivially adaptable to the stock, 1 MHz C64, with very good results. By using the filters, you could probably filter out the 11.025 KHz carrier and return to 6-bit accuracy! I should note that in the Apple world, sampled sounds are usually represented as "excess-128" codes, which means that the sign bit is inverted. This actually simplifies things, since the sample value is within a few shifts of being the pulse width in cycles. Let me know what you think! -michael -- (Always great to hear from Atari and Apple ][ folks!) And finally, I have a little mathematical analysis of PWM and how it compares to a "straight" digi. Basically, I found some of the PWM explanations a little unconvincing in issue #20 (even though I wrote them!). For example, the idea of "average voltage" seems a little funny, since every two samples has an "average voltage", as does every four, etc. but that set of average voltages would give a different sounding signal than the original (or more dramatically, there is an average voltage over a full second of digi playback, but that's not what you hear!). So I wanted to know how a PWM signal _really_ compares to a straight digi playback. Another issue is changing the amplitude of a PWM digi, i.e. using two pulse waveforms, with one 1/16 the value of the other, to get higher resolution. If you recall the discussion of digis, the resolution of a PWM digi depends on the number of pulse widths available, not the amplitude. Adding two PWM waveforms together does not change the number of pulse widths available, so I wanted to figure out what changing the amplitude _really_ does to a PWM digi, and if it can really be exploited. And finally, I wanted to know about the carrier wave (that is so piercing at lower playback frequencies) -- and once again, how it compares with a standard digi (which, after all, is stair-stepping the voltages at the playback rate). Since the rest of this article is some Fourier analysis that 99% of people will have zero interest in, I'll put the conclusions here. The first is: PWM digis and standard digis are essentially identical except at higher frequencies (except for a phase shift, which doesn't make any difference to your ear). The second is: changing the amplitude of a PWM changes the resolution. More specifically, the amplitude of the pulse multiplies the digi sample value. If two pulses can be synced close enough, it should indeed be possible to use two pulses to get a higher resolution. Moreover, by modulating the amplitude of a single PWM digi, using the $d418 volume register -- that is, using PWM _and_ $d418 -- it should be possible to get a higher dynamic range, something that should be a little more achievable using SID (but maybe not that useful, so I didn't try it out). And finally, a standard digi has zero amplitude at the carrier frequency. In other words, after a lot of effort I was able to demonstrate what everyone already knows. The analysis doesn't change anything from the previous articles (except possibly the idea for changing the PWM amplitude to get more dynamic range). And now, some Fourier analysis. A standard digi just sets the voltage to the sample value s_j, for a length of time dt (dt = 1/sample rate). The Fourier transform of a single sample s_j (occuring at time t_j) is s_j [e^(-iw dt) - 1] * [e^(-iw t_j) / -iw] where w = angular frequency. Since the above is a little hard to read, I'll say it in words. The first term is the sample value s_j, which scales amplitudes at all frequencies. The second term is due to the finite length of the pulse (evaluating the Fourier integral at the boundaries), and basically changes the phase of the transform. The third term is like sin(w)/w -- a sinusoid with decreasing amplitude as frequency increases. So: the transform goes like sin(w)/w times the sample value, with some phase effects thrown in (we'll get back to these in a moment). A PWM digi sets the duty cycle of a pulse to the sample value s_j, giving a Fourier transform of [e^(-iw s_j dt) - 1] * [e^(-iw t_j) / -iw] Compare this with the earlier expression, and you'll see that the sample value s_j has moved up in to the exponent of the "phase term" but that they're otherwise the same. The first thing to do is to show that both expressions, PWM and standard, reduce to the same thing -- that is, that a PWM and a standard digi sound the same! The expressions both decrease as 1/frequency, due to the sin(w)/w term. This means that at large frequencies the values become negligible. (How large? For example, if the sample frequency is just 1KHz, then sin(w)/w is .001 times smaller near w=1KHz (i.e. the sample frequency, which is twice the Nyquist limit) than it is near w=0). So now consider the phase terms for small w. The Taylor expansion for e^x is 1 + x + x^2/2 + ... We can therefore expand the "phase terms" as regular: e^(-iw dt) - 1 = (1 - iw*dt + w^2 dt^2/2 + ...) - 1 = -iw*dt + O(w^2 dt^2) pwm: e^(-iw s_j dt) - 1 = -iw*s_j*dt + O(w^2 dt^2) where O(w^2 dt^2) is considered very small since w and dt are both small. Substituting the above into the original expressions gives s_j*iw*dt [e^(-iw t_j) / iw] in both cases. That is, we have shown that for "small" frequencies -- more specifically, for frequencies where (w^2*dt^2) is much smaller than (w*dt), which is where w*dt<1, which is frequencies less than the sample frequency, which is all frequencies of interest! -- PWM and standard digis are the same. The explanation lies in the phase terms. Those "phase terms" [e^(iw dt) - 1] (regular) and [e^(iw s_j dt) - 1] (PWM) do more than just change the phase. When they multiply the sin(w)/w signal, they take the sin(w)/w signal, change the phase, and then subtract the sin(w)/w signal again. It's this difference of signals that makes things work out at the frequencies we care about. PWM and standard digis are _not_ the same, but the main differences are at higher frequencies, where the amplitudes are in general much smaller. But... but... what about the PWM carrier frequency? If we take a constant digi, say with sample values = 1/2, the standard digi gives a constant voltage, whereas a PWM digi gives a square wave at the sample frequency. The answer comes from the "phase terms" above. The sample frequency is w = 2*pi/dt. Substituting this into the phase terms gives [e^(i*2*pi) - 1] (regular) and [e^(i s_j 2*pi) - 1] (PWM) The regular expression is exactly zero -- there is _nothing_ at the sample frequency of a regular digi. But that's not the case for the PWM term, because of the s_j up in the exponent. PWM digis have a _finite_ amplitude at the carrier frequency. Note that because of the sin(w)/w term it gets smaller as the sample frequency increases -- but it isn't zero. Finally, the phase term expansions give some insight into what happens when both the pulse width _and_ height are varied. If the pulse width is s_j, and the height is set to h_j, then the Fourier transform becomes h_j*s_j *iw*dt [e^(-iw t_j) / iw] That is, the amplitude multiples the width. For the case of adding two PWM waves together, then, the amplitude really does effectively scale the sample value, and it should be possible to add one PWM value at 1/16 the amplitude of another to get an effective 8-bit value. What about _varying_ the amplitude of a single PWM sequence? For a 6-bit PWM digi, say, the sample values s_j can go from 0 to 63. If this is then multiplied by h_j=2 say, then the values become 0 2 4 ... 126 -- a 7-bit number where the lowest bit is always 0. What use is that? Well, we still have the h_j=1 values of 0..63, which do include the lowest bit. So we can effectively change the dynamic range from 0..63 to 0..126 using just two amplitude values. As a practical matter, then, it might be possible to use all 15 $d018 values available to get a big dynamic range, and hence a better sounding digi, using fewer CPU cycles. Well, ok, we're only _sort of_ changing the dynamic range, so I pretty much doubt the usefulness of it. But maybe someone out there would like to give it a shot. All right, let's hope this closes the book on pulse width modulation for digi playback!