[Continued from Part 1]
The last method is the so-called "digital amplifier". This is actually a misleading name and it should ideally be renamed. The full name is Uniformly Sampled PWM. This is the technique used by TacT in their amplifiers, and also my research group. ?
The basic principle is to keep the audio in the digital domain, and instead of comparing the reference sawtooth/triangle waveform to the continuous analog signal to produce the PWM, it is now compared to the digital samples. The ramp generator is now a digital counter that resets when it reaches its top value. It should be obvious then that the resolution of the sample that is being compared should match the resolution (or increments) of the counter. So, it should be perfect and much more accurate than the analog version, plus we are spared of using any DAC or having to convert to analog at any stage. Where it actually becomes analog is a philosophical matter, but the most popular agreement is when it is converted to PWM.
There is a catch however. One needs to switch at an integer multiple of the sample rate, and a frequency of 384kHz is commonly used nowadays - 2x 192kHz. The need for a sample rate converter is pretty much mandatory, as it would be difficult to adjust the whole scheme according to the incoming sample rate, especially 44.1kHz since it would need a frequency of 352.8kHz. Converting everything to a single static sample rate is much easier. That is not the problem though. The problem is the speed of the clocks needed. If we want to retain 24-bit precision, the counter needs to step in 24-bit increments per switching period. So, for 24bit counter resolution and a switching frequency of 384kHz, the clock speed of the counter needs to be 384 x 2^24 kHz - that is 6400GHz, an impossible figure. The fastest possible clock speeds in modern DSP's and FPGA's is around 500MHz, allowing 10bits. It should be noted however, that none of the current 24-bit DAC's can really deliver 24-bit resolution. That would mean an analog Dynamic Range of 145dB that has never been achieved. Do not confuse analog DR with digital DR, the two are definitely not the same!
If we simply discard the other least significant 14 bits and just connect a straight line between the samples (linear interpolation), the sonic result will be terrible with high baseband distortion. The technique used to address this is called polyphase interpolation, or attempting to reconstruct the original continuous analog signal and where it would estimate to have crossed the sawtooth, and is called Pseudo-Natural PWM. This is a mathematical nightmare to perform in real-time. In an attempt to use the rest of the 14 bits, noise shaping is usually used (this is a bit difficult to explain, but the idea is to shift noise to a higher inaudible frequency). TacT uses a special DSP from TI to do all this. We use an FPGA to do so in software with much greater flexibility, but also much more work since everything has to be created from scratch and not merely accessing registers on the DSP. In theory and simulation this technique is capable of 160dB Dynamic Range, but would require tolerances and several other factors that are impossible to achieve in practise - even a slight increase in temperature for example (there are masses of other factors too) can have a profound effect on DR of this level. A more realistic figure is around 110dB. Whether humans can hear this amount of DR is besides the point (they cannot; ears aren't nearly sensitive enough plus we have to keep in mind that ambient noise levels is easily in the region of 40dB), it is for measurements and AES articles.
If we look at the concept holistically, both the analog and digital camps have valid arguments and I don't blindly advocate the latter just because it's the method I'm doing my research on. I believe in horses for courses, and for the end application we have in mind, it is a more practical and flexible approach.
Let's start at the CD player. The DAC of the CD player does most of the techniques that I've described for the digital amp: modulation (usually delta-sigma), interpolation and noise shaping, followed by demodulation using lowpass filters with opamps or whatever at its output. The resulting analog output is very low in noise and distortion. For standard PWM amplifiers, this analog signal is then subject to the amplifier, be it standard analog or class-D with PWM generation and switching stage.
For digital we start at the same place. The digital signals are now sent to the DSP or FPGA, which generates the PWM opposed to delta-signa modulation. If we also add the same opamps lowpass filters to demodulate the PWM into a low-level signal, we have a normal DAC. So one could argue that the digital amp is bascially just a power DAC, and that's also one of the other names for them. Instead of low-level demodulation and analog amplification, we use analog amplification (well some sort of it) and high-level demodulation. The opamp-based lowpass filters that are used between the DAC and the pre/power amp is now replaced by the LC lowpass filter between the amp and the speaker. So it really is just a different approach and cannot be seen as being "digital".
Another drawback of the digital method is that feedback is very difficult to achieve. It works well by leaving it out, such as TacT also did, but the effect of non-ideal power supply and the lowpass filter can be a problem. One method to do feedback is with an ADC to convert the analog output back to digital, and that opens up a whole new can of worms. It does work if done correctly, even though at a much higher complexity level.
Even though I think that a high-end source and well-designed normal class-D amp may have a sonic edge over the digital approach, the latter has a few important advantages:
1) Cost. For analog the quality of the source, and thus its DAC, has a major influence on the quality of what comes out of the amp. GIGO. With the digital method and decent jitter attennuation techniques, the quality of the source becomes much less critical.
2) Flexibility. Any digital application such as filters, crossovers, EQ, room correction et al can easily interface with it without having to use ADC's and DAC's all over the place. Simpler design and shorter signal path. When using an FPGA instead of a DSP the advantages become even more: with a DSP you're bound to what it can do or not. An FPGA is just a giant testbench with a certain capacity. The code on it can be adapted to implement all the features mentioned without any change to the hardware. The TacT amps' digital crossovers and room EQ hardware are separate DSP boards that have to bought in addition. With FPGA the necessary software can merely be loaded. If available logic cells are limited, some of the functions can be removed to make room for others.
3) Application for multiple speakers. In the light of home automation and large-scale professional audio applications, the following makes a lot of sense: Digitally active hybrid speakers. It has already been discussed at great length at the last AES convention to use the speakers' magnet assembly as heatsink for class-D modules. Each drive unit has its own amp, digital crossover and Wifi receiver. Each speaker is then assigned an IP address from a remote hosting console, and thus every speaker can be controlled remotely by only giving it local mains power. This is very useful for example in multiroom home automation: apply this technology to all the in-wall and in-ceiling speakers, throughout the house (where power cables are abundant and easy to access), and it becomes very simple to control what speaker should do what. Simply bring a new speaker home, give it power, assign an IP and you're ready to go. Compare that to the current method used by Clipsal or Crestron of using a distribution system such as RS485 to merely control basic analog amplifiers feeding all the speakers connected by long runs of speaker cable. Bulky and inefficient, cables (signal, speaker and control) running all over the place. Of course this approach could be used with other amps too, but then they would need expensive and space-consuming DAC's as well, that would ultimately raise cost and limit performance since there won't be space for a super-serious DAC per drive unit.
So there is my (admittedly not well formatted; I wrote it in less than two hours in the insomniac night hours :015: ) explanation of the working of PWM audio, the three variations of class-D amps using it and the advantages and disadvantages they have to offer. Hope it provides a better insight to some of the people here.
The last method is the so-called "digital amplifier". This is actually a misleading name and it should ideally be renamed. The full name is Uniformly Sampled PWM. This is the technique used by TacT in their amplifiers, and also my research group. ?
The basic principle is to keep the audio in the digital domain, and instead of comparing the reference sawtooth/triangle waveform to the continuous analog signal to produce the PWM, it is now compared to the digital samples. The ramp generator is now a digital counter that resets when it reaches its top value. It should be obvious then that the resolution of the sample that is being compared should match the resolution (or increments) of the counter. So, it should be perfect and much more accurate than the analog version, plus we are spared of using any DAC or having to convert to analog at any stage. Where it actually becomes analog is a philosophical matter, but the most popular agreement is when it is converted to PWM.
There is a catch however. One needs to switch at an integer multiple of the sample rate, and a frequency of 384kHz is commonly used nowadays - 2x 192kHz. The need for a sample rate converter is pretty much mandatory, as it would be difficult to adjust the whole scheme according to the incoming sample rate, especially 44.1kHz since it would need a frequency of 352.8kHz. Converting everything to a single static sample rate is much easier. That is not the problem though. The problem is the speed of the clocks needed. If we want to retain 24-bit precision, the counter needs to step in 24-bit increments per switching period. So, for 24bit counter resolution and a switching frequency of 384kHz, the clock speed of the counter needs to be 384 x 2^24 kHz - that is 6400GHz, an impossible figure. The fastest possible clock speeds in modern DSP's and FPGA's is around 500MHz, allowing 10bits. It should be noted however, that none of the current 24-bit DAC's can really deliver 24-bit resolution. That would mean an analog Dynamic Range of 145dB that has never been achieved. Do not confuse analog DR with digital DR, the two are definitely not the same!
If we simply discard the other least significant 14 bits and just connect a straight line between the samples (linear interpolation), the sonic result will be terrible with high baseband distortion. The technique used to address this is called polyphase interpolation, or attempting to reconstruct the original continuous analog signal and where it would estimate to have crossed the sawtooth, and is called Pseudo-Natural PWM. This is a mathematical nightmare to perform in real-time. In an attempt to use the rest of the 14 bits, noise shaping is usually used (this is a bit difficult to explain, but the idea is to shift noise to a higher inaudible frequency). TacT uses a special DSP from TI to do all this. We use an FPGA to do so in software with much greater flexibility, but also much more work since everything has to be created from scratch and not merely accessing registers on the DSP. In theory and simulation this technique is capable of 160dB Dynamic Range, but would require tolerances and several other factors that are impossible to achieve in practise - even a slight increase in temperature for example (there are masses of other factors too) can have a profound effect on DR of this level. A more realistic figure is around 110dB. Whether humans can hear this amount of DR is besides the point (they cannot; ears aren't nearly sensitive enough plus we have to keep in mind that ambient noise levels is easily in the region of 40dB), it is for measurements and AES articles.
If we look at the concept holistically, both the analog and digital camps have valid arguments and I don't blindly advocate the latter just because it's the method I'm doing my research on. I believe in horses for courses, and for the end application we have in mind, it is a more practical and flexible approach.
Let's start at the CD player. The DAC of the CD player does most of the techniques that I've described for the digital amp: modulation (usually delta-sigma), interpolation and noise shaping, followed by demodulation using lowpass filters with opamps or whatever at its output. The resulting analog output is very low in noise and distortion. For standard PWM amplifiers, this analog signal is then subject to the amplifier, be it standard analog or class-D with PWM generation and switching stage.
For digital we start at the same place. The digital signals are now sent to the DSP or FPGA, which generates the PWM opposed to delta-signa modulation. If we also add the same opamps lowpass filters to demodulate the PWM into a low-level signal, we have a normal DAC. So one could argue that the digital amp is bascially just a power DAC, and that's also one of the other names for them. Instead of low-level demodulation and analog amplification, we use analog amplification (well some sort of it) and high-level demodulation. The opamp-based lowpass filters that are used between the DAC and the pre/power amp is now replaced by the LC lowpass filter between the amp and the speaker. So it really is just a different approach and cannot be seen as being "digital".
Another drawback of the digital method is that feedback is very difficult to achieve. It works well by leaving it out, such as TacT also did, but the effect of non-ideal power supply and the lowpass filter can be a problem. One method to do feedback is with an ADC to convert the analog output back to digital, and that opens up a whole new can of worms. It does work if done correctly, even though at a much higher complexity level.
Even though I think that a high-end source and well-designed normal class-D amp may have a sonic edge over the digital approach, the latter has a few important advantages:
1) Cost. For analog the quality of the source, and thus its DAC, has a major influence on the quality of what comes out of the amp. GIGO. With the digital method and decent jitter attennuation techniques, the quality of the source becomes much less critical.
2) Flexibility. Any digital application such as filters, crossovers, EQ, room correction et al can easily interface with it without having to use ADC's and DAC's all over the place. Simpler design and shorter signal path. When using an FPGA instead of a DSP the advantages become even more: with a DSP you're bound to what it can do or not. An FPGA is just a giant testbench with a certain capacity. The code on it can be adapted to implement all the features mentioned without any change to the hardware. The TacT amps' digital crossovers and room EQ hardware are separate DSP boards that have to bought in addition. With FPGA the necessary software can merely be loaded. If available logic cells are limited, some of the functions can be removed to make room for others.
3) Application for multiple speakers. In the light of home automation and large-scale professional audio applications, the following makes a lot of sense: Digitally active hybrid speakers. It has already been discussed at great length at the last AES convention to use the speakers' magnet assembly as heatsink for class-D modules. Each drive unit has its own amp, digital crossover and Wifi receiver. Each speaker is then assigned an IP address from a remote hosting console, and thus every speaker can be controlled remotely by only giving it local mains power. This is very useful for example in multiroom home automation: apply this technology to all the in-wall and in-ceiling speakers, throughout the house (where power cables are abundant and easy to access), and it becomes very simple to control what speaker should do what. Simply bring a new speaker home, give it power, assign an IP and you're ready to go. Compare that to the current method used by Clipsal or Crestron of using a distribution system such as RS485 to merely control basic analog amplifiers feeding all the speakers connected by long runs of speaker cable. Bulky and inefficient, cables (signal, speaker and control) running all over the place. Of course this approach could be used with other amps too, but then they would need expensive and space-consuming DAC's as well, that would ultimately raise cost and limit performance since there won't be space for a super-serious DAC per drive unit.
So there is my (admittedly not well formatted; I wrote it in less than two hours in the insomniac night hours :015: ) explanation of the working of PWM audio, the three variations of class-D amps using it and the advantages and disadvantages they have to offer. Hope it provides a better insight to some of the people here.