9. Wireless Active Speakers

The term “wireless active speakers” can refer to a number of different implementations with different features, so we first need to clarify our focus. The drawings below show the block diagrams for the three categories of wireless speakers we will focus on:  those that are IP based, and those that use Bluetooth, and those that use real-time wireless connections.  Let’s tackle the easiest one first, which is Bluetooth, followed by the real-time wireless solutions.  The IP-based solutions are more complex and it will help to see some of the challenges of the other solutions to appreciate the tradeoffs that go into the IP-based solution.

Bluetooth Audio

Here is a conceptual diagram for Bluetooth audio:

bluetooth

Although Bluetooth has some provisions for broadcasting, the communications protocol for audio is strictly point-to-point.  That means that a single source device needs to select a destination device, pair with it, and then transmit the data only to that destination.  So if we have audio on our cell phone or tablet and want to hear it through active speakers, we need to establish that connection and stream to those speakers and only those.  But that’s exactly what a lot of people want to do…so for cell phone or tablet streaming, Bluetooth active speakers are nice to have.

The main limitation of Bluetooth is that there is some audio degradation.  There have been many changes in the Bluetooth standard over the last 15 years of Bluetooth existence, but the current audio standard is limited to about 1Mbit/sec.  If we want to send left and right 16-bit audio at a sampling rate of 48KHz, the total bandwidth would be 2*16*48000, or 1.536Mbit/sec.  Since this exceeds the available bandwidth, we need to use some form of compression.  Early standards used fairly aggressive compression algorithms that gave Bluetooth audio a bad reputation, but the currently favored compression algorithm–called aptX, uses a 4:1 compression algorithm that many people find satisfactory.  In addition to being a fairly high quality compression algorithm, it can be implemented in real time with a general-purpose microprocessor, so the overall latency is fairly low, which means there aren’t big delays when you change the source or volume.  Overall, this algorithm is usually viewed as “good enough” for playback of MP3’s and other audio from tablet or cell phone collections.  There is a good comparison of the aptX with an earlier Bluetooth standard (SBC), MP3 compression and uncompressed audio at this link:  http://www.sereneaudio.com/blog/how-good-is-bluetooth-audio-at-its-best.  You aren’t going to impress a true audiophile with this audio quality, but your kids will adore you and you might even score points with your wife if you make them some nice Bluetooth active speakers.

There is a completed high-quality computer speaker with Bluetooth input described in the Projects section, and so we won’t address all of the implementation details in this article.  But there are some important features that we need to point out.  First, all or most of the electronics needs to be in one or the other speaker.  As we already noted, Bluetooth audio is point-to-point rather than broadcast, and there is no easy way to send the left audio to one speaker and the right audio to the other.  One of the speakers needs to extract both left and right audio, and you will need wires to send the signal to the other speaker for stereo.  Obviously, that’s not an issue if you have both left and right speakers in a single box, and that is actually a fairly common arrangement for Bluetooth speakers.  But if you want good stereo separation, you are going to need wires between the left and right cabinets.  Second, the primary design driver for Bluetooth speakers tends to be size rather than quality.  The most common Bluetooth speakers right now are portable, with rechargeable batteries and enclosures that you can easily fit in your hands.  With size being a high priority, the biggest challenges for Bluetooth speakers are the mechanical design of the enclosure and advanced DSP to squeeze reasonable performance from small drivers.  We’ve looked at some of the bass enhancement technology in another article, but when trying to compete with enclosures less than .1 cu ft, some more advanced algorithms with complicated tradeoffs are needed.  Also, the small enclosures are best implemented with plastics, and this requires some specialized equipment or else access to high-end 3D printers.  Since the entry cost is fairly high and the well designed commercial products are relatively low cost, the new generation of small Bluetooth speakers is not a very good candidate for a DIY project.

Real-time Wireless Audio

Real-time wireless audio has been around a long time, but most of the early solutions were simply FM transmission, with the analog signal modulating the carrier.  However, our interest is only for wireless high resolution low noise digital audio.  As shown in the block diagram, the simplest implementation is a one-way radio signal with just the digital audio data, without any other control information such as volume level, delay, on/off, etc.

realtime

The simplest way to send digital audio without wires is to modulate the I2S audio stream with an FCC-approved transmitter and receiver combination.  This has been done for many years, but for several reasons, it just hasn’t caught on for wireless active speakers.  Amphony is one of the early marketers of this technology, and they have reasonably priced components for interconnecting audio devices.  One reason the cost for this technology is relatively low is that sending digital audio is somewhat similar to sending NTSC video, and there are a lot of low-cost chips and modules for sending NTSC video.  Baby monitors, video links for Radio Controlled vehicles and the old video extenders all use this technology, in either the 2.4GHz or 5.8GHz bands.  The bandwidth of a typical video transmitter module is usually around 4MHz, and a 24-bit stereo biphase encoded SPDIF signal is around 6MHz, so the SPDIF data rate exceeds the total available bandwidth for a low cost video link.  But with some clever design and careful parts selection you can successfully send high-resolution digital using this low cost circuitry.

But there are some serious limitations of this simplistic wireless “baby monitor” technology.  First, it isn’t very robust in the presence of noise, and the 2.4GHz band in particular is very noisy.  Many wireless routers use this frequency, along with some wireless phones, Bluetooth devices and neighbor’s baby monitors.  But the worst offender is the microwave oven–most of them will generate so much noise that the wireless audio will be disrupted.  Switching to the 5.8GHz band helps out, but this area is also becoming popular for home routers and other wireless devices.  Moving human bodies are also a problem, as this wireless link is mostly line of sight, and any object that has lots of water in it will absorb the signal.  Unfortunately, the baby monitor “protocol” is without any error prevention or recovery, and whenever there is a short dropout, the audio receiver takes a while to recover synchronization with the transmitter.  So small occasional dropouts can seriously degrade the baby monitor wireless audio approach.  Also, the one-way radio signal typically doesn’t provide any messaging capability between the control devices and the speakers, which can be a serious limitation for some applications.

There are several newer low cost receivers and transmitters designed to overcome these limitations and still provide real time digital audio transmission.  For example, TI has a series of chips that use a more “robust” waveform that is less susceptible to interference and provides some error correction–the CC852x and CC853x.  And the Wireless Speaker and Audio Association (WiSA) has defined a real time wireless audio protocol that operates in the 5GHz radio band.  This band requires the use of “smart” radios that can select quiet frequencies, so there is more complexity and higher costs with this solution.  Several WiSA devices have shown up in the last two years, and the cost should drop as more vendors adopt this standard.  However, the WiSA standards have been evolving a lot slower than many industry analysts expected, and it is possible that this technology might prevail for “top tier” audio products but that lower cost products will use other solutions.  WiSA compliant boards and modules are available from Summit Wireless, and Hansong has WiSA components that can be used for making high performance wireless systems.

IP Audio

For the best throughput and reliability, audio needs to get sent the same way other network data is sent:  in packets, using a layered protocol.  Packetized data is buffered into messages that are numbered and kept track of as the data makes its way through the software.  The TCP/IP protocol ensures that data is delivered reliably and that any lost packets are retransmitted and reassembled into the right order.  Packetized audio has been around for many years and early implementations on slow wireless networks had some quality of service issues.  But modern home networks are fast and reliable, and streaming audio usually works very well.  The audio can be sent with high resolution at high sample rates and be reconstructed at the client with no loss of fidelity.  And with the ability to send both audio and data about the audio to the client, IP audio allows great flexibility in routing to multiple rooms with active speakers and it allows sending control information such as volume and muting.

The “downside” of IP audio is that the reliability comes at the expense of latency, as the data needs large buffers and delays to ensure the packets are all received and properly reassembled.  As a result, it may take a while for audio to start or to change the volume.

Also, the buffering means you can’t simply send left audio to one speaker and right audio to another, as the audio is not necessarily synchronized.  It is possible to synchronize the audio using local clocks and a network time protocol, but another approach that is more popular right now is to use multicasting for speakers that are known to have a reliable connection between them.  These multicast connections are established using a “network mesh” design.  This network mesh approach requires that each connection have the ability to serve as a network access point (AP).  Fortunately, there are off-the-shelf solutions that the DIY’er can purchase to implement a mesh network and they work quite well.  Follow this link for a good discussion of how a mesh network can be used to synchronize the nodes for audio distribution.  This technology will allow us to build wireless IP speakers with synchronized left and right audio, or even a surround system with synchronized speakers.

It turns out that the buffering delay isn’t a problem for many users, and the world of possibilities that gets opened when we use IP-based audio compensates for those limitations.  Speakers like the Sonos Play, the Denon Heos, or Bose Soundtouch, etc.  are selling extremely well right now, as they provide a lot of the features that people want:

•  Built-in Internet radio (XM, Pandora, Spotify, Amazon, Google, etc.)
•  Music library playback (from PC's on the network or storage devices)
•  Multi-room:  ability to stream audio to any speaker in any room
•  Cell-phone or tablet control by any family member
•  Easy wireless connection, with optional Bluetooth
•  Auxiliary input for playing a TV or other audio equipment
•  Built-in amplifiers and DSP to provide high-quality audio from a small enclosure

So let’s look at a block diagram of one these speakers and outline an approach for building a DIY version.

wifi

As indicated, this solution requires an embedded computer that we called a “WiFi Audio Adapter”.  This computer connects to Internet music sources such as Pandora, Spotify, XM Radio, iHeartRadio, and others.  It needs to know how to store user passwords for these services, log in, and interact with the menus that get displayed on the cell phone or tablet.  So this computer needs to provide the “Internet Radio” function, using the cell phone as the user interface.

This computer also needs to run the DLNA protocol (Digital Living Network Alliance) to retrieve music from PC’s, TV or other audio network devices.  We’ve only shown DLNA in the diagram, but there are other music-sharing protocols such as Apple Play or Chromecast that can retrieve audio from user’s libraries that are on the home network.  And this embedded computer needs to retrieve audio from libraries on hard drives or memory sticks.  An ideally, we would like to have a high-quality analog input as another audio source, although this isn’t shown on the diagram.

This computer also needs to be able to assign names to speakers such as “Living Room” or “Bathroom” (yes, it has come to that…) and allow the cell phone application to determine which source gets selected for each speaker.  So this is a fairly sophisticated embedded computer, and all of the new multi-room WiFi speakers have some variation of this computer embedded in the speakers.  It’s not something we can easily build ourselves, but we don’t have to, because we can buy an adapter box from Parts Express for $38 that provides a fairly good set of capabilities.  Obviously, we could buy a low-end speaker from Sonos, Heos or Bose and open it up to use their WiFi audio adapter circuitry, but unfortunately, even the lowest cost versions of these other speakers are a lot more expensive than the PE product.

The PE adapter is the Dayton Audio WFA02 (Part # 300-576).  It isn’t quite as versatile as the WiFi Audio Adapter in the Sonos or Heos speakers, as it doesn’t allow direct connection to the SiriusXM Internet stream, and it doesn’t include Pandora (that was dropped in the last software revision), but there are still quite a few streaming services to choose from.  Also, the network routing isn’t as sophisticated as the Sonos, and the User interface isn’t as easy to use as Sonos or even the Heos.  But the PE adapter supports Air Play and FLAC audio file streaming with DLNA, and it has some nice features that some of the other adapters don’t provide, such as analog input for connecting a Bluetooth adapter.  And since it’s only 3″ by 3″ by .75″, it’s something we can easily add to the back of an active speaker to support wireless connectivity.  We’ll show a design using the PE adapter in one of the project pages.

The electronics in the PE product is made by LinkPlay, and if you look at the LinkPlay web pages, you’ll see that there are many products based on these modules.  That multi-user business base is important, as it indicates that this technology has become a “de-facto standard”.   And LinkPlay has already taken the next step for wireless audio:  voice control of the wireless audio by integration with the Amazon Alexa product.  So the PE module looks like a very good product at a great price, with excellent long-term support.  And it supports the modern “Internet of Things” paradigm of using a distributed federation of servers rather than a centralized audio server.  It will be our “go-to” wireless audio solution.

IP Audio:  Quality issues

We’ve looked at using a commercial WiFi audio adapter from a “features” perspective, but we haven’t yet addressed the audio quality of the PE device or similar devices.  If you are only interested in streaming audio from “Internet radio” sources, the audio quality is not a significant issue, as the quality is mostly determined by the source.  Normal audio streaming is 96kbits/s for services like Spotify, but the user can also select “high” (160Kbit/s) or “extreme” (320kbit/s).  These bit rates are fine for background music and some types of serious listening, but they aren’t going to win over the hard-core audiophile crowd.

The most common audiophile solution is using DLNA or AirPlay to stream to a “networking receiver” that supports uncompressed audio and that has a high quality DAC.  The PE module allows active speakers to do essentially the same thing, as it takes the place of the networking receiver.  But how can we be sure that the wireless audio from the DLNA or Air Play sources is transferred at a high data rate, with no intervening compression, using a high quality digital to analog converter that introduces minimal noise or distortion?

We can answer this question by opening up the WiFi module and seeing what circuitry is used.  LinkPlay is currently offering two modules, the A28 and A31.  The WFA02 uses the A31 module, which has analog input/output.  The analog input/output module uses the Wolfson (now Cirrus) WM8918 CODEC.  This is not the highest quality DAC that money can buy, but it isn’t bad, either.  The SNR specification is 96dB typical, with THD -86dB typical.  A distortion level of -86db corresponds to .005%, which is a level that many audiophiles will begrudgingly accept as “good enough”.  And the specifications suggest that the device supports up to 24-bit, 192KHz audio.  So the current module with the analog input/output is probably good enough for our purposes.  PE has recently added the WFA28 module, which has digital input and output, with a TOSLINK connector for an external DAC.  This digital I/O with support for high data rates means there is some hope for the purist looking for a higher quality solution.

IP Audio:  Home Network Issues

Another issue that will be a concern for our wireless speakers is the networking quality:  how well these modules “get along” with other devices on the network and whether they maintain a good connection in the presence of noise and how well they recover from disconnections or a number of other conditions that affect home networks.  These issues need to be considered because the device is limited to the busy 2.4GHz band and because there is no external antenna on the PE module.  The comments on the PE site suggests that there are some issues with at least some home networks, but the large number of good  reviews suggest that the product will work well with most 2.4GHz home networks.  But if there are problems with the device, you may need to download one of the many free network monitoring tools and determine whether there are some throughput, signal level, interference or other issues impacting performance.  Also, most routers or access points in home networks use code that is released under the GNU General Public License (GPL), and this code undergoes changes periodically that may affect wireless audio performance.  The wireless modules will use the TCP/IP Networking Stack library code from the chip vendor, and this code gets updated periodically as well.  So you might need to make sure all of the devices on our home network are up to date and that wireless extenders are used if the active speakers are in a location where the signal is weak.