Navigate This Site...

Amateur Wavelet Modulation

Abstract

The current amateur packet radio interconnects make inefficient use of the RF spectrum they consume. 1200Hz AFSK communications requires the full 2.7kHz bandwidth provided by most radios, producing a spectral efficiency of 0.444 bps/Hz. When used over an NBFM channel, as is often the case with 2m and 6m access, the efficiency drops still further to 0.075 bps/Hz. G3RUH modulation is substantially better: over a 12kHz channel and at 9600bps, we see spectral efficiencies as high as 0.800 bps/Hz.

However, all of these modulation methods have hit their limitations. AFSK cannot be driven to higher bits per symbol because doing so will produce even wider sidebands than what is permitted in an audio-grade channel. James Miller, G3RUH, for example, claims to have struck the maximum possible data transfer rate through any amateur narrow-band FM medium (Miller), and is therefore unlikely to be improved upon further.

This document attempts to describe a method of modulation which can potentially achieve extremely high bandwidth economies, and therefore, relatively large data throughputs through audio-grade, unmodified amateur radio equipment.

Introduction

The current packet radio interconnects make inefficient use of the RF spectrum they consume. We can get a measure of efficiency by taking how much information they allow to be transfered, and dividing it by their relative bandwidth. For example, let's consider the two most popular methods of communicating digitally: 1200bps AFSK and PSK31.

1200 bps / 2700 Hz = 0.444 bps/Hz

31.5 bps / 31.5 Hz = 1.000 bps/Hz

When used over a narrow-band FM channel, as is often the case with 2m and 6m access, the efficiency of 1200bps AFSK drops still further to 0.075 bps/Hz, thanks to the FM channel's 16kHz wide bandwidth. G3RUH modulation is substantially better: over a 12kHz channel and at 9600bps, we see spectral efficiencies as high as 0.800 bps/Hz.

However, all of these modulation methods have hit their limitations. AFSK cannot be driven to higher bits per symbol because doing so will produce even wider sidebands than what is permitted in an audio-grade channel. Use of MFSK16 at higher speeds will introduce so many side bands that it will reduce the effective signal to noise ratio substantially. Indeed, it seems that FSK-based modes, such as G3RUH, necessarily are limited by the filtering characteristics of the channel. G3RUH himself claims to have struck the maximum possible data transfer rate through standard, 20kHz-wide amateur RF channels.

The common characteristic with all of the current digital modes, from PSK31 up through G3RUH, is that they all work primarily in the frequency domain. That is, they mostly ignore time-domain elements. Therefore, Fourier-based techniques are sufficient to extract useful information from these channels.

However, ignoring the time domain also ignores valuable opportunities for re-using existing bandwidth within the same channel. Hence, wavelets can be employed to recover what is lost in Fourier transforms, and can provide opportunities for substantially higher throughputs.

This document aims to describe an ISO Layer 1 protocol for communications through voice-grade channels (2.7kHz) with 4500 bits/second throughput, without requiring radio modifications or custom hardware, and which is expandable to support higher throughputs in the same channel allocation, provided better channel conditions exist.

In summary, we define in this document a method of modulation that strives to achieve the following goals:

NOTE --- THIS IS A DRAFT STANDARD DOCUMENT; AS SUCH IT IS STILL EVOLVING, AND CAN CHANGE AS EXPERIENCE WITH THE SPECIFICATION IS GAINED.

About the WM4521 Name

The mode's name can be broken down like so:

WM indicates Wavelet Modulation is the modulation technique used in the communications channel.

The Mantissa and Exponent fields indicates the number of wavelet slots per second in the link. In this case, 452 maps to 45.0 x 10^2, or 4500 wavelet plots per second.

The Bits Per Wavelet Plot indicates how many bits are encoded in each plot on the time/frequency graph of a wavelet-modulated symbol. Since this mode only encodes one bit per plot, we get a total throughput of 4500 bits per second.

This naming convention is intended to help people avoid confusion from two incompatible links offering the same level of throughput. For example, there are several ways of offering 9000bps digital throughputs with wavelet modulation:

Symbol Encoding

WM4521 data is organized into symbols. Each symbol consists of 15 plots on the symbol's time/frequency plane, and lasts for a duration of 3 and one third milliseconds. The base symbol frequency is 300Hz.

If we were to transmit only one bit at a time, we would have a 300 bps communications channel. Let's suppose you want to transmit the binary sequence 0101. It'd look something like this:

Note how the square waves (which are truely bipolar, as required by the Haar wavelet, and guarantees no DC bias along the signal path) more or less identity map to the bit stream to be transmitted.

Note that the times when the signal is high or low, for that length of time, more or less represents a period of DC -- e.g., no real information is being transmitted in that time. Hence, we can use smaller, self-similar waves to transmit additional information. If we zoom in on one of the pulses, we might see something like this:

Notice that this is achieved by adding a higher frequency signal to a lower frequency signal. But there is an element of time too. If we adopt the same rules as for our base frequency, we can see the signal being transmitted during this low-frequency cycle is 10. That is, we packed two more bits during the same cycle as we would normally transmit 1 -- for a total of 3 bits per base cycle. See the figure below to see more graphically how bits 0, 1, and 2 would relate to each other in both the frequency and time domains.

We can repeat this cycle of zooming in and packing additional information as many times as we like, and for as much raw channel bandwidth as we can consume. Since we're limiting our desired channel bandwidth to 2700Hz, it follows that the high-end frequency ought to be less than that. 2400Hz is the highest frequency that is an even multiple of 300Hz, and yet falls below 2700Hz.

With frequency increasing to the right, and time increasing downward, the layout of the individual WM4521 plots claimed by data bits in the wavelet time/frequency plane are as follows:

Notice how plot zero maps to the 300Hz subband, and takes the full 300Hz cycle to transmit. Plots 1 and 2 fill in at 600Hz but consume twice the bandwidth, while plots 3, 4, 5, and 6 fill in at 1200Hz and take up four times the bandwidth, and plots 7 through 14 fill in at 2400Hz and eight times the bandwidth. The sum total plot rate is then 1+2+4+8 = 15 plots per symbol. Compare this with our time-domain explanation above (especially plots 0, 1, and 2).

Note that the full audio waveform for this consumes 3.45kHz of audio frequency bandwidth. However, remember that both upper and lower sidebands are generated when amplitude modulating tones. Therefore, the 2400Hz and 300Hz "carriers" both still retain their relative shapes even after filtering to a 300Hz to 2400Hz bandwidth.

By using multiple bits to control the amplitude of each plot, we can encode a total of 15*n bits per symbol, where n is the number of bits used to control amplitudes. Since WM4521 is on/off keyed, only 1 bit per plot is used, and therefore, 4500bps is the data throughput of this channel.

Finer grained amplitude or even phase modulation is possible for each wavelet slot as well. For example, encoding two bits as four amplitude or phase levels per plot (WM4522) would give a total throughput of 9000bps; 13500bps for 3-bits per plot (WM4523), 18000bps for 4-bits per plot (WM4524), and so forth. Precise standards for these higher speed modes will be forthcoming once more experience is gained with the WM4521 mode, however.

Choice of Wavelets

The Haar wavelet is chosen for its ease of implementation in both software and hardware implementations, and its support for compactness and high orthogonalility in both the time and frequency domains. The Haar wavelet can be expressed mathematically as follows:


          { +1 iff 0 < t <= 0.5,
   W(t) = { -1 iff 0.5 < t <= 1,
          {  0 for all other values of t
   

Note that +1 and -1 map to relative amplitudes on the channel. t is a relative measure of time, whose absolute duration depends on the scale at which the wavelet is applied.

The Haar wavelet is used across all four subchannels.

Transmission Format

Because radio squelches are often slow to respond, the waveforms described by the WM4521 specification are designed to be detectable without the need to rely explicitly on a radio's carrier detect signal.

It is anticipated that WM4521 implementations will scan for synchronization using a sliding window of samples, and then subsequently process a fixed quantity of samples isochronously once found. In order to ensure synchronization between the transmitter and all the receivers, all data must be transmitted in frames which consists of a brief synchronization phase, followed by an arbitrary number of data symbols. The synchronization phase, if kept unique from regular payload data, is sufficient to permit receivers to adequately lock onto the transmitter's signal, and to synchronize sample rates as well.

It is left to higher layers to ensure sufficient bit density exists to maintain synchronization once a frame starts, and to ensure a synchronization symbol does not appear within the data payload. Ideally, at least one wavelet plot per symbol must be non-zero, so that synchronization can be maintained. The bit-stuffing required by AX.25 implementations, for example, would work well to achieve this. However, the bit-stuffing rules can be relaxed a little bit, as discussed in section 6.

Synchronization Phase

Because wavelet transforms to work on whole blocks of waveform samples on an isochronous basis, it is important to get proper symbol timing at the receiver. Prior to achieving such a lock, however, the receiver may need to employ a sliding window of samples to isolate the start of frame condition. The initial symbol chosen for this purpose is 0x4001, which produces a symbol with plots 0 and E turned on. All other plots are turned off. This produces a pulse train which includes the lowest and highest frequency components.

Transmitters should send at least two synchronization symbols to help ensure proper synchronization of all receivers present on a channel. This will help synchronize the receiver's sample rate with that of the transmitter.

If the receiver detects a synchronization symbol in the middle of a data stream, it must pass it along as raw data. A receiver may optionally use the synchronization symbol to recalibrate its internal timebase to that of the transmitter's however.

Payload

Once the sample stream from the input has been synchronized, the sliding window is abandoned in favor of isochronous processing of a fixed quantity of sample data.

Refer to section 3 for the time/frequency layout of a symbol. Bits fill in starting at plot 0, and proceeding to plot E. Although all bits are essentially transmitted in parallel (within the context of a symbol), plot 0 is considered to be transmitted first, and plot E last. Hence, like RS-232, WM4521 must be treated as an LSB-first serial data transport.

NOTE --- Although WM4521 delivers a bit-serial transmission service, bits are still delivered in quantities of 15, due to the nature of the wavelet modulated symbol structure. It is up to higher-level communications software to discard unnecessary trailing bits.
NOTE --- Since at least one bit in each symbol must be set to ensure receiver synchronization, and because there are 13 bits between the bits in a synchronization symbol, bit-stuffing should be employed to ensure that there is never a contiguous sequence of 0-bits longer than 12 bits. Since there are 15 bits per symbol, this ensures that at least one bit is set in each symbol for all symbols.

End of Frame

The end of frame can be detected in one of two ways. The first method is to detect a drop in the medium's carrier, if one is provided. For example, when used over a radio communications channel, the radio will often export a signal indicating whether (asserted) or not (negated) a carrier is present. When the carrier detect signal is negated, it is patently clear that no further data is being received; hence, any frame currently being received is by default terminated.

The second way to detect an end of packet is to receive a symbol which has NO time/frequency plot energy at all, ignoring noise. Since all other symbols, by definition, have at least one bit set, the transmitter is free to "just stop transmitting" when it reaches the end of its payload.

Collisions

It is the responsibility of higher networking layers to handle collisions of frames.

Media Types

WM4521 is a baseband signal encoding, and can be modulated over a variety of media, including wireline and wireless technologies. This modulation method is optimized for use in amateur radio bands, however.

Mapping AX.25 to WM4521

The AX.25 Version 2.2 protocol is the recommended OSI Layer 2 protocol for use with WM4521. There are two methods that can be used in adapting AX.25 for use with WM4521.

ISO Layering

Because the WM4521 implements a raw bitstream service, it's possible to transmit AX.25 frames, as specified by the AX.25 Version 2.2 specifications or later, directly using WM4521, except as amended as follows.

AX.25 performs bitstuffing on detecting a contiguous series of '1' bits. This is incompatible with WM4521, because long strings of '0' bits go unnoticed (recall from section 5.3 that a symbol with all zero bits is interpreted as an end of frame indication).

In order to ensure interoperability with WM4521, frame data is serialized to the WM4521 channel with all transmitted bits inverted from their original state. For example, if the following frame data is to be sent:

01111110 10000010 10000100 111110110 ...

They would need to be inverted, as follows:

10000001 01111101 01111011 000001001 ...

Note the last "octet" has a stuffed bit. In this way, what appears as a long stream of '1' bits with the stuffed '0' bits now appears as '0' bits with stuffed '1' bits, which is compatible with WM4521's synchronization rules.

The receiver is of course required to invert the WM4521 data payload for correct interpretation by the AX.25 layer.

Native Adaptation

Adapting AX.25 to WM4521 is another approach to layer 2 frame delivery. This requires a bit more modification to the AX.25 interpretation, but depending on the channel requirements, may provide better operating characteristics. Depending on the hardware with which the TNC is implemented, it may also provide less processing requirements on the part of the terminal node controller as well, as data is transmitted and received 15 bits at a time, instead of 1 bit at a time.

Synchronization

AX.25 flag octets are replaced with 15-bit flags of 0x4001. Since the very first flag pattern appears on an even symbol boundary, it is guaranteed to map to the WM4521 synchronization symbol.

Note that the 0x4001 flag is permitted to appear anywhere in the bitstream, and need not appear only on symbol boundaries. Only the first synchronization sequence is required to map to a synchronization symbol. This allows the receiver to effortlessly locate the AX.25 frame check sequence field.

Because bits are delivered in quantities of 15 from a WM4521 demodulator, an AX.25 implementation must be prepared to handle spurious bits after the last flag, whereever it may reside in the received bitstream.

Bit Stuffing

In contrast to the currently specified AX.25 specifications, A '1' bit would be stuffed into the bitstream after 12 '0' bits have been transmitted. Likewise, when receiving data, after 12 '0' bits have been received, the next bit must be discarded. No stuffing occurs for contiguous sequences of '1' bits.

Note that the increase in length of runs of bits reduces average packet jitter. As specified by the current AX.25 specifications, the 5-bit stuffing rules allows a packet to expand by a factor of 16.7%. WM4521's requirements for synchronization allows a packet to expand by no more than 7.7%. Therefore, AX.25 packets transmitted over WM4521 channels can be up to 9% smaller in cases where long strings of '0' bits can be expected (e.g., voice over packet, digital image transmission, etc).

NRZI Encoding

Because bits are transmitted in parallel in specially modulated waveforms, timing information is already extracted from the waveform, and because a wavelet modulated signal inherently has no DC bias, NRZI encoding is not performed.

Framing

Note that a single WM4521 frame may contain multiple AX.25 frames, each separated by a flag sequence.

Works Cited

Miller, James. "Shape of Bits to Come." Spread Spectrum Scene Online. Pearce, Jim. 1998 May 28. Pegasus Technologies, Inc. 2004 Oct 28. http://www.sss-mag.com/G3RUH/index1.html#bits