1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
4 <!ENTITY test_tcm_listing SYSTEM "test_tcm.py.xml">
10 <title>Trellis-based algorithms for GNU Radio</title>
12 <firstname>Achilleas</firstname>
13 <surname>Anastasopoulos</surname>
16 <email>anastas@umich.edu</email>
23 <revnumber>v0.0</revnumber>
24 <date>2006-08-03</date>
32 <abstract><para>This document provides a description of the
33 Finite State Machine (FSM) implementation and the related
34 trellis-based encoding and decoding algorithms
43 <!--=====================================================-->
44 <sect1 id="intro"><title>Introduction</title>
49 The basic goal of the implementation is to have a generic way of
50 describing an FSM that is decoupled from whether it describes a
52 code (CC), a trellis code (TC), an inter-symbol interference (ISI)
54 other communication system that can be modeled with an FSM.
55 To achieve this goal, we need to separate the pure FSM descrition from the
56 rest of the model details. For instance, in the case of a rate 2/3 TC,
57 the FSM should not involve details about the modulation used (it can
58 be an 8-ary PAM, or 8-PSK, etc). Similarly, when attempting maximum likelihood
59 sequence detection (MLSD)--using for instance the Viterbi algorithm (VA)--
60 the VA implementation should not be concerned with the channel details
61 (such as modulations, channel type, hard or soft inputs, etc).
62 Clearly, having generality as the primary goal implies some penalty
63 on the code efficiency, as compared to fully custom implementations.
67 We will now describe the implementation of the basic ingedient, the FSM.
73 <!--=====================================================-->
74 <sect1 id="fsm"><title>The FSM class</title>
76 <para>An FSM describes the evolution of a system with inputs
77 x<subscript>k</subscript>, states s<subscript>k</subscript> and outputs y<subscript>k</subscript>. At time k the FSM state is s<subscript>k</subscript>.
78 Upon reception of a new input symbol x<subscript>k</subscript>, it outputs an output symbol
79 y<subscript>k</subscript> which is a function of both x<subscript>k</subscript> and s<subscript>k</subscript>.
80 It will then move to a next state s<subscript>k+1</subscript>.
81 An FSM has a finite number of states, input and output symbols.
82 All these are formally described as follows:
86 <listitem><para>The input alphabet A<subscript>I</subscript>={0,1,2,...,I-1}, with cardinality I, so that x<subscript>k</subscript> takes values in A<subscript>I</subscript>.</para></listitem>
87 <listitem><para>The state alphabet A<subscript>S</subscript>={0,1,2,...,S-1}, with cardinality S, so that s<subscript>k</subscript> takes values in A<subscript>S</subscript>.</para></listitem>
88 <listitem><para>The output alphabet A<subscript>O</subscript>={0,1,2,...,O-1}, with cardinality O, so that y<subscript>k</subscript> takes values in A<subscript>O</subscript></para></listitem>
89 <listitem><para>The "next-state" function NS: A<subscript>S</subscript> x A<subscript>I</subscript> --> A<subscript>S</subscript>,
91 s<subscript>k+1</subscript> = NS(s<subscript>k</subscript>, x<subscript>k</subscript>)</para></listitem>
92 <listitem><para>The "output-symbol" function OS: A<subscript>S</subscript> x A<subscript>I</subscript> --> A<subscript>S</subscript>,
94 y<subscript>k</subscript> = OS(s<subscript>k</subscript>, x<subscript>k</subscript>)</para></listitem>
98 Thus, a complete description of the FSM is given by the
99 the five-tuple (I,S,O,NS,OS).
100 Observe that implementation details are hidden
101 in how the outside world interprets these input and output
103 Here is an example of an FSM describing the (2,1) CC
104 with constraint length 3 and generator polynomial matrix
105 (1+D+D<superscript>2</superscript> , 1+D<superscript>2</superscript>)
106 from Proakis-Salehi pg. 779.
110 <example id="cc_ex"><title>(2,1) CC with generator polynomials (1+D+D<superscript>2</superscript> , 1+D<superscript>2</superscript>)</title>
113 This CC accepts 1 bit at a time, and outputs 2 bits at a time.
114 It has a shift register storing the last two input bits.
116 b<subscript>k</subscript>(0)=x<subscript>k</subscript>+
117 x<subscript>k-1</subscript>+x<subscript>k-2</subscript>, and
118 b<subscript>k</subscript>(1)=x<subscript>k</subscript>+
119 x<subscript>k-2</subscript>, where addition is mod-2.
120 We can represent the state of this system
121 as s<subscript>k</subscript> = (x<subscript>k-1</subscript> x<subscript>k-2</subscript>)<subscript>10</subscript>. In addition we can represent its
122 output symbol as y<subscript>k</subscript> = (b<subscript>k</subscript>(1) b<subscript>k</subscript>(0))<subscript>10</subscript>.
123 Based on the above assumptions, the input alphabet A<subscript>I</subscript>={0,1}, so I=2;
124 the state alphabet A<subscript>S</subscript>={0,1,2,3}, so S=4; and
125 the output alphabet A<subscript>O</subscript>={0,1,2,3}, so O=4.
126 The "next-state" function NS(,) is given by
128 s<subscript>k</subscript> x<subscript>k</subscript> s<subscript>k+1</subscript>
138 The "output-symbol" function OS(,) can be given by
140 s<subscript>k</subscript> x<subscript>k</subscript> y<subscript>k</subscript>
153 Note that although the CC outputs 2 bits per time period, following
154 our approach, there is only one (quaternary) output symbol per
155 time period (for instance, here we use the decimal representation
156 of the 2-bits). Also note that the modulation used is not part of
157 the FSM description: it can be BPSK, OOK, BFSK, QPSK with or without Gray mapping, etc;
158 it is up to the rest of the program to interpret the meaning of
159 the symbol y<subscript>k</subscript>.
166 The C++ implementation of the FSM class keeps private information about
167 I,S,O,NS,OS and public methods to read and write them. The NS
168 and OS matrices are implemented as STL 1-dimensional vectors.
177 std::vector<int> d_NS;
178 std::vector<int> d_OS;
179 std::vector<int> d_PS;
180 std::vector<int> d_PI;
181 std::vector<int> d_TMi;
182 std::vector<int> d_TMl;
183 void generate_PS_PI ();
185 bool find_es(int es);
188 fsm(const fsm &FSM);
189 fsm(int I, int S, int O, const std::vector<int> &NS, const std::vector<int> &OS);
190 fsm(const char *name);
191 fsm(int k, int n, const std::vector<int> &G);
192 fsm(int mod_size, int ch_length);
193 int I () const { return d_I; }
194 int S () const { return d_S; }
195 int O () const { return d_O; }
196 const std::vector<int> & NS () const { return d_NS; }
197 const std::vector<int> & OS () const { return d_OS; }
198 const std::vector<int> & PS () const { return d_PS; }
199 const std::vector<int> & PI () const { return d_PI; }
200 const std::vector<int> & TMi () const { return d_TMi; }
201 const std::vector<int> & TMl () const { return d_TMl; }
206 As can be seen, other than the trivial and the copy constructor,
207 there are three additional
208 ways to construct an FSM.
213 <para>Supplying the parameters I,S,O,NS,OS:</para>
215 fsm(const int I, const int S, const int O, const std::vector<int> &NS, const std::vector<int> &OS);
220 <para>Giving a filename containing all the FSM information:</para>
222 fsm(const char *name);
225 This information has to be in the following format:
229 NS(0,0) NS(0,1) ... NS(0,I-1)
230 NS(1,0) NS(1,1) ... NS(1,I-1)
232 NS(S-1,0) NS(S-1,1) ... NS(S-1,I-1)
234 OS(0,0) OS(0,1) ... OS(0,I-1)
235 OS(1,0) OS(1,1) ... OS(1,I-1)
237 OS(S-1,0) OS(S-1,1) ... OS(S-1,I-1)
241 For instance, the file containing the information for the example mentioned above is of the form:
261 The third way is specific to FSMs representing binary (n,k) conolutional codes. These FSMs are specified by the number of input bits k, the number of output bits n, and the generator matrix, which is a k x n matrix of integers
262 G = [g<subscript>i,j</subscript>]<subscript>i=1:k, j=1:n</subscript>, given as an one-dimensional STL vector.
263 Each integer g<subscript>i,j</subscript> is the decimal representation of the
264 polynomial g<subscript>i,j</subscript>(D) (e.g., g<subscript>i,j</subscript>= 6 = 110<subscript>2</subscript> is interpreted as g<subscript>i,j</subscript>(D)=1+D) describing the connections from the sequence x<subscript>i</subscript> to
265 y<subscript>j</subscript> (e.g., in the above example y<subscript>j</subscript>(k) = x<subscript>i</subscript>(k) + x<subscript>i</subscript>(k-1)).
268 fsm(int k, int n, const std::vector<int> &G);
275 The fourth way is specific to FSMs resulting from shift registers, and the output symbol being the entire transition (ie, current_state and current_input). These FSMs are usefull when describibg ISI channels. In particular the state is comprised of the input symbols x(k-1), x(k-2),...,x(k-L), where L = ch_length-1 and each x(i) belongs to an alphabet of size mod_size. The output is taken to be x(k), x(k-1), x(k-2),...,x(k-L) (in decimal format)
278 fsm(const int mod_size, const int ch_length);
286 As can be seen from the above description, there are
287 two more variables included in the FSM class implementation,
288 the PS and the PI matrices. These are computed internally
289 when an FSM is instantiated and their meaning is as follows.
290 Sometimes (eg, in the traceback operation of the VA) we need
291 to trace the history of the state or the input sequence.
292 To do this we would like to know for a given state s<subscript>k</subscript>, what are the possible previous states s<subscript>k-1</subscript>
293 and what input symbols x<subscript>k-1</subscript> will get us from
294 s<subscript>k-1</subscript> to s<subscript>k</subscript>. This information can be derived from NS; however we want to have it ready in a
296 In the following we assume that for any state,
297 the number of incoming transitions is the same as the number of
298 outgoing transitions, ie, equal to I. All applications of interest
299 have FSMs satisfying this requirement.
301 If we arbitrarily index the incoming transitions to the current state
302 by "i", then as i goes from 0 to I-1, PS(s<subscript>k</subscript>,i)
303 gives all previous states s<subscript>k-1</subscript>,
304 and PI(s<subscript>k</subscript>,i) gives all previous inputs x<subscript>k-1</subscript>.
305 In other words, for any given s<subscript>k</subscript> and any index i=0,1,...I-1, starting from
306 s<subscript>k-1</subscript>=PS(s<subscript>k</subscript>,i)
308 x<subscript>k-1</subscript>=PI(s<subscript>k</subscript>,i)
309 will get us to the state s<subscript>k</subscript>.
310 More formally, for any i=0,1,...I-1 we have
311 s<subscript>k</subscript> = NS(PS(s<subscript>k</subscript>,i),PI(s<subscript>k</subscript>,i)).
318 two more variables included in the FSM class implementation,
319 the TMl and the TMi matrices. These are both S x S matrices (represented as STL vectors) computed internally
320 when an FSM is instantiated and their meaning is as follows.
321 TMl(i,j) is the minimum number of trellis steps required to go from state i to state j.
322 Similarly, TMi(i,j) is the initial input required to get you from state i to state j in the minimum number of steps. As an example, if TMl(1,4)=2, it means that you need 2 steps in the trellis to get from state 1 to state 4. Further,
323 if TMi(1,4)=0 it means that the first such step will be followed if when at state 1 the input is 0. Furthermore, suppose that NS(1,0)=2. Then, TMl(2,4) should be 1 (ie, one more step is needed when starting from state 2 and having state 4 as the final destination). Finally, TMi(2,4) will give us the second input required to complete the path from 1 to 4.
324 These matrices are useful when we want to implement an encoder with proper state termination. For instance, based on these matrices we can evaluate how many
325 additional input symbols (and which particular inputs) are required to be appended at the end of an input sequence so that the final state is always 0.
334 <!--=====================================================-->
335 <sect1 id="blocks"><title>Blocks Using the FSM structure</title>
338 In this section we give a brief description of the basic blocks implemented that make use of the previously described FSM structure.
342 <!-- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -->
343 <sect2 id="encoder"><title>Trellis Encoder</title>
346 The trellis.encoder_XX(FSM, ST) block instantiates an FSM encoder corresponding to the fsm FSM and having initial state ST. The input and output is a sequence of bytes, shorts or integers.
351 <!-- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -->
352 <sect2 id="decoder"><title>Viterbi Decoder</title>
355 The trellis.viterbi_X(FSM, K, S0, SK) block instantiates a Viterbi decoder
356 for an underlying ...
369 <!--=====================================================-->
370 <sect1 id="tcm"><title>TCM: A Complete Example</title>
373 We now discuss through a concrete example how
374 the above FSM model can be used in GNU Radio.
376 The communication system that we want to simulate
377 consists of a source generating the
378 input information in packets, a CC encoding each packet separately,
379 a memoryless modulator,
380 an additive white Gaussian noise (AWGN) channel, and
381 the VA performing MLSD.
382 The program source is as follows.
388 The program is called by
390 ./test_tcm.py fsm_fname Es/No_db repetitions
392 where "fsm_fname" is the file containing the FSM specification of the
393 tested TCM code, "Es/No_db" is the SNR in dB, and "repetitions"
394 are the number of packets to be transmitted and received in order to
395 collect sufficient number of errors for an accurate estimate of the
400 The FSM is first intantiated in "main" by
403 62 f=trellis.fsm(fname) # get the FSM specification from a file (will hopefully be automated in the future...)
413 Each packet has size Kb bits
414 (we choose Kb to be a multiple of 16 so that all bits fit nicely into shorts and can be generated by the lfsr GNU Radio).
415 Assuming that the FSM input has cardinality I, each input symbol consists
416 of bitspersymbol=log<subscript>2</subscript>( I ). The Kb/16 shorts are now
417 unpacked to K=Kb/bitspersymbol input
418 symbols that will drive the FSM encoder.
421 63 Kb=1024*16 # packet size in bits (make it multiple of 16 so it can be packed in a short)
422 64 bitspersymbol = int(round(math.log(f.I())/math.log(2))) # bits per FSM input symbol
423 65 K=Kb/bitspersymbol # packet size in trellis steps
429 The FSM will produce K output symbols (remeber the FSM produces always one output symbol for each input symbol). Each of these symbols needs to be modulated. Since we are simulating the communication system, we need not simulate the actual waveforms. An M-ary, N-dimensional
430 modulation is completely specified by a set of M, N-dimensional real vectors. In "fsm_utils.py" file we give a number of useful modulations with the following format: modulation = (N,constellation), where
431 constellation=[c11,c12,...,c1N,c21,c22,...,c2N,...,cM1,cM2,...cMN].
432 The meaning of the above is that every constellation point c_i
433 is an N-dimnsional vector c_i=(ci1,ci2,...,ciN)
434 For instance, 4-ary PAM is represented as
435 (1,[-3, -1, 1, 3]), while QPSK is represented as
436 (2,[1, 0, 0, 1, 0, -1, -1, 0]). In our example we choose QPSK modulation.
437 Clearly, M should be equal to the cardinality of the FSM output, O.
438 Finally the average symbol energy and noise variance are calculated.
441 66 modulation = fsm_utils.psk4 # see fsm_utlis.py for available predefined modulations
442 67 dimensionality = modulation[0]
443 68 constellation = modulation[1]
444 69 if len(constellation)/dimensionality != f.O():
445 70 sys.stderr.write ('Incompatible FSM output cardinality and modulation size.\n')
447 72 # calculate average symbol energy
449 74 for i in range(len(constellation)):
450 75 Es = Es + constellation[i]**2
451 76 Es = Es / (len(constellation)/dimensionality)
452 77 N0=Es/pow(10.0,esn0_db/10.0); # noise variance
458 Then, "run_test" is called with a different "seed" so that we get
459 different noise realizations.
462 82 (s,e)=run_test(f,Kb,bitspersymbol,K,dimensionality,constellation,N0,-long(666+i)) # run experiment with different seed to get different noise realizations
468 Let us examine now the "run_test" function.
469 First we set up the transmitter blocks.
470 The Kb/16 shorts are first unpacked to
471 symbols consistent with the FSM input alphabet.
472 The FSm encoder requires the FSM specification,
473 and an initial state (which is set to 0 in this example).
477 16 src = gr.lfsr_32k_source_s()
478 17 src_head = gr.head (gr.sizeof_short,Kb/16) # packet size in shorts
479 18 s2fsmi = gr.packed_to_unpacked_ss(bitspersymbol,gr.GR_MSB_FIRST) # unpack shorts to symbols compatible with the FSM input cardinality
480 19 enc = trellis.encoder_ss(f,0) # initial state = 0
487 We now need to modulate the FSM output symbols.
488 The "chunks_to_symbols_sf" is essentially a memoryless mapper which
489 for each input symbol y_k
490 outputs a sequence of N numbers ci1,ci2,...,ciN representing the
491 coordianates of the constellation symbol c_i with i=y_k.
494 20 mod = gr.chunks_to_symbols_sf(constellation,dimensionality)
498 The channel is AWGN with appropriate noise variance.
499 For each transmitted symbol c_k=(ck1,ck2,...,ckN) we receive a noisy version
500 r_k=(rk1,rk2,...,rkN).
505 24 noise = gr.noise_source_f(gr.GR_GAUSSIAN,math.sqrt(N0/2),seed)
511 Part of the design methodology was to decouple the FSM and VA from
512 the details of the modulation, channel, receiver front-end etc.
513 In order for the VA to run, we only need to provide it with
514 a number representing a cost associated with each transition
515 in the trellis. Then the VA will find the sequence with
516 the smallest total cost through the trellis.
517 The cost associated with a transition (s_k,x_k) is only a function
518 of the output y_k = OS(s_k,x_k), and the observation
519 vector r_k. Thus, for each time period, k,
520 we need to label each of the SxI transitions with such a cost.
521 This means that for each time period we need to evaluate
522 O such numbers (one for each possible output symbol y_k).
524 in "metrics_f". In particular, metrics_f is a memoryless device
525 taking N inputs at a time and producing O outputs. The N inputs are
528 are the costs associated with observations rk1,rk2,...,rkN and
529 hypothesized output symbols c_1,c_2,...,c_M. For instance,
530 if we choose to perform soft-input VA, we need to evaluate
531 the Euclidean distance between r_k and each of c_1,c_2,...,c_M,
532 for each of the K transmitted symbols.
533 Other options are available as well; for instance, we can
534 do hard decision demodulation and feed the VA with
535 symbol Hamming distances, or even bit Hamming distances, etc.
536 These are all implemented in "metrics_f".
540 27 metrics = trellis.metrics_f(f.O(),dimensionality,constellation,trellis.TRELLIS_EUCLIDEAN) # data preprocessing to generate metrics for Viterbi
544 Now the VA can run once it is supplied by the initial and final states.
545 The initial state is known to be 0; the final state is usually
546 forced to some value by padding the information sequence appropriately.
547 In this example, we always send the the same info sequence (we only randomize noise) so we can evaluate off line the final state and then provide it to the VA (a value of -1 signifies that there is no fixed initial
548 or final state). The VA outputs the estimates of the symbols x_k which
549 are then packed to shorts and compared with the transmitted sequence.
552 28 va = trellis.viterbi_s(f,K,0,-1) # Put -1 if the Initial/Final states are not set.
553 29 fsmi2s = gr.unpacked_to_packed_ss(bitspersymbol,gr.GR_MSB_FIRST) # pack FSM input symbols to shorts
554 30 dst = gr.check_lfsr_32k_s();
561 The function returns the number of shorts and the number of shorts in error. Observe that this way the estimated error rate refers to
562 16-bit-symbol error rate.
565 48 return (ntotal,ntotal-nright)
573 <!--====================n================================-->
574 <sect1 id="future"><title>Future Work</title>
582 Improve the documentation :-)
588 automate fsm generation from rational functions
595 Optimize the VA code.
602 decoders, eg, sphere decoding, M- and T- algorithms, sequential decoding, etc.
610 Although turbo decoding is rediculously slow in software,
611 we can design it in principle. One question is, whether we should
612 use the encoder, and SISO blocks and connect them
613 through GNU radio or we should implement turbo-decoding
614 as a single block (issues with buffering between blocks).
615 So far the former has been implemented.