university of groningen on the weight adjacency matrix of ...schneider, h-g. (2008). on the weight...

University of Groningen

On the weight adjacency matrix of convolutional codesSchneider, Hans-Gert

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2008

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Schneider, H-G. (2008). On the weight adjacency matrix of convolutional codes. [S.l.]: [s.n.].

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 17-07-2020

https://www.rug.nl/research/portal/en/publications/on-the-weight-adjacency-matrix-of-convolutional-codes(efccb4c2-250b-4e72-a9b3-ff80386489c6).html

https://www.rug.nl/research/portal/en/publications/on-the-weight-adjacency-matrix-of-convolutional-codes(efccb4c2-250b-4e72-a9b3-ff80386489c6).html

Rijksuniversiteit Groningen

On the weight adjacency matrixof convolutional codes

Proefschrift

ter verkrijging van het doctoraat in deWiskunde en Natuurwetenschappenaan de Rijksuniversiteit Groningen

op gezag van deRector Magnificus, dr. F. Zwarts,in het openbaar te verdedigen op

vrijdag 26 september 2008om 16:15 uur

door

Hans-Gert Schneider

geboren op 28 april 1979te Gelsenkirchen (Duitsland)

1

Promotor: Prof. Dr. R. F. Curtain

Copromotor: Prof. Dr. H. Gluesing-Luerssen

Beoordelingscommissie:

Prof. Dr. H.-G. QuebbemannProf. Dr. J. RosenthalProf. Dr. M. van der Put

ISBN: 978-90-367-3587-2

2

Als ik kan

3

Acknowledgements

The most important lesson I learned about scientific work during my years as aPhD student in Groningen is the following: one cannot take for granted that thenames of the persons that appear in the first pages of a paper or thesis are in factthe ones of those, who are responsible for its existence. In my case this is different.Besides me each of the readers and the (co-)promotors contributed a large share. Itis for formal reasons, that Prof. Dr. Heide Gluesing-Luerssens name appears in thesecond place as co-promotor. In fact, the research summarised in this thesis wouldnever have started without her mathematical instinct to initiate the project I hadthe pleasure to work on. Whenever necessary she offered her knowledge and persis-tently pointed out every single of my innumerable mistakes I made, which made herinvaluable for me. Most importantly, throughout our collaboration she respected my(sometimes very special) way of organising my life and work, for which I am deeplyindebted to her. Taking into account that Fai Lung Tsang and I have been her firstPhD-students ever, she seems to have a natural talent being a PhD-supervisor.Prof. Dr. Ruth Curtain ideally complemented Prof. Dr. Heide Gluesing-Luerssen.In the first place it was her experience with administrative acts, which pose a realdifficulty in acquiring a PhD in Groningen, that helped me to concentrate on theresearch. So it came not out of a sudden that, after the move of Prof. Dr. HeideGluesing-Luerssen to the University of Kentucky, I very much appreciated her offerto take over the duties associated with the completion of my PhD.Besides reading my thesis, giving helpful comments and suggestions each of thereaders made special contributions. Prof. Dr. Heinz-Georg Quebbemann stronglyinfluenced and coined my understanding of mathematics since I started to studynine years ago by his exemplary lectures and patience in working with his students.Prof. Dr. Joachim Rosenthal offered my (and other young mathematicians) somany opportunities to present results to an audience that works in the same fieldof mathematics. Finally, Prof. Dr. Marius van der Put encouraged me to find myown way of writing the thesis up and has been supportive ever since I met him.Of course, there are many more people that helped me in the last four years in oneway or the other. There is my network of friends, in particular Andreas Schmachtl,that among other things cared for my menagerie when I was visiting conferencesor workshops, which would have been impossible without their help. There is theacademical staff in the department of mathematics, which made it easier to cometo the university due to their friendliness. In the first place it were Prof. Dr. HansNieuwenhuis, Prof. Dr. Jaap Top and (Tomas) Fai Lung Tsang, with whom I hadmany inspiring and interesting discussions. Finally, I have to thank my sister IngridSchneider, who supported me in her own special way throughout my whole life.

4

Contents

1 Introduction 6

2 Basic Notions of Coding Theory 9

2.1 Bilinear Forms and Characters . . . . . . . . . . . . . . . . . . . . . . 12

2.2 The MacWilliams Identity for Block Codes . . . . . . . . . . . . . . . 13

2.3 From Block Codes to Convolutional Codes . . . . . . . . . . . . . . . 18

2.4 Duality Notions for Convolutional Codes . . . . . . . . . . . . . . . . 22

3 The Weight Adjacency Matrix of a Convolutional Code 26

3.1 The Complete Weight Adjacency Matrix and its Properties . . . . . . 26

3.2 The Space of State Pairings . . . . . . . . . . . . . . . . . . . . . . . 29

3.3 Further Results on the Weight Adjacency Matrices . . . . . . . . . . 33

4 The MacWilliams Identity for the Weight Adjacency Matrix of aConvolutional Code 40

4.1 The Matrix HΓH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2 The Isomorphisms f1 and f2 . . . . . . . . . . . . . . . . . . . . . . . 51

4.3 MacWilliams Identities for Convolutional Codes . . . . . . . . . . . . 57

4.4 Previous Results and possible Generalisations . . . . . . . . . . . . . 63

5 On Self-orthogonal and Self-dual Convolutional Codes 67

5.1 Self-orthogonal and Self-dual Block Codes . . . . . . . . . . . . . . . 68

5.2 The Clifford Groups and Gleason’s Theorem for Block Codes . . . . . 69

5.3 Self-orthogonal and Self-dual Convolutional Codes . . . . . . . . . . . 71

5.4 Invariant Theory and Convolutional Codes . . . . . . . . . . . . . . . 74

6 On Concepts of Equivalence for Convolutional Codes 78

5

1 Introduction

Since its birth Coding Theory has been a fast developing subject in the intersectionof mathematics, engineering and computer science. Problems and their solutions inthis field typically arise from all three disciplines. As a consequence, many theoremsmay have an impact on only one or two of the disciplines. In general, there is adivision between applications-driven research, which is connected to the engineeringcommunity and computer scientists, and the theoretical aspects of coding theory,which are more popular with mathematicians. One result, that is equally importantand admitted deeper insight for both groups is the MacWilliams Identity for blockcodes, which was already proven as early as 1961. Ever since, it has been an impor-tant tool in studying codes and one of their most important properties: the weightsof their code words.From a mathematical point of view a code word is nothing but a sequence of infor-mation bits, which is encoded from a shorter sequence of message bits and the codeis then formed by the sets of these code words. The information bits come from analphabet, which is a priori free to chose. By virtue of the difference in length of themessage and the code word, extra information is added to the codeword such thatthe transmission of the codeword via a noisy, that is an error-susceptible, channelbecomes more reliable. The idea is simply that the added information may be usedto detect and, if necessary, to correct a limited number of errors that occurred onthe transmission of the code word. Of course, in the digital world the most commonalphabet is 0, 1, which is given the structure of the finite field F2 and codes aretherefore subsets of some vector space Fn2 with a prescribed block length n. Forthe error detection a metric is imposed on that vector space creating a notion ofdistance of codewords. This metric is in general most powerful, if the code itselfhas the structure of a vector space. The class of these codes is referred to as linearblock codes and has from the very beginning been at the center of attention of bothmathematicians and engineers. One of the challenges in coding theory has been tofind large codes where each two distinct codewords of the code are separated by ahigh distance. Obviously, the complexity of determining the minimal distance of acode increases with its dimension. The MacWilliams identity now is a tool that re-lates the distance or weight distribution of all codewords of a given high dimensionalcode with that of a low dimensional code and vice versa. Its use for applications ishence apparent. The impact on the mathematical side is more subtle, but equallywide-spread.The second fundamental theorem of MacWilliams is the MacWilliams Equivalenceor Extension Theorem, which clarifies when two codes are equally good. Commonsense tells that rescaling and permuting the coordinates of each codeword of a givencode leads to a code, which has similar distance properties as the original code. Dueto the MacWilliams Equivalence Theorem this extrinsic notion of code equivalencehappens to coincide with the intrinsic notion of having a linear isometry betweentwo codes. The relevance for applications of this mathematically beautiful resultis obvious. It gives an important indication when two codes should be identified

6

because they have identical error-correcting properties.The two theorems of MacWilliams are exceptional in the sense that they are equallywell-known in the mathematical and engineering community. The mathematicalpart of coding theory soon developed a life of its own and to a large extent uncou-pled from their motivation in applications. New fields of applications for codingtheory like deep space communication, mobile phones and most important the ubiq-uitous use of personal computers, led to modifications of the original setting on theengineering side, which only hesitantly found their way to mathematicians, partly,because the resulting mathematics is not elegant. One of the modifications is the in-sight that information bits are usually not sent in blocks, but as a continuous streamof blocks at different time instances. This may be achieved by choosing a linear blockcode and sending a codeword (one block) at each time instance. Thereby, however,one sends blocks that are not interconnected. Each block contains no informationon the previous and the following block. The idea of interconnecting the blocks ofthe different time instances leads to convolutional codes, which are no longer linearblock codes in the classical sense. Among these codes there are codes that haveimpressive error-correcting capabilities and they naturally come with an efficientdecoding algorithm. Therefore they have been implemented by engineers in manyapplications fields for more than three decades. Despite their distribution the math-ematical theory of convolutional codes has long been and still is underdeveloped.Many of the known facts are due to engineers with exceptional mathematical capa-bilities.Due to their nature block codes may be seen as a subclass of convolutional codes.Hence it is natural to see whether the results known for block codes may be gener-alised to convolutional codes. Both theorems of MacWilliams are natural candidatesto start with, but until recently no generalisations to convolutional codes had beenknown. The mathematical tools in coding theory typically come from the field ofalgebra. Convolutional codes, however, share the structure of linear systems, whichhave extensively studied in systems theory. This likewise young discipline opens anew perspective on convolutional codes and provides a powerful tool to study them,which has, for instance, been demonstrated in [31].In this thesis I will employ these methods to generalise the classical MacWilliamsIdentity for linear block codes to convolutional codes. In order to do so I will usea generalisation of the distance distribution of a block code to convolutional codes,which has not yet been extensively studied, the weight adjacency matrix of a con-volutional code. It contains information not only on the minimal distance of thecode, but also on more refined distance parameters. One of its disadvantages is,that it is on the first sight not an invariant of the convolutional code. Whereasthis problem has recently been resolved [8], it remains a problem that there is noknown way to derive the weight adjacency matrix of a code from the code directly.Only by representing the code in a suitable way may one obtain the weight adja-cency matrix using systems theoretic tools. This problem is reflected in the proof ofthe MacWilliams identity. Apart from the original proof employed by MacWilliamsit was very soon discovered that the MacWilliams identity is from a mathematicalpoint of view most elegantly proven using a discrete Fourier transform. My attempts

7

to copy this principle and by this prove the generalisation of the MacWilliams iden-tity for convolutional codes have been fruitless due to the very nature of the weightadjacency matrix. However, in this thesis a proof is be given, which verifies theidentity with the help of technical means to describe the weight adjacency matrix.Having proved the MacWilliams identity, I survey the possibilities of using it to dosome preliminary steps to develop a theory of self-dual convolutional codes.Finally I briefly demonstrate the problems connected with the generalisation ofthe MacWilliams Equivalence Theorem to convolutional codes. It appears that thegreatest challenge here is to find out when two convolutional codes are really equallygood. In other words, how many algebraic and distance parameters two convolu-tional codes need to share to call them equivalent. Although no final answer isgiven, I obtain a result that gives a partial solution for the class of convolutionalcodes which is most important for applications.

8

2 Basic Notions of Coding Theory

Before discussing convolutional codes it is essential to recall the basic notions ofclassical coding theory. For a more general introduction to coding theory see forinstance [22], [19]. Let F = Fq be the — up to isometry — unique finite field withq = ps elements called the alphabet, where p ∈ Z is some prime and s ∈ N, s > 0.

Definition 2.1 A linear code of length n and dimension k over F is a subspaceC ≤ Fn of dimension k. A matrix G ∈ Fk×n such that C = imG is called a generatormatrix of the code C.

It is well known that any linear block code has a generator matrix, which istypically non-unique. Block codes are from now on always linear block codes overF. The strength of coding theory stems from imposing a metric on the space Fn,the Hamming-metric.

Definition 2.2 Let c = (c1, . . . , cn) ∈ Fn. Then wt(c) := #i | ci 6= 0 isthe Hamming-weight or short weight of c. Let C ⊆ Fn a block code, then d :=minwt(c) | c ∈ C, c 6= 0 is the minimal distance of the code C.

It is easy to check that the tuple (Fn,wt) is indeed a metric space. Let c, c′ ∈ Fnthen wt(c−c′) is precisely the number of coordinates, in which c and c′ are different.So if c ∈ C is a sent codeword and c′ ∈ Fn is the received word after transmission,then wt(c−c′) is the minimal number of errors that occurred during the transmission.A linear code C with minimal distance d can therefore correct up to

⌊d−1

2

⌋errors. So

the minimal distance of a code is obviously one of the most important parameters tomeasure its quality for practical purposes. Alas, the determination of the minimaldistance of a random block code is a hard problem. The most straightforward wayto do so is certainly to go through a list of all codewords and calculate their weights.In fact this is mirrored in the following object.

Definition 2.3 Let S ⊆ Fn be a set. Then the weight enumerator of the set is

we(S) :=∑c∈S

Wwt(c) ∈ C[W ].

The complete weight enumerator of the set is

cwe(S) :=∑c∈S

n∏i=1

Wci ∈ C[Wa | a ∈ F]

Another interpretation of the weight enumerator is the following: the coefficient ofW j is the number of vectors of weight j in S. Note that the complete weight enu-merator of a subset of Fn is a homogeneous polynomial of degree n. It is immediatefrom the definition that the weight enumerator is not homogeneous and generally

9

has a degree smaller than n.If the set S is even a block code, the minimal distance of the code is the coefficientof the lowest non-constant term of the weight enumerator. Moreover, one can alsodetermine, how many codewords of this weight exist, which again is of great interestwhen it comes to considerations for applications.The complete weight enumerator of a set is best seen as the symmetrisation of theweight enumerator of this set. Taking this point of view, there is an easy way toobtain the weight enumerator of a set from the complete weight enumerator.

Lemma 2.4 Let S ⊆ Fn. Let ι : C[Wa | a ∈ F] → C[W ] be the homomorphicextension of

Wa 7→

1, if a = 0

W, if a 6= 0.

Then we(S) = ι (cwe(S)) and ι is a C-algebra homomorphism.

Note, that, in general, there is no inverse map to ι, that is, the complete weightenumerator yields more detailed information on the set than the weight enumerator.However, if q = 2, the complete weight enumerator can be obtained from the weightenumerator by homogenisation of the latter. I will illustrate this by a toy example.

Example 2.5 Consider the one-dimensional code C := (1, 1, 0), (0, 0, 0) ≤ F32.

Then wt(1, 1, 0) = 2 and wt(0, 0, 0) = 0. Therefore we(C) = 1 +W 2. The completeweight enumerator is simply computed to be cwe(C) = W 3

0 + W0W21 . Identifying

W1 = W is is immediate that cwe(C) is obtained from we(C) by homogenisationwith W0.Now consider C := (2, 1, 0), (1, 2, 0), (0, 0, 0) ≤ F3

3. Like with the codewords ofC it is wt(1, 2, 0) = wt(2, 1, 0) = 2 and wt(0, 0, 0) = 0, hence we(C) = 1 + 2W 2,but the complete weight enumerator is easily computed to be cwe(C) = W 3

0 +2W1W2W0. Note that C := (1, 1, 0), (2, 2, 0), (0, 0, 0) ≤ F3

3 fulfills we(C) = we(C),but cwe(C) 6= cwe(C). This shows that there cannot be a general inverse map to ιeven in this very simple situation.

The codes C and C in the above example share the same weight parameters butare different. Hence the natural question arises as to, when two codes are equallygood. Obviously, isomorphic codes share many algebraic properties, but need nothave the same distance properties with respect to the Hamming metric. This givesrise to the notion of isometry or code equivalence, as it is often referred to.

Definition 2.6 Let C,C ′ ⊆ Fn. An isomorphism f : C → C ′ is an isometry, if

wt(f(c)) = wt(c) for all c ∈ C.

If there is such an isometry f between the codes C and C ′, the codes are said to beequivalent.

10

Note that the codes C and C are indeed isometric. In fact, any two one-dimensionalcodes with the same weight enumerator are already isometric. This is not true forhigher dimensions. It is well known from linear algebra that any two codes over Fwith the same dimension are isomorphic. But even if they share the same weightenumerator, that is there is a weight-preserving bijection of codewords, they neednot be equivalent. This is because the bijection need not be a linear map. An ex-ample for this fact is given in chapter 6. This illustrates that the weight enumeratoralone is not an appropriate indicator to compare the performance of codes. However,equivalent codes necessarily share the same weight enumerator. Again, example 2.5may be used that this is not true for the complete weight enumerator, even in thissimple case.Another intuitive equivalence notion is that of monomial equivalence. Obviously apermutation of coordinates does not affect any of the coding properties of a code.The same is true for a rescaling of coordinates, which gives rise to the followingdefinition.

Definition 2.7 Two codes C,C ′ ⊆ Fn are monomially equivalent if there is apermutation matrix P ∈ GLn(F) and a diagonal matrix M ∈ GLn(F) such thatC · PM = C ′.

Note, that for any such permutation matrix P and diagonal matrix M , there is apermutation matrix P ′ and a diagonal matrix M ′, such that MP = P ′M ′. Thereforethe definition of monomial equivalence takes into account that for practical purposesthe order of rescaling and permuting does not matter .Monomial equivalence of codes is seemingly stronger than code equivalence, as theexistence of a matrix pair MP induces an isometry not only of the two codes, butalso of the whole ambient space Fn the codes are a part of. It is the achievementof MacWilliams,[20], to show, that both notions indeed coincide, in the famousMacWilliams Equivalence Theorem or MacWilliams Extension Theorem:

Theorem 2.8 Let C,C ′ ≤ Fn be two codes. The codes are equivalent if and only ifthey are monomially equivalent. In other words, any weight-preserving isomorphismf : C → C ′ naturally extends to an isometry of Fn.

It has already been mentioned that the information whether two codes are equiv-alent is valuable for problems arising in applications. The MacWilliams EquivalenceTheorem does at least offer a systematic approach to this question. By ranging overall monomial matrices one may check whether two codes are equivalent.rising in applications. The MacWilliams Equivalence Theorem does at least offer asystematic approach to this question. By ranging over all monomial matrices onemay check whether two codes are equivalent.In recent years block codes over finite rings, i.e. submodules over Rn for some fi-nite ring R, have aroused interest in coding theory and it has been proven thatthe MacWilliams Equivalence Theorem holds for a large class of rings and weight

11

functions different from the Hamming weight [37], [13]. A survey on a generalisationto convolutional codes will be given in Chapter 6.

2.1 Bilinear Forms and Characters

Before stating the actual MacWilliams identity, the second influential Theorem ofMacWilliams, for both the weight enumerator as well as the complete weight enu-merator, the notion of duality of codes has to be defined via bilinear forms. For anintroduction on bilinear forms with a special focus on bilinear forms on finite fields,see [33]. Recall that a bilinear form [, ] : Fn × Fn is symmetric if [c, d] = [d, c] andregular if the map Fn → Hom (Fn,F), c 7→ [c, ·] is an isomorphism. It is well knownthat the standard inner product

[, ] : Fn × Fn → F, (c, d) 7→n∑i=1

cidi

is both, regular and symmetric. Although the next lemma is true for any regularand symmetric bilinear form, it is sufficient for the purpose of this text to restrictto the standard form from now on.

Lemma 2.9 Let C ⊆ Fn be a code of dimension k. Then

C⊥ = d ∈ Fn | [c, d] = 0 for all c ∈ C

is a code of dimension n− k. This uniquely determined code is called the dual codeof C in respect to [, ]. Moreover (C⊥)⊥ = C.

As this is a standard result in the theory of bilinear forms I will omit a completeproof and just point out where symmetry and regularity are needed. The symmetryfixes ”‘the”’ dual of a code rather than a left or right dual, the regularity gives theassertion on the dimension and, combined, they finally imply that dualisation of thedual code recovers the original code.Next, I will introduce the concept of C-valued characters. As a reference use forinstance, [18].

Definition 2.10 Let N ∈ N. A character on FN is a group homomorphism(FN ,+)→ (C∗, ·).

Let τ : Fq → Fp, a 7→∑s−1

i=0 api, be the usual trace form on F and let ζ ∈ C be

a primitive p-th root of unity. Using these two ingredients, one can give an explicitrepresentation of all characters on FN .

Proposition 2.11 (i) For each P ∈ GLN(F) the map (FN ,+) → Hom (FN ,C∗),a 7→ χaP := ζτ([aP,·]) is a group isomorphism;

12

(ii) The qN different characters on FN are linearly independent in the vector spaceof C-valued functions on FN ;

(iii)Let 0 6= d ∈ FN . Then∑

c∈FN χd(c) = 0 and∑

c∈FN χ0(c) = qN ;

(iv)For all X, Y ∈ FN and all P ∈ GLN(F) it is χX(Y ) = χY (X) and χXP (Y ) =χX(Y P t);

(v) For all X, Y, Z1, Z2 ∈ FN one has χX(Z1)χY (Z2) = χ(X,Y )(Z1, Z2) where thelatter is defined on F2N analogously, that is, χ(X,Y )(Z1, Z2) := ζτ([(X,Y ),(Z1,Z2)])

with [, ] also denoting the canonical bilinear form on F2N .

2.2 The MacWilliams Identity for Block Codes

As mentioned earlier, the computation of the minimal distance and weight enumer-ators of block codes is a hard problem whose complexity clearly increases rapidlywith the dimension of the code. The MacWilliams identity introduced later in thissection is an efficient tool in block coding theory to address this problem. Given acode of length n, it limits the complexity of computing its weight enumerator to thecomplexity of computing the weight enumerator of a code of dimension at most n

2

and the computational effort of applying the transformation formula on it.

Definition 2.12 Let α ∈ F∗ and Hα : C[Wa | a ∈ F] → C[Wa | a ∈ F] be definedby extending the map

Wa 7→ q−12

∑b∈F

χaα(b)Wb

to a C-algebra homomorphism. This map is the complete α-MacWilliams transform.For simplicity I put H1 = H.The MacWilliams transform h : C[W ]≤n → C[W ]≤n is defined by

t(W ) 7→ (1 + (q − 1)W )nt(1−W

1 + (q − 1)W).

From the definition it is immediate that deg(Hα(f)) = deg(f) for any polynomialf ∈ C[Wa | a ∈ F] and α ∈ F∗. More importantly, any such f is indeed in the rangeof Hα and the image of f is again an ordinary polynomial. Therefore one may applythis complete MacWilliams transforms on the complete weight enumerator of anysubset of Fn, i.e. it is independent of n, but depends only on F. These are importantdifferences from h. From the definition of h one sees that h is obviously dependenton n. Given a polynomial f ∈ C[W ] with deg(f) > n the image under h is in generalnot a polynomial anymore, therefore it is restricted to C[W ]≤n. Moreover, the maph is not degree-preserving even if deg(f) ≤ n.During the main part of the text only H will be used for the sake of simplicity.Therefore I will denote important properties of this map. To this end I need another

13

class of maps on C[Wa | a ∈ F]. For any α ∈ F∗ let

mα : C[Wa | a ∈ F]→ C[Wa | a ∈ F],Wa 7→ Wαa.

In the sequel it is used at some points, how mα for α ∈ F∗ acts on the completeweight enumerator of a set S ⊆ Fn. One has

mα(cwe(S)) =∑c∈S

n∏i=1

Wαci =∑c∈αS

n∏i=1

Wci = cwe(αS). (2.1)

In particular, the complete weight enumerator of a set that is closed under multi-plication is invariant under mα for any α ∈ F∗. The maps mα help to clarify therelations of the different MacWilliams identities.

Lemma 2.13 (i) It is Hα = mα−1 H = H mα for any α ∈ F∗,(ii) the map H is invertible, H2 = m−1 and H3 = H−1;

(iii)restricting H to C[Wa | a ∈ F]n (the subspace of polynomials of degree at mostn) one has ι H = q−

n2 h ι.

The following theorem is due to MacWilliams [20], [21] and is therefore generallyreferred to as MacWilliams identity.

Theorem 2.14 Let C ≤ Fn be a k-dimensional block code and C⊥ its dual, thenfor any α ∈ F∗:

Hα(cwe(C)) = qk−n2 cwe(C⊥)

and h(we(C)) = qkwe(C⊥).

It is at first sight surprising that in the case of the complete weight enumerator theright hand side is independent of α. This irritation is quickly resolved using Lemma2.13 a) and (2.1). For any α ∈ F∗ and any code C ⊆ Fn one has

Hα(cwe(C)) = H(mα(cwe(C))) = H(cwe(C)).

Remark 2.15 The standard matrix interpretation of Hα is (χaα(b))(a,b)∈F2 . Henceby Proposition 2 H is bijective, which proves Lemma 2.13 b). Note, that althoughH and hence its matrix representation are dependent on the choice of ζ, this hasno effect on the MacWilliams identity, i.e. the MacWilliams identity holds inde-pendently of the choice of ζ. Given two primitive p-th roots of unity ζ1 and ζ2 andfixing an α one finds that H(ζ1)α = H(ζ2)α′ for some α′ ∈ F∗. Therefore all existingMacWilliams transforms are already realised by letting α range on F∗ and one losesnothing by fixing ζ.The freedom to choose an α ∈ F∗ originates in the particular choice of the iso-morphism between F and the character group (see Proposition 2.11 a)), as in thesituation considered here N = 1 and GL1(F) = F∗.

14

The MacWilliams identity has been the subject of many generalisations and appli-cations in both mathematical and engineering contexts [1], [25], [36]. From the math-ematical point of view a rather recent result is the generalisation of the MacWilliamsidentity to block codes over finite rings and modules ([36]) using appropriate no-tions of bilinear forms, duality and characters for this setting. A crucial step inthis process has been the finding of a proof that uses characters in a more elegantway exploiting the isomorphism of Fn with the character group of Fn. Besides aes-thetic aspects the MacWilliams identity is best understood in this context and thisapproach gives more insight in the nature of the MacWilliams identity of being aspecial case of a general principle. It should be mentioned that the proof originallyused by MacWilliams [20] was much more elementary using combinatorics basically.It is beyond the scope of this text to go too much into details about the finite ringsetting, but I wish to clarify why the MacWilliams identity is restricted to one bilin-ear form in this text. A reader familiar with the recent literature [36], [25] dealingwith the MacWilliams identity may wonder, why I restrict myself to just one bilin-ear form. In the case of a finite ring instead of a finite field one usually has manyMacWilliams identities, each coming from a certain bilinear form and representationof the ring. In particular, a bilinear form is required to come from a regular, bilinearform on the alphabet, i.e. the finite ring (or even module), which is then canonicallyextended to the code. In the situation where the alphabet is a finite field F, it is easyto see that the only regular, bilinear form on F is precisely (a, b) 7→ ab for a, b ∈ F,and its canonical extension to Fn is [, ].

Example 2.16 Before going on in the text, I will continue Example 2.5 to demon-strate the MacWilliams identities. Therefore I fix ζ2 = −1 and ζ3 := exp(2π

3i) for

the cases q = 2 and q = 3 respectively. With this data the matrix representationsof H are

1√2

(1 11 −1

)and

1√3

1 1 11 ζ3 ζ2

3

1 ζ23 ζ3

respectively. Note, that in characteristic 2 the matrix representation of H is neces-sarily real-valued, as −1 is the unique root of unity of order 2. For odd characteristicthe matrix representation is always complex-valued.

The dual code of C is C⊥ = im

(0 0 11 1 0

)and using the the MacWilliams transform

h for n=3 one finds according to Theorem 2.14

q−1h(we(C)) =1

2(1 +W )3(1 + (

1−W1 +W

))2 =1

2((1 +W )3 + (1−W )2(1 +W ))

= 1 +W +W 2 +W 3 = we(C⊥).

The dual code of C is C⊥ = im

(0 0 11 2 0

). It is of course more involved to apply

the complete MacWilliams transform on the complete weight enumerator cwe(C)

15

of this code. I will only demonstrate how the MacWilliams transform acts on themonomial W 2

0W1, which is a summand of cwe(C):

√3

32 H(W 2

0W1) =

(W0 +W1 +W2)2(ζ1·03 W0 + ζ1·1

3 W1 + ζ1·23 W2)

= (W0 +W1 +W2)2(W0 + ζ3W1 + ζ23W2)

Doing this for the other two monomials and summing up the results leads tocwe(C⊥) = W 3

0 + 2W0W1W2 +W 20W1 +W 2

0W2 + 2W 21W2 + 2W1W

22 .

For a finite set of vectors v1, . . . , vm ⊆ F n let

〈v1, . . . , vm〉 := +mi=1Fvi. (2.2)

I will make use of this notation frequently in the text, particularly in the nextproposition, which is a generalisation of the MacWilliams identity of the completeweight enumerator to affine spaces. The result itself may not be of interest by itself,but is needed in the sequel. The formulation of the proposition explicitly excludesthe classical case of a vector space, as it simplifies the proof, but the assertion istrue for vector spaces as well and, although it looks more complicated, it reduces –but for a factor q – to the classical MacWilliams identity stated above, when it isread for a vector space.

Proposition 2.17 Let c ∈ Fn, V ≤ Fn and c /∈ V . Put E := 〈c, V 〉⊥. Choose anya ∈ Fn such that V ⊥ = 〈a, E〉. Then

H(cwe(c+ V )) = qdimV−n2

∑α∈F

χαa(c)cwe(αa+ E).

In the proof of the proposition I extensively use the properties of characters collectedin Proposition 2.11.Proof: I start the proof considering a special affine space, namely a single vectorc = (c1, . . . , cn) 6= 0 ∈ Fn. This has already been done in Example 2.16 for a vectorof length 3. Applying H on cwe(c) gives:

H(cwe(c)) =n∏i=1

H(Wci) = q−n2

n∏i=1

∑b∈F

χb(ci)Wb.

By induction and using Proposition 2.11 it is straightforward to see that this is equalto

H(cwe(c)) = q−n2

∑a∈Fn

n∏i=1

χai(ci)Wai = q−n2

∑a∈Fn

χa(c)cwe(a).

Although this is not yet the result I claim to prove, everything is now prepared todeal with the general case. Choose any F ≤ Fn such that E ⊕ F = Fn and a ∈ F .

16

Put furthermore F ′ ≤ Fn such that 〈a, F ′〉 = F .

H(cwe(c+ V )) =∑v∈c+V

H(µv) = q−n2

∑v∈c+V

∑a∈Fn

χa(v)µa

= q−n2

∑v∈c+V

∑a∈F

∑b∈E

χv(a+ b)µa+b

= q−n2

∑v∈c+V

∑a∈F

χv(a)∑b∈E

µa+b

= q−n2

∑a∈F

∑b∈E

µa+b

∑v∈c+V

χa(v)

= q−n2

∑α∈F

∑a∈F ′

∑b∈E

µαa+a+b

∑v∈c+V

χαa+a(v). (2.3)

If in the rightmost sum of the last row a 6= 0 one obtains:∑v∈c+V

χαa+a(v) =∑v∈V

χαa+a(c+ v)

=∑v∈V

χαa(v)χαa(c)χa(c)χa(v).

Using that per definition [a, v] = 0 and thus χαa(v) = 1 for any α ∈ F I may continue∑v∈c+V

χαa+a(v) = χαa(c)χa(c)∑v∈V

χa(v).

By construction a ∈ F ′ and hence χa is a non-trivial character on V . Therefore thesum in the last row and thus the row itself is zero for any a 6= 0.On the other hand if a = 0 it is:∑

v∈c+V

χαa(v) =∑v∈V

χa(c+ v)

=∑v∈V

χαa(c) = qdimV χαa(c),

where I used that a ∈ V ⊥. Using both cases finally simplifies (2.3) to

H(cwe(c+ V )) = q−n2

∑α∈F

∑b∈E

µαa+b

∑v∈c+V

χαa(v)

= qdimV−n2

∑α∈F

χαa(c)∑b∈E

µαa+b

= qdimV−n2

∑α∈F

χαa(c)cwe(αa+ E),

which is precisely what I claimed. 2

Note, that although the MacWilliams identity for block codes in Theorem 2.14holds for any Hα, α ∈ F∗, Proposition 2.17 is in this form only true for H1 = H.Similar results may be obtained for α 6= 1, but these will not be needed in the sequel.

17

2.3 From Block Codes to Convolutional Codes

I will now start to introduce the notions of convolutional coding theory, as far asthey are needed in this text. For a more general survey of convolutional codes see[23]. The definition of convolutional codes is quite straightforward and on a firstsight mimics the definition of a block code.

Definition 2.18 Let n ∈ N. Consider the polynomial ring in one indeterminateF[z]. A convolutional code C ⊆ F[z]n is a module such that there is a module C ′with the property C ⊕ C ′ = F[z]n, i.e. it has a direct complement in F[z]n. I call the(finite) rank of C the dimension k of the code.

Although the definition of a convolutional code closely resembles that of a blockcode over a finite ring, there is an important difference. A block code over a finitering R is usually defined as a submodule of Rn without any extra conditions, whereasa convolutional code is required to have a direct complement. It will later be shownthat this is not only a technical condition, but has meaningful consequences, whenit comes to duality notions.Let deg denote the standard degree of a polynomial. The degree of a polynomialvector c ∈ F[z]n is

deg(c) := maxdeg(ci) | 1 ≤ i ≤ n.

For a polynomial matrix G ∈ F[z]k×n with rows gi ∈ F[z]n let

δ := maxdeg(γ) | γ is a k-minor of G.

The matrix G is basic, if there is a G ∈ F[z]n×k such that GG = Ik and minimal ifit is basic and δ =

∑ni=1 deg(gi). The following proposition collects some standard

results from the algebraic theory of convolutional codes.

Proposition 2.19 Any convolutional code C ≤ F[z]n of dimension k admits a min-imal encoder G ∈ F[z]k×n such that C = imG. The degree δ of any such encoder— the complexity — and the list of row degrees [δi := deg(gi) | 1 ≤ i ≤ k] — theForney indices — are invariants of the code C.

It is noteworthy that any convolutional code C of complexity δ = 0 can be regardedas a block code because the encoder G of such a code is constant. Hence block codescan be seen as a subclass of convolutional codes. Adopting this point of view, it isreasonable to ensure that any notion for convolutional codes should reduce to thatof block codes when the complexity of the convolutional code is δ = 0.The definition of the weight notion takes this into account. Let v =

∑t≥0 vtz

t ∈ Cbe a polynomial codeword. Implementing the Hamming-weight wt on each of thevt ∈ Fn the weight of this polynomial codeword is

wt(v) :=∑t≥0

wt(vt). (2.4)

18

Note, that the weight of any polynomial codeword is finite as from some time in-stance t0 ≥ 0 it is vt = 0 for all t ≥ t0. Moreover if v is a constant codeword thenwt(v) is precisely the Hamming-weight of v. This is of course not true for arbitrarypolynomial vectors. So the weight function defined above for convolutional codesis not the Hamming-weight on C, if C is considered as a block code over F[z]. Inanalogy to block coding theory the minimal distance of a convolutional code C isdefined by d := minwt(v(z)) | 0 6= v ∈ C. Again, if C can be interpreted as ablock code, the two definitions of minimal distance coincide.Convolutional codes have in the past also been successfully looked upon as linearsystems [31] over the finite field F. With a message u =

∑t≥0 utz

t ∈ F[z]k and a

codeword v =∑

t≥0 vtzt ∈ F[z]n the ut ∈ Fk and vt ∈ Fn are inputs and outputs

at the time instance t respectively. The weight function obviously respects thisinterpretation of convolutional codes.

Definition 2.20 Let G ∈ F[z]k×n be a minimal encoder of the convolutional codeC with Forney indices δ1, . . . , δr > 0 = δr+1 = . . . = δk and degree δ :=

∑ki=1 δi.

Let G have the rows gi =∑δi

j=0 gi,jzj, i = 1, . . . , k, where gi,j ∈ Fn. For i = 1, . . . , r

define the matrices

Ai =

(0 1

. . .10

)∈ Fδi×δi , Bi =

(1 0 · · · 0

)∈ Fδi , Ci =

gi,1...gi,δi

∈ Fδi×n.

Thecontroller canonical form of G is defined as the matrix quadruple (A,B,C,D) ∈Fδ×δ × Fk×δ × Fδ×n × Fk×n where

A =

(A1

. . .Ar

), B =

(B0

)with B =

(B1

. . .Br

), C =

(C1

...Cr

), D =

( g1,0...

gk,0

)= G(0).

It is easy to see that the matrices A and B contain the information about theForney-indices of the code, whereas the matrices C and D yield the coefficient rowsof the encoder in a structured way.The controller canonical form describes the encoding process in respect to the matrixG as follows.

Proposition 2.21 For u =∑

t≥0 utzt ∈ F[z]k and v =

∑t≥0 vtz

t ∈ F[z]n and

xt ∈ Fδ one has

v = uG⇐⇒xt+1 = xtA+ utBvt = xtC + utD

for all t ≥ 0

where x0 = 0.

andG(z) = B(z−1I − A)−1C +D. (2.5)

Here the xt are the states and the complexity δ is the dimension of the minimal statespace, which is in systems theory called the MacMillan degree of the system. The

19

controller canonical form is as special case of what is called a realisation of C in linearsystems theory (see for instance [8] or [11]). Any matrix quadruple (A,B,C,D) thatfulfills Equation 2.5 for a minimal encoder G of C is in this context called a minimalcanonical realisation. Other than the controller canonical forms of encoders G ofC there are more minimal canonical realisations. However, the controller canonicalform is particularly simple to work with and the relations between all minimalcanonical realisations are well known (see the (3.4) below), so it will be employedas one of the main tools in this text.Before introducing two block codes that are derived from a convolutional code, someproperties of the matrices in the controller canonical form are collected. The twoindex sets

I := 1, 1 + δ1, 1 + δ1 + δ2, . . . , 1 +∑r−1

i=1 δi, J := δ1, δ1 + δ2, . . . ,∑r

j=1 δj = δ(2.6)

faciliate the necessary notation considerably. Their meaning is best understood inthe context of the matrices A and C. The indices in I correspond to the first rowof a block in A and C, whereas the indices in J do with the last row respectively.

Definition 2.22 For C as above let C const := C ∩Fn to be the block code consisting

of the constant codewords in C. Moreover, let CC := im(CD

)⊆ Fn and define

r ∈ 0, . . . , n− k such that dimCC = k + r.

The following identities of the matrices (A,B,C,D) in the controller canonical formare easily checked.

Remark 2.23 One has ABt = 0 and BBt =(Ir 00 0

). Furthermore, imB =

〈ei | i ∈ I〉 and kerB = im (0(k−r)×r, Ik−r) ⊆ Fk. Finally,

(BtB)i,j =

1, if i = j ∈ I0, else,

, (AtA)i,j =

1, if i = j /∈ I0, else,

,

(AAt)i,j =

1, if i = j /∈ J0, else.

As a consequence, AtA+ BtB = Iδ. Moreover, kerA ∩ kerC = 0, and(kerA)C ∩C const = 0.The last two properties are due to the fact that the encoder matrix G = B(z−1I −A)−1C + E is minimal. Indeed, notice that kerA = 〈ej | j ∈ J 〉, where e1, . . . , eδdenote the standard basis vectors in Fδ. Using G as in Definition 2.20 one sees thatejC is the highest coefficient vector of one of the polynomial rows of G. Becausefor a minimal matrix G the highest coefficient vectors g1,δ1 , . . . , gk,δk are linearlyindependent and noticing that C const = 〈gr+1,δr+1 , . . . , gk,δk〉, one easily derives thedesired properties.

20

Remark 2.24 (1) Suppose the encoder matrix G is as in Definition 2.20. Then

CC = im(CD

)= 〈gi,ν | i = 1, . . . , k, ν = 0, . . . , δi〉. Recalling that two different

encoders of C differ only by a left unimodular transformation it follows imme-diately that the block code CC does not depend on the choice of the encoder Gbut rather is an invariant of the code C. Since rankD = k it is clear that thedimension of CC is indeed at least k.

(2) One has dim C const = k − r and precisely, with the notation from (1),

C const = 〈gi | i = r + 1, . . . , k〉 = (kerB)D := uD | u ∈ kerB. (2.7)

This also shows C const ⊆ CC . Furthermore one has imD = imBtD ⊕ C const.

Given any convolutional code C ≤ F[z]n there is an easy way to construct aconvolutional code ρ(C) via time-reversal. The Forney indices, as well as the codesρ(C)const and CCρ coincide with those of C, but the code ρ(C) is in general differentfrom C. Let therefore

ρ : F[z]n → F[z]n, v(z) 7→ zdeg(v(z))v(z−1).

This map indeed reverses the time instances of a code word’s coefficient vectors, asit is

ρ(

t0∑t=0

vtzt) =

t0∑t=0

vtzt0−t,

and it is an lengthy but easy exercise to show that the following proposition holds.

Proposition 2.25 Let C ≤ F[z]n be a (n, k, δ)-convolutional code and G ≤ F[z]k×n

a minimal encoder of C with Forney indices (δ1, . . . , δk), then ρ(C) is a (n, k, δ)convolutional code and

ρ(G) :=

zδ1

. . .

zδk

G(z−1) ∈ F[z]k×n, (2.8)

where ρ is applied rowwise on G, is a minimal encoder of ρ(C) with the same Forneyindices as G. Finally ρ(ρ(v)) = v for all v ∈ F[z]n and hence ρ(ρ(C)) = C.

The reversal code is for instance treated in [15] and [30], where it is referred toas reciprocal code and proves to be a useful tool in the decoding of convolutionalcodes.Besides the invariants mentioned in the proposition both codes share additionalproperties. It is immediate that the codes C const ⊆ C and CC are invariant underρ, such that they are the respective block codes of ρ(C) as well. Moreover, as for

21

any v ∈ C one has wt(v) = wt(ρ(v)), both codes have the same minimal distance.But as the map ρ is not F[z]-linear, they are not isometric under ρ, if one wishesto use the block coding definition of having a linear weight-preserving map betweenboth codes. Indeed, whereas there is a generally accepted notion of isometry forblock codes there is as yet no such concept for convolutional codes. This problem isaddressed in Chapter 6.

2.4 Duality Notions for Convolutional Codes

In the first section of this chapter I claimed that block codes will be treated as asubclass of convolutional codes and every notion that is introduced for convolutionalcodes can be interpreted as an extension of the respective notion for block codes. Ithas been demonstrated that the meaningful extension of the Hamming weight fromblock codes over F to convolutional codes is different from the Hamming weight onC as a block over F[z]. In fact this weight notion is generally agreed upon in theliterature.However, the situation is different, when it comes to duality notions. The is nointrinsic notion of duality for convolutional coding theory. In [6] the authors give anoverview over several duality notions for convolutional codes, which justify them-selves from different applications. I introduce two concepts of duality for convolu-tional codes, both of which reduce to the classical notion of duality for codes withδ = 0 and which are most commonly found in literature , e.g. in [23], [34], [1].A very natural approach is to define duality in respect to the F[z]-bilinear form

[, ] : F[z]n × F[z]n → F[z], (c, d) 7→n∑i=1

cidi,

as the bilinear form used in block coding theory is clearly the restriction of this formto Fn × Fn. Hence both forms need not be separated by symbols. However, as inthe sequel block code duality and convolutional code duality will appear in the samecontext it is convenient to distinguish the duality notions in symbols as follows.

Definition 2.26 Let C ≤ F[z]n be a convolutional code. Its dual is

C := d ∈ F[z]n | [c, d] = 0.

Of course the form [, ] is symmetric and regular as an F[z]-bilinear form as well.Therefore it is not surprising that a generalisation of lemma 2.9 to convolutionalcodes is possible.

Proposition 2.27 For any (n, k, δ)-convolutional code the dual is a convolutionalcode with parameters (n, n− k, δ).

22

Again the proof will be omitted. Symmetry and regularity are used in the proofin the same way as in the block code case. The result on the complexity has to beattributed to the fact that the bilinear form [, ] ”‘has degree 0”’. It is well knownfrom linear algebra, that any bilinear form for instance on Fn × Fn may be repre-sented as a matrix M ∈ Fn×n, such that [a, b] = aMbt for all a, b ∈ Fn. Symmetryand regularity are reflected in the matrix by the fact that M is itself symmetric andinvertible. The same idea may be applied to F[z]-bilinear forms as well. In thissituation the matrix has polynomial entries and a symmetric and regular form maybe represented as a symmetric and unimodular matrix. Such a matrix need not beconstant but may well have a positive degree. If it has a positive degree the resulton the complexity in Proposition 2.27 is not true anymore. For any form with aconstant representation matrix, however, it is t rue.I explained in the section on bilinear forms, why I restrict myself to the form [, ], thereason being that only bilinear forms on codes that arise from bilinear forms overthe alphabet itself admit a MacWilliams identity. In the convolutional code settingthe alphabet is the ring F[z] and likewise the only symmetric and regular bilinearforms on that ring are (a, b) 7→ aαb for a, b ∈ F[z] and α ∈ F∗. It is easy to seethat the dual of a code in respect to any of these forms is the same. Therefore itsuffices to consider the form [, ] only, when looking for a MacWilliams identity forconvolutional codes.

As the form [, ] can be interpreted as a F-bilinear form as well, one may deriveuseful results on the block codes, which underly a convolutional code and its dual.The two block codes from Definition 2.22 and the corresponding objects C

Cand

C const for the dual code C behave as follows under duality.

Proposition 2.28 One has (CC )⊥ = C const. As a consequence, C has exactly n−k−

r zero Forney indices and r nonzero Forney indices. Moreover, dimCC

= n− k + r.

Proof: Using the notation and statement of Remark 2.24(1) I obtain

c ∈ (CC )⊥ ⇐⇒ [c, gi,ν ] = 0 for all i = 1, . . . , k, ν = 0, . . . , δi

⇐⇒ [c, gi] = 0 for all i = 1, . . . , k

⇐⇒ c ∈ C ∩ Fn = C const,

where the second equivalence uses the fact that c is a constant vector. The conse-quences are clear from the definition of r and r. 2

Next, I will introduce a different notion of duality, which does not come from anF[z]-bilinear form on F[z]n × F[z]n, but is deduced from the F-bilinear form [, ] andwhich is in the literature (for instance [6] refer

Definition 2.29 Let the F-bilinear form [[, ]] be defined as follows

F[z]n × F[z]n −→ F,(∑t≥0

vtzt,∑t≥0

wtzt)7−→ [[

∑t≥0

vtzt,∑t≥0

wtzt]] :=

∑t≥0

[vt, wt].

23

and the sequence space dual of a convolutional code C be

C := w ∈ F[z]n | [[v, zlw]] = 0 for all v ∈ C and l ∈ N0. (2.9)

This form is extensively treated in the literature [15], [6] and it is well knownthat the sequence space dual of a convolutional code is a convolutional code again,although it is not immediate from the definition. For convolutional codes of com-plexity 0, this form reduces to [, ] on Fn as well. Moreover, one can show that thereis an analogue to Proposition 2.27.

Corollary 2.30 For any (n, k, δ)-convolutional code the dual in respect to [[, ]] is aconvolutional code with parameters (n, n− k, δ).

The form [[, ]] obviously takes the idea into account that in applications it is not inthe first place a polynomial vector which is transmitted, but a sequence of constantcodewords. It has been demonstrated that this viewpoint led to a different notion ofthe weight function for a convolutional code. This is only loosely connected to theweight function one would use in the case that a convolutional code C is understoodas a block code over the alphabet F[z]. So one may ask how the duals to C and Cof a convolutional code are related, if at all. Surprisingly, although both notions ofduality look very different, the relation of the dual codes is tight.

Proposition 2.31 Let C be a convolutional code and C and C be its duals. Then

C = ρ(C) = ρ(C). (2.10)

This connection finally allows me to concentrate on one of the two notions of du-ality and it is, for example, clear that Proposition 2.28 may directly be transferedto duality in respect to [[, ]] as the block codes CC and C const are invariant under thetransformation to the reciprocal code. There is, however, an important difference,when it comes to the concept of self-orthogonality and self-duality. This differencewill be addressed in Chapter 5, as a code may be self-orthogonal, for example, withrespect to [, ] but not with respect to [[, ]] or vice versa.

I close this chapter by giving an example to illustrate the notions introduced sofar.

Example 2.32 Let F = F3 and

G =

(1 + z2 2 + z 0

1 0 2

)∈ F[z]2×3 and G =

(z + 2 2 + 2z2 z + 2

)∈ F[z]1×3.

It is easy to see that G and G are minimal, the Forney indices of C and C are (2, 0)

and (2) respectively and they satisfy GGT = 0. Thus, the codes C := imG and

C = im G are mutual duals with respect to [, ]. According to Proposition 2.31

C = im ρ(G) =(z + 2z2 2 + 2z2 z + 2z2

)∈ F[z]1×3.

24

The controller canonical forms of the given encoders G and G are

(A,B,C,D) =

((0 10 0

),

(1 00 0

),

(0 1 01 0 0

),

(1 2 01 0 2

))and

(A, B, C, D) =

((0 10 0

),(1 0

),

(1 0 10 2 0

),(2 2 2

)),

respectively. Note in particular that there is no obvious way, how to derive thecontroller canonical form of C from that of C.The block codes from Definition 2.22 are

C const = im(1 0 2

)and CC := F3

3

C const = 0 and CC = im

(0 1 01 0 1

)in accordance with Proposition 2.28.

25

3 The Weight Adjacency Matrix of a Convolu-

tional Code

Although I defined a weight function on convolutional codes in the last chapter, noanalogue to a weight enumerator has been introduced. Because any convolutionalcode has infinitely many polynomial codewords, it is not possible to just sum upthe monomials Wwt(c) for all c ∈ C as has been done for a block codes. There aredifferent ways to solve this problem for convolutional codes. One is, for instance, tocount only the weights of a class of codewords that is called atomic. Although theresulting object, a formal power series called the weight enumerator of a convolu-tional code, yields valuable information on the weight distribution of the entire code,it is therefore interesting when studying the performance of the code, it was soonfound that it does not carry enough data of the code to allow for a MacWilliamsidentity [34]. One of the reasons might be that the set of all atomic codewords ofa convolutional code is not a group nor has it any comparable structure. However,this weight enumerator is a generalisation of the weight enumerator concept forblock codes because in the case of a block code it is identical to its classical weightenumerator.

3.1 The Complete Weight Adjacency Matrix and its Prop-erties

In order to introduce a more refined weight counting object recall that accordingto Proposition 2.21 a convolutional code C can be interpreted as a linear systemover the field F by introducing the state space Fδ. Using the state space descriptiongiven by the controller canonical form (A,B,C,D) one can represent the code in adirected graph whose vertices are the states X ∈ Fδ and which has an edge fromstate X ∈ Fδ to state Y ∈ Fδ, if there is an input u ∈ Fk such that Y = XA+ uB.This edge is then labelled with all pairs (u, vu) ∈ Fk×n, where vu = XC + uD.This graph, the state space graph, is indeed an equivalent representation of theconvolutional code, which can be recovered from the graph.A considerable amount of information about the code is of course lost when onlythe weights of the outputs vu are collected in the labels of the edges. The resultinggraph is a weighted directed graph, the weighted state space graph, with the sameedges and vertices as the state space graph, but with different labels. PuttingU := u ∈ Fk | Y = XA + uB and VU := v ∈ Fn | ∃u ∈ U : v = XC + uD theedge between X and Y is as follows:

Xwe(VU )−→ Y.

Both objects, the state space graph and the weighted state space graph, have been

26

studied in the past and have proven useful in understanding the weight distributionof convolutional codes. In particular the class of atomic codewords mentioned abovemay be derived from the state space graph. An object not studied before is obtainedby replacing the labels of the weighted state space graph with the complete weightenumerator of V

Xcwe(VU )−→ Y.

As may be expected the amount information about the code contained in this graph,the complete weighted state space graph lies in between the ones of the other twographs. From the two weighted state space graphs one can derive weight countingobjects for convolutional codes by representing them by their adjacency matrices.

Definition 3.1 Let F := Fδ × Fδ. The complete weight adjacency matrix (cWAM)

Γ(G) = (γX,Y ) ∈ C[Wa | a ∈ F]qδ×qδ

is defined to be the matrix indexed by (X, Y ) ∈ F with the entries

γX,Y := cwe(XC + uD | u ∈ Fk : Y = XA+ uB) ∈ C[Wa | a ∈ F]≤n.

Likewise, the weight adjacency matrix (WAM)

Λ(G) = (λX,Y ) ∈ C[W ]qδ×qδ

is defined to be the matrix indexed by (X, Y ) ∈ F with the entries

λX,Y := we(XC + uD | u ∈ Fk : Y = XA+ uB) ∈ C[W ]≤n.

A pair of states (X, Y ) ∈ F is called connected if λX,Y 6= 0, else it is called discon-nected. The set of all connected state pairs is denoted by ∆ ⊆ F .

The WAM has been introduced and studied extensively in [24]. Because the entriesof the cWAM are complete weight enumerators of subsets of Fn, they are all eitherhomogeneous polynomials of degree n or 0. This is different in the WAM. Its entriesare polynomials of degree at most n.

Remark 3.2 It has been shown in Lemma 2.4 that the weight enumerator of ablock code can be derived from the complete weight enumerator by applying themap ι. This technique works for the cWAM and the WAM as well, i.e.

Λ(G) = ι(Γ(G)),

where the map ι has to be applied entrywise.

27

Observe that in the case δ = 0 the matrices A, B, C do not exist while D = G. Asa consequence, Γ = γ0,0 = cwe(C) is the ordinary complete weight enumerator of theblock code C = uG | u ∈ Fk ⊆ Fn and using Remark 3.2 it is Λ = λ0,0 = we(C).It is clear from Definition 3.1 that both cWAM and the WAM depend on the chosenencoder G. This dependence, however, can nicely be described. Since I will makeintensive use of the notation later on I introduce the following.

Definition 3.3 For any P ∈ GLδ(F) define P(P ) ∈ GLqδ(C) by P(P )X,Y = 1 ifY = XP and P(P )X,Y = 0 else. Furthermore, let Π := P(P ) | P ∈ GLδ(F)denote the subgroup of all such permutation matrices.

By definition, the matrix P(P ) corresponds to the permutation on the set Fδ inducedby the isomorphism P . Notice that P is an isomorphism of groups and basicallyis the canonical faithful permutation representation of the group GLδ(F). As aconsequence for all P,Q ∈ GLδ(F) on has

P(P )P(Q) = P(PQ) and P(P−1) = P(P )−1 = P(P )t. (3.1)

Obviously, for any Γ ∈ C[Wa | a ∈ F]qδ×qδ or Λ ∈ C[W ]q

δ×qδ and any P := P(P ) ∈ Πthe following identities hold(

PΓP−1)X,Y

= ΓXP,Y P for all (X, Y ) ∈ Fand

(PΛP−1

)X,Y

= ΛXP,Y P for all (X, Y ) ∈ F . (3.2)

This enables me to collect the following facts about both kinds of adjacency matrices.

Remark 3.4

(a) Using the obvious fact wt(αv) = wt(v) for any α ∈ F∗ and v ∈ Fn one im-mediately has λX,Y = λαX,αY for all α ∈ F∗. Hence Λ(G) is invariant underconjugation with permutation matrices that are induced by scalar multiplica-tion on Fδ, i. e., under conjugation with matrices P(P ) where P = αI for someα ∈ F∗. This is not true for the cWAM.

(b)In [8, Thm. 4.1] it has been shown that if G1, G2 ∈ F[z]k×n are two minimalencoders of C then Λ(G1) = PΛ(G2)P−1 for some P ∈ Π. Hence the equivalenceclass of Λ(G) modulo conjugation by Π, where G is any minimal encoder, formsan invariant of the code. It is called the generalised weight adjacency matrixof C. I will generalise this result for the cWAM after this remark using the samearguments to prove this result used in [11].

(c) Combining (b) and (a) one sees that the equivalence class of Λ(G) is already fullyobtained by conjugating Λ(G) with matrices P(P ) where P is in the projectivelinear group GLδ(F)/αI | α ∈ F∗. This reduces the computational effort whencomputing examples.

(d)For the cWAM one has γαX,αY = mα(γX,Y ) for all α ∈ F∗ due to (2.1).

28

Proposition 3.5 Let C ≤ F[z]n be a (n, k, δ) convolutional code and (A,B,C,D)as well as (A, B, C, D) canonical minimal realisations of C. There is a P ∈ GLδ(F)such that the cWAMs Γ and Γ satisfy for all (X, Y ) ∈ F

ΓX,Y = ΓXP,Y P . (3.3)

In particular, Γ = P(P )ΓP(P )−1 and, factoring out the conjugation of Γ with suchpermutation matrices, the coset Γ, the generalised complete weight adjacencymatrix is an invariant of the code.

Proof: Theorem 2.6(ii) in [11] establishes the following connection between thetwo realisations (A,B,C,D) and (A, B, C, D). There is a P ∈ GLδ(F), U ∈ GLk(F),M ∈ Fδ×k such that

A = P−1(A−MB)P ), B = UBP, C = T−1(C −MD), D = UD. (3.4)

A proof of this fact is given in Chapter 6, Proposition 6.4. But then one canstraightforwardly check that for any (X, Y, u, v) ∈ Fδ×δ×k×n

Y = XA+ uB, v = XC + uD

is equivalent to

Y P = XPA+ (uU−1 +XMU−1)B, v = XPC + (uU−1 +XMU−1)D.

Since for any given X the mapping u 7→ uU−1 + XMU−1 is bijective on Fk isimmediate from the definition of the cWAM. 2

3.2 The Space of State Pairings

As the entries of each representative of a weight adjacency matrix to a convolutionalcode’s realisation are indexed by a pair of states (X, Y ) ∈ F , it is crucial to under-stand which state pairings lead to which entries in the WAM and the cWAM. FromDefinition 3.1 it is clear that (X, Y ) ∈ F is connected if and only if there is a u ∈ Fksuch that (X, Y ) = (X,XA+ uB). Using rankB = r I obtain

Proposition 3.6 ∆ = im(I A0 B

)is an F-vector space of dimension δ + r.

Later on I will also need the dual of ∆ in F in respect to the form

β : Fδ × Fδ → F, ((X, Y ), (X ′, Y ′)) 7→2δ∑i=1

XiX′i + YiY

′i .

It may seem like abuse of notation to use a different symbol for this form, as it isprincipally defined in the same way as [, ], but as both forms will appear side by side

29

in the following, it is necessary to clearly distinguish between them. The form β isdefined exclusively on the space of state pairings whereas the form [, ] is defined onconvolutional and block codes. However, I will use the same symbol ⊥ as for blockcodes to indicate the dual in respect to β, as it will always be clear from the contextwhich dual is meant. With the help of Remark 2.23 the dual of ∆ in F can easilybe calculated and is given as follows.

Lemma 3.7 ∆⊥ = (X,−XA) | X = (X1, . . . , Xδ) ∈ Fδ such that Xj = 0 for j ∈J .

In the next lemma I will show that the nontrivial entries γX,Y and λX,Y of thecWAM and the WAM respectively can be described as weight enumerators of certaincosets of the block code C const. More precisely, I will relate them to the F-vectorspace homomorphism

ϕ : F −→ Fn, (X, Y ) 7−→ XC + Y BtD. (3.5)

Recall the notation 〈a, U〉 introduced in (2.2).

Lemma 3.8 For any state pair (X, Y ) ∈ ∆ it is

γX,Y = cwe(ϕ(X, Y ) + C const

)and λX,Y = we

(ϕ(X, Y ) + C const

).

Moreover in the case of the simple WAM,

λX,Y =

we(C const), if ϕ(X, Y ) ∈ C const,

1q−1

(we(〈ϕ(X, Y ), C const〉

)− we(C const)

), else.

Proof: First notice that for any (X, Y ) ∈ ∆ the set u ∈ Fk | Y −XA = uBis non-empty. Right-multiplying the defining equation of this set with Bt I getupon use of Remark 2.23 that Y Bt = uBBt, which says that the first r entriesof u are completely determined by Y . This shows u ∈ Fk | Y − XA = uB ⊆Y Bt + im (0, Ik−r). From im (0, Ik−r) = kerB, see Remark 2.23, I conclude thatthese two affine subspaces coincide. Hence, using Remark 2.24(2), I obtain

γX,Y = cwe(XC + (Y Bt + kerB)D) = cwe(ϕ(X, Y ) + (kerB)D

)= cwe(ϕ(X, Y ) + C const).

This shows the first part of the lemma, as the assertion on the entries of the WAMmay easily be deduced by this. If ϕ(X, Y ) ∈ C const, I immediately conclude λX,Y =we(C const). Otherwise I have

λX,Y = we(ϕ(X, Y ) + C const) = we(α(ϕ(X, Y ) + C const)

)= we

(αϕ(X, Y ) + C const

)30

for all α ∈ F∗. Moreover,

〈ϕ(X, Y ), C const〉 =⋃α∈F

(αϕ(X, Y ) + C const

),

where due to ϕ(X, Y ) /∈ C const this union is disjoint. From this the last assertioncan be deduced. 2

The lemma shows that the mapping ϕ and the block code C const along with theknowledge of ∆ fully determine Λ(G). Moreover, to find out how many state pairs(X, Y ) ∈ ∆ are mapped to C const, I will slightly modify the mapping ϕ.

Lemma 3.9 The homomorphism

Φ : ∆ −→ CC/C const, (X, Y ) 7−→ ϕ(X, Y ) + C const

is well-defined, surjective and satisfies

(a) ker Φ = (X, Y ) ∈ ∆ | ϕ(X, Y ) ∈ C const,(b)dim ker Φ = δ − r, where r is as in Definition 2.22,

(c) (X, Y ) ∈ ker Φ ⇔ there is an edge Xu0→ Y for some u ∈ Fk.

Proof: That Φ is well-defined follows simply from imϕ ⊆ CC . As for the

surjectivity notice that any row of(CD

)that is not in C const is a row of the matrix(

CBBtD

), see also Remark 2.24(2). Moreover, by Remark 2.23 I have im

(C

BBtD

)=

im(I A0 B

)( CBtD

)= ϕ(∆), where the latter follows from the definition of the mapping

ϕ along with Proposition 3.6. All this implies the surjectivity of Φ. Now part (a) istrivial. The surjectivity together with dim ∆ = δ+r yields (b), since dimCC = k+ rand dim C const = k − r.Let (X, Y ) ∈ ker Φ. Then λX,Y = we(C const). As C const is a vector space, 0 ∈ C const

and therefore there is a transition from X to Y with output 0 for some u ∈ Fn−k.Conversely if there is a zero-transition for some u ∈ Fn−k and (X, Y ) ∈ F , thisimplies that the respective coset of C const is a vector space as it contains 0. Accordingto Lemma 3.8 this implies ϕ(X, Y ) ∈ C const, which is due to part a) equivalent to(X, Y ) ∈ ker Φ. 2

As a consequence of Lemma 3.9 one has

ϕ(∆) + C const = CC . (3.6)

After these preparations I will clarify some more redundancies in the adjacencymatrix of C.

Proposition 3.10 Let ∆∗ ⊆ ∆ be any subspace such that ∆ = ∆∗ ⊕ ker Φ. More-over, define ∆− := 〈(0, ei) | i /∈ I〉 ⊆ F . Then

(a) ∆⊕∆− = F , hence ∆∗ ⊕ ker Φ⊕∆− = F .

31

(b)For (X, Y ) ∈ ∆− and (X ′, Y ′) ∈ ∆ one has γX+X′,Y+Y ′ = 0 and λX+X′,Y+Y ′ = 0if and only if (X, Y ) 6= 0.

(c) For (X, Y ) ∈ ∆∗ and (X ′, Y ′) ∈ ker Φ one has

γX+X′,Y+Y ′ = γX,Y and λX+X′,Y+Y ′ = λX,Y .

Proof: (a) ∆∩∆− = 0 follows from ei /∈ imB for i /∈ I. The rest is clear sincedim ∆− = δ − r = 2δ − dim ∆. (b) is obvious from the first direct sum in (a) andthe definition of ∆. As for (c) notice that by linearity and Lemma 3.9(a) ϕ(X, Y )−ϕ(X + X ′, Y + Y ′) ∈ C const. Hence ϕ(X, Y ) + C const = ϕ(X + X ′, Y + Y ′) + C const

and the result follows from Lemma 3.8. 2

Concerning Proposition 3.10(c) it is worth mentioning that the converse statement

[γX,Y = γX,Y (or even λX,Y = λX,Y ) =⇒ (X, Y )− (X, Y ) ∈ ker Φ]

is in general not true as different affine sets may well have the same complete weightenumerator. Moreover, notice that the results above are obviously true for any directcomplement of ∆ in F . The particular choice of ∆− will play an important role dueto the following corollary.

Corollary 3.11 One has ϕ|∆− = 0 and CC =⋃

(X,Y )∈∆∗

(ϕ(X, Y )+C const) with the

union being disjoint.

Proof: The first part follows directly from the definition of all objects involved.The inclusion “⊇” of the second statement is obvious. For the other inclusion letXC + uD ∈ CC for some (X, u) ∈ Fδ+k. Using that imD = imBtD + C const, seeRemark 2.24(2), this yields XC + uD = XC + Y BtD + a for some Y ∈ Fδ anda ∈ C const. Hence XC+uD ∈ ϕ(X, Y )+C const where (X, Y ) ∈ F . Now ϕ|∆− = 0 andLemma 3.9(a) imply that without loss of generality (X, Y ) ∈ ∆∗. The disjointnessof the union follows from ∆∗ ∩ ker Φ = 0 with the same lemma. 2

Finally the terminology developed so far may be used to describe the cWAM andthe WAM conclusively.

Proposition 3.12 For the cWAM Γ one has

γX,Y =

0, if (X, Y ) /∈ ∆

cwe(C const), if (X, Y ) ∈ ker Φ,

cwe(ϕ(X, Y ) + C const), else.

and for the WAM

λX,Y =

0, if (X, Y ) /∈ ∆

we(C const), if (X, Y ) ∈ ker Φ,

we(ϕ(X, Y ) + C const), else.

32

The proposition is already proven by the considerations above together withLemma 3.8

Remark 3.13 Proposition 3.12 allows the clarification of many of the redundanciesinvolved in a typical (complete) WAM (see Example 3.18). From the first case it isimmediate that dimF − dim ∆ = q2δ − qδ+r = qδ(qδ − qr) entries of the cWAM andWAM are 0. Considering that most convolutional codes considered by engineers areone-dimensional codes and hence r = 1 with a complexity that is typically greaterthan 1, this is why a (complete) WAM of such a code is usually a sparse matrix.Moreover, given any non-zero entry of the (complete) WAM one can predict thatthere will be qdim ker Φ − 1 = qδ−r − 1 entries of the (complete) WAM, that areequal to that entry due to Proposition 3.10 (c). For the WAM the redundancyis even higher, due to Remark 3.4 (a), i.e. each non-zero entry in the matrix hasat least multiplicity qδ−r+1. Exploiting these redundancies allows to compute the(complete) WAM considerably faster, especially if codes of high complexity and highdimension are considered. Without the information on redundancy in the matrixthe high complexity demands many entries to be computed and the high dimension,in addition, makes the calculation of the (complete) weight enumerators costly.

3.3 Further Results on the Weight Adjacency Matrices

In this small section I will collect some results that will be used later on, but canalready be proven and perceived as such. In the proofs the reader may familiarisehim- or herself with the notation for the space of state pairings introduced aboveand see, in which way it helps to deal with the entries of the adjacency matrices.

Proposition 3.14 The entries of the adjacency matrix satisfy∑(X,Y )∈∆∗

γX,Y = cwe(CC) and∑

(X,Y )∈F

γX,Y =∑

(X,Y )∈∆

γX,Y = qδ−rcwe(CC)

and hence∑(X,Y )∈∆∗

λX,Y = we(CC) and∑

(X,Y )∈F

λX,Y =∑

(X,Y )∈∆

λX,Y = qδ−rwe(CC).

Proof: Using Lemma 3.8 and Corollary 3.11 I obtain∑(X,Y )∈∆∗

γX,Y =∑

(X,Y )∈∆∗

cwe(ϕ(X, Y ) + C const) = cwe(CC ).

Next notice that∑

(X,Y )∈F γX,Y =∑

(X,Y )∈∆ γX,Y as any disconnected state pair

(X, Y ) satisfies γX,Y = 0. Hence with Proposition 3.10(c) and Lemma 3.9(b) I get∑(X,Y )∈∆

γX,Y =∑

(X,Y )∈ker Φ

∑(X,Y )∈∆∗

γX+X,Y+Y =∑

(X,Y )∈ker Φ

∑(X,Y )∈∆∗

γX,Y

=∑

(X,Y )∈ker Φ

cwe(CC ) = qδ−rcwe(CC ).

33

2

The next two results illustrate how one works with a WAM or cWAM as a represen-tative and how the permutation matrices that connect the different representativesin one class are used to prove that two classes are equal.

Proposition 3.15 Let the data be as in Definition 2.20. Furthermore, let ρ(G) bethe reciprocal matrix of G as in Proposition 2.25. Then the controller canonicalform of ρ(G) is given by (A,B,Cρ, Dρ), where(

Cρ

Dρ

)=

(TAt TBt

BT I −BBt

)(CD

)and

T =

T1. . .

Tr

∈ GLδ(F) with Ti =

1

. ..

1

∈ GLδi(F). (3.7)

Moreover, the matrix L :=(TAt TBt

BT I −BBt)

satisfies LLt = Iδ+k, thus L ∈ GLδ+k(F).

Proof: Since G and ρ(G) are both minimal with the same row degrees δ1, . . . , δk,it is clear that the controller canonical form of ρ(G) has, just like G, state transitionmatrix A and input-to-state matrix B. The identities for Cρ and Dρ follow straight-forwardly from the form of A, B, C, D as in Definition 2.20 along with the simplematrix identities and

AtA+BtB = I, I −BBt =(

0 00 Ik−r

), R = R−1 = Rt, and RAtR = A (3.8)

as well as the fact that

Cρ =

Cρ

1

...

Cρr

, where Cρi =

gi,δi−1

...gi,1gi,0

, and Dρ =

g1,δ1...

gr,δrgr+1,0

...gk,0

.

The identity LLt = Iδ+k can easily be verified in the same way using (3.8) andRemark 2.23. 2

Now it is easy to present the cWAM of the reversal code.

Corollary 3.16 Let Γ be the complete weight adjacency matrix of C associatedwith the controller canonical form (A,B,C,D). Then the complete weight adjacencymatrix Γρ of the reversal code ρ(C) associated with the controller canonical form(A,B,Cρ, Dρ) given in Proposition 3.15 satisfies

Γ′X,Y = ΓY T,XT for all (X, Y ) ∈ F , (3.9)

where T ∈ GLδ(F) is as in (3.7).

34

The assertion may be intuitively perceived. The time reversal by which the reversalcode is defined corresponds to a reversal of the direction of the edges in the statespace graph, i.e. an edge X −→ Y is transformed to X ←− Y by the time reversal.By virtue of the projection map ι a similar result may be obtained for the WAM Λ.

Proof: Because

∆ = (X, Y ) ∈ F | Y = XA+ uB for some u ∈ Fk = (X, Y ) ∈ F | ΛX,Y 6= 0,

it is ΓρX,Y 6= 0 ⇐⇒ (X, Y ) ∈ ∆ and ΓρY T,XT 6= 0 ⇐⇒ (Y T,XT ) ∈ ∆. Hence onehas first to verify that (X, Y ) ∈ ∆⇐⇒ (Y T,XT ) ∈ ∆. Using the description of ∆in Proposition 3.6 as well as the matrix L from Proposition 3.15 along with theidentities in (3.8), the latter equivalence follows directly from

im

(I A0 B

)(0 TT 0

)= im

(AT TBT 0

)= imL−1

(I A0 B

)= im

(I A0 B

).

Now it remains to show Identity (3.9) for (X, Y ) ∈ ∆. Thus, fix (X, Y ) ∈ ∆. From3.8 I know that

ΓρX,Y = cwe(XCρ+Y BtDρ+ρ(C)const) and ΓY T,XT = cwe((Y T )C+(XT )BtD+C const

).

(3.10)Since C const is generated by the constant rows of the minimal encoder G, it followsdirectly from the definition of the reciprocal matrix in (2.8) that ρ(C)const = C const.Furthermore, since (X, Y ) ∈ ∆, there exists u ∈ Fk such that Y = XA+uB. UsingProposition 3.15 and (3.8) as well as Remark 2.23 one computes

XCρ + Y BtDρ = XTAtC +XTBtE + Y BtBTC + Y Bt(I −BBt)D =

= XATC +XTBtD +XABtBTC + uBBtBTC

= (XA+ uB)TC + (XT )BtD = (Y T )C + (XT )BtD.

With the aid of (3.10) this proves (3.9) for all (X, Y ) ∈ F . 2

It should be pointed out that this result reflects the well-known fact that a codeand its reversal code share all important weight invariants such as the weight enu-merator, because the (complete) WAM carries information about these parametersand the difference of the (complete) WAM of a code and its reversal is only trans-position.The next result illustrates how the representation of the code as a directed weightedgraph is reflected by the (complete) weight adjacency matrix. Given two block codesC ≤ Fn and C ′ ≤ Fn′ of dimension k and k′ the (outer) direct sum C ⊕ C ′ ≤ Fn+n′ isobviously a code of dimension k + k′. The weight enumerators of this code can bederived by the weight enumerators of the original codes in the following manner:

cwe(C ⊕ C ′) = cwe(C)cwe(C ′) and hence we(C ⊕ C ′) = we(C)we(C ′).

Note that the the code C⊥ from Example 2.16 is indeed an outer direct sum of twosmaller codes, namely C⊥ = im

(1 1

)⊕ im

(1)

and it is straightforward to verify

35

the identities above.These identities hold even for arbitrary sets rather than codes. In particular, theminimal distance of the code C⊕C ′ is known to be the sum of the minimal distancesof the original codes. Moreover, this code inherits additional structural properties,e.g. self-orthogonality or self-duality. Although for applications this construction isnot interesting, it is mathematically useful as a very simple construction of a codewith given length, dimension, minimal distance, weight distribution and structuralproperties and it is for instance used in [25].

Therefore it is a valid question, whether one can give an analogous statementfor convolutional codes. Given two convolutional codes C ≤ F[z]n and C ′ ≤ F[z]n

′

of dimension k and k′ and complexity δ and δ′ with basic and minimal encodersG ≤ F[z]k×n and G′ ≤ F[z]k

′×n′ the (outer) direct sum C⊕ := C⊕C ′ is a convolutionalcode of length n+ n′, dimension k + k′ and complexity δ + δ′. This is immediately

derived by looking on the generator matrix G⊕ :=

(G 00 G′

)of C ⊕ C ′, which is

again basic and minimal. Of course, the minimal distance of this code is again thesum of the minimal distances of the original codes. Hence it remains to check if andhow the weight adjacency matrices of the new code can be derived from the onesof the original codes. Let the controller canonical forms of the original codes be(A,B,C,D) and (A′, B′, C ′, D′) respectively. It is straightforward to see from theencoder G⊕ that It is straightforward to see from the encoder G⊕ that((

A 00 A′

),

(B 00 B′

),

(C 00 C ′

),

(D 00 D′

))is the controller canonical form of G⊕.any state pairing of F ⊕ F ′ can be written as ((X,X ′), (Y, Y ′)) where (X, Y ) ∈F and (X ′, Y ′) ∈ F ′. Using the definition of the space ∆ of connected state pairingsone sees that

(X,X ′, Y, Y ′) ∈ ∆⊕ ⇔ (X, Y ) ∈ ∆ and (X ′, Y ′) ∈ ∆′.

Recalling that the map ϕ can be used to describe the entries of the adjacencymatrix and having clarified what its range is, one checks that

Φ⊕(X,X ′, Y, Y ′) = (ϕ(X, Y ), ϕ′(X ′, Y ′)).

Together with C const⊕ = C const⊕C const

′ one sees that the label of an edge from state(X,X ′) to state (Y, Y ′) is

cwe(Φ⊕(X,X ′, Y, Y ′) + C const⊕) = cwe((ϕ(X, Y ), ϕ′(X ′, Y ′)) + (C const, C const

′))

= cwe((ϕ(X, Y ) + C const, ϕ′(X ′, Y ′) + C const

′)).

Using the result from block coding theory, it is by virtue of Lemma 3.8 for all(X,X ′, Y, Y ′) ∈ ∆⊕∆′

γ⊕X,X′,Y,Y ′ = cwe((ϕ(X, Y ) + C const, ϕ′(X ′, Y ′) + C const

′))

= cwe(ϕ(X, Y ) + C const)cwe(ϕ′(X ′, Y ′) + C const′) = γX,Y γ

′X,Y . (3.11)

36

Of course, these equations remain true if one replaces the complete weight enumer-ator cwe with the ordinary weight enumerator we due to Lemma 2.4.Summarising these results the weighted directed state graph to the encoder G⊕ ofC⊕ has an edge from (X,X ′) to (Y, Y ′) if and only if there were edges from X toY and X ′ to Y ′ in the weighted directed state graphs to the encoders G and G′

and the weight of this edge is precisely the product of the weights of the originaledges. Hence the weighted directed state graph of G⊕ is just the direct product ofthe weighted directed graphs of G and G′. This insight allows me to use a standardresult from graph theory to show how all weight adjacency matrices involved arerelated:

Proposition 3.17 With the data used above let Γ, Γ′ and Λ, Λ′ be the cWAMsand the WAMs to the encoders G and G′ of the convolutional codes C and C ′. ThecWAM and WAM of the encoder G⊕ of the convolutional code C ⊕ C ′ are:

Γ⊕ = Γ⊗ Γ′ and Λ⊕ = Λ⊗ Λ′,

where ⊗ indicates the Kronecker-product of matrices.

Proof: The result is due to the fact that the weight adjacency matrix of a directproduct of two graphs is just the Kronecker product of the weight adjacency matricesof the original graphs. 2

There is, however, an important difference to the classical weight enumeratorresult. Whereas one can define the weight enumerator to any subset of Fn, it is notpossible to define a weight adjacency matrix to any subset of F[z]n, which is dueto the fact that one needs to find a representation of the set in terms of a directedweighted graph.

Again, I will close the chapter by showing how the notions defined are used on aconcrete example.

Example 3.18 Recall the data of the codes C and C used in Example 2.32. Thecodes C and C obviously have r = 1 and r = 1. Therefore dim(∆) = dim(∆) = 3

and dim(ker Φ) = dim(ker Φ) = 1. This is helpful information for the computationof the cWAM and the WAM, which will be derived from the cWAM by virtue of ι.Using Remark 3.13 and Proposition 3.6 one finds that any state pairing

(X, Y ) /∈ ∆ = im

1 0 0 10 1 0 00 0 1 0

results in γX,Y = 0 and λX,Y = 0, 0. Hence there are only 33 = 27 entries in thecWAM that need to be computed. Exploiting Proposition 3.10 one can reduce thecomputation even more. It is easy to see that ker Φ = 〈((1, 2), (1, 1))〉 = ker Φ.Hence the finding of the 27 entries of the cWAM can be reduced to computing 9complete weight enumerators of one-dimensional affine sets. Finally it may directly

37

be seen from the encoder matrix G, that dim(C const) = 1. Hence any polynomial inthe matrix will be the sum of three monomials of degree 3.Doing all this and abbreviating these nine complete weight enumerators one arrivesat

θ1 = W 30 + 2W0W1W2, θ2 = 2W0W1W2 +W 3

2 , θ3 = 2W0W1W2 +W 31 ,

θ4 = 2W 20W1 +W0W

22 , θ5 = 2W0W

22 +W 2

1W2, θ6 = W 20W1 + 2W 2

1W2,

θ7 = 2W 20W2 +W0W

21 , θ8 = W 2

0W2 + 2W1W22 , θ9 = 2W0W

21 +W1W

22 ,

where one may again avoid computing weight enumerators by Remark 3.4(d), whichgives θ3 = m2(θ2), θ4 = m2(θ7), θ5 = m2(θ9) and θ6 = m2(θ8).Using the lexicographic ordering of the states in F2

3

(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), (3.12)

the cWAM to the encoder G of C is:

Γ(G) =

θ1 0 0 θ2 0 0 θ3 0 0θ4 0 0 θ5 0 0 θ6 0 0θ7 0 0 θ8 0 0 θ9 0 00 θ6 0 0 θ4 0 0 θ5 00 θ9 0 0 θ7 0 0 θ8 00 θ3 0 0 θ1 0 0 θ2 00 0 θ8 0 0 θ9 0 0 θ7

0 0 θ2 0 0 θ3 0 0 θ1

0 0 θ5 0 0 θ6 0 0 θ4

.

the associated WAM of G is easily computed using ι and is given by

Λ(G) =

1+ 2W 2 0 0 2W 2+W 3 0 0 2W 2+W 3 0 02W+W 2 0 0 2W 2+W 3 0 0 W+ 2W 3 0 02W+W 2 0 0 W+ 2W 3 0 0 2W 2+W 3 0 0

0 W+ 2W 3 0 0 2W+W 2 0 0 2W 2+W 3 00 2W 2+W 3 0 0 2W+W 2 0 0 W+ 2W 3 00 2W 2+W 3 0 0 1+ 2W 2 0 0 2W 2+W 3 00 0 W+ 2W 3 0 0 2W 2+W 3 0 0 2W+W 2

0 0 2W 2+W 3 0 0 2W 2+W 3 0 0 1+ 2W 2

0 0 2W 2+W 3 0 0 W+ 2W 3 0 0 2W+W 2

.

Likewise the cWAM associated with the state space realisation (A, B, C, D) of C canbe computed with the help of the same means as used previously. Again only nineweight enumerators have to be calculated. But this time C const = 0, hence anynon-zero entry of the cWAM is just a monomial of degree 3. This may be attributedto Proposition 2.27. Hence the cWAM is of this smaller-dimensional code is much

38

easier computed to be

Γ(G) =

W 30 0 0 W 3

2 0 0 W 31 0 0

W 20W2 0 0 W1W

22 0 0 W0W

21 0 0

W 20W1 0 0 W0W

22 0 0 W 2

1W2 0 00 W0W

21 0 0 W 2

0W2 0 0 W1W22 0

0 W 21W2 0 0 W 2

0W1 0 0 W0W22 0

0 W 31 0 0 W 3

0 0 0 W 32 0

0 0 W0W22 0 0 W 2

1W2 0 0 W 20W1

0 0 W 32 0 0 W 3

1 0 0 W 30

0 0 W1W22 0 0 W0W

21 0 0 W 2

0W2

(3.13)

Let me explicitly compute the entry γ(0,1),(0,0). It is Φ((0, 1), (0, 0)) = (0, 2, 0). Henceγ(0,1),(0,0) = cwe((0, 2, 0)) = W 2

0W2. Now according to Proposition 3.10 it is

γ(0,1),(0,0) = γ(1,0),(1,1) = γ(2,2),(2,2) = W 20W2

and according to (2.1)

γ(0,2),(0,0) = γ(2,0),(2,2) = γ(1,1),(1,1) = W 20W1.

Thus computing one complete weight enumerator gives six entries of the weightadjacency matrix. The corresponding WAM is using once more ι

Λ(G) =

1 0 0 W 3 0 0 W 3 0 0W 0 0 W 3 0 0 W 2 0 0W 0 0 W 2 0 0 W 3 0 00 W 2 0 0 W 0 0 W 3 00 W 3 0 0 W 0 0 W 2 00 W 3 0 0 1 0 0 W 3 00 0 W 2 0 0 W 3 0 0 W0 0 W 3 0 0 W 3 0 0 10 0 W 3 0 0 W 2 0 0 W

. (3.14)

The example demonstrates that the data of the code, that is, the complexity, thedimension and the parameters r and r, determine how much computation is involvedin finding the (complete) WAM. Moreover, it is an example of a code and its dual,where the computational effort to find the (complete) WAM is highly different forthe two codes. In this particular situation the computation of the cWAM of thedual code proved to be easier than computing the cWAM of the original code. Thismeans that fewer weights have to be calculated in order to obtain the cWAM. Sohere and in similar situations it would be nice to have a means to calculate thecWAM of the original code from the cWAM of the dual code in order to save timein computation. In the case of block codes this is provided by the MacWilliamsidentity. For convolutional codes a MacWilliams identity will be given and provenin the next chapter that shows that the (complete) WAM of a code determines therespective object of the dual code by an explicit transformation formula.

39

4 The MacWilliams Identity for the Weight Ad-

jacency Matrix of a Convolutional Code

The aim of this chapter is to formulate and prove a MacWilliams identity for theWAM and cWAM of convolutional codes in terms of an explicit transformationformula. In order to state this formula I have to introduce a complex-valued matrixwhich is defined via characters (see Chapter 2.1).

Definition 4.1 Fix a p-th root of unity ζ, choose X ∈ Fδ and let

χX = ζτ(β(X,·)) ∈ Hom (Fδ,C∗)

be the associated character. For P ∈ GLδ(F) define

H(P ) := q−δ2

(χXP (Y )

)(X,Y )∈F ∈ Cqδ×qδ .

A matrixH(P ) for a P ∈ GLδ(F) is called the P -MacWilliams-matrix. For simplicityI also put H := H(I). For δ = 0 one has H = 1.

As indicated by the use of the dimension δ, the MacWilliams matrices correspondto the state space Fδ. Therefore in the definition of the character χX I used thesymbol β for the standard bilinear form rather than [, ], as this is reserved for thestandard form on F[z]n and Fn. Later on I will also consider characters on F2δ byderiving them from characters on Fδ using Proposition 2.11 v). Consequently theform used with these characters will also be denoted by β.

Remark 4.2 Notice that the MacWilliams matrices depend on δ. Since this pa-rameter is fixed throughout this text (except for the examples), I will not explicitlydenote this dependence. Again the existence of different MacWilliams matrices isdue to how precisely the isomorphism in Proposition 2.11 i) is established. Moreover,the matrices depend on the choice of the primitive root ζ, as Hα did in the case ofthe block coding MacWilliams transform. This dependence, however, can easily bedescribed. Suppose ζ1 and ζ2 are two primitive p-th roots of unity and let H1 and H2

be the corresponding I-MacWilliams matrices. Then ζd1 = ζ2 for some 0 < d < pand, using the Fp-linearity of τ and the matrices P(P ) from Definition 3.3, it iseasy to check that H2 = P(dI)H1 = H(dI) = H1P(d−1I). Hence as with the blockcoding MacWilliams transform no MacWilliams matrix is ”‘lost”’ by fixing ζ.

All MacWilliams matrices are invertible as a result of Proposition 2.11 b). Theinverse of these matrices can easily be calculated and additional properties need tobe fixed.

Lemma 4.3 The MacWilliams matrix H is symmetric and invertible. One hasH2 = P(−I) and hence H4 = I. Furthermore,

H(P ) = P(P )H = HP((P t)−1

)for all P ∈ GLδ(F).

In particular the inverse of a MacWilliams matrix is a MacWilliams matrix again.

40

Proof: For the computation of H2 fix any pair (X, Y ) ∈ F . Then, upon usingthe rules in Proposition 2.11(ii) and (iii),

(H2)X,Y = q−δ∑Z∈Fδ

χX(Z)χZ(Y ) = q−δ∑Z∈Fδ

χX+Y (Z)

=

1, if Y = −X0, else

= P(−I)X,Y .

The rest of the lemma can be checked in the same way using again Proposition 2.11iii).2

The matrix interpretation of H, the complete MacWilliams transform defined inDefinition 2.12, is the MacWilliams matrixH(I1). However, the complete MacWilliamstransform is a map acting on multivariate polynomials, whereas the MacWilliamsmatrix act on any matrix Ω of the same format by the conjuagtion Ω 7→ HΩH−1.Indeed, this is precisely how the MacWilliams matrices will be applied. Finally, itis well known that for any δ, δ ∈ N is is

H(Iδ + δ) = H(Iδ)⊗H(Iδ). (4.1)

I am now prepared to state the MacWilliams identity formula for convolutional codesas it will be proven. Therefore let C ≤ F[z]n be a (n, k, δ)-convolutional code and Γits generalised cWAM. It will be shown that q−k+n

2 H(HΓtH−1) is the generalisedcWAM of the dual code, where one may plug in an arbitrary representative of Γinto the formula and has to apply the map H entrywise to the resulting matrix. Ananalogous formula for the generalised weight adjacency matrix Λ will be derivedparallely using ι at an appropriate moment. By Lemma 2.14 one may already see,that the formula for the generalised cWAM and WAM will hold for any MacWilliamsmatrix H(P ) once it has been proven for H due to the nature of these objects. Thiswill be made more precise at the end of this chapter.Moreover, for given representatives Γ to a realisation (A,B,C,D) of C and Γ to

a realisation (A, B, C, D) of C the precise matrix P ∈ GLδ(F ) will be stated independence to the realisations such that

Γ = q−k+n2 H(P(P )HΓtH−1P(P )−1

).

Again, the corresponding identity for the WAM will be stated as well. Havingestablished these identities forH by this result, it will be easy to get similar identitiesfor each H(P ). Hence from now on I will only consider the MacWilliams matrix Hto simplify the notation.The MacWilliams identities stated above will be proven using representatives forthe cWAM and the WAM of a convolutional code and its dual. Therefore I fixthe data of C a (n, k, δ)-convolutional code with encoder G, controller canonicalform (A,B,C,D), Γ = (γX,Y ) the corresponding cWAM and Λ = (λX,Y ) its WAM.

Moreover it is C by Proposition 2.27 a (n, n− k, δ)-convolutional code with encoder

G, controller canonical form (A, B, C, D), Γ = (γX,Y ) its cWAM and Λ = (λX,Y )

41

the WAM. Finally as before r and r is the number of polynomial rows in G and Grespectively.Using the symmetry of H and Lemma 4.3 one finds that

HΓH = P(−I)(HΓtH−1

)t.

Investigating the matrix HΓH is therefore a good starting point for the quest toprove the MacWilliams identity. In a next step the block coding MacWilliamstransform will be applied entrywise to Γ and a strong relation of the two matriceswill be established. Finally, at the end of the next section, it will be shown thatthere is an isomorphism on F that establishes a bijection of the entries of Γ andqn2−kH(HΓtH−1). Analogous results for the WAM will be established whenever

necessary. Section 4.2 is then needed to show that there is an isomorphism that hasthe correct form to prove the MacWilliams identities as stated above.

4.1 The Matrix HΓH

A direct description of the entries of the matrix HΓH will be needed. Therefore Iabbreviate

L = (lX,Y )(X,Y )∈F = HΓH. (4.2)

The result of the next theorem is at first rather indigestible, but will be simplifiedconsiderably later in this chapter.

Theorem 4.4 Let (X, Y ) ∈ F . Then

lX,Y =

0, if (X, Y ) /∈ ker Φ⊥

q−rcwe(CC ), if (X, Y ) ∈ ∆⊥

and in the remaining case

lX,Y = q−r∑α∈F

χ(X,Y )(α(X, Y ))∑

(X1,Y1)∈(X,Y )⊥∩∆∗

cwe(αϕ(X, Y ) + ϕ(X1, Y1) + C const)

where (X, Y ) ∈ ∆∗ is any state pairing such that F(X, Y )⊕ (X, Y )⊥ = F . Further-more,

lX+U,Y+V = lX,Y for all (U, V ) ∈ ∆⊥.

Proof: Using Proposition 2.11 d) one calculates

qδlX,Y =∑

(W,Z)∈F

χX(W )γW,ZχZ(Y ) =∑

(W,Z)∈F

χ(X,Y )(W,Z)γW,Z .

42

For the first case let (X, Y ) /∈ ker Φ⊥. This implies ker Φ * (X, Y )⊥. Hence

one can choose (X, Y ) ∈ ker Φ such that F = F(X, Y ) ⊕ (X, Y )⊥. Using thatγ(X′,Y ′) = γ(X′,Y ′)+(X′′,Y ′′) for any (X ′, Y ′) ∈ F and (X ′′, Y ′′) ∈ ker Φ one finds

lX,Y = q−δ∑

(W,Z)∈F

χ(X,Y )(W,Z)γW,Z

= q−δ∑α∈F

∑(X1,Y1)∈(X,Y )⊥

χ(X,Y )(α(X, Y ) + (X1, Y1))γα(X,Y )+(X1,Y1)

= q−δ∑α∈F

∑(X1,Y1)∈(X,Y )⊥

χ(X,Y )(α(X, Y ))γX1,Y1

= q−δ∑α∈F

χ(X,Y )(α(X, Y ))∑

(X1,Y1)∈(X,Y )⊥

γX1,Y1 = 0,

where the last identity is due to Proposition 2.11 b), as χ(X,Y ) is a non-trivialcharacter. This completes the first case.

For the second case let now (X, Y ) ∈ ∆⊥, which implies ∆ ⊆ (X, Y )⊥ and onecan choose (X, Y ) ∈ ∆− such that F = F(X, Y )⊕ (X, Y )⊥. With this data one gets

lX,Y = q−δ∑

(W,Z)∈F

χ(X,Y )(W,Z)γW,Z

= q−δ∑α∈F

∑(X1,Y1)∈(X,Y )⊥

χ(X,Y )(α(X, Y ) + (X1, Y1))γα(X,Y )+(X1,Y1)

= q−δ∑α∈F

∑(X1,Y1)∈(X,Y )⊥

χ(X,Y )(α(X, Y ))γα(X,Y )+(X1,Y1)

= q−δ∑

(X1,Y1)∈(X,Y )⊥

γX1,Y1 , (4.3)

where the last equality holds, because (X, Y ) ∈ ∆− implies

[γα(X,Y )+(X1,Y1) 6= 0⇔ α = 0].

As the general condition in this case is ∆ ⊆ (X, Y )⊥ one may go on with (4.3)

lX,Y = q−δ∑

(X1,Y1)∈∆

γX1,Y1 = q−rcwe(CC )

according to Proposition 3.14.

The last case to be considered results from the first two: (X, Y ) ∈ ker Φ⊥, but(X, Y ) /∈ ∆⊥, which implies ker Φ ⊆ (X, Y )⊥ and ∆ * (X, Y )⊥. Recalling the

decomposition ∆ = ker Φ ⊕ ∆∗ this implies that one can choose 0 6= (X, Y ) ∈ Fsuch that (X, Y ) ∈ ∆∗ and F = F(X, Y )⊕ (X, Y )⊥. This finally gives

43

lX,Y = q−δ∑

(W,Z)∈F

χ(X,Y )(W,Z)γW,Z

= q−δ∑α∈F

∑(X1,Y1)∈(X,Y )⊥

χ(X,Y )(α(X, Y ) + (X1, Y1))γα(X,Y )+(X1,Y1)

= q−δ∑α∈F

χ(X,Y )(α(X, Y ))∑

(X1,Y1)∈(X,Y )⊥

γα(X,Y )+(X1,Y1)

= q−δ∑α∈F

χ(X,Y )(α(X, Y ))∑

(X1,Y1)∈(X,Y )⊥∩∆

cwe(αϕ(X, Y ) + ϕ(X1, Y1) + C const)

= q−r∑α∈F

χ(X,Y )(α(X, Y ))∑

(X1,Y1)∈(X,Y )⊥∩∆∗

cwe(αϕ(X, Y ) + ϕ(X1, Y1) + C const).

It remains to show lX+U,Y+V = lX,Y for all (U, V ) ∈ ∆⊥. Since ∆⊥ ⊆ (ker Φ)⊥,the statement is obvious in the first two cases of lX,Y . For the remaining case oneuses that for any (U, V ) ∈ ∆⊥ it is (X, Y )⊥ ∩ ∆ = (X + U, Y + V )⊥ ∩ ∆. Firstrecognise that this allows to choose the same (X, Y ) ∈ ∆∗ in order for F(X, Y ) ⊕(X + U, Y + V )⊥ = F to hold. Hence one may keep to the choice of (X, Y ) ∈ ∆∗

for any (U, V ) ∈ ∆⊥. Using the definition of the character χ and ∆∗ ⊆ ∆ thatχ(X+U,Y+V )(α(X, Y )) = χ(X,Y )(α(X, Y )) for all (U, V ) ∈ ∆⊥ and α ∈ F. Togetherwith the identity of sets this finally shows that the last case is left invariant underaddition of any (U, V ) ∈ ∆⊥ as well. 2

Formulating the result for HΛH = instead of L admits a better result for the lastcase using Remark 3.4 a).

Corollary 4.5 Put `X,Y = (HΛH)X,Y for all (X, Y ) ∈ F . Then

`X,Y =

0, if (X, Y ) /∈ (ker Φ)⊥,

q−rwe(CC ), if (X, Y ) ∈ ∆⊥,

1qδ(q−1)

(q∑

(Z1,Z2)∈(X,Y )⊥ λZ1,Z2 − qδ−rwe(CC ))

else.

Furthermore,`X+U,Y+V = `X,Y for all (U, V ) ∈ ∆⊥.

The proof of this corollary is omitted and may be found in [10]. In a next step I

will apply the complete MacWilliams transform H entrywise on Γ.

Proposition 4.6 For the cWAM Γ one has

H(γX,Y ) =

0, if (X, Y ) /∈ ∆,

qn2−k−rcwe(CC ), if (X, Y ) ∈ ker Φ,

qn2−k−r∑

α∈F χαaX,Y (ϕ(X, Y ))cwe(αaX,Y + EX,Y ), else.

44

where EX,Y := 〈ϕ(X, Y ), C const〉⊥ and aX,Y ∈ Fn such that

FaX,Y ⊕ EX,Y = (C const)⊥ = CC .

Proof: First read Proposition 3.12 for the cWAM of C. The first case is thenimmediate, as H(0) = 0, the second case results from the classical block codingMacWilliams identity for the complete weight enumerator together with the dimen-sion of C const. Finally, the third case is due to application of the generalised blockcoding MacWilliams identity for the complete weight enumerator of affine spaces asit is found in Proposition 2.17. 2

At this point the resemblance of the matrices L and H(Γ) is already striking. Thematrix L has q2δ − qdim ker Φ⊥ = q2δ − qδ+r entries that are zero, which is as manyzero entries as the matrix H(Γ) has, as dim ∆ = δ + r. Moreover, the number of

entries that are a multiple of cwe(CC ) in each matrix coincides due to dim ker Φ =δ − r = dim ∆⊥. The more complicated looking entries do not only look alike intheir structure, but there are equally many of them in both matrices. After a fewmore preparations it will be shown that there is a strong correspondence betweenthe respective entries of both matrices using the controller canonical forms.Because the codes C and C are dual to each other the encoders satisfy GGt = 0. ByProposition 2.21 it is furthermore

G(z) = B(z−1I − A)−1C +D, G(z) = B(z−1I − A)−1C + D. (4.4)

Since D, D both have full row rank this implies

imD = ker Dt. (4.5)

I will now define a matrix that will play a crucial role in the proof of the MacWilliamsidentity for convolutional codes using the controller canonical forms of C and C

Lemma 4.7 Let

M0 :=

(CCt C(BtD)t

BtDCt 0

)∈ F2δ×2δ.

Then

(a) imM0 ⊆ (ker Φ)⊥.

(b)ker Φ⊕ ∆− ⊆ kerM0.

(c) imM0 ∩∆⊥ = 0.(d)M0 is injective on ∆∗.

(e) rankM0 = r + r and M⊕∆⊥ = (ker Φ)⊥ where

M :=imM0 = (X, Y )M0 | (X, Y ) ∈ ∆∗.

45

Proof: First notice that by (4.5) M t0 =(CBtD

)(Ct (BtD)t) and therefore

β((X ′, Y ′), (X, Y )M0

)= (X ′, Y ′)M t

0(X, Y )t = [ϕ(X ′, Y ′), ϕ(X, Y )] (4.6)

for all (X, Y ), (X ′, Y ′) ∈ F . Remember also that ϕ(X, Y ) ∈ CC

for all (X, Y ) ∈ F .

(a) follows from (4.6) since for (X ′, Y ′) ∈ ker Φ one has ϕ(X ′, Y ′) ∈ C const = (CC)⊥.

(b) If (X, Y ) ∈ ker Φ ⊕ ∆−, then it is ϕ(X, Y ) ∈ C const by Corollary 3.11 andLemma 3.9(a). Thus ϕ(X, Y ) ∈ (CC )

⊥ while ϕ(X ′, Y ′) ∈ CC for all (X ′, Y ′) ∈ F .Now (4.6) along with the regularity of the bilinear form β shows (X, Y )M0 = (0, 0).

(c) Let (X, Y )M0 ∈ ∆⊥. Then by (4.6) I have ϕ(X, Y ) ∈ ϕ(∆)⊥. Since alsoϕ(X, Y ) ∈ C

C= (C const)

⊥, one obtains from (3.6) and Proposition 2.28 that ϕ(X, Y ) ∈C const. But then (X, Y ) ∈ ker Φ and (b) implies (X, Y )M0 = (0, 0).

(d) Let (X, Y )M0 = 0 for some (X, Y ) ∈ ∆∗. Similarly to (c) one obtains by useof (4.6) and (3.6)

ϕ(X, Y ) ∈ (imϕ)⊥ ∩ CC

= (imϕ)⊥ ∩ (C const)⊥ = (imϕ+ C const)

⊥ = C const.

But this means that (X, Y ) ∈ ker Φ and the assumption (X, Y ) ∈ ∆∗ finally yields(X, Y ) = (0, 0).

(e) The rank assertion follows from (d) and (b) since dim ∆∗ = r+ r and dim(ker Φ⊕∆−) = 2δ − (r + r). The rest is immediate from the above and dim(ker Φ)⊥ −dim ∆⊥ = r + r. 2

By virtue of the matrix M0 I can now concretise the relationship of the matricesL and H(Γ).

Theorem 4.8 With the matrix M0 from Lemma 4.7 one has

H(γX,Y ) = qn2−kl(X,Y )M0 for all (X, Y ) ∈ ∆.

Proof: Recall the decomposition ∆ = ker Φ ⊕ ∆∗. For (X ′, Y ′) ∈ ker Φ and

(X, Y ) ∈ ∆∗ one has γX′+X,Y ′+Y = γX,Y and hence H(γX′+X,Y ′+Y ) = H(γX,Y ) dueto Proposition 3.10(c). Furthermore, l(X′,Y ′)M0+(X,Y )M0 = l(X,Y )M0 by Lemma 4.7(b).

Hence it suffices to show the result for (X, Y ) ∈ ∆∗. For (X, Y ) = (0, 0) the resultis obviously true due to Proposition 4.6 and Theorem 4.4. So let (X, Y ) 6= (0, 0).By Lemma 4.7e) this yields (X, Y )M0 ∈ (ker Φ)⊥\∆⊥. Hence q

n2−kl(X,Y )M0 needs to

be computed according to the last case in Theorem 4.4. As (X, Y ) ∈ ∆∗, one has

46

to consider the last case in Proposition 4.6 as well, i.e. one has to show that

qn2−k−r

∑α∈F

χαaX,Y (ϕ(X, Y ))cwe(αaX,Y + EX,Y )

= qn2−k−r

∑α∈F

χ((X,Y )M0)(α(X, Y ))∑(X1,Y1)∈((X,Y )M0)⊥∩∆∗

cwe(αϕ(X, Y ) + ϕ(X1, Y1) + C const),

where the objects aX,Y and EX,Y are due to Proposition 4.6 dependent on the statepairing (X, Y ) and the state pairing (X, Y ) is dependent on (X, Y )M0. The precisedependencies will be recalled in due time.The only obvious fact in the equation to prove is that the factor q

n2−k−r is the same

on both sides, so one need not take it into account any longer. As a first step I showthat

EX,Y =⋃

(X1,Y1)∈((X,Y )M0)⊥∩∆∗

(ϕ(X1, Y1) + C const) =: C(X, Y ).

Therefore recall from Proposition 4.6 that EX,Y = 〈ϕ(X, Y ), C const〉⊥ and note that

C(X, Y ) ⊆ CC = C const. I will show show that C(X, Y ) is orthogonal to ϕ(X, Y ) aswell, which will prove C(X, Y ) ⊆ EX,Y . Let a ∈ C(X, Y ) then there is a (X ′, Y ′) ∈((X, Y )M0)⊥ ∩∆∗ and c ∈ C const = C

C⊥ such that a = ϕ(X ′, Y ′) + c and

[ϕ(X, Y ), a] = [ϕ(X, Y ), ϕ(X ′, Y ′) + c]

= [ϕ(X, Y ), ϕ(X ′, Y ′)] = [(X, Y )

(C

BtD

), (X ′, Y ′)

(C

BtD

)]

= (X, Y )(CBtD

)(Ct (BtD)t

)(X ′, Y ′)t

due to the definition of the form [, ]. I may continue

[ϕ(X, Y ), a] = (X, Y )M0(X ′, Y ′)t = 0,

because (X ′, Y ′) ∈ ((X, Y )M0)⊥ ∩ ∆∗. This proves C(X, Y ) ⊆ EX,Y . For the

equality recall that (0, 0) 6= (X, Y ) ∈ ∆∗ and therefore ϕ(X, Y ) /∈ C const. Hence

dimEX,Y = dim(〈ϕ(X, Y ), C const〉⊥

)= k + r − 1. The dimension of C(X, Y ) can

be calculated with the help of Corollary 3.11, which yields that ϕ is injective on ∆∗

and CC =⋃

(X,Y )∈∆∗

(ϕ(X, Y ) + C const). As dim ∆∗ − dim

(((X, Y )M0)⊥ ∩∆∗

)= 1

one therefore finds

dim

⋃(X′,Y ′)∈((X,Y )M0)⊥∩∆∗

(ϕ(X, Y ) + C const)

= dim(C(X, Y )) = dim(CC )− 1.

Now dim(CC ) = k + r implies dim(C(X, Y )) = k + r − 1. As C(X, Y ) ⊆ EX,Y andthe dimensions coincide, the spaces must be equal.

47

The next step is to show that ϕ(X, Y ) is an admissible choice for aX,Y , i.e. to show

that Fϕ(X, Y )⊕EX,Y = (C const)⊥ = CC . By definition of ϕ one has ϕ(X, Y ) ∈ CC =

(C const)⊥. Now EX,Y = C(X, Y ) has already been shown to be a hyperplane of CC .

By definition (X, Y ) /∈ ((X, Y )M0)⊥∩∆∗ but (X, Y ) ∈ ∆∗ and again the injectivityof ϕ on ∆∗ yields ϕ(X, Y ) /∈ ϕ(((X, Y )M0)⊥ ∩ ∆∗), so ϕ(X, Y ) /∈ C(X, Y ), whichfinally proves that putting aX,Y := ϕ(X, Y ) is an admissible choice.Recapitulating the two small results so far allows me to rewrite the relevant lastcase in Theorem 4.4 to

qrl(X,Y )M0 =∑α∈F

χ(X,Y )M0(α(X, Y ))∑

(X1,Y1)∈((X,Y )M0)⊥∩∆∗

cwe(αϕ(X, Y ) + ϕ(X1, Y1) + C const) =

∑α∈F

χ(X,Y )M0(α(X, Y ))cwe(αaX,Y + EX,Y ).

Recalling the definition of a character χ finally yields by virtue of (4.6)

χ(X,Y )M0(α(X, Y )) = ζτ(β((X,Y )M0,α(X,Y ))) = ζτ([ϕ(X,Y ),αϕ(X,Y )]) = χαaX,Y (ϕ(X, Y )),

which gives

qn2−kl(X,Y )M0 = q

n2−k−r

∑α∈F

χαaX,Y (ϕ(X, Y ))cwe(αaX,Y + EX,Y ).

Comparing this expression with the expression in the last case of Proposition 4.6shows that they are equal, which concludes the proof. 2

Of course, it is desirable to have the transformation in closed form on one side ofthe equation. This is achieved by applying the complete MacWilliams transform Hon both sides. Recall that due to Lemma 2.13 H2 = m−1 and (2.1) H2cwe(S) =cwe(−S) for any set S ⊆ Fn. This gives

Corollary 4.9

γ−X,−Y = qn2−kH(l(X,Y )M0) for all (X, Y ) ∈ ∆.

Proof: By applying the complete MacWilliams transform H on both sides ofthe equation in Theorem 4.8 one gets by virtue of Lemma 2.13a)

H2(γX,Y ) = qn2−kH(l(X,Y )M0) for all (X, Y ) ∈ ∆

⇔ m−1(γX,Y ) = qn2−kH(l(X,Y )M0) for all (X, Y ) ∈ ∆

⇔ m−1(cwe(ϕ(X, Y ) + C const)) = qn2−kH(l(X,Y )M0) for all (X, Y ) ∈ ∆

⇔ cwe(−(ϕ(X, Y ) + C const)) = qn2−kH(l(X,Y )M0) for all (X, Y ) ∈ ∆

⇔ cwe((ϕ(−X,−Y ) + C const)) = qn2−kH(l(X,Y )M0) for all (X, Y ) ∈ ∆

⇔ γ−X,−Y = qn2−kH(l(X,Y )M0) for all (X, Y ) ∈ ∆,

48

where I used that ϕ is linear and C const is a vector space that is closed under multi-plication. 2

I will now transfer Corollary 4.9 to the entries of the WAM using the C-algebrahomomorphism ι as defined in Lemma 2.4.

Corollary 4.10

λ−X,−Y = q−kh(HΛH)(X,Y )M0 for all (X, Y ) ∈ ∆.

Proof: It is clear from the definition of ι that ι(γ−X,−Y ) = λ−X,−Y . The righthand side requires more attention. Applying ι and recalling Lemma 2.13iii) it reads

ι(qn2−kH(HΓH)(X,Y )M0) = q

n2−kι(H(HΓH)(X,Y )M0)

= q−kh(ι(HΓH)(X,Y )M0). (4.7)

The map ι is a C-algebra-homomorphism, hence ι(HΓH) = Hι(Γ)H = HΛH ac-cording to Remark 3.2, as the entries in HΓH are only C-linear combinations of theentries of Γ. 2

Note in particular that due to Remark 3.4(a) the result holds also in the version

λX,Y = q−kH(HΛH)(X,Y )M0 for all (X, Y ) ∈ ∆,

where the − on the left hand side disappeared. In this version it is identical to The-orem 5.4 in [10]. However, I will need the result in the form stated in the corollary.

For the sequel let G be any direct complement of (ker Φ)⊥ in F . Due to Lemma 4.7(e)I arrive at the following decompositions of F .

∆︷︸︸︷Ff

= ∆∗

f0

⊕ ker Φ

f1

⊕ ∆−

f2

F = M ⊕ ∆⊥ ⊕ G︸︷︷︸(ker Φ)⊥

(4.8)

where, due to identical dimensions, there exist isomorphisms in each column. For f0

I choose the isomorphism induced by the matrix −M0 from Lemma 4.7, and thusM = imM0 as before. This picture leads to the following result.

Theorem 4.11 Consider the Diagram (4.8) and let the isomorphism f0 be inducedby the matrix M0 from Lemma 4.7. Fix any isomorphisms f1 and f2 in the diagram.Let f := f0 ⊕ f1 ⊕ f2 be the associated automorphism on F . Then

γ−X,−Y = qn2−kH

((HΓH)f(X,Y )

)for all (X, Y ) ∈ F . (4.9)

49

As a consequence,

γf−1(Y,−X) = qn2−kH

((HΓtH−1)X,Y

)for all (X, Y ) ∈ F .

In particular, the entries of the matrices Γ and qn2−kH

(HΓtH−1

)coincide up to

reordering.

Proof: Recall from (4.2) that (HΓH)f(X,Y ) = lf(X,Y ). One has to consider threecases.1) If (X, Y ) 6∈ ∆, then f(X, Y ) 6∈ (ker Φ)⊥ and γX,Y = 0 = q

n2−kH(lf(X,Y )) due to

the very definition of ∆ and Proposition 2.14.2) If (X, Y ) ∈ ker Φ then ϕ(X, Y ) ∈ C const and f(X, Y ) ∈ ∆⊥. Now Lemma 3.8 as

well as Proposition 2.14 yield γX,Y = cwe(C const) = q−kH(lf(X,Y )).

3) For the remaining case one has (X, Y ) ∈ ∆\ ker Φ. Writing (X, Y ) = (X1, Y1) +

(X2, Y2) where (X1, Y1) ∈ ∆∗ and (X2, Y2) ∈ ker Φ, Proposition 3.10(c) yields γX,Y =γX1,Y1 while Theorem 4.4 implies lf(X,Y ) = l(X1,Y1)M0 . Now the result follows fromCorollary 4.9.For the second statement put L := HΓtH−1. Notice first that Lemma 4.3 and thedefinition of P(−I) as given in Definition 3.3 yield HΓH = P(−I)Lt. This implies(HΓH)X,Y = LY,−X . Now I obtain from (4.9) γf−1(−X,−Y ) = q

n2−kH(LY,−X) and thus

γf−1(Y,−X) = qn2−kH(LX,Y ). This concludes the proof. 2

An analogous theorem may be proven for the WAM as well using either the chainof arguments employed in the proof of Theorem 4.11 together with Corollary 4.10 orby applying ι on Theorem 4.11. Therefore I will only state the corresponding resultfor the WAM:

Corollary 4.12 Consider the Diagram (4.8) and let the isomorphism f0 be inducedby the matrix M0 from Lemma 4.7. Fix any isomorphisms f1 and f2 in the diagram.Let f := f0 ⊕ f1 ⊕ f2 be the associated automorphism on F . Then

λ−X,−Y = q−kh((HΛH)f(X,Y )


As a consequence,

λf−1(Y,−X) = q−kh((HΛtH−1)X,Y


In particular, the entries of the matrices Λ and q−kh(HΛtH−1

)coincide up to re-

ordering.

It needs to be stressed that the neither Theorem 4.11 nor Corollary 4.12 provethe MacWilliams identity for the cWAM or the WAM, respectively, since it has notbeen shown that f−1(Y,−X) = (XQ, Y Q) for some suitable Q ∈ GLδ(F) and all(X, Y ) ∈ F . The challenge in proving the MacWilliams identities consists preciselyin finding isomorphisms f0, f1, f2 for Diagram (4.8) such that f−1(Y,−X) has sucha form. This will be accomplished in the next section.

50

4.2 The Isomorphisms f1 and f2

As Diagram (4.8) implies, it is crucial to find isomorphisms f1 : ker Φ → ∆⊥ and

f2 : ∆− → G. This task is, however, asymmetric in the sense that domain and imageof f1 are fixed spaces, whereas the situation in f2 is very different. Both ∆− and Gare still subject to choice, as these spaces are only required to be direct complementsof ∆ and ker Φ⊥ respectively in F . On the other hand, the knowledge about ker Φ(or ker Φ) is until now very limited as no concrete description of these spaces hasbeen given so far. So the first aim of this section is to gather more information onker Φ.

Proposition 4.13

(a) Let Π1 : F −→ Fδ be the projection onto the first component, thus Π1(X, Y ) =X for all (X, Y ) ∈ F . Then Π1|ker Φ is injective.

(b) rankCDtB = r.

(c) kerCDtB = Π1(ker Φ) = X ∈ Fδ | ∃ u ∈ Fk : (X,XA+ uB) ∈ ker Φ.

Proof: (a) Suppose (X,XA+ uB), (X,XA+ u′B) ∈ ker Φ. Then (0, uB) ∈ ker Φfor u = u − u′. Hence, uBBtD ∈ C const. But then Remark 2.24 along with thefull row rank of D yields uBBtB = 0, thus uB = 0 due to Remark 2.23. As aconsequence, (X,XA+ uB) = (X,XA+ u′B).(b) Again I will employ Remark 2.23 and Remark 2.24 (2). Let X ∈ Fδ such that

XCDtB = 0. Using Remark 2.24 it is, on the one hand, XC ∈ CC = (C const)⊥,

where the last identity is due to Proposition 2.28. On the other hand, XC ∈ker(DtB) = (im BtD)⊥. Making use of Remark 2.24 and its dual version this yields

XC ∈ (C const)⊥ ∩ (im BtD)⊥ = (C const ⊕ im BtD)⊥ = (im D)⊥. But the latter space

is identical to imD, as one can see directly from (4.4) and the full row rank of the

matrices D and D. Thus I conclude that XC ∈ imD = C const⊕ imBtD. Using thatBtBBt = Bt, I obtain the existence of some u = uBt ∈ Fk such that XC+uBBtD ∈C const. Along with the identity ABt = 0 this implies that (X,XA + uB) ∈ ker Φ.

All this shows that kerCDtB ⊆ Π1(ker Φ) and, using (a), one arrives at

dim(kerCDtB) ≤ dim(Π1(ker Φ)) = dim(ker Φ) = δ − r.

Since CDtB ∈ Fδ×δ, this implies r ≤ rankCDtB ≤ rank DtB. Recalling fromProposition 2.28 that dim(C const) = n − k − r, the dual version of Remark 2.24

along with rank D = n − k then tells me that rank DtB = r. This finally provesrankCDtB = r.(c) The inclusion “⊆” has been shown in the proof of (b). Thus equality of the

two spaces follows from Lemma 3.9b) since dim kerCDtB = δ − r = dim ker Φ =dim Π1(ker Φ). 2

Part (a) and (c) of the previous proposition give rise to a crucial map.

51

Corollary 4.14 Let K := kerCDtB. Then

σ : K −→ Fδ, X 7−→ Y such that (X, Y ) ∈ ker Φ

is a well-defined, linear, and injective map. Furthermore, K does not contain anonzero σ-invariant subset.

Proof: That the map is well-defined follows from Proposition 4.13(a) and (c),whereas the linearity is obvious. As for injectivity, let X ∈ K such that σ(X) = 0.Then (X, 0) ∈ ker Φ, meaning that XC ∈ C const. On the other hand, (X, 0) ∈ker Φ ⊆ ∆ imply that 0 = XA + uB for some u ∈ Fk. Hence XA = −uB ∈imA∩ imB and Remark 2.23 give XA = 0. But then XC ∈ (kerA)C∩C const = 0,where the last identity is due to Remark 2.23. As a consequence, X ∈ kerA∩kerC,and due to the same remark I arrive at X = 0. This proves the injectivity of σ.For the last statement assume that K ′ is a σ-invariant subset of K. That simplymeans that there exists some vector X ∈ K such that σi(X) ∈ K for all i ≥ 0. SinceK ⊆ Fδ is a finite set, this yields that the orbit σi(X) | i ∈ N0 is finite and hencecontains a cycle. In other words, there exists some X ′ ∈ K and some j > 0 suchthat σj(X ′) = X ′. Without loss of generality I may assume X ′ = X. By definitionof the map σ one has

(σi(X), σi+1(X)

)∈ ker Φ for all i ≥ 0. Using Lemma 3.9c),

all this tells that one has a cycle

X−−−→(u00 )

σ(X)−−−→(u10 )

σ2(X)−−−→(u20 )

· · · −−−→(uj−1

0 )

σj(X) = X

of weight zero in the state transition diagram associated with (A,B,C,D). Here

the notation X−−−→(uv )

Y stands for the equations Y = XA + uB, v = XC + uD.It is well-known [19, p. 308] that the basicness of the encoder G implies that sucha cycle is a concatenation of the trivial cycle, that is, X = 0 and ui = 0 for alli = 0, . . . , j − 1. Thus K ′ = 0 and the proof is complete. 2

Let me now introduce the matrices

S0 := BtD and Si := BtBAi−1C for i ≥ 1. (4.10)

Likewise, I define the matrices S0 := BtD and Si := BtBAi−1C, i ≥ 1, associatedwith the dual code. Furthermore, I put

N :=∑m≥2

m−1∑i=1

i−1∑j=0

(At)i−1SjStm−jA

m−(i+1),

N :=∑m≥2

m−1∑i=1

i−1∑j=0

(At)i−1SjStm−jA

m−(i+1).

(4.11)

Using that Ai = 0 = Ai, for i ≥ δ it is easy to see that these sums are indeed finite,since each summand vanishes for m ≥ 2δ. One easily shows that, after two indexchanges,

N t =∑m≥2

m−1∑i=1

m∑j=i+1

(At)i−1SjStm−jA

m−(i+1). (4.12)

52

The matrices N and N will indeed be the key ingredients for the isomorphisms f1

and f2. Before making this more precise, let me establish some important propertiesof these matrices.

Proposition 4.15

(a)N + N t = −CCt.

(b) CSt0 + S0Ct = NA+ AtN t.

(c)NAAt = N .

Proof: Let me begin the proof with some general remarks on the matrices Si thatultimately define N and N . Since G = D +

∑i≥1BA

i−1Czi it is BtG =∑

i≥0 Sizi

with the matrices Si as given in (4.10). Then BtGGtB = 0 implies

m∑i=0

SiStm−i = 0 for all m ≥ 0. (4.13)

From the controller canonical form one easily derives the identity∑i≥1

(BAi−1)t(BAi−1) = Iδ,

which in turn yields

C =∑i≥1

(At)i−1Si (4.14)

and, consequently,

CCt =∑m≥2

m−1∑i=1

(At)i−1SiStm−iA

m−i−1. (4.15)

Now everything is prepared to begin with the actual proof of the proposition.

(a) From (4.13) one obtains

SiStm−i = −

i−1∑j=0

SjStm−j −

m∑j=i+1

SjStm−j

for i = 1, . . . ,m−1. Using (4.15) and N and N t from (4.11) and (4.12), one thereforecomputes

−CCt = −∑m≥2

m−1∑i=1

(At)i−1SiStm−iA

m−(i+1)

=∑m≥2

m−1∑i=1

(At)i−1( i−1∑j=0

SjStm−j +

m∑j=i+1

SjStm−j

)Am−(i+1) = N + N t,

53

which is what I wanted.(b) Using again (4.12), one obtains

NA+ AtN t =∑m≥2

(m−1∑i=1

i−1∑j=0

(At)i−1SjStm−jA

m−i +m−1∑i=1

m∑j=i+1

(At)iSjStm−jA

m−(i+1)

)

=∑m≥2

(m−2∑i=0

i∑j=0

(At)iSjStm−jA

m−(i+1) +m−1∑i=1

m∑j=i+1

(At)iSjStm−jA

m−(i+1)

)

=∑m≥2

(m−2∑i=1

(At)i( m∑j=0

SjStm−j

)Am−(i+1) + S0S

tmA

m−1 + (At)m−1SmSt0

).

Due to (4.13) the inner sum over j vanishes, and substituting 0 = S0St1 + S1S

t0,

which is (4.13) for m = 1, I proceed with

NA+ AtN t =∑m≥2

(S0S

tmA


)+ S0S

t1 + S1S

t0

=∑m≥1

(S0S

tmA


)= S0C

t + CSt0,

where the last identity is a consequence of (4.14). This proves part (b).(c) As before let e1, . . . , eδ be the standard basis vectors of Fδ. Throughout the restof this proof denote, for any matrix M , the i-th column (resp. i-th row) of M by M(i)

(resp.M (i)). Let me assume that the the first r Forney indices are nonzero. Thus, thematrices A and B are as in Definition 2.20. Then I have kerA = 〈ejl | l = 1, . . . , r〉,where jl =

∑la=1 δa. Moreover, AAt is the diagonal matrix with (AAt)jl,jl = 0 for

l = 1, . . . , r and (AAt)i,i = 1 else. Therefore, it suffices to show that the µ-th column

of N is zero for all µ ∈ j1, . . . , jr. Thus, let µ =∑l

a=1 δa for some l = 1, . . . , r. Inorder to prove the desired result I will even show that

(Stm−jAm−i−1)(µ) = 0 for all m ≥ 2 and 1 ≤ i ≤ m− 1 as well as 0 ≤ j ≤ i− 1.

(4.16)This, of course, implies N(µ) = 0 due to (4.11). In order to prove (4.16) notice that

Stm−jAm−i−1 = Ct(At)m−j−1BtBAm−i−1.

Put ν :=∑l−1

a=1 δa + 1. The definition of A and B shows that

(BAm−i−1)(µ) 6= 0⇐⇒ (Am−i−1)ν,µ = 1⇐⇒ δl − 1 = m− i− 1⇐⇒ i = m− δl.

Hence for i 6= m− δl one has (BAm−i−1)(µ) = 0, and it remains to consider the casei = m− δl. Then 0 ≤ j ≤ m− δl − 1 implies δl < m− j. Using that (BtB)(ν) = eν ,the ν-th row of Sm−j is

(Sm−j)(ν) = (BtBAm−j−1C)(ν) = (Am−j−1C)(ν) = (Am−j−1)(ν)C = 0,

54

where the last identity follows from the simple fact that the l-th diagonal blockof Am−j−1 is zero, as m − j − 1 ≥ δl. Transposing the obtained identity yields(Stm−j)(ν) = 0. Since (Am−i−1)(µ) = (Aδl−1)(µ) = etν , one obtains

(Stm−jAm−i−1)(µ) = Stm−j(A

m−i−1)(µ) = (Stm−j)(ν) = 0.

This completes the proof of (4.16) and thus concludes the proof. 2

Now I am in a position to work on the remaining freedom in Diagram (4.8). Definethe matrices

M1 =

(N −NA0 0

), M2 =

(N t 0

−AtN t 0

). (4.17)

Recalling that S0 = BtD, Proposition 4.15 along with the matrix M0 defined inLemma 4.7 yields

M := M0 +M1 +M2 =

(0 P−P 0

), where P := CSt0 −NA. (4.18)

In the rest of this section I will show that, firstly, the map f induced by M0 +M1 +M2 is an automorphism, that is, the matrix P ∈ Fδ×δ is regular, and, secondly,that f respects the decomposition of F on the right hand side of Diagram (4.8).As a consequence, f defines an automorphism as in Theorem 4.11 that, at the sametime, is of the form as required in the closing remark of section 4.1, which arenecessary conditions for the proof of the MacWilliams identity.

In order to carry out these computations notice that M2 = M t1, i. e., M2 is the

dual version of M t1. From Remark 2.23 and Proposition 4.15(c) one obtains(

N −NA0 0

)(I 0At Bt

)= 0.

Consequently,imM1 ⊆ ∆⊥ and ∆ ⊆ kerM2, (4.19)

where the second containment follows from the first one via duality. The followingresult establishes the regularity of P .

Theorem 4.16 The matrix P = CSt0 −NA is in GLδ(F).

Proof: I need to resort to the dual version of Corollary 4.14. Thus, considerK = ker CDtB with the corresponding map σ. Firstly, one observes that kerP ⊆ K.Indeed, for X ∈ kerP one has XNA = XCSt0 = XCDtB ∈ imA ∩ imB, and fromRemark 2.23 concludes

XNA = XCDtB = 0. (4.20)

55

Hence kerP ⊆ K. In order to show the regularity of P , let X ∈ kerP . Then X ∈ Kand thus (X, σ(X)) ∈ ker Φ. Recalling that ker Φ ⊆ ∆ I obtain from (4.19) andLemma 4.7

(X, σ(X)) ∈ kerM2 ∩ kerM0. (4.21)

Moreover, (X, σ(X))M1 = (XN,−XNA). But XNA = 0 by (4.20) and thusProposition 4.15(c) yields XN = XNAAt = 0. Hence (X, σ(X)) ∈ kerM1, whichalong with (4.21) implies (X, σ(X)) ∈ kerM . Consequently, σ(X) ∈ kerP . All

this shows that kerP is a σ-invariant subspace of K, and by the dual version ofCorollary 4.14 I may conclude that kerP = 0. This yields the desired result. 2

This theorem shows that the map f induced by M = M0 + M1 + M2 is anautomorphism on F of the form as required. The next step is to show that itrespects the direct decomposition as in diagram (4.8). This is accomplished andsummarised in the next result.

Proposition 4.17 Put ∆∗ := kerM1 ∩ ∆ and G := imM2. Then

(a) kerM0 = ker Φ⊕ ∆−.

(b)kerM1 = ∆∗ ⊕ ∆−.

(c) kerM2 = ∆∗ ⊕ ker Φ = ∆.

(d) imM1 = ∆⊥ and F = ker Φ⊥ ⊕ G.

Proof: (a) has already been given in Lemma 4.7.

(b) It is clear from the definition of ∆∗ and the dual version of Proposition 3.10(a)

that the sum ∆∗ ⊕ ∆− is indeed direct and contained in kerM1. In order to showequality let me first compute the rank of M1. To this end I show that

ker Φ ∩ kerM1 = 0. (4.22)

Due to (4.19) and part (a) I have that ker Φ ⊆ kerM0 ∩ kerM2. Then (X, Y )M1 =

(X, Y )M for (X, Y ) ∈ ker Φ. Now the regularity of the matrix M , see Theorem 4.16,implies (4.22). Using the dual version of Proposition 3.9(b) as well as (4.19), I

conclude δ − r = dim ker Φ ≤ rankM1 ≤ dim ∆⊥. Since dim ∆⊥ = δ − r due toProposition 3.6 and Lemma 2.9, this proves

rankM1 = δ − r (4.23)

andimM1 = ∆⊥. (4.24)

Next I show that∆ = ∆∗ ⊕ ker Φ. (4.25)

The directness of the sum on the right hand side as well as the inclusion “⊇” areobvious, see also (4.22). Furthermore, notice that kerM1 + ∆ = F as ∆− ⊆ kerM1

and ∆− ⊕ ∆ = F . Since kerM1 ∩ ∆ = ∆∗, I obtain with the aid of Proposition 3.6

56

that dim ∆∗ = dim(kerM1) + dim ∆− dimF = r+ r = dim ∆− dim ker Φ. All thisproves (4.25). Along with the dual version of Proposition 3.10(a) I arrive at

F = ∆∗ ⊕ ker Φ⊕ ∆−, (4.26)

which is exactly the decomposition of F as in the upper row of Diagram (4.8). Now

I compute dim kerM1 = δ + r = 2δ − dim ker Φ = dim(∆∗ ⊕ ∆−), which along with

∆∗ ⊕ ∆− ⊆ kerM1 completes the proof of (b).

(c) Due to (4.25) it only remains to show that ∆ = kerM2. The inclusion “⊆”

has been obtained in (4.19). In order to establish identity recall that M2 = M t1

and therefore dualising (4.23) yields rankM2 = δ − r. But then dim kerM2 =

δ + r = dim ∆. Hence kerM2 = ∆, which concludes the proof of (c). (d) The firstpart has already been proven in (4.24) above. Furthermore, from (c) I know thatdimG = dim imM2 = δ− r. Moreover, dim ker Φ⊥ = δ+ r. Hence the proof of (d) iscomplete if I can show that G∩ker Φ⊥ = 0. To this end assume (X, Y )M2 ∈ ker Φ⊥

for some (X, Y ) ∈ F . By (c) and (4.26) I may assume (X, Y ) ∈ ∆− and therefore(X, Y )M2 = (X, Y )M due to (a) and (b). Furthermore, by Lemma 4.7 I haveker Φ⊥ = imM0 ⊕ ∆⊥ = imM0 ⊕ imM1. As a consequence, the above yields(X, Y )M2 = (X0, Y0)M0 + (X1, Y1)M1 for some (Xi, Yi) ∈ F , i = 1, 2. Using (4.26)

and (a) and (b) I may assume (X0, Y0) ∈ ∆∗ and (X1, Y1) ∈ ker Φ. Using oncemore (a) – (c) I conclude (X, Y )M = (X, Y )M2 = (X0, Y0)M0 + (X1, Y1)M1 =(X0, Y0)M + (X1, Y1)M and regularity of the matrix M implies (X, Y ) = (X0, Y0) +

(X1, Y1) ∈ ∆− ∩ (∆∗ ⊕ ker Φ) = ∆− ∩ ∆. Thanks to Proposition 3.10(a) thisintersection is trivial and I may finally conclude that G ∩ ker Φ⊥ = 0. Thiscompletes the proof. 2

It remains to summarise the last results formally.

Theorem 4.18 The matrix M = M0 + M1 + M2 induces an automorphism on Fthat respects its decomposition as it is given in Diagram (4.8). Moreover,

M =

(0 P−P 0

),

where P ∈ GLδ(F) is as in Theorem 4.16.

4.3 MacWilliams Identities for Convolutional Codes

Using Proposition 3.5, Remark 3.4 b), Theorem 4.11 and Corollary 4.12 I can nowstate MacWilliams identities for the (generalised) cWAM and WAM of a convolu-tional code and its dual. After that I will show that MacWilliams identities for theseobjects may easily be deduced for a convolutional code and its sequence space dual.

Theorem 4.19 With the data used above for C and C put Q := −P = −CDtB +NA, where N is as in (4.11). Then P ∈ GLδ(F) and the adjacency matrices satisfy

57

the MacWilliams Identity

ΓX,Y = qn2−kH

((HΓtH−1)XQ,Y Q


Translating the linear transformation on the indices to permutation matrices theidentity appears as

Γ = qn2−kH

(P(Q)HΓtH−1P(Q)−1

)and therefore the generalised adjacency matrices Γ and Γ of the codes C and Csatisfy the identity Γ = q

n2−kH(HΓtH−1).

Proof: Due to Theorem 4.11 it is known that

γf−1(Y,−X) = qn2−kH

((HΓtH−1)X,Y

)for all (X, Y ) ∈ F , (4.27)

where f is any automorphism that respects the decomposition of F according to Di-agram (4.8). Theorem 4.18 establishes that the matrix M induces such an automor-

phism. Hence let f be the automorphism induced byM . Using thatM =

(0 P−P 0

)it is immediate that f−1 is induced by the matrix

M−1 =

(0 −P−1

P−1 0

).

Using this (4.27) is equivalent to

γ(−XP−1,−Y P−1) = qn2−kH

((HΓtH−1)X,Y

)for all (X, Y ) ∈ F ,

which may be reformulated to

γ(X,Y ) = qn2−kH

((HΓtH−1)−XP,−Y P

)for all (X, Y ),∈ F

which proves the first statement. For the second and third statement one has to usethe definition of the matrix P(Q) as it is given in Definition 3.3 and the definitionof the generalised cWAM in Proposition 3.5 respectively. 2

Using the arguments used in the proof of Theorem 4.19 and Corollary 4.12 insteadof Theorem 4.11 one derives the same results on the WAM.

Theorem 4.20 With the data used above for C and C put Q := −P = −CDtB +NA, where N is as in (4.11). Then P ∈ GLδ(F) and the adjacency matrices satisfythe MacWilliams Identity

ΛX,Y = q−kh((HΛtH−1)XQ,Y Q



Λ = q−kh(P(Q)HΛtH−1P(Q)−1

)and therefore the generalised adjacency matrices Λ and Λ of the codes C and Csatisfy the identity Λ = q−kh(HΛtH−1).

58

Note that, due to Remark 3.4a), one may use a matrix αQ for α ∈ F∗ instead ofQ in Theorem 4.20 to accomplish the permutation on the states. This is, of course,in general false for Theorem 4.19.It is noteworthy that in both theorems only the identity on the generalised (com-plete) WAM is a MacWilliams identity in a strict sense. The essence of the blockcode MacWilliams identity is indeed that it is possible to obtain the weight enumer-ator of the dual code via a transformation on the weight enumerator of the originalcode without the necessity to give any information on the dual code. In fact thisis what the MacWilliams formula for the generalised (complete) WAM achieves, asno information on the dual code is necessary to obtain its generalised (complete)WAM. This is obviously different, if one considers the formulas for the representa-tives. In this case the controller canonical form of the code and the dual code isneeded to calculate the automorphism Q that induces the correct permutation forthe equality on the representatives to hold. So the identity on the representativesis, although a useful result, best understood as an important step on the way toprove the MacWilliams identity for the generalised (complete) WAM. Moreover, itis useful, when calculating examples.For practical purposes, however, it is probably not interesting. Recall that the classi-cal block code MacWilliams identity is in the first place a tool to compute the weightenumerator of a high dimensional convolutional code from the weight enumeratorof its low dimensional dual code and thereby saving computational effort. Hencethe value of the MacWilliams identity for convolutional codes in this respect is verymuch dependent on how much computation is involved in the MacWilliams trans-formation formula. Whereas for both MacWilliams formulas, i.e. the one for therepresentatives and for the generalised (complete) WAMs, one has to compute thecontroller canonical form of one of the original code’s C encoder G and the accordingweight adjacency matrix Γ(G) or Λ(G), the formula on the representatives addition-ally requires to compute the controller canonical form of the dual code. From thisthe automorphism Q, the permutation P(Q) and its inverse have to be calculatedand then applied to HΓtH−1 or HΛtH−1, respectively. These steps comprise alreadya good deal of computation, as the formula for the matrix N , that is an ingredi-ent for Q shows. However they may be omitted, if one considers the generalised(complete) WAM. In this situation it is only necessary to compute q

n2−kH(HΓtH−1)

or q−kh(HΛtH−1), respectively, and the outcome is one representative of Γ or Λ.

Although it is uncertain in general whether there is an encoder G and a controllercanonical form (A, B, C, D) such that the according (complete) WAM is this repre-sentative. For example all relevant parameters on the code that may be extractedfrom the (complete) WAM, such as the classical weight distribution, the minimaldistance etc., may be derived from this representative [24]. Hence one does notgain any advantage from computing a particular representative of the (complete)WAM and may therefore omit the additional computational effort involved withthe MacWilliams formula for the representatives without losing any of the weightparameters contained in the cWAM and WAM.The proof of the MacWilliams identity is highly technical and not particularly in-sightful. It has already been indicated that there is an elegant way of proving

59

the block coding MacWilliams identity using a discrete Fourier transform. It istempting to think that it might be possible to prove the MacWilliams identity forconvolutional codes by the same means. However, as pointed out earlier, the onlystrict MacWilliams identity is the formula for the generalised (complete) WAM. Thenature of these objects as being orbits of polynomial matrices makes it at least in-convenient to try to use this technique on them. Whatever proof may be employedfor the MacWilliams formulas for convolutional codes, it is necessary to show thatthe transformed matrix q

n2−kH(HΓtH−1) or q−kh(HΛtH−1) is in the orbit of the

dual object. Therefore it seems inevitable that one representative in the dual orbitbe chosen and one needs to show that it may be reached with a suitable permuta-tion from either q

n2−kH(HΓtH−1) or q−kh(HΛtH−1). Consequently, the result on

the representatives, if not particularly useful in applications, should be understoodas a means to prove the identity on the generalised (complete) WAMs.Before putting Theorems 4.19 and 4.20 in the context of previous results and givingprospects of possible further generalisations I will transfer them to a convolutionalcode and its sequence space dual by virtue of Proposition 2.31.

Theorem 4.21 Let C be a convolutional code and C its sequence space dual withthe data used above. Put Q′ := −TQ, where T ∈ GLδ(F ) is as in Proposition 3.15and Q as in Theorem 4.19. Then the cWAMs Γ = Γ(G) and Γ = Γ(G) satisfy

ΓX,Y = qn2−kH

((HΓH−1)XQ′,Y Q′



Γ = qn2−kH

(P(Q′)HΓH−1P(Q′)−1

)and therefore the generalised adjacency matrices Γ and Γ of the codes C and Csatisfy the identity Γ = q

n2−kH(HΓH−1).

Before giving the proof and stating the twin result for the WAM let me pointout that an important difference of the MacWilliams formulas in Theorem 4.21 andTheorem 4.19 is that the transposition of Γ must not be carried out in the formula forthe sequence space dual. From the computational point of view, it will become clearthat this may be attributed to Proposition 3.16. Moreover one should keep in mindthat the automorphism Q′ is stated in terms of (A,B,C,D) and (A, B, C, D) ratherthan (A,B,Cρ, Dρ), the controller canonical form of ρ(G). It is of course possible

to employ the controller canonical form of G = ρ(G) rather than that of G, butthis would require to introduce more notation. For example the matrix N wouldbe needed to be redefined, which I would like to avoid. Moreover, I explained thatthe MacWilliams formulas on the representatives are best understood as a necessarystep on the way to the MacWilliams identity of the generalised cWAM and they areof low interest for applications. The reader interested in seeing the automorphismQ′ in terms of the controller canonical forms of G and G is referred to [12].

60

Proof: First note that the matrix T from Proposition 3.15 is self-inverse, that isT = T−1. Moreover recall that for any permutation matrix P one has P t = P−1 andhence for any P ∈ GLδ(F ) it is P(P−1) = P(P )−1 = P(P )t. Using this, Theorem4.19 and Proposition 3.16 one finds

Γ = P(T )ΓtP(T )−1 = P(T )(qn2−kH

(P(Q)HΓtH−1P(Q−1)

))tP(T )−1

= qn2−kP(T )

(H(P(Q)HΓtH−1P(Q−1)

))tP(T )−1

= qn2−kH

(P(T )(P(Q)−1)t(H−1)tΓHtP(Q)tP(T )−1

)= q

n2−kH

(P(T )P(Q)(H−1)tΓHtP(Q)−1P(T )−1

)= q

n2−kH

(P(TQ)(H−1)tΓHtP(TQ)−1

),

where I additionally used that H is only applied entrywise on the matrix, so itsapplication and the permutation and the transposition commute. Now recall fromLemma 4.3 that the MacWilliams matrix H and hence its inverse are symmetric, soHt = H and (H−1)t = H−1. The same lemma gives H−1 = P(−Iδ)H = HP(−Iδ).Using these properties of H I may conclude

Γ = qn2−kH

(P(TQ)P(−Iδ)HΓH−1P(−Iδ)P(TQ)−1

),

= qn2−kH

(P(−TQ)HΓH−1P(−TQ)−1

).

This gives the MacWilliams formula in matrix form, which immediately implies theother two. 2

Using the same chain of arguments as in the proof of Theorem 4.21, one mayprove the analogous result for the WAM using Theorem 4.20 instead of Theorem4.19.

Theorem 4.22 Let C be a convolutional code and C its sequence space dual withthe data used in Theorem 4.21. Put Q′ := −TQ, where T ∈ GLδ(F ) is as in

Proposition 3.15 and Q as in Theorem 4.19 with the data of G, the encoder of theordinary dual of C. Then the cWAMs Λ = Λ(G) and Λ = Λ(G) satisfy

ΛX,Y = q−kh((HΛH−1)XQ′,Y Q′


Translating the linear transformation on the indices to permutation matrices, theidentity appears as

Λ = q−kh(P(Q′)HΛH−1P(Q′)−1

)and therefore the generalised adjacency matrices Λ and Λ of the codes C and Csatisfy the identity Λ = q−kh(HΛH−1).

At the end of this section I will discuss the issues of using different MacWilliamsmatrices H(P ), P ∈ GLδ(F), and in the case of the cWAM different block codingMacWilliams transforms Hα, α ∈ F∗. Let me first address the MacWilliams matri-ces. I will do this first for the case of the cWAM and the ordinary duality notion

61

[, ] in Theorem 4.19. Therefore recall from Lemma 4.3 that H(P ) = P(P )H for anyP ∈ GLδ(F ). Additionally using (3.1) this gives

Γ = qn2−kH(P(Q)HΓtH−1P(Q)−1)

= qn2−kH(P(Q)P(P )−1P(P )HΓtH−1P(P )P(P )−1P(Q)−1)

= qn2−kH(P(QP−1)H(P )Γt(P(P )H)−1(P(Q)P(P−1))−1)

= qn2−kH(P(QP−1)H(P )ΓtH(P )−1P(QP−1)−1).

This shows that the effect of using a different MacWilliams matrix is simply thatthe automorphism that induces the needed permutation in order for the equationto hold has to be adjusted dependent on P . This argument obviously holds forthe MacWilliams identities in Theorems 4.20, 4.21 and 4.22 as well. Hence allMacWilliams formulas work for any MacWilliams matrix H(P ), P ∈ GLδ(F) in-stead of H and consequently the MacWilliams formulas for the generalised (com-plete) WAM remain unaltered by using a different MacWilliams matrix.

The use of different MacWilliams identities Hα, α ∈ F∗, may only occur in thesituation of the cWAM, i.e. it is relevant for Theorem 4.19 and Theorem 4.21.Recalling Lemma 2.13 i) it is Hα = H mα. Therefore it is not surprising that thematrix mα(Γ) will be of interest. Fix (X, Y ) ∈ ∆ then

mα(γX,Y ) = mα(cwe(ϕ(X, Y ) + C const)) = cwe(α(ϕ(X, Y ) + C const))

= cwe(ϕ(αX,αY ) + C const) = γαX,αY ,

where I used (2.1). This implies

mα(Γ) = P(α · id)ΓP(α · id)−1. (4.28)

Because the map mα is a C-algebra-homomorphism for any α ∈ F∗ one has for anyP ∈ GLδ(F) by (3.1)

mα(P(P )H(Γt)H−1P(P )−1) = P(P )H(mα(Γt))H−1P(P )−1

= P(P )HP(α · id)ΓtP(α · id)−1H−1P(P )−1

= P(α−1P )HΓtH−1P(α−1P )−1. (4.29)

With this result and the second statement in Theorem 4.19 may be reformulated asfollows for α ∈ F∗:

qn2−kHα(P(αQ)HΓtH−1P(αQ)−1) = q

n2−kH(mα(P(αQ)HΓtH−1P(αQ)−1))

= qn2−kH(P(αQ)P(α−1)HΓtH−1P(α)P(αQ)−1)

= qn2−kH(P(Q)HΓtH−1P(Q)−1) = Γ. (4.30)

Again, using the same chain of arguments, I derive from the second statement ofTheorem 4.21 for any α ∈ F

Γ = qn2−kHα(P(αQ′)H(Γ)H−1P(αQ′)−1). (4.31)

62

As a consequence I obtain for the generalised cWAM and any α ∈ F∗

Γ = qn2−kHα(HΓtH−1) and Γ = q

n2−kHα(HΓH−1). (4.32)

Considering this result with the result on the different MacWilliams matrices Isummarise the following identities.

Corollary 4.23 For any α ∈ F∗ and any P ∈ GLδ(F) one has

(i) Γ = qn2−kHα(H(P )ΓtH(P )−1),

(ii) Λ = q−kh(H(P )ΛtH(P )−1),

(iii)Γ = qn2−kHα(H(P )ΓH(P )−1),

(iv)Λ = q−kh(H(P )ΛH(P )−1).

4.4 Previous Results and possible Generalisations

The general approach to convolutional codes has been that all concepts of con-volutional coding theory should be generalisations of the existing and establishednotions of block coding theory. In the previous sections I showed that both no-tions of duality and the weight function considered for convolutional codes reduceto the standard one for block codes, if the complexity of the convolutional code is0. Moreover, it has been pointed out that for any convolutional code C = withencoder G and complexity 0 one finds Γ(G) = cwe(C) and Λ(G) = we(C). Thereforeit is a valid question, whether the MacWilliams formulas proved above reduce tothe classical MacWilliams identities known for block codes. To this end note that ifδ = 0, the group GLδ(F) is trivial. Therefore no permutation is involved in any ofMacWilliams formulas stated in the previous section. Consequently, the generalised(complete) WAM is no longer an orbit of polynomial matrices, but a single (multi-variate) polynomial, namely the (complete) weight enumerator, which is of courseinvariant under transposition. Finally one has H = 1 = H−1. Comparing the result-ing formulas in this situation with the respective classical MacWilliams identitiesshows that they indeed coincide. Therefore it is justified to call the result in Section4.3 a generalisation of the classical MacWilliams identities to convolutional codes.Next I want to have a closer look at codes with degree δ = 1, also called unitconstraint-length codes. The situation now becomes particularly simple since, firstly,GLδ(F) = F∗ and, secondly, the adjacency matrix Λ as well as the corresponding(sequence space) dual objects do not depend on the choice of the encoder matrices G

and G. The latter is a consequence of Equation (3.2) along with Remark 3.4(a), (b).Notice also that in Diagram (4.8) the second and third column are trivial. Using Re-mark 3.4(a) once more, one finally sees that the statements of Theorem 4.20 reduceto the nice short formula

Λ = q−kh(HΛtH−1). (4.33)

This result cannot be achieved for the cWAM as there is no analogue for Remark3.4a).

63

In the paper [1] the so-called weight enumerator state diagram has been studiedfor codes with degree one. They are defined as the state diagram of the encoderwhere each directed edge is labelled by the weight enumerator of a certain affinecode. A type of MacWilliams identity has been derived for these objects [1, Thm. 4]with the usual duality notion [, ]. It consists of a separate transformation formulafor each of these labels. After some notational adjustment one can show that theweight enumerator state diagram is in essence identical to the WAM of the code.Furthermore, if stated in the notation of the text, the MacWilliams identity in [1,Thm. 4] reads as

λX,Y =

q−k−1h

(λ0,0 + (q − 1)(λ0,1 +

∑Y ∈F λ1,Y )

)if (X, Y ) = (0, 0)

q−k−1h(λ0,0 + qλX,Y − λ0,1 −

∑Y ∈F λ1,Y

)else.

(4.34)

In the sequel I will briefly sketch that this result coincides with Identity (4.33). Inorder to do so use again `X,Y as introduced in Corollary 4.5. Then (4.33) turns into

λX,Y = q−kh(`−Y,X) for all (X, Y ) ∈ F . (4.35)

Now I am in a position to derive (4.34). Consider first the case (X, Y ) = (0, 0).Recalling Corollary 4.5, Proposition 3.14, and Remark 3.4(a) one finds

q`0,0 = we(CC ) =∑

(X,Y )∈F

λX,Y = λ0,0 +∑Y ∈F∗

λ0,Y +∑X∈F∗

∑Y ∈F

λX,Y

= λ0,0 + (q − 1)(λ0,1 +∑Y ∈F

λ1,Y ).

Using (4.35) this yields the first case of (4.34). For the second case let (X, Y ) ∈F \(0, 0). Since ∆⊥ = 0 and (ker Φ)⊥ = F one observes that in Corollary 4.5 thethird case has to be applied. Along with Proposition 3.14 and Remark 3.4(a) thisyields

q`−Y,X =1

q − 1

(q

∑(Z1,Z2)∈(−Y,X)⊥

λZ1,Z2 − we(CC ))

=1

q − 1

(q∑α∈F

λαX,αY − we(CC ))

=1

q − 1

(q(q − 1)λX,Y + qλ0,0 −

∑(Z1,Z2)∈F

λZ1,Z2

)= qλX,Y + λ0,0 −

1

q − 1

∑(Z1,Z2)∈F\(0,0)

λZ1,Z2 = qλX,Y + λ0,0 − λ0,1 −∑Y ∈F

λ1,Y .

Combining this with (4.35) leads to the second case of (4.34). At this point I want tostress that it was indeed this paper that gave a crucial incentive to find the generalMacWilliams formulas for convolutional codes.After having surveyed on the work that has been done in the past, I will now discussthe possibilities to generalise the result even further. It has already been pointedout that it is not possible to consider different bilinear forms, as in the field case

64

the classical MacWilliams identity is known to work only for the standard bilinearform that has been used in this text so far. However, if one considers block codesover finite rings, there are more bilinear forms that admit a MacWilliams identity.This suggests looking at convolutional codes over finite rings, that is looking atsubmodules of R[z]n, where R is a finite ring, instead of F[z]n. Indeed those codeshave been studied in the past, for example in [4], [16]. Although some fundamentalresults have been derived, some key questions remain unsolved. The cWAM and theWAM have been defined using minimal encoders for the code. An essential findingof systems theory, that clarified how the cWAM and WAM of two different encodersare connected, is how to minimal encoders and hence there controller canonicalforms in the field case are related. Alas, even for the finite rings closest to the fieldcase, commutative, finite chain rings, there is no equivalent systems theoretic resultyet, that would tell how, for example, two minimal encoders of one convolutionalcode over one of these rings are related. Even worse, it is not even clear, whetherit is possible to find a well-defined notion of ”‘complexity”’ for a convolutional codeover a finite ring, that is reflected by an encoder of the code. As the complexity ofthe convolutional code determines the format of the cWAM and WAM in the fieldcase, it is obvious that unless this question is solved, there is little hope one mayfind a MacWilliams identity for convolutional codes over finite rings that is similarto those in Section 4.3. Still, there is currently work carried out on that subject, soit might be hoped that these problems will be overcome in the future.Another direction of generalisation that may be considered is to look at differentweight enumerators than the two, which have been studied here. In the literaturemany more weight enumerators for block codes over finite fields exist. The classicalweight enumerator we, that has been considered here is the most commonly usedand the coarsest, because all elements in F∗ are assigned the same value W andzero is assigned the value 1. Almost on the other extreme is the complete weightenumerator cwe, where each a ∈ F is assigned a variable Wa and it has been shownthat the we is the projection of cwe under ι. Using different projections of cwe leadsto other notions of weight enumerators such as the Lee weight enumerator Lwe witha projection ιLee or the homogeneous weight enumerator hwe with ιhom, where

ιhom : C[Wa | a ∈ F]→ C[W,W0],Wa 7→

W0, if a = 0

W, else.

For all these weight enumerators MacWilliams identities in the block coding caseexist. One may of course define a Lee weight adjacency matrix or a homogeneousweight adjacency matrix where one puts, for example,

λLeeX,Y := Lwe(ϕ(X, Y ) + C const) for (X, Y ) ∈ ∆

orλhomX,Y := hwe(ϕ(X, Y ) + C const) for (X, Y ) ∈ ∆.

It has been shown that there was one crucial step for conveying the result on Γ toits projection Λ under ι, namely Corollary 4.10, where I used that ιH = q−

n2 hι and

65

ι is a C-algebra-homomorphism. If the projections used to derive the weight enu-merators fulfill similar conditions, the corresponding weight adjacency matrices willallow for a MacWilliams identity in the convolutional code case. In the example ofthe Lee weight enumerator and the homogeneous weight enumerator this is the caseand each of the theorems in Section 4.3 may be stated and proven for these weightadjacency matrices. However, one has to pay attention to the rescaling factor q·,which has to be adapted according to the projection.

66

5 On Self-orthogonal and Self-dual Convolutional

Codes

In block coding theory one of the research fields which aroused great interest amongmathematicians is the theory of self-orthogonal and self-dual block codes. Beforegiving a formal definition, I will briefly sketch why these codes are interesting andwhich are the main topics in research.Two of the best known classical block codes over F2 are the extended Hammingcode C8 of length 8 and the extended Golay-code C24 of length 24. Both codes havea dimension of half their length and are in a certain sense optimal. Moreover, theirconstruction admits for an efficient decoding algorithm. Hence from an applicationspoint of view they are interesting objects. From a mathematical point of viewit is striking that both codes are self-dual. This fact led to the search for higher-dimensional self-dual codes and self-dual codes over different fields with high minimaldistance, under the premise that the chance of finding a good self-dual code ishigher than those of finding a good random code. In fact, the code used to encodeinformation for compact discs and digital versatile discs is a self-dual code over F128,which indicates that there might be many more good self-dual codes.There is still ongoing research invested in finding self-dual codes with good distanceproperties and a huge machinery has been developed to faciliate this task. It isobvious that an exhaustive amount of computational effort is necessary for highdimensions to find all self-dual codes and to compute their minimal distance. Animportant tool in this search is the classical (complete) MacWilliams transform,which leaves the (complete) weight enumerator of a self-dual code invariant. Usinginvariant theory, it has been shown that the (complete) weight enumerator of aself-dual code has a very special structure, which makes it possible to exclude theexistence of good self-dual codes of certain parameters. For example, for any self-dual, doubly-even (the weight of any codeword is divisible by 4) code C over F2,there is a polynomial ψ ∈ C[Z1, Z2] such that cwe(C) = ψ(cwe(C8), cwe(C24). Thatis, any complete weight enumerator of a self-dual code over F2 is algebraic in thecomplete weight enumerator of C8 and C24, which are besides being self-dual doublyeven as well.The situation for convolutional codes is very different. It is a recent result [9] that— with a few exceptions — a convolutional code C is optimal with respect to therelation of its algebraic parameters and various distance parameters if and only if thedual with respect to [, ] is. Moreover, in [17] a convolutional code with good distanceproperties is presented that is self-dual with respect to sequence space duality. Littlemore is known about self dual convolutional codes. In this chapter I will collect somesimple facts about self-orthogonal and self-dual convolutional codes.

67

5.1 Self-orthogonal and Self-dual Block Codes

I start by giving a formal definition of what a self-orthogonal and a self-dual blockcode is.

Definition 5.1 A code 0 6= C ≤ Fn is self-orthogonal, if C ⊆ C⊥ and self-dual, ifC = C⊥. Moreover, a codeword v ∈ C is isotropic if [v, v] = 0. A code C ≤ Fn iscalled totally isotropic if every codeword in C is isotropic.

It is immediate from the definition that any self-dual code is of course self-orthogonal. Moreover, using Lemma 2.9 the dimension of a self-orthogonal codeis at most n

2. If a code is self-dual, its dimension necessarily is n

2. Therefore 2 | n

is a necessary condition for the existence of self-dual codes in Fn. This is in generalnot a sufficient condition, for it is easy to check that there is no self-dual code inF2

3. A more general question is, whether a self-orthogonal code is contained in aself-dual code or not. It turns out that the answer to both questions is dependenton the cardinality of the field. The following assertions are well-known in the theoryof bilinear forms.

Proposition 5.2 Let K be a field (not necessarily finite) and V ≤ Kn a self-orthogonal subspace. Then V is contained in a self-dual space, if and only if −1 isa square in K or 4 | n.

In the situation of a finite field it depends on the cardinality of the field, whether−1 is a square.

Corollary 5.3 A self-orthogonal code over a finite field F = Fq is contained in aself-dual code, if and only if

[q 6≡ 3 (mod 4) or 4 | n].

Regarding the relationship of the terms totally isotropic and self-orthogonal andself-dual respectively, one may summarise the following.

Lemma 5.4 For (iii)-(v) let n be even.

(i) Any self-orthogonal (and hence any self-dual) code is totally isotropic;

(ii) if the characteristic of F is odd, any totally isotropic subspace of Fn is self-orthogonal;

(iii)if the characteristic of F is 2, the dimension of the unique maximal totallyisotropic subspace of Fn is n− 1 and it contains 1 := (1, . . . , 1) ∈ Fn;

(iv)a self-dual code is a maximal totally isotropic subspace of Fn, if and only if thecharacteristic of F is odd;

(v) if the characteristic of F is odd, a maximal totally isotropic subspace of Fn isself-dual.

68

One means to assess the quality of a code is the so called Singleton-bound.

Lemma 5.5 For any block code of length n, dimension k and distance d over afinite field F the following inequality holds

n ≥ k + d− 1.

A code whose parameters reach this bound is called maximum distance separable(MDS).

Example 5.6 The Hamming-code of length 8,

C8 = im

1 0 0 0 1 1 1 00 1 0 0 0 1 1 10 0 1 0 1 0 1 10 0 0 1 1 1 0 1

⊆ F82

is easily checked to be self-dual. In the introduction to this chapter it has alreadybeen mentioned that each codeword’s weight is divisible by 4, that is the code isdoubly even. In particular, this shows that the minimal distance of the Hamming-code is 4. The Hamming-code of length 8 is therefore not optimal, but indeed closeto being optimal.

5.2 The Clifford Groups and Gleason’s Theorem for BlockCodes

In this section I will define a group of C-algebra homomorphisms acting on C[Wa |a ∈ F], that leaves the complete weight enumerator of a self-dual block code in-variant. To ensure the existence of self-dual codes, let n be even, or, if necessary,because of Corollary 5.3 , be divisible by 4. Recall the maps mα, α ∈ F∗ definedafter Definition 2.12. According to (2.1) the complete weight enumerator of anycode and hence, in particular, the complete weight enumerator of a self-dual codeis left invariant under mα for any α ∈ F. If C = C⊥ ≤ Fn is self-dual and thusk = n

2, then applying the complete MacWilliams transform H on cwe(C) gives using

Theorem 2.14H(cwe(C)) = cwe(C⊥) = cwe(C). (5.1)

Hence the complete weight enumerator of a self-dual code is invariant under H.Moreover, recalling that Lemma 2.13 gives Hα = H mα for any α ∈ F∗, it isinvariant under Hα for any α ∈ F∗. Now consider ds : C[Wa | a ∈ F]→ C[Wa | a ∈F] defined as the F-algebra-homomorphism given by Wa 7→ χs(a

2)Wa for any s ∈ F,where the notation from Proposition 2.11 has to be read for the F-vectorspace F.Using that a character is a group homomorphism, one finds for a given c ∈ Fn thatfor any s ∈ F

ds(cwe(c)) = ds(n∏i=1

Wci) =n∏i=1

χs(c2i )Wci = χs(

n∑i=1

c2i )

n∏i=1

Wci = χs([c, c])cwe(c).

(5.2)

69

Thus, if c ∈ C is isotropic, it is immediate that ds(cwe(c)) = cwe(c) for all s ∈ F.Since independently of the characteristic a self-dual code, C ≤ Fn is totally isotropic,by Lemma 5.4 (i) one arrives at

ds(cwe(C)) = cwe(C) for any s ∈ F and any self-dual code C over F. (5.3)

Finally, the complete weight enumerator of any c ∈ Fn is invariant under −id, where−id(Wa) = −Wa for any a ∈ F, since n is even. Note that all maps used above haveC[Wa | a ∈ F] as range, do not depend on n, and, in particular, are invertible, whichis immediate for mα, α ∈ F∗, H from Proposition 2.13 and, by checking d−1

s = d−s,for ds, s ∈ F as well. Hence the set

K := 〈ds,mα,H,−id | s ∈ F, α ∈ F∗〉

is indeed a group, the Clifford group, that only depends on F, and the followingassertion is already proven.

Proposition 5.7 The complete weight enumerator of a self-dual code over F isinvariant under K.

Note that due to the independence of K of n the length of the code in the propositionneed not to be specified. It is by no means obvious that the Clifford group is finite,but exploiting the special structure of the group its order can even be calculated[26]. In the same book it is established using invariant theory, group theory andrepresentation theory, that in a very general context the following special case holds,which is commonly referred to as Gleason’s Theorem.

Theorem 5.8 Fix a finite field F = Fq. The ring of invariants of K is generatedby q complete weight enumerators of self-dual codes over F as a C-algebra. In otherwords, there are q self-dual codes C1, . . . , Cq such that any ψ ∈ C[Wa | a ∈ F], thatis invariant under K is algebraic in cwe(C1), . . . , cwe(Cq).

The theorem has different implications. One is that there is no other C-algebrahomomorphism that leaves the complete weight enumerator of an arbitrary self-dualcode invariant, that is, K is the largest group of homomorphism with that property.Moreover, a code, whose complete weight enumerator is not in the invariant ringof K, cannot be self-dual. Therefore through the knowledge of the generators ofthe invariant ring, which may be found with a computer search, if necessary, theexistence of self-dual codes with certain parameters can be excluded.Theorem 5.8 in this form has not been proven by Gleason. In fact, the special casefor doubly-even, self-dual codes over F2 mentioned in the introduction of this chapteris attributed to him. Quite recently Nebe, Rains and Sloane [25] to [27] embeddedthis result into a wide context using sophisticated methods.

70

5.3 Self-orthogonal and Self-dual Convolutional Codes

It has been demonstrated that self-dual block codes are of interest for mathematicsand applications, so it is reasonable to look on self-dual codes as well. A convolu-tional code is called self-dual (or self-orthogonal) with respect to [, ], if C = C (or

C ⊆ C). For the sequence space duality [[, ]] a convolutional code that satisfies C = C(or C ⊆ C) is called sequence space self-dual (or sequence space self-orthogonal). Asboth notions of duality reduce to the bilinear form [, ] on Fn, if the complexity is 0,the two definitions are generalisations of the concept of self-dual block codes. Againthroughout this section n is the length of a convolutional code and k its dimension.

Example 5.9 Consider the code C = im(z + 1 1 z

)≤ F2[z]3. It is easily checked

that this code is self-orthogonal and the dual is C = im

(z + 1 1 z

1 1 1

). Due to

Propositions 2.25 and 2.31 it is C = ρ(C) = im

(1 + z z 1

1 1 1

).

Hence C is not sequence space self-orthogonal. This shows that the two forms [, ]and [[, ]] lead to different notions of self-orthogonality. For example, the notion usedin [17] is sequence space self-orthogonality. I will concentrate on self-orthogonalitywith respect to [, ].The codes C const and CC have proven to be useful in the analysis of convolutionalcodes. The next lemma surveys consequences for these block codes, if the convolu-tional code is self-orthogonal or even self-dual.

Lemma 5.10 Let C be a self-orthogonal (self-dual) code with minimal encoder G.Then:

(i) imG(0) is a self-orthogonal (self-dual) block code;

(ii) the code generated by the highest coefficient rows of G is a self-orthogonal (self-dual) block code;

(iii)if C is self-dual and C const or CC is self-dual, then δ = 0;

(iv) if the characteristic is even the code CC is a totally-isotropic block code.

Proof: (i) and (ii) are immediate from GGt = 0. As for (iii), C const self-dualimplies dim(C const) = n

2, hence any row of the chosen encoder must be constant

and C is a block code. If CC is self-dual, then n2

= dim(CC ) = dim(C). But fromProposition 2.28 it is known that dim(CC ) = k + r, where due to the self-duality ofC it is r = r. Hence r = 0 and again all rows of the encoder are constant.(iv) Choose any row gi =

∑δij=0 g

(j)i zj of the encoder G. As C is self-orthogonal

[gi, gi] = 0. From this one finds for all 0 ≤ m ≤ 2δi by comparing the coefficients

0 =∑k+l=m

[g(k)i , g

(l)i ].

71

Now let m = 2m′ be even. The symmetry of the bilinear form now gives using p = 2

0 =∑k+l=m

[g(k)i , g

(l)i ] = [g

(m′)i , g

(m′)i ] +

m′−1∑k=0

2[(g(k)i , g

(m−k)i ]

=[g(m′)i , g

(m′)i ].

Hence for all 1 ≤ i ≤ k and 0 ≤ m′ ≤ δi one has [g(m′)i , g

(m′)i ] = 0, that is the code

CC is generated by isotropic vectors. Now in characteristic 2 one has for all a, b ∈ Fnand α ∈ F

[a, a] = 0 = [b, b]⇒ [a+ αb, a+ αb] = [a, a] + α2[b, b] + 2α[a, b] = 0.

Therefore the code CC itself is totally-isotropic. 2

For p ≥ 3 (iv) is wrong for self-dual codes with δ ≥ 1, as here according toLemma 5.4 a subspace is self-orthogonal, if it is totally isotropic. Hence if CC istotally-isotropic one arrives at the following inequality: dim(CC ) = k + r ≤ n

2. In

particular, in this situation there is no self-dual code with r > 0 and hence δ > 0such that CC is totally-isotropic.

In the case p = 2, however, one can derive some more results from Lemma 5.10(iv) using Lemma 5.4 (iii).

Lemma 5.11 Let C be a self-orthogonal convolutional code over a field of evencharacteristic. Then dim(CC ) = k + r ≤ n − 1. In particular, 1 ∈ C const 6= 0.Hence for any self-dual code C it is 1 ∈ C.

Proof: The only non-obvious assertion is to show that 1 is contained in thedual of any self-orthogonal code C ≤ Fn. Then CC is contained in the uniquely

determined maximally totally-isotropic subcode V ≤ Fn. Hence CC⊥ = C const ⊇ V ⊥.

Since dimV = n− 1, V ⊥ is generated by one vector. Let v ∈ V , then

0 = [v, v] =n∑i=1

v2i = (

n∑i=1

vi)2,

which implies

0 =n∑i=1

vi = [v,1].

Therefore V ⊥ = F1 ⊆ C const. 2

As a consequence the minimal distance of a self-dual convolutional code withcharacteristic 2 and length n is less or equal than n. To estimate how good thisbound is consider the generalised Singleton-bound for convolutional codes as givenin [32].

72

Proposition 5.12 For any (n, k, δ)-convolutional code C with minimal distance done has:

d ≤ (n− k)(

⌊δ

k

⌋+ 1) + δ + 1.

Note that this inequality reduces to the block code Singleton bound, as soonas δ = 0. As in this situation a code which achieves this bound is called MDS(maximum distance separable). It has been shown that any self-dual convolutionalcode of length n and dimension n

2has a minimal distance less than n + 1. Using

that the Forney indices of a MDS-code are in ε, ε+1 for some ε ∈ N and 1 ∈ C fora self-dual code C with characteristic 2, the Forney indices of a self-dual MDS-codeare in 0, 1 that is any polynomial row in an encoder of C has degree 1, which isobviously a strong condition on the code. The same holds true for any convolutionalcode with odd characteristic using δ ≤ n

2− 1.

So far, few self-dual convolutional codes have been given. Therefore it is a validquestion for which parameters self-orthogonal or self-dual convolutional codes exist.

Example 5.13 Fix n = 4 and q = 2. Then im

(z + 1 1 z 0

1 1 1 1

)is a self-dual

convolutional code of complexity 1. Using this code it is straightforward how toconstruct self-dual convolutional codes with parameters n ≥ 4, δ ≥ 1. But thereis no self-dual convolutional code with n = 2 and δ > 0, as im (1, 1) is the onlytotally-isotropic subspace of F2. So in this situation self-dual convolutional codesexist for almost all parameters.

The case of odd characteristic is already more complicated in the case of block codes.It has been shown, that the coefficient space CC of a self-dual convolutional codewith odd characteristic is only totally-isotropic if C itself is a block code. A naturalquestion is, whether there are any self-dual convolutional codes with δ > 0 and ifthere are any constraints on the space CC. Let me address the first issue.

Proposition 5.14 Let n be even. If q 6≡ 3 (mod 4), any self-orthogonal convo-lutional code C ⊆ F[z]n is contained in a self-dual convolutional code. Let q ≡ 3(mod 4). Then any self-orthogonal convolutional code is contained in a self dualconvolutional code, iff 4 | n.

Proof: Let C = imG be a self-orthogonal convolutional code. Then G generatesan F(z)-vectorspace V of dimension k with C ⊆ V . Proposition 5.2 gives the con-dition, when V and hence C is contained in a self-dual subspace W ⊆ F(z)n. UsingCorollary 5.3, since F ⊆ F(z), the existence of such a W is guaranteed in the respec-tive cases. Now W ∩ F[z]n is a free, self-dual F[z]-module with direct complement,hence a convolutional code, which contains C. 2

Note that there is no assertion on the complexity of the self-dual code C is con-tained in. Although the self-orthogonal code C may have had complexity δ it is notguaranteed that this is the complexity of any self-dual code, in which C is contained.

73

Example 5.15 Let F = F3 and n = 4 such that any self-orthogonal code is con-tained in a self-dual code. Then C = im

(1 + z 1 + 2z 1 z

)is a self-orthogonal

code of complexity 1. The construction above guarantees that C is contained in a

self-dual code C ′. One can easily check that one such code is C ′ = im

(1 2 0 11 1 1 0

),

which has of course complexity 0. It is straightforward to see, by a short computersearch for instance, that there is no self-dual code of complexity 1 that contains C.I finish this section by giving an example of a self-orthogonal code over F3 whoseextension to a self-dual code C is a code with dim(CC) = 4, showing that there areself-dual CC for self-dual convolutional codes.

The code C ′ = im(1 + z + z2 1 + 2z + z2 1 z2

)is clearly self-orthogonal. Note

in particular that the coefficient row (1, 2, 0, 0) is not isotropic. Assume that thereis a self-dual code C, that fulfills dimCC = 3. This implies dim C const = 1, hencethere is an isotropic constant row that is orthogonal to C ′ and thus to CC′ . AsC⊥C′ = 〈(1, 1, 1, 1)〉 and this vector is not isotropic, there is no self-dual convolutionalcode C with dimCC = 3. Of course there cannot be such a code with dimCC ≤ 2and thus any self-dual code C that includes C ′ has dimCC = 4. One self-dual con-volutional code with the smallest possible complexity is

C = im

(1 + z + z2 1 + 2z + z2 1 z2

z 2 + z 1 1 + z

),

which has degree 2.

The examples given demonstrate that there is in general no constraint on thespace of coefficient rows of a self-dual convolutional code with odd characteristic,nor is it possible to predict the complexity of a self-dual convolutional code thatincludes a self-orthogonal convolutional code of given complexity.

5.4 Invariant Theory and Convolutional Codes

Gleason’s Theorem has proven to be a powerful tool in the analysis of self-dualblock codes. The existence of a MacWilliams identity for the generalised cWAMof a convolutional code suggests attempting to generalise Gleason’s Theorem toconvolutional codes as well. To anticipate the result of this section, I will not beable to do so, but some preliminary steps will be established.The most important difference in the setting of convolutional codes is that thegeneralised cWAM is not a polynomial like the complete weight enumerator is. It isnot even a polynomial matrix, but an orbit of polynomial matrices. Moreover, theMacWilliams transform for convolutional codes

Γ 7→ qn2−kH(HΓtH−1)

is dependent on the size of Γ, that is the complexity of the convolutional code, whosecWAM Γ is. In particular, the size of the MacWilliams matrices H ∈ C[Wa | a ∈

74

F]qδ×qδ =: Aδ is dependent on δ. Therefore the MacWilliams transform to com-

plexity δ is not even defined as map on the cWAM of a code with higher or lowercomplexity.It has already been mentioned that the only MacWilliams identity in a strict sense isthat on the generalised cWAM, which is not an element of a ring. Therefore it is notclear what an invariant in this setting should be. Let Ω ∈ Aδ and ψ : Aδ → Aδ be amap Aδ → Aδ. The matrix Ω is called invariant under ψ if there is a P ∈ GLδ(F),such that ψ(Ω) = P(P )ΩP(P )−1.In the block code case it is straightforward to show that the complete weight enumer-ator of a self-dual block code is an invariant of the Clifford group. Having establisheda MacWilliams identity for the cWAM of convolutional codes, it is pertinent to ask,if a similar result holds for self-dual convolutional codes as well. Therefore I will adConsidering the maps mα, α ∈ F∗, it is immediate from 3.4d) that the cWAM of anyconvolutional code, and hence of any self-dual convolutional code, is an invariantunder mα for any α ∈ F∗. The situation for the maps ds, s ∈ F, applied entrywiseto a matrix in Aδ, is more involved, even if one specialises to self-dual convolutionalcodes.Let the characteristic be even and C = imG be a self-dual convolutional code. Thenaccording to Lemma 5.10 (iv) the block code CC is totally isotropic. In particular,every entry of Γ(G) is the complete weight enumerator of a set of isotropic vectors.Hence by (5.2) for any (X, Y ) ∈ F and any s ∈ F it is ds(γX,Y ) = γX,Y . Sinceds is applied entrywise on Γ(G), the cWAM of a self-dual convolutional code withcharacteristic 2 is even a strict invariant under ds, s ∈ F, that is ds(Γ(G)) = Γ(G)for all s ∈ F.This is different for odd characteristic. Recalling that there is no general con-straint on CC for a self-dual convolutional code C = imG, weight enumerators ofanisotropic vectors may occur as entries of Γ(G). It is easy to see from (5.2) thatfor any anisotropic vector v ∈ Fn the monomial ds(cwe(v)) has a non-trivial com-plex coefficient. Thus, the entries of ds(Γ(G)) and Γ(G) are not even in bijectionto each other any more and hence Γ(G) is in general not an invariant under ds for0 6= s ∈ F. In fact I will show that the only situation, in which the cWAM of aself-dual convolutional code of odd characteristic is invariant under ds for all s ∈ F∗,is δ = 0.

Lemma 5.16 Let s ∈ F∗ and consider the group homomorphism µs : Fn → C∗,c 7→ χsc(c). Then c ∈ kerµs for all s ∈ F if and only if c is isotropic with respect to[, ].

Proof: If c ∈ Fn is isotropic with respect to [, ], then clearly it is in the kernelof µs for all s ∈ F∗ due to the definition of χ.Now let c ∈ kerµs for all s ∈ F and assume [c, c] = t, then again the definition ofχ gives χsc(c) = χs(t) = χt(s) = 1 for all s ∈ F. Hence the character χt is trivial,which implies t = 0. 2

Proposition 5.17 Let C be a convolutional code. Let Γ be any representative of

75

the cWAM of C. Then Γ is invariant under ds for all s ∈ F if and only if CC istotally-isotropic.

Proof: As ds(0) = 0 for all s ∈ F, I have only to consider the non-trivialentries of the cWAM. Now let (X, Y ) ∈ ∆ and L(X, Y ) ⊆ CC ⊆ Fn such thatγ(X,Y ) = cwe(L(X, Y )). If CC is totally isotropic ds(cwe(L(X, Y ))) = cwe(L(X, Y ))for all s ∈ F due to (5.2). Therefore ds(Γ) = Γ for all s ∈ F.

Now assume CC is not totally-isotropic. If C const is not totally-isotropic, choose c ∈C const such that [c, c] 6= 0. Hence there is according to Lemma 5.16 s ∈ F such thatcwe(c) 6= ds(cwe(c)) /∈ N[Wa | a ∈ F]. This implies ds(cwe(C const)) 6= cwe(C const).As γ0,0 = cwe(C const) according to Proposition 3.12 the matrix Γ is not invariantunder ds. In the sequel I may therefore assume that CC is not totally-isotropic, butC const is totally-isotropic. Let c ∈ CC such that [c, c] 6= 0. Choose (X, Y ) ∈ ∆ suchthat c ∈ L(X, Y ). Hence γ(X, Y ) = cwe(L(X, Y ))cwe(c + C const). In particular,the coefficient of the monomial cwe(c) is non-zero, say m ∈ N (more codewords inL(X, Y ) may lead to this monomial). As c is not isotropic there is an s ∈ F suchthat χsc(c) 6= 1, which implies according to (5.2) that ds(mcwe(c)) = αmcwe(c) forsome α ∈ C∗\1. Therefore cwe(c+ C const) 6= ds(cwe(c+ C const)) /∈ N[Wa | a ∈ F] forthe chosen s ∈ F. This implies ds(γX,Y ) 6= γX,Y and in particular ds(γX,Y ) 6∈ N[Wa |a ∈ F] and the entries of the matrices Γ and ds(Γ) are not in bijection. Hence Γ isnot invariant under ds for all s ∈ F. 2

Proposition 5.17 together with Lemma 5.10iii) now shows that the complexity ofa self-dual convolutional code of odd characteristic, whose cWAM is invariant underds for all s ∈ F∗ is indeed δ = 0. Therefore, I cannot see the possibility of establish-ing a Gleason-Theorem-like result for odd characteristics that reaches beyond theclassical situation. So I will concentrate on even characteristic.

It has already been shown that if p = 2, the cWAM of a self-dual convolutionalcode is invariant under the maps mα, ds for α ∈ F∗ and s ∈ F, independent of thecomplexity or dimension of the code. At the beginning of the section I pointed outthat a key difference to the classical situation is that the MacWilliams transformis dependent on the complexity. However, it is immediate from the MacWilliamsidentity that the cWAM of a self-dual convolutional code of complexity δ is leftinvariant under the MacWilliams transform that uses MacWilliams matrices of thesize qδ × qδ.Hence the cWAM of a self-dual convolutional code of complexity δ is left invariantunder the group

Kδ := 〈ds,mα,H(δ), | s ∈ F, α ∈ F∗〉,

where H(δ) denotes the MacWilliams transform for the cWAM of codes of complexityδ. In particular, K0 = K. The invertibility of the maps ds and mα has alreadybeen proven and as they are applied entrywise they are still bijective on each Aδ

76

for all δ ≥ 0. As for the MacWilliams transform note that taking into account,that in characteristic 2 it is 1 = −1, and one easily computes that H(δ))2 = id.Therefore Kδ is indeed a group for every δ ≥ 0. Moreover, as the maps ds andmα are homomorphisms on the C-algebra C[Wa | a ∈ F] the groups are naturallyisomorphic by mapping H(δ) 7→ H(δ′) from Kδ to Kδ′ . This gives the following result.

Proposition 5.18 The groups Kδ are isomorphic to each other and finite and maytherefore be called Clifford group of degree δ. The cWAM of a self-dual convolutionalcode of complexity δ is an invariant of the Clifford group of degree δ.

It remains open, whether the Clifford groups as given here are the largest groupwith the property to leave the cWAM of a self-dual convolutional code invariant. Ifone takes the rather basic proof of Gleason’s Theorem as given in [25], a problem intrying to copy this proof for the setting given here is that the maps mα act triviallyon any matrix in Aδ for any δ ≥ 0; that is, any element in Aδ is an invariant ofany mα due to the very definition of an invariant. Therefore, the important step ofLemma 4.4 in the cited paper cannot be proven. In general the approach used thereis difficult to copy, as for any set of block codewords it is always possible to giveits complete weight enumerator. In the convolutional coding this is very different.The complete weight adjacency matrix is only defined for codes and it is hardlyimaginable how to define a corresponding object for a set of polynomial vectors.Conversely, not all matrices in Aδ correspond to an object in F[z]n for n ∈ N, whichis used at least in the proof given in [26].

77

6 On Concepts of Equivalence for Convolutional

Codes

In Chapter 2 two different notions of isometry for block codes have been introducedwhich happen to coincide due to Theorem 2.8. It is widely accepted that these aremeaningful concepts and therefore they are well established in the literature. Areason for this is that they really describe when two codes are equivalent, whichtranslates to being ”‘equally good”’. For convolutional codes it has not yet beenclarified, when two codes perform ”‘equally well”’. One reason is, that it is unclearhow many properties of the codes should be taken into account.Of course, one could call two codes equivalent, if they are isometric; that is, ifthere is a F[z]-linear weight-preserving bijection between them. Taking F = F2 it isstraightforward to check that the codes im

(1 z

)and im

(1 1

)are in fact isomet-

ric. Necessarily they have the same minimum distance, but the complexity, whichis an important parameter, when comparing the performance of two codes, is dif-ferent. This is reflected in the generalised Singleton-bound as given in Proposition5.12. As far as this bound is concerned the code with the constant encoder matrixis clearly superior. Hence one would not like to identify these two isometric codes tobe equivalent. The toy example and the generalised Singleton-bound suggest, thatthe complexity is an important parameter as well and equivalent codes should sharethis invariant.Due to the fact that equivalent block codes are isometric they share the same weightenumerator. In the toy example given above, the two isometric convolutional codesdo not have the same WAM. Even worse, the different complexities result in theWAMs of the two codes not even having the same size. The latter can of courseeasily be fixed by demanding that the codes should have the same complexity inaddition of being isometric. Two convolutional codes of complexity 1 that are iso-metric are given by the minimal encoders

(1 1 + z

)and

(z z + 1

). The second

code is easily identified to be the reversal code of the first one. As the WAM of thereversal code can be obtained by the WAM of the original code via transpositionaccording to Corollary 3.16, the two codes share the same weight adjacency matrixif their WAM is symmetric. It is easy to calculate that the WAM of the first encoder

is

(1 W 2

W W

). So they do not share the same WAM. Again, from an applications

point of view both codes are indeed not equivalent as their column distances aredifferent (see [15] p.110). This shows that even an isometry which preserves thecomplexity of the codes does not leave all parameters invariant, that are importantfor the code’s performance.As the WAM contains most, if not all, parameters that are important for the per-formance of a convolutional code, it is an interesting approach to see in which waytwo codes that share the same WAM are connected. For block codes there is nostrong connection. If two codes have the same weight enumerator they need notbe isometric, but there is only a weight-preserving bijection between them. This isillustrated in the following example taken from [14], example 1.6.1.

78

Example 6.1 Let F = F2 and consider the codes

C1 := imG1 = im

1 1 0 0 0 00 0 1 1 0 01 1 1 1 1 1

and C2 := imG2 = im

1 1 0 0 0 01 0 1 0 0 01 1 1 1 1 1

.

One computes the weight enumerator of both codes to be 1 + 3W 2 + 3W 4 + W 6,but they are not isometric. This can quickly be verified by exploiting that the firstcode is self-orthogonal, which implies G1G

t1 = 0. But the second is not, as the first

and second row of the encoder are not orthogonal to each other. If the codes wereisometric, the MacWilliams Extension Theorem (Theorem 2.8) would imply there isa permutation matrix P ∈ GL6(F) and a U ∈ GL3(F) such that G2 = UG1P . Butthen 0 6= G2G

t2 = UG1P (UG1P )t = UG1PP

tGt1U

t = 0, and so the codes are notisometric.

Surprisingly, for a special class of convolutional codes a much stronger result maybe derived. To do so I have to recall the concept of monomial equivalence. Recall,that two block codes of length n are monomially equivalent if there is a permutationmatrix P ∈ Fn×n and an invertible diagonal matrix M ∈ Fn×n such that PMestablishes an isomorphism between the codes. This definition may be applied forconvolutional codes as well. It may at first sight be surprising that the matrixM is not allowed to have polynomial entries, but this is an implication from thecondition that it should be invertible. Moreover, it has already been demonstratedthat rescaling by z for example can change important invariants of the codes. Thefollowing assertion generalises a well known fact for block codes, is easily verifiedand may be found in [8].

Proposition 6.2 If the convolutional codes C and C ′ are monomially equivalentthey share the same generalised WAM.

By virtue of this proposition one easily sees that the concepts of isometry andmonomial equivalence do not coincide for convolutional codes, as I have given anexample of isometric codes that do not share the same WAM. Hence the classicalMacWilliams Extension Theorem 2.8 may not simply be transferred to convolutionalcodes.

Theorem 6.3 Let C, C ′ ⊆ F[z]n be two codes and assume that all Forney indicesof C are positive. Then C and C ′ are monomially equivalent if and only if theirgeneralised WAMs coincide.

To prove this Theorem I need some preparation.

Proposition 6.4 Let G, G ∈ F[z]k×n be minimal encoders and deg(G) = deg(G) =δ. Let (A,B,C,D) and (A, B, C, D) be the associated controller canonical forms.Then the following are equivalent:

79

(i) G = WG for some W ∈ GLk(F[z]).

(ii) The systems (A,B,C,D) and (A, B, C, D) are equivalent under the full statefeedback group, that is, there exist matrices T ∈ GLδ(F), U ∈ GLk(F), M ∈Fδ×k such that

A = T−1(A−MB)T, B = UBT, C = T−1(C −MD), D = UD (6.1)

Proof: (ii) ⇒ (i): Define the k × k-matrix V := I + B(z−1I − A)−1M . Fromsystems theory it is well known [2, p. 346, Eq. (2.43)] that

B(z−1I − A)−1C +D = V U−1(UB(z−1I − A+MB)−1(C −MD) + UD

), (6.2)

thus G = V U−1G. Due to nilpotency of A the matrix V is polynomial. But thenW := V U−1 is even unimodular since G and G are both basic. This yields (i).(i)⇒ (ii): First notice that equivalence under the full state feedback group is indeedan equivalence relation. Assumption (i) implies that G and G have the same rowdegrees. Since reordering of the rows of G retains the specific requirements of thecontroller canonical form I may further assume that G and G both have row degreesν1 ≥ . . . ≥ νk. Then A = A and B = B since they are both fully determined bythe row degrees. Due to reducedness of G and G the ith row of W has degree atmost νi for i = 1, . . . , k, see [5, Main Thm. (4)]. I will show now that

W =(I +B(z−1I − A)−1M

)U−1 for some M ∈ Fδ×k, U ∈ GLk(F). (6.3)

I certainly have to put U := W (0)−1 and need to find M such that B(z−1I −A)−1M = WU − I. The latter matrix is of the form WU − I =

(∑νij=1 aijz

j)i=1,...,k

for suitable aij ∈ Fk. Using that B(z−1I −A)−1 = diag( (z z2 · · · zνi

) )i=1,...,k

∈F[z]k×δ, one sees that the matrix M = (M1, . . . ,Mk)

t where Mi = (ati1, . . . , atiνi

),satisfies (6.3). Notice that if νi = 0 the result is true as well, since in that case theith block of M is missing and a zero row appears in WU − I and B(z−1I − A)−1.Now I have the identity G = V U−1G where, again, V = I + B(z−1I − A)−1M .Using (6.2) this reads as

UB(z−1I − A+MB)−1(C −MD) + UD = B(z−1I − A)−1C + D = G(z). (6.4)

Hence (A −MB,UB,C −MD,UD) is a minimal realisation of G of complexitydeg(G). As a consequence, (6.4) implies that the realisations (A −MB,UB,C −MD,UD) and (A,B, C, D) are similar, and this yields (ii). 2

Now I can give the proof of Theorem 6.3.

Proof: The only-if part is in Proposition 6.2. Thus let me assume that Λ(C) =Λ(C ′). The outline of the proof is as follows. I will consider the controller canonicalforms of the two codes and show that the identity Λ(C) = Λ(C ′) implies thatthese realisations are equivalent under the full state feedback group followed byreordering and rescaling of the output coordinates. With the aid of Proposition 6.4

80

I can then conclude that the two associated encoder matrices satisfy an identityof the form G′ = WGPR for some unimodular matrix W and permutation andrescaling matrices P, R. This implies that the codes are monomially equivalent. Iproceed in several steps.

1) I first study the algebraic parameters of the codes and fix suitable realisations.Since the adjacency matrices have the same size, the two codes have the same degree,say δ. Let G, G′ be any minimal encoder matrices of C and C ′ and (A,B,C,D) and(A′, B′, C ′, D′) be the corresponding controller canonical forms, respectively. Thenthe two systems have complexity δ and they form minimal realisations of the codes Cand C ′. Let Λ and Λ′ be the associated weight adjacency matrices. By assumptionthere exist some T ∈ GLδ(F) such that

Λ′X,Y = ΛXT,Y T for all (X, Y ) ∈ F2δ. (6.5)

In [8, Thm. 5.1] is has been proven that codes satisfying (6.5) have the same dimen-sion and the same Forney indices. Thus let k := dim(C) = dim(C ′). I may assumethat both codes have their Forney indices, which are by assumption positive, in thesame ordering. Let me denote them by ν1 ≥ . . . ≥ νk ≥ 1. Recall that δ =

∑ki=1 νi.

Now the controller canonical form implies A′ = A and B′ = B.

2) Next I will show that

A = T (A−MB)T−1 and B = UBT−1 for some matrices M ∈ Fδ×k, U ∈ GLk(F).(6.6)

By definition of the weight adjacency matrix it is for any (X, Y ) ∈ F

Y −XA ∈ imB ⇐⇒ Λ′X,Y 6= 0⇐⇒ ΛXT,Y T 6= 0⇐⇒ Y T −XTA ∈ imB.

Putting A = TAT−1, B = BT−1, I thus get

Y −XA ∈ imB ⇐⇒ Y −XA ∈ im B.

Using X = 0 this implies im B = imB and hence BT−1 = UB for some U ∈ GLk(F).On the other hand, for each X ∈ Fδ there exists u ∈ Fk and Y ∈ Fδ such thatY − XA = uB, hence there exists u ∈ Fk such that Y − XA = uB. This impliesX(A−A) = (u− u)B. Using for X all standard basis vectors I obtain the identityA = A + MB for some matrix M ∈ Fδ×k. Hence I arrive at A = T−1(A + MB)Tand B = UBT . This in turn yields (6.6).

3) In this step I will prove that (A,B,C ′, D′) and (A,B,C,D) are related via the fullstate feedback group followed by reordering and rescaling of the output coordinates,see (6.8) below. In order to do so I will compare the entries of the weight adjacencymatrices. Consider the minimal realisation (A, B, C, D) = (TAT−1, BT−1, TC,D)of the code C. It is easy to see [8, Rem. 3.6] that the associated weight adjacencymatrix Λ satisfies ΛX,Y = ΛXT,Y T for all (X, Y ) ∈ F and hence Equation (6.5)implies

Λ = Λ′.

81

Now I can study the entries of these weight adjacency matrices. Since all Forneyindices are positive, the matrix B has full rank k. As a consequence, for each pairof states (X, Y ) ∈ F the set XC ′ + uD′ | u ∈ Fk : Y = XA + uB has at mostone element. Recalling the definition of the weight adjacency matrix in Definition3.1 one obtains that the nonzero entries are given by

Λ′X,XA+uB = ΛX,XA+uB for all (X, u) ∈ Fδ × Fk, (6.7)

and these entries have the value Λ′X,XA+uB = W a where a = wt(XC ′ + uD′). Onthe other hand notice that, due to (6.6), for any (X, u) ∈ Fδ × Fk I have

XA+uB = X(TAT−1−TMBT−1)+uUBT−1 = XA+ uB where u = uU−XTM.

Thus Definition 3.1 yields ΛX,XA+uB = ΛX,XA+uB = W b where b = wt(XC + uD).As a consequence, (6.7) implies

wt

((X, u)

(C ′

D′

))= wt

(XC + (uU −XTM)D

)= wt

((X, u)

(C − TMD

UD

))for all (X, u) ∈ Fδ × Fk. Now [8, Lemma 5.4], which is basically MacWilliams’Equivalence Theorem for block codes, yields the existence of a permutation ma-trix P ∈ GLn(F) and a nonsingular diagonal matrix R ∈ GLn(F) such that(

C ′

D′

)=

(C − TMD

UD

)PR.

With the help of (6.6) one sees that the realisation (A,B,C ′, D′) of C ′ is of the form

(A,B,C ′, D′) = (T (A−MB)T−1, UBT−1, (C − TMD)PR,UDPR)

= (T (A−MB)T−1, UBT−1, T (C −MD)PR,UDPR).

(6.8)

4) Now I can apply Proposition 6.4 and obtain for the associated encoder matrices

G′ = WGPR for some W ∈ GLk(F[z]).

Thus C = imG and C ′ = imG′ are monomially equivalent. This completes the proof.2

Although this result is restricted to a seemingly small class of convolutional codes,it covers the codes that are of greatest interest for applications. Any code not inthis class has constant codewords. The minimal distance of a code with constantcodewords is less than or equal to the minimal distance of the block code it contains.Hence its minimal distance cannot be better than that of this block code, whichmakes the codes not particularly suitable for applications.The result may not be generalised further. Example 6.1 already shows that it doesnot hold for block codes in general. Moreover, one can construct a convolutionalcode with constant codewords for which it is wrong using the two block codes fromthe Example.

82

Example 6.5 Using the rows of the encoders of Example 6.1 in a suitable way oneobtains

G =

(1 1 z z 0 01 1 1 1 1 1

), G =

(z + 1 1 z 0 0 0

1 1 1 1 1 1

)∈ F2[z]2×6.

Both matrices are minimal. The WAM of the associated controller canonical formsare both given by

Λ =

(1 +W 6 W 2 +W 4

W 2 +W 4 W 2 +W 4

).

But the codes C = imG and C = im G are not monomially equivalent. This canbe seen by computing UG for all U ∈ GL2(F2[z]) such that UG is reduced withindices 1 and 0 again. The only options are

U ∈I2,

(1 10 1

),

(1 z0 1

),

(1 1 + z0 1

)and it is seen by inspection that in none of these cases UG has, up to ordering, thesame columns as G (over F2 one can disregard rescaling matrices). It is, however,not clear whether the codes are isometric.

It remains open, if there is a similarily tight connection between monomial equiv-alence and a suitable notion of isometry as in the block code case. Moreover, it istotally open whether such a notion of isometry is meaningful for practical consider-ations of convolutional codes.

83

Summary

Summary of the thesis”On the weight adjacency matrix of convolutional codes”

by Gert Schneider.

Two of the most important theorems in classical block coding theory are theMacWilliams Extension Theorem and the MacWilliams identity. The first one clar-ifies when two codes are equivalent; that is, they share those invariants which aremost important for coding purposes. The second theorem provides a means to ef-ficiently compute the weight enumerator of a high dimensional code from a lowdimensional code.

Since the minimal distance can easily be derived by the weight enumerator, thepractical use of the MacWilliams identity is immediate. Moreover, its impact on themathematical aspects of coding theory is undoubted.

Convolutional codes may be seen as generalisation of block codes. Although con-volutional codes and their advantages over block codes have been well-known forseveral decades, the mathematical theory of convolutional codes is still underdevel-oped. For example, both of MacWilliams’ Theorems have not seen generalisationsto convolutional codes.

In this text I will state and prove a MacWilliams identity for a weight countingobject of convolutional codes which incorporates the MacWilliams identity for blockcodes. This MacWilliams identity is then used to briefly survey the consequencesto the theory of self-dual convolutional codes.

Finally, I will discuss the problems connected with generalising the MacWilliamsExtension Theorem to convolutional codes and a partial solution to the questionas to when two convolutional codes should be called equivalent is given using theweight counting object, that allows for a MacWilliams identity.

84

Sammenvatting

Sammenvatting van het proefschrift”On the weight adjacency matrix of convolutional codes”

door Gert Schneider.

De twee belangrijkste resultaten uit de klassieke theorie van blokcodes zijn de uit-breidingsstelling van MacWilliams en de MacWilliams identiteit. Het eerste is eenoplossing van het probleem van equivalentie van codes, d.w.z., codes met dezelfdeinvarianten (dit zijn de belangrijkste eigenschappen van een code). Het tweederesultaat levert een efficiente methode om de ‘weight enumerator’ van een hoogdi-mensionale code te berekenen gebruik makend van een laagdimensionale code.

De minimale afstand van een code is uit de ”‘weight enumerator”’ is af te lezenen dit is het praktische nut van de MacWilliams identiteit. Bovendien is de invloedvan deze identiteit op coderingstheorie onmiskenbaar.

Convolutiecodes kan men zien als generalisatie van blokcodes. Hoewel convolu-tiecodes en hun voordelen ten opzichte van blokcodes al enkele decennia bekend zijn,is de wiskunde van convolutiecodes nog onvoldoende ontwikkeld. De twee resultatenvan MacWilliams bijvoorbeeld, zijn nog niet gegeneraliseerd voor convolutiecodes.

In deze tekst geef ik een formulering en een bewijs van een MacWilliams iden-titeit voor convolutiecodes en een zeker ”‘weight counting object”’. Dit omvat hetklassieke resultaat. Deze MacWilliams identiteit wordt dan gebruikt voor een kortoverzicht van zelfduale convolutiecodes.

Ten slotte bespreek ik de problemen met betrekking tot een generalisatie vande MacWilliams uitbreidingsstelling en geef een gedeeltelijk antwoord op de vraagwanneer twee convolutiecodes equivalent zijn, gerelateerd aan het bovengenoemd‘weight counting object’.

85

References

[1] K.A.S. Abdel-Ghaffar. On unit constraint-length convolutional codes. IEEETrans. Inform. Theory, IT-38:200-2006, 1992.

[2] P. J. Antsaklis and A. N. Michel. Linear Systems. McGraw-Hill, New York,1997.

[3] H. Q. Dinh and S. R. Lopez-Permouth. On the equivalance of codes over finiterings. Appl. Algebra Engrg. Comm. Comput., 15:37-50, 2004.

[4] F. Fangnani and S. Zampieri. System-theoretic properties of convolutional codesover rings. IEEE Trans. Inform. Theory, IT-47:2256-2274, 2001.

[5] G. D. Forney jr. Minimal bases of rational vector spaces with applications tomultivariable linear systems. SIAM J. on Contr., 13:493-520, 1975.

[6] G. D. Forney and M. D. Trott. The Dynamics of Group Codes: Dual AbelianGroup Codes and Systems. IEEE Trans. Inform. Theory, IT-50:2935-2965,2004.

[7] G. D. Forney, M. Grassl and S. Guha. Convolutional and Tail-Biting QuantumError-Correcting Codes. IEEE Trans. Inform. Theory, IT-53:865-880, 2007.

[8] H. Gluesing-Luerssen. On the weight distribution of convolutional codes. LinearAlgebra and its Applications, 408:298-326, 2005.

[9] H. Gluesing-Luerssen, J. Rosenthal and R. Smarandache. Strongly-MDS Con-volutional Codes. IEEE Trans. Inform. Theory IT-52:584-598, 2006.

[10] H. Gluesing-Luerssen and G. Schneider. On the MacWilliams identity for con-volutional codes. IEEE Trans. Inform. Theory IT-54:1536-1550, 2008.

[11] H. Gluesing-Luerssen and G. Schneider. State Space Realizations and Mono-mial Equivalence for Convolutional Codes. Linear Algebra and its Applications425:518-533, 2007.

[12] H. Gluesing-Luerssen and G. Schneider. A MacWilliams Identity for Convolu-tional Codes — The General Case. (submitted) 2008.

[13] M. Greferath and S. E. Schmidt. Finite ring combinatorics and MacWilliamsEquivalence Theorem. J. Combin. Theory Ser. A, 92:17-28, 2000.

[14] W. C. Huffman and V. Pless. Fundamentals of Error-Correcting Codes. Cam-bridge University Press, Cambridge, 2003.

[15] R. Johannesson and K.S. Zigangirov. Fundamentals of Convolutional Coding.IEEE Press, New York, 1999.

86

[16] R. Johannesson, Zhe-Xian Wan and E. Wittenmark. Some Structural Propertiesof Convolutional Codes over Rings. IEEE Trans. Inform. Theory IT-44:839-845,1998.

[17] R. Johannesson, Zhe-Xian Wan and E. Wittenmark. A note on Type II convo-lutional codes. IEEE Trans. Inform. Theory IT-46:1510-1514, 2000.

[18] R. Lidl and H. Niederreiter. Finite fields. Cambridge University Press, 1997.

[19] S. Lin and D. J. Costello Jr. Error Control Coding: Fundamentals and Appli-cations. Prentice Hall, 1983.

[20] F. J. MacWilliams. Combinatorial problems of elementary abelian groups. PhDthesis, Harvard University, 1962.

[21] F. J. MacWilliams. A theorem on the distribution of weights in a systematiccode. Bell Syst. Tech. J., 42:79-94, 1963.

[22] F. J. MacWilliams and N. J. A. Sloane. The Theory of Error-Correcting Codes.North-Holland, 1977.

[23] R. J. McEliece. The algebraic theory of convolutional codes. In V. Pless and W.Huffman, editors, Handbook of Coding Theory, Vol. 1, pages 1065-1138. ElsevierAmsterdam, 1998.

[24] R. J. McEliece. How to compute weight enumerators for convolutional codes. InM. Darnell and B. Honory, editors, Communications and Coding (P. G. Farell60th birthday celebration), pages 121-141, Wiley, New York, 1998.

[25] G. Nebe, E. M. Rains and N. J. A. Sloane. The Invariants of the Clifford Groups.Designs, Codes, and Cryptography 24:99-122, 2001.

[26] G. Nebe, E. M. Rains and N. J. A. Sloane. Self-Dual Codes and InvariantTheory. Springer, 2006.

[27] G. Nebe, E. M. Rains and N. J. A. Sloane. Codes and Invariant Theory. Math-ematische Nachrichten, 274-275:104-166, 2004.

[28] H.-G. Quebbemann. On even codes. Discrete Math., 98:29-34, 1991.

[29] E. M. Rains and N.J.A Sloane. Self Dual Codes. In V. Pless and W. C. Huffman,editors, Handbook of Coding Theory Vol.1, pages 177-294. Elsevier Amsterdam,1998.

[30] S. Riedel. MAP Decoding of Convolutional Codes Using Reciprocal Convolu-tional Codes. IEEE Trans. Inform. Theory IT-44:1176-1187, 1998.

[31] J. Rosenthal. Connections between linear systems and convolutional codes. InB. Marcus and J. Rosenthal, editors, Codes, Systems and Graph Models, pages39-66. Springer, Berlin, 2001.

87

[32] J. Rosenthal and R. Smarandache. Maximum Distance Separable ConvolutionalCodes. Applicable Algebra Engrg. Comm. Comput. 10:15-32, 1999.

[33] W. Scharlau. Quadratic and Hermitian forms. Springer, Berlin, 1985.

[34] J.B. Shearer and R. J. McEliece. There is no MacWilliams identity for convo-lutional codes. IEEE Trans. Inform. Theory, IT-23:775-776, 1977.

[35] W. A. Wolovich. The use of state feedback for exakt model matching. Siam J.Contr. & Opt., 10:512-523, 1972.

[36] J. A. Wood. Duality for modules over finite rings and applications to codingtheory. Americ. J. of Math., 121:555-575, 1999.

[37] J. A. Wood. Extension theorems for linear codes over finite rings. In T. Moraand H. Mattson, editors, Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, Springer, Berlin, 1997.

88

university of groningen on the weight adjacency matrix of ...schneider, h-g. (2008). on the weight...

Documents