compression of ultrasonic files ir-sb-ex-0501

84
Compression of Ultrasonic Files SVERKER NYSTR ¨ OM Master of Science Thesis Stockholm, Sweden 2004-01-03 IR-SB-EX-0501

Upload: edsoncarlosgarcia

Post on 06-Mar-2015

33 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Compression of Ultrasonic Files Ir-sb-ex-0501

Compression of Ultrasonic Files

SVERKER NYSTROM

Master of Science ThesisStockholm, Sweden 2004-01-03

IR-SB-EX-0501

Page 2: Compression of Ultrasonic Files Ir-sb-ex-0501
Page 3: Compression of Ultrasonic Files Ir-sb-ex-0501

Compression of Ultrasonic Files

Master of Science thesis work by Sverker Nyström, December 2004 Westinghouse/WesDyne TRC Department of Signals, Sensors and Systems Royal Institute of Technology (KTH) Examiner: Björn Völcker

Page 4: Compression of Ultrasonic Files Ir-sb-ex-0501

1 (49)

Contents 1 Abstract 2 Introduction 2.1 Background 3 Ultrasonic Equipment and Techniques 3.1 Ultrasonic Instruments 3.2 Ultrasonic Theory 3.3 Inspection Sequences 4 Compression Methods 4.1 Lossless Compression 4.1.1 Huffman Coding 4.1.2 Lempel-Ziv Coding 4.2 Lossy Compression 5 Transform Theory 5.1 Fourier Transform 5.2 Short Time Fourier Transform (STFT) 5.3 Wavelet Transforms 5.3.1 Scale and translation. 5.4 Standard Wavelets vs. Wavelet Packets 5.5 Two Dimensional Transforms 5.6 Transform Compression 5.7 Mathematical Error 6 File Compression 6.1 Hardware compression 6.2 Compression Scheme 6.3 Lossy File Compression 6.4 Noise 7 Experimental Result and Evaluation 7.1 Signal Pre Processing 7.2 Error Analysis 7.2.1 Visual Error 7.2.2 Visual File Evaluation 7.3 Lossless Compression Results 7.4 Lossy Compression Evaluation Results 7.5 Reality Compression Tests 8 Conclusions and Future Work 9 References

Page 5: Compression of Ultrasonic Files Ir-sb-ex-0501

2 (49)

Appendices A1 Defect 4 A1.1 Mathematical Error Calculation Results A1.2 Visual Pulse-echo Evaluation Results A1.3 Pulse-echo Plots A1.4 Visual TOFD Evaluation Results A1.5 TOFD Plots A2 Defect 5 A2.1 Mathematical Error Calculation Results A2.2 Visual Pulse-echo Evaluation Results A2.3 Pulse-echo Plots A2.4 Visual TOFD Evaluation Results A2.5 TOFD Plots A3 Defect 8 A3.1 Mathematical Error Calculation Results A3.2 Visual Pulse-echo Evaluation Results A3.3 Pulse-echo Plots A3.4 Visual TOFD Evaluation Results A3.5 TOFD Plots

Page 6: Compression of Ultrasonic Files Ir-sb-ex-0501

3 (49)

1 Abstract This thesis is made at WesDyne International in Pittsburgh, USA and at WesDyne TRC in Täby, Sweden. WesDyne International is a company that performs non-destructive testing using different inspection techniques such as ultrasonic and eddy current. The inspection objects are mainly welds in nuclear power plants. Normally data collection and data analysis are done at site. There has been a wish to have some of the analysis done at the head quarter instead. The task for this thesis is to investigate the possibilities to transfer inspection data from the inspection site to the head quarter via a slow transmission line. Due to many jobs short duration in time it is not always cost effective to set up a high-speed transmission line and therefore the only possible transmission channel is a telephone modem. Relatively large file sizes, around 200 MB demand a high compression ratio and the suggested solution is to use a destructive transform compression technique with wavelets in combination with WinZip. The results show that a compression ratio of 1:13 is achievable if a wavelet from the Daubechies family is used which would reduce a 200 MB file size to around 15 MB.

Page 7: Compression of Ultrasonic Files Ir-sb-ex-0501

4 (49)

2 Introduction 2.1 Background WesDyne Internatonal is a company that performs non-destructive testing of welds in nuclear power plants worldwide. The inspection techniques used are the ultrasonic technique (UT) and the eddy current technique (ET). An inspection team consists of around 20 persons where each person has his own field of expertise. A crew can roughly be divided into three groups, equipment personnel, acquisition personnel and analyst personnel. Inspections are often performed at different sites simultaneously which some times cause a shortage of qualified personnel. This shortage could be reduced if some of the inspection work could be done off site, e.g., from the head quarter. Only one of the three groups mentioned previously is suited for distance work namely the analysis of the inspection data. If the collected data could be transferred from the inspection site to the head quarter the analyst personnel could evaluate data from more than one inspection at the same time and, therefore, reduce the shortage to some extent. Due to many jobs short duration in time it is not always cost effective or even possible to set up a high speed transmission line and therefore the remaining alternative is to use the standard telephone net. The problem is that if a file is to be transferred over the telephone net via a standard telephone modem which has a maximum bit rate of 56600 bits/s and the file size is 200 MB the total transfer time would theoretically be:

566008200000000 × = 7 hours 51 minutes (1)

In reality the transmission time will most likely be twice as long due to network overhead, congestion and shared use. This means that to get a reasonable transmission time of around 60 minutes the file size has to be compressed at least 15 times. A natural first step is to use any of the many commercial compression programs on the market and see how effective they are. As seen later in the report this simple approach will only yield a compression ratio of around 1:2 which is not high enough to meet the original criteria. The idea in this thesis will be to investigate if it is possible to improve this compression ratio by pre processing the raw ultrasonic data and then as a final step use the well known compression program WinZip.

Page 8: Compression of Ultrasonic Files Ir-sb-ex-0501

5 (49)

3 Ultrasonic Equipment and Techniques 3.1 Ultrasonic Instruments The instruments used to generate the signals and collect the data are manufactured by R/D Tech (Tomoscan) and WesDyne (Intraspect). The basic functions are the same between the two models as far as it comes collecting data. What differs are, for example, signal processing and filtering functions. The data is also stored differently. The Tomoscan uses a TIFF file format system (RDTIFF) [Benoit95], see figure 3.1, whereas the Intraspect always stores the data the same way, see figure 3.2. To be able to extract the data from a TIFF file a special reading program that decodes the file structure has to be used. The data analysis is however done the same way on both systems. Figure 3.1: RDTIFF file data structure. Figure 3.2: Intraspect file data structure.

FILE HEADER

FILE CONTENTS GROUP

……..

……..

ACQUISITION DATA GROUP

ACQUISITION DATA

FILE HEADER

ACQUISITION DATA

Page 9: Compression of Ultrasonic Files Ir-sb-ex-0501

6 (49)

3.2 Ultrasonic Theory The transducer is made out of a piece of piezoelectric material. The asymmetric atom structure of the material gives it certain electro mechanical properties that are used to generate sound impulses. When an electric pulse is applied to such a material the shape will be contracted or extracted depending on the polarity of the impulse. This will emit a sound wave which has a wavelength proportional to the wavelength of the electric impulse. The opposite thing will happen when a sound wave hits the surface of the material. When it contracts, and extracts, an electric voltage will be generated proportional to the amplitude of the incoming wave. The crystal is normally attached to a piece of Plexiglas that works as a lens. This makes it possible to focus the sound wave at different depths. To investigate if a weld or a piece of metal has any defects, such as internal cracks or material defects, a sound wave is sent into the material. The sound travels with different speed in different materials and therefore it is possible to calculate the time

vs

t = s = distance , v = velocity (2)

it should take for the sound wave to travel through the material, reflect on the backside and return to the transducer. If the transmitted wave returns in tt <' time units something else has reflected the pulse. This could for example be an air cavity or a crack which both have a lower density than metal. The frequency of the signal is selected depending on which material to be inspected since the attenuation in the material is frequency dependent. The pulse generated by the UT instrument has the form of a negative square wave, half a period long, see figure 3.3. The pulse length t∆ can be varied between 25 to 500 ns. The pulse amplitude can be set from 0 to 200 Volts. Although the pulse shape from the instrument is a square wave the actual sound wave coming out of the transducer has a sinusoidal shape due to the stiffness in the piezoelectric material, figure 3.4. The knowledge of the pulse shape will be used later in this work. Figure 3.3 Figure 3.4

U

t

t∆

U

t

t∆

Page 10: Compression of Ultrasonic Files Ir-sb-ex-0501

7 (49)

The ultrasonic instruments also have a number of signal processing features. Different filters can be combined so that band pass, low pass and high pass filters can be applied. Another form of filtering is the averaging functio n which takes the average of a number of samples. If the built in compression function is activated a choice can be made to only store the highest peak of 2, 4, 8 or 16 consecutive samples. This can be seen as a form of down sampling which will change the relative bandwidth of the signal. The effect of this can be seen on the stored pulse as the shape gets more saw tooth like due to the removal of samples, see figures 3.5 and 3.6. There are two different types of ultrasonic techniques that are used when inspecting a weld. Pulse-echo: A sound pulse is transmitted and received from either one or two transducers. The most common penetration angles used are 0 degrees, 45 degrees and 70 degrees. The aim is to detect any potential indications (cracks, air pockets etc). When evaluating a file the amplitude of the signal is important, i.e., signal amplitudes higher than a certain level are considered to be indications and need further evaluation. Figure 3.5 shows the pulse-echo principle together with a plot of a so called A-scan. An A-scan is a picture that shows the echo signal from a transmitted pulse.

Figure 3.5: Pulse -echo principle (top), A-scan (bottom).

Weld

Crack

UT probe(s)

Wave

Samples 50

Amplitude

-20

20

Page 11: Compression of Ultrasonic Files Ir-sb-ex-0501

8 (49)

Time of Flight Diffraction (TOFD): This method is used only if any indications have been found with the pulse-echo method and it is used to measure them with respect to depth, length and width. The important information in this method is the phase of the received signal. The amplitude is of less interest. It is important that the above mentioned signal properties are preserved as much as possible by the compression method. Figure 3.6 shows the TOFD principle together with a plot of an A-scan.

Figure 3.6: TOFD principle (top), A-scan (bottom). 3.3 Inspection Sequences The received echo signal is normally sampled at 60 MHz and depending on the thickness of the inspected object a different number of samples will be stored. One sampled echo signal is called an A-scan, see figures 3.5 and 3.6. The transducer is then moved a certain distance in the scan direction and another A-scan is collected. When a sufficient length has been scanned a short step motion is followed and the scanning continuous in the opposite direction as previously until the surface has been covered, see figure 3.7. Another way of presenting the data is to make a plot of all A-scans in one scan direction. Such a plot is called a B-scan. A B-scan plot could be seen as a ‘slice’ that has been cut out from the scanned object. The amplitude levels in each A-scan are either colour coded or grey scale coded which make the B-scans look like figure 3.8 shows. This makes it possible to see any irregularities in the materia l which can be caused by, e.g., a crack. The B-scans in figure 3.8 come from a pulse-echo file and a TOFD file. Pulse-echo signals are normally colour coded, see figure 3.7 and TOFD signals grey scale coded. The bow shaped section in the middle of the TOFD picture is what an indication typically looks like.

Weld Crack

Rec. probe Tr. probe

Samples 100

Amplitude

10000

-10000

Page 12: Compression of Ultrasonic Files Ir-sb-ex-0501

9 (49)

Page 13: Compression of Ultrasonic Files Ir-sb-ex-0501

10 (49)

Figure 3.7: Scan sequence (top), colour coded A-scan (bottom).

Figure 3.8: Pulse-echo B-scan (left), TOFD B-scan (right).

A-scan

Scan direction

Transducer

B-scan view

Step direction

A-scan direction

Scan Step

Time

Amplitude

Page 14: Compression of Ultrasonic Files Ir-sb-ex-0501

11 (49)

4 Compression Methods The purpose of compressing a file is to reduce the size and this can be done in a number of different ways. Generally, compression is achieved by representing an original set of data more effective. This can be seen as reducing the amount of redundancy. The compression ratio is defined as:

nCompressioBeforeSizeFilenCompressioAfterSizeFile

RationCompressio = (3)

The compression techniques can be separated into two groups, lossy and lossless. In the case of lossless compression the file is completely reproducible and does not lose any information. This compression is achieved by representing the compressed data with fewer bits than the original data. Two lossless compression algorithms will be described in the section below, Huffman and Lempel-Ziv. They are both used in combination in the WinZip program. 4.1 Lossless Compression 4.1.1 Huffman Coding The basic idea is borrowed from an older and slightly less efficient method called Shannon-Fano coding. Huffman coding belongs to the group of statistical compression algorithms. This means that the probabilities for all the symbols in a message have to be known in advance. The most probable ones are assigned the shortest codeword and so on. The encoding algorithm is as follows [Proakis, Salehi 94]:

1. Sort source outputs in decreasing order of their probabilities. 2. Merge the two least-probable outputs into a single output whose probability is the

sum of the corresponding probabilities.

3. If the number of remaining outputs is 2 then go to the next step, otherwise go to step 1.

4. Arbitrarily assign 0 and 1 as code words for the two remaining outputs. 5. If an output is the result of the merger of two outputs in a preceding step append

the current codeword with a 0 and a 1 to obtain the codeword for the preceding outputs and repeat step 5. If no output is preceded by another output in a preceding step then stop.

Page 15: Compression of Ultrasonic Files Ir-sb-ex-0501

12 (49)

4.1.2 Lempel-Ziv Coding

Lempel-Ziv is sometimes referred to as a substitutional or dictionary based encoding algorithm. The algorithm builds a data dictionary of the data in an uncompressed data stream. Patterns of data (substrings) are identified in the data stream and are matched to entries in the dictionary. If the substring is not present in the dictionary a code phrase is created based on the data content of the substring and it is stored in the dictionary. The phrase is then written to the compressed output stream. When a matching substring occurs in the data the phrase in the dictionary is written to the output instead. Because the phrase value has a physical size smaller than the substring it represents data compression is achieved. The example in figure 4.1 [Proakis, Salehi 94] shows a data sequence encoded with the Lempel-Ziv algorithm. Data seq. 0100001100001010000010100000110000010100001001001

Dictionary Location

Dictionary Contents Codeword

1 0001 0 0000 0 2 0010 1 0000 1 3 0011 00 0001 0 4 0100 001 0011 1 5 0101 10 0010 0 6 0110 000 0011 0 7 0111 101 0101 1 8 1000 0000 0110 0 9 1001 01 0001 1 10 1010 010 1001 0 11 1011 00001 1000 1 12 1100 100 0101 0 13 1101 0001 0110 1 14 1110 0100 1010 0 15 1111 0010 0100 0 16 1110 1

Encoded sequence: 0000 0, 0000 1, 0001 0, 0011 1, 0010 0, 0011 0, 0101 1, 0110 0, 0001 1, 1001 0, 1000 1, 0101 0, 0110 1, 1010 0, 0100 0, 1110 1

Figure 4.1

Page 16: Compression of Ultrasonic Files Ir-sb-ex-0501

13 (49)

4.2 Lossy Compression A much higher compression ratio is achievable if a lossy compression method is used. The disadvantage is that you lose information in the process. This loosing of information can be seen as a filtering operation where unwanted data is removed. The filtering process can be done either in the time or some other domain, e.g., the frequency domain. Fourier, Cosine and Wavelet trans forms are often used in this type of compression when a signal is transformed to the frequency domain. They all use the same principal, i.e., to represent a signal in the time domain with a set of basis functions in the frequency domain. The type of basis functions used is what differs between the three methods. Filtering a signal in the frequency domain can be done by setting a number of coefficients to zero and then inverse transform the signal back to the time domain. The principal of lossy compression in the frequency domain is to set as many coefficients as possible equal to zero without distorting the signal too much after it has been inverse transformed. The zeroed coefficients are easily further compressed using any of the, in section 4.1, mentioned methods. In this report I have chosen to focus on lossy compression since the compression ratio needs to be as high as possible due to the size of the files. A comparison will however be done with a lossless technique to show the difference. All calculations and tests will be done on six data files containing pulse-echo and TOFD data. The compression ratio will be calculated after the files have been pre processed and compressed. The idea of this thesis is to investigate how a different preparation of the data in the UT-files will affect the final compression result using WinZip. With a lossy compression algorithm it is impossible to exactly reconstruct the original signal. But sometimes it could be an advantage to lose some information. Removing the noise from a noisy signal is an example of a positive loss of information. The signals in this thesis will have a various amount of noise added. The noise is normally a bigger problem when evaluating TOFD data. This is due to a preamplifier that has to be used to amplify the signal. The noise is modelled as additive white Gaussian noise. Other noise sources at nuclear power plants are TIG welding equipments which generates a high frequency noise when in use. This will not be taken into account in this report. The inspected object itself also contributes with a certain amount of material noise. The material noise depends on the size of the material grains.

Page 17: Compression of Ultrasonic Files Ir-sb-ex-0501

14 (49)

5 Transform Theory A useful definition in transform theory is the scalar product of two continuous functions defined as:

∫=b

a

dxxgxfgf )()(, (4)

This can also be seen as the projection of the function f(x) onto the basis function g(x). Another interpretation is, how similar are f(x) and g(x). If 0, =gf , the two functions are called orthogonal. Figure 5.1 shows a simple projection example in a Cartesian coordinate system.

orthogonalBA

BV

AV

⇒=∗+∗=•

=∗+∗=•

=∗+∗=•

0)02()40(

12)01()43(

3)31()03(

Figure 5.1: Vector projection example 5.1 Fourier Transform A well known and very useful transform is the Fourier transform developed by the French baron Jean Joseph Fourier in the 19th century. His idea was to represent a periodic signal f(t) with period T as a sum of weighted sine and cosine functions as

(5)

)0,4(=B

)1,3(=V

)3,0(=A

Ttnbtna

atf T

nTnTn

πωωω

2))sin()cos((

2)(

1

0 =++= ∑∞

=

Page 18: Compression of Ultrasonic Files Ir-sb-ex-0501

15 (49)

This sum is called the Fourier series of a signal [Petersson 97]. The basis functions used in the Fourier series are sine and cosine and have the following properties

mnmtnt ≠= 0)cos(),cos( (6)

mnmtnt ≠= 0)sin(),sin( (7)

mnmtnt == 0)sin(),cos( (8) The coefficients nn baa ,,0 can easily be calculated due to the orthogonality of the basis functions. A continuous aperiodic signal can not be written as a Fourier series but as a Fourier integral [Petersson 97].

∫∞

∞−

−= dtetfF tjωω )()( (9) ∫∞

∞−

= ωωπ

ω deFtf tj)(21

)( (10)

Fourier transform (FT). Inverse Fourier transform (IFT). The function )(ωF is called the Fourier transform (FT) of )(tf and conversely )(tf is called the inverse Fourier transform (IFT) of )(ωF . It should be noted that a Fourier transformed signal does not contain any information of where the different frequencies occur in time. It only gives the overall spectral content of the signal. This is due to the assumption that the signal to be transformed is stationary. One criteria of stationarity is that the frequency content does not change over time. In the discrete time-domain a discrete version of the Fourier series (DFT) has to be used. The formulas for the DFT and the IDFT are [Proakis, Manolakis 96]

DFT ∑−

=

−==−

1

0

1...,,2,1,0)()(2

N

n

NkenxkX Nknj π

(11)

IDFT 1....,,2,1,0)(1

)(1

0

2

−== ∑−

=

NnekXN

nxN

k

Nknj π

(12)

Page 19: Compression of Ultrasonic Files Ir-sb-ex-0501

16 (49)

5.2 Short Time Fourier Transform (STFT) To overcome the time resolution problem of the Fourier transform the signal is cut into small slices followed by a Fourier transformation of these slices. This can be seen as moving a rectangular window along the signal t∆ time units at a time as shown in figure 5.2. At each instant the window function w(t) is multiplied with the signal f(t) and the product is then Fourier transformed.

Figure 5.2: Windowing principle. To avoid large Fourier coefficients due to the sharp edges of the window function in figure 5.2, smoother window functions are normally used. Hanning, Hamming and Bartlett are examples of windows commonly used. Figure 5.3 shows a smoother window function.

Figure 5.3: Smoother window function, w(t).

t

t

f(t) w(t) f(t)w(t)

0

t

f(t) w(t) f(t)w(t)

0 t

Page 20: Compression of Ultrasonic Files Ir-sb-ex-0501

17 (49)

The resulting local time- frequency analysis procedure is called Short Time Fourier Transform (STFT) or windowed Fourier Transform. The STFT is defined as

conjugatekomplexdenotesdtetwtfSTFT tj ∗−∗= −∫ ωτωτ )()(),( (13) so the window function basically controls the time- frequency resolution according to Narrow window good time resolution, poor frequency resolution. Wide window good frequency resolution, poor time resolution. This time- frequency resolution compromise has its roots in the Heisenberg uncertainty principle. It simply states that one can not know the exact time-frequency representation of a signal, i.e., one can not know what spectral components exist at what instances of times.

Page 21: Compression of Ultrasonic Files Ir-sb-ex-0501

18 (49)

5.3 Wavelet Transforms Wavelet transforms have become very popular within signal processing the past decade especially in the field of compressing images (JPEG 2000). The wavelet transform is very similar to the Short Time Fourier Transform, i.e., it uses a window function to solve the time resolution problem. The analysis is done in a similar way to the STFT. There is however one main difference between the STFT and the wavelet transform: The width of the window function is changed for every single spectral component. This means that the time- frequency resolution will not be fixed as it is in the STFT. The variable time-frequency resolution of the wavelet transform (WT) compared to FT and STFT is shown in figure 5.4. The pictures show how the STFT and the wavelet transform has a time resolution as well as a frequency resolution as a comparison to the FT. It can be seen how the wavelet transform has a varying time- frequency resolution depending on the frequency range of the signal. The high frequency content in a signal gets a lower frequency resolution but a better time resolution than the low frequency content. a) b) c)

Figure 5.4: Time-frequency resolutions for a) FT, b) STFT and c) WT.

Freq.

Time

Freq.

Time

Freq.

Time

Page 22: Compression of Ultrasonic Files Ir-sb-ex-0501

19 (49)

The formula for the continuous wavelet transform is

conjugatecomplexdenotesdtttfs s ∗= ∫ ∗ )()(),( ,τψτγ (14)

This shows how a signal f(t) is decomposed into a set of basis functions )(, ts τψ , where the two variables s and τ represent scale and translation. The function )(0,1 tψ is called mother wavelet and the wavelets are generated from this single function which is defined as

=s

t

sts

τψψ τ

1)(, (15)

where

s1 is a normalization factor for the energy at different scales.

The mother wavelet is similar to the windowing function in the STFT. A difference between wavelet transforms and other transforms, e.g. , Fourier transform is that the basis function )(tψ can be chosen and designed by the user as long as it satisfies certain conditions. Each wavelet family has a number of subclasses distinguished by the number of coefficients. The wavelets are often classified in each family by the number of vanishing moments. This is an extra set of mathematical relationships for the coefficients that must be satisfied and is directly related to the number of them. A higher number gives a smoother wavelet due to the increased number of coefficients. Figure 5.5 shows the Daubechies wavelet with 3 different vanishing moments and the Haar wavelet. The wavelets used in this report are the Daubechies wavelet with 8 vanishing moments and the Haar wavelet.

Page 23: Compression of Ultrasonic Files Ir-sb-ex-0501

20 (49)

Daubechies, 2 vanishing moments. Daubechies, 4 vanishing moments.

Daubechies, 8 vanishing moments. Haar transform.

Figure 5.5: Daubechies and Haar wavelets.

From a formal point of view the mother wavelet )(tψ has to satisfy a number of conditions and the two most important ones are the admissibility and the regularity conditions [Valens 99]. The admissibility condition is defined as

)()()( 2

toftransformFourierd ψωωω

ω=Ψ+∞<

Ψ∫ (16)

Page 24: Compression of Ultrasonic Files Ir-sb-ex-0501

21 (49)

This implies that the function )(ωΨ vanishes at zero frequency, i.e.,

0)( 2

0=Ψ

=ωω (17)

This means that the average value in the time domain must be zero as well, i.e., the positive and negative areas under the curve must cancel out

∫ = 0)( dttψ (18) A function with these properties is an oscillating function which means that )(tψ is a wave. This is where the word wavelet comes from. The regularity conditions has to do with the approximation order of the wavelet transform and the decay of the coefficients

),( τγ s so if the wavelet has N vanishing moments then the approximation order of the wavelet is also N. 5.3.1 Scale and translation. The two parameters s (scale) and τ (translation) are used when a signal f(t) is transformed. The s parameter can be seen as a zooming tool which dilates and compresses the mother wavelet and, thus, changes the frequency resolution. A small s value will compress the mother wavelet whereas as high value will dilate or stretch it out. The τ parameter is used to slide the wavelet over the signal at the different scales. At every time instant τ−t the signal f(t) is multiplied with the wavelet and integrated over all times. This procedure is repeated until the end of the signal is reached. Then the scale is changed and it all repeats again. When the scale has a value that makes the wavelet curve similar to the signal f(t) at a certain time instant the multiplication and integration will give a large value compared to when they do not match very well. Figures 5.6, 5.7 and 5.8 illustrate how an input signal is affected by a wavelet with 3 different scale values. In this example the input signal f(t) (fig. 5.6) is a wavelet called Mexican hat. Figure 5.7 shows 3 plots of the Daubechies wavelet with 8 vanishing moments and 3 different scale values. The last plot, figure 5.8, shows how the output signal varies in amplitude depending on how well the input signal f(t) matches the wavelet. From the output plot it can be seen that the middle wavelet matches the input signal best as it gives the highest output of all 3 wavelets.

Page 25: Compression of Ultrasonic Files Ir-sb-ex-0501

22 (49)

time

Figure 5.6: Mexican hat wavelet as input signal f(t).

time

Figure 5.7: Daubechies wavelets with 3 different scale values.

t

t

Amplitude

Amplitude

Page 26: Compression of Ultrasonic Files Ir-sb-ex-0501

23 (49)

Figure 5.8: The output signals from the 3 differently scaled wavelets with the Mexican hat as input signal.

.

5.4 Standard Wavelets vs. Wavelet Packets Wavelet packets are outside the scope of this report but will be shortly described in this section due to its relation to standard wavelets. A description of the DWT (Discrete Wavelet Transform) will make the understanding of wavelet packets easier. A time-scale representation of a digital signal is obtained using digital filtering techniques. The CWT (Continuous Wavelet Transform) was computed by changing the scale of the analysis window, shifting the window in time, multiplying it by the input signal and integrating over all times. In the discrete case filters of different cut off frequencies are used to analyze the input signal at different scales. The DWT analyzes the input signal at different frequency bands with different resolutions by decomposing it into a coarse approximation and detail information with the use of two sets of functions, called scaling function and wavelet function. The scaling function is associated with low pass filtering and the wavelet function with high pass filtering. The decomposition of the signal is then obtained by successive high pass and low pass filtering of the time domain input signal, see figure 5.9. The filtering operation corresponds to convolution of the signal x[n] with the impulse response of the filter h[n] as:

[ ] [ ] [ ] [ ] [ ]∑∞

−∞=

−=∗=k

knhkxnhnxny (19)

τ

Amplitude

Page 27: Compression of Ultrasonic Files Ir-sb-ex-0501

24 (49)

Figure 5.9: Low and high pass filtering of the signal x[n]. If the original signal x[n] has a frequency content of ( )2,0 sf and the HP and LP filters used are half band filters, the filtered outputs [ ]nylow and [ ]nyhigh will have the frequency

bands ),0( 4sf and ),( 24

ss ff respectively. After the filtering operation, half of the samples can be eliminated according to Nyqvists rule and the signal is therefore decimated by 2 by discarding every other sample. Figure 5.10 shows the filtering and decimation process.

Figure 5.10 A signal that is passed trough such a filter is said to be decomposed one level. The level of decomposition is varied by repeatedly filter the LP part of the signal until only two samples remain which means that the signal has been fully decomposed. A three level DWT decomposition of the signal x[n] is shown in figure 5.11 as an example.

l[n]

h[n]

x[n]

[ ]nylow

[ ]nyhigh

l[n] = LP filter

h[n] = HP filter

l[n]

h[n]

x[n]

[ ]nylow

[ ]nyhigh

2

2

Page 28: Compression of Ultrasonic Files Ir-sb-ex-0501

25 (49)

Figure 5.11: Three level DWT decomposition of signal x[n]. As can be seen only the LP part of the signal is passed on to the next level of filtering and this is what differs between the wavelet packet and the wavelet decomposition. A wavelet packet decomposition uses both the LP and the HP part of the filtered signal, see figure 5.12, which will increase the possibilities of an efficient representation of the signal.

Figure 5.12: Wavelet packet decomposition scheme. Without going too deep into the theory of wavelet packets an advantage is the possibility to choose the best basis for a given application. This means that the optimal representation of a signal is calculated with the help of a so called cost function [Jensen, la Cour-Harbo 01]. After the signal has been fully decomposed, the cost function is applied to the elements in each level of the decomposition. The best representation of the signal is then found by taking the elements which correspond to the lowest cost values. This means that the user can design the cost function in such way that it chooses the best basis for a specific application. One example is the picture format JPEG 2000 where the cost function has been designed according to the sensitivity of the human eye. The use of different cost functions will change the time- frequency resolution of a signal and it will no longer always look like it does in figure 5.4c. Figure 5.13 shows four example plots of different time- frequency resolutions obtained with different cost functions. In this report we will however stick with the description in figure 5.4c.

LP

HP

2

2

LP

HP

2

2

LP

HP

2

2 x[n]

LP

HP

2

2

LP

HP

2

2 LP

HP

2

2

x[n]

…….

…….

Page 29: Compression of Ultrasonic Files Ir-sb-ex-0501

26 (49)

Figure 5.13: Examples of time-frequency representations obtained with different cost functions.

5.5 Two Dimensional Transforms The transforms dealt with so far have all been applied to one dimensional data sequences. As described in section 3.3 the scan pattern forms a two dimensional picture (or three dimensional if the amplitude is taking into account). This makes it possible to apply a 2 dimensional (2D) transform on. If the scanned data is viewed as a NM × matrix then the 2D discrete Fourier transform is defined as:

olumnscNrowsMeyxfMN

vuFM

x

N

y

Nvy

Mux

j

∑ ∑= =

+−===

0 0

)(2,),(

1),(

π (20)

When a 2D transform is applied to a data matrix it involves a number of 1D transformations. More precisely, it is achieved by first transforming each row, replacing each row with its transform and then transforming each column replacing it with its transform. The same theory is applicable on the 2D wavelet transforms.

Time

Freq.

Time

Freq.

Time

Freq.

Time

Freq.

Page 30: Compression of Ultrasonic Files Ir-sb-ex-0501

27 (49)

5.6 Transform Compression The transform operation itself is not lossy. The reason it is described in this section is that it can be used as a filter and then becomes lossy. The definition of a transform is however that without any loss of information transform a signal or a function between different domains e.g. from the time domain to the frequency domain. The following example will illustrate the principle of using a Fourier transform to compress a signal by setting the smallest coefficients to zero. As a comparison the same operation will be performed on the signal in the time domain. We wish to get a 1:2 compression ratio by setting 50% of the smallest values equal to zero. The signal f(t) is 3000 samples long and consists of two sinusoids with different frequency and amplitude, see figure 5.14. The notation F(f) will be used to denote the signal f(t) in the frequency domain.

))50sin(05.0))(sin(1500()sin()( 000 tttttf ωωθω −−−= (21)

functionstepHeavisidef == θπω 00 2

Figure 5.14: The original sampled signal f(nT). In the time domain the zeroing operation would result in the signal g(t) which is plotted in figure 5.15.

Page 31: Compression of Ultrasonic Files Ir-sb-ex-0501

28 (49)

Figure 5.15: Filtered sampled signal g(nT). This signal is completely different from the original signal and it contains only one frequency. If the original signal f(t) is Fourier transformed it will have two peaks with different heights corresponding to the two sinusoid signals, see figure 5.16. The height differences represent different energies and are due to the different amplitudes of the two signals.

Figure 5.16: F(f) with normalized energy.

0 Frequency

1

Energy

2sf

2sf−

Page 32: Compression of Ultrasonic Files Ir-sb-ex-0501

29 (49)

If the same operation is done on F(f), i.e. setting 50% of the smallest coefficients equal to zero, E(f), and then inverse transform the signal it will look like figure 5.17.

Figure 5.17: Inverse transform of filtered signal E(f). This signal is almost identical to the original and it has been reconstructed with only half the number of samples. This is the basic principle how to lossy compress signals with the use of transforms and still achieve a good result. The result is however dependent of what kind of signal it is. In this example the signal contained two sinusoids only and since the Fourier transform has sinusoids as its basis functions a good result should be expected. A downside with the Fourier transform is that it has no time resolution. It only shows the different frequencies in a signal but not where in time they occur. This is no problem if the signal is stationary i.e. all frequencies exist at all time. The signal f(t) is however non-stationary. To visualize this effect a stationary signal, s(t) containing the same two frequencies is Fourier transformed. The signal is:

0000 2)50sin(05.0)sin()( fttts πωωω =+= (22) In this signal the two frequencies occur at the same time and have the same amplitude and frequency as the previous signal f(t). Figure 5.18 shows a plot of s(t).

Samples 500

1

1−

0 Amplitude

Page 33: Compression of Ultrasonic Files Ir-sb-ex-0501

30 (49)

Figure 5.18: The sampled signal s(nT). To show the similarities between F(f) and S(f) two comparative plots have been made. Figure 5.19 shows F(f) again and figure 5.20 shows S(f).

Figure 5.19: F(f) with normalized energy.

Samples 500

Amplitude 0

1

1−

0

1

Energy

Frequency 2sf

2sf−

Page 34: Compression of Ultrasonic Files Ir-sb-ex-0501

31 (49)

Figure 5.20: S(f) with normalized energy. As can be seen it is not possible to determine which of the two frequency plots that come from which signal. 5.7 Mathematical Error The error can be defined in many different ways. The error measure used here is defined as:

samplesofnumbermx

xxError m

ii

m

iii

=−

=

=

=

1

2

1

2)ˆ( (23)

where x is the original signal and x is the reconstructed signal after compression. The error value from the different compression methods is then divided with the sum of the square of the original samples and plotted in a diagram. This way of measuring the error is motivated by the compression technique used i.e. by setting small coefficient values to zero. Plots of the calculated errors are found in section 7.2. The error tables are sorted defect wise in the appendices.

0 2sf

1

Energy

Frequency 2sf−

Page 35: Compression of Ultrasonic Files Ir-sb-ex-0501

32 (49)

6 File Compression The examined files contain three cracks in total and each crack has been scanned with one pulse-echo and one TOFD probe. This gives a total of six files to be compressed. To make sure no false “indications” are created in the compression process one of the examined cracks is below detection level which means that it is a crack but not big enough to be reportable. To save computational time the files have been modified so they only contain the defect areas. This has saved a lot of time as each file has been compressed and decompressed four times using three different compression algorithms. The software used is Matlab Student Version 6.0 R12 with the Wavelet toolbox and Borland C++ 5.02. All compression work has been done with Matlab and the data extraction and file modification has been done with Borland C++. To be able to read the headers in the UT files a special RDTIFF reading program, RDTV from R/D Tech has been used. This program makes it possible to determine the size of the raw UT data and where it is stored in the file. 6.1 Hardware compression The hardware compression function built into the Tomoscan is used on the pulse-echo files in this report. It has been set to save the largest of 8 consecutive samples. This compression/down sampling operation distorts the original sinusoidal shaped pulse and makes it more saw tooth like. This will give its frequency spectrum a small contribution of high frequency components due to the cutting of the signal by the down sampling process. Broader frequency spectrum means less number of small value coefficients in the frequency domain. This implies that the compression ratio gets lower before the signal shape is affected. Hardware compressed pulse-echo files are therefore more sensitive to higher compression ratios than TOFD files. Figure 6.1 and 6.2 shows an A-scan from a TOFD file and a pulse-echo file from one of the defect files examined in the report. It is clearly visible how the down sampling operation affects the signal. An advantage with this down sampling technique compared to just taking every 8th value is that the amplitude resolution gets higher.

Page 36: Compression of Ultrasonic Files Ir-sb-ex-0501

33 (49)

Figure 6.1: A-scan from a TOFD file, no hardware compression.

Figure 6.2: A-scan from a pulse-echo file with hardware compression.

Samples

Samples

540

228

Amplitude

Amplitude

15−

15

5000−

5000

Page 37: Compression of Ultrasonic Files Ir-sb-ex-0501

34 (49)

6.2 Compression Scheme As mentioned in the preface the object ive of this report is to investigate if a high enough compression ratio is achievable if the UT- files are pre processed in some way before the final WinZip compression. A flow chart over the complete compression and decompression process is shown in figure 6.3. The compressio n described here does not have anything to do with the hardware compression in section 6.1. These files have been compressed after the data has been collected and saved. File compression scheme. File decompression scheme.

Figure 6.3

UT-file

Extract UT-data

UT-data File Header

Lossy Compression

WinZip Compression

WinZip Compression

Rebuild Compressed File

Transform

Send File

Split File

WinZip Decompression

WinZip Decompression

File Header UT-data

Inverse Transform

Rebuild Original File

Receive Compressed File

Page 38: Compression of Ultrasonic Files Ir-sb-ex-0501

35 (49)

6.3 Lossy File Compression The pre processing of the data in this report will be of lossy character only due to the wish of getting as high a compression ratio as possible. Three slightly different transforms have been studied and they are:

• The Fourier transform. This method was selected due to its commonness as a transform and the fact that it can be implemented using fast algorithms (The Fast Fourier Transform). The basis functions are also very similar to the UT transmitting pulse which is good.

• The Haar wavelet transform. This is the most simple of the two wavelets used

and thus the simplest to implement with a computer. It was chosen out of interest to see how the shape of the basis functions affects the final result.

• The Daubechies wavelet transform. A widely used transform known for its good

results in e.g. image compression. A transform with eight vanishing moments is used in this report which hopefully will give a good result. The downside with this transform is that it requires more calculations than the Haar transform.

The files have been compressed with four different compression ratios and only with the 2D compression method. A comparison with the 1D transform would be interesting but would be too time consuming due to the extra evaluation work required. The compression ratios are 1:5, 1:6.7, 1:20 and 1:100. A compression ratio of 1:5 means that one fifth of the transformed coefficients have been saved and that four fifth have been set to zero. 6.4 Noise The addition of unwanted noise is always a problem because it distorts the signal and thus makes the data more difficult to analyse. The noise has various sources of origin. The most common is thermal noise which is present all the time. The two methods (pulse-echo and TOFD) used are not equally sensitive to noise because they are evaluated differently. As the purpose of the pulse echo method is to detect cracks the only interest is the amplitude of the received signal and normally the noise is not a problem. The TOFD method on the other hand requires the use of a preamplifier to boost up the received signal and this means that the noise as well is amplified. When a TOFD signal is evaluated the phase of the received signal is what is important and the added noise often makes it hard to determine the phase. The UT instruments has a built in noise reducing feature called averaging and this means that the average of 2, 4, 8 or 16 consecutive A-scans is taken which reduces the noise to some extent. One downside with this function is that the scanning speed has to be reduced when higher averaging is used e.g. 8 and 16.

Page 39: Compression of Ultrasonic Files Ir-sb-ex-0501

36 (49)

7 Experimental Result and Evaluation 7.1 Signal Pre Processing As mentioned in section 2.1 and 6.2 the UT files have to be pre processed before they are compressed. This is done with a small C program that reads the UT data from the original file which is a form of TIFF file called RDTIFF. The program extracts all the UT data by reading the address pointers in the file and then stores the data as a long vector instead of being spread out as it is in a RDTIFF file. The raw UT data is then transformed to the frequency domain, compressed with different ratios and WinZipped if it was to be used in reality. During the analysis the file is just inverse transformed after the compression and then evaluated. Before the evaluation the UT data has to be put back into the original UT file by another small C program. 7.2 Error Analysis The error analysis is divided in two parts. Part one is a mathematical analysis where the error is calculated for different compression ratios and compression techniques. Part two is a visual evaluation of the compressed files based on criteria used in reality. The mathematical error is defined in section 5.7. The result of the error calculations is presented in graphical form below (figures 7.1-7.6). The graphs show the error as a function of the amount saved elements for the different transforms. As can be seen the TOFD files are less affected by high compression ratios compared to the pulse-echo files and that the Daubechies transform seem to give the overall smallest error.

Page 40: Compression of Ultrasonic Files Ir-sb-ex-0501

37 (49)

Figure 7.1: Error plots from the defect 4 pulse-echo file.

Figure 7.2: Error plots from the defect 4 TOFD file.

20

20 10

10

010

010

110−

210−

elementssaved%

elementssaved%

Error

Error

Page 41: Compression of Ultrasonic Files Ir-sb-ex-0501

38 (49)

Figure 7.3: Error plots from the defect 5 pulse-echo file..

Figure 7.4: Error plots from the defect 5 TOFD file.

20

20

10

10

010

010

110−

elementssaved%

elementssaved%

Error

Error

Page 42: Compression of Ultrasonic Files Ir-sb-ex-0501

39 (49)

Figure 7.5: Error plots from the defect 8 pulse-echo file.

Figure 7.6: Error plots from the defect 8 TOFD file.

10 20

10 20

010

110−

010

210−

elementssaved%

elementssaved%

Error

Error

Page 43: Compression of Ultrasonic Files Ir-sb-ex-0501

40 (49)

To show the importance of combining a visual evaluation together with a mathematical error calculation the following example can be studied. It shows an uncompressed A-scan compared with the same A-scan compressed with the FFT and the Haar transform respectively. The A-scan comes from defect 8 and is scanned with a pulse-echo probe. The chosen compression ratio is 1:20 which means that 95 % of the coefficients are set to zero. According to table AT3 in appendix A3.1 this compression would give approximately the same mathematical error for both methods i.e. FFT = 0.435510 and Haar = 0.440680. The difference between the two compressed scans, figure 7.8 and 7.9 and the origina l scan, figure 7.7 is clearly visible.

Figure 7.7: Original A-scan.

Figure 7.8: FFT compressed A-scan.

Figure 7.9: Haar compressed A-scan.

50

50

50

Samples

Samples

Samples

Amplitude

Amplitude

Amplitude

10

10−

5

5−

10

10−

Page 44: Compression of Ultrasonic Files Ir-sb-ex-0501

41 (49)

7.2.1 Visual Error As mentioned earlier there are two different types of UT- files, pulse-echo and TOFD. The evaluation is done using different criteria depending on which type it is. As the pulse-echo method is used to detect cracks the received signal amplitude is what is of most interest. When such a file is analyzed the signal is often rectified and if the peak amplitude at a certain distance reaches over a predefined level it is considered to be an indication. From a compression point of view it is important that the compression program affects the signal amplitude as little as possible. The way the compression is achieved in this case i.e. by setting a number of small coefficients in the frequency domain to zero should not affect a strong echo signal too much as it would be represented as a large coefficient in the frequency domain. When a TOFD-file is analyzed the phase of the signal is what is most important. When there is a crack or air pocket in the material the signal is reflected due to different reflection index when it hits the material/air intersection. This reflection causes it to change 180 degrees in phase. So in this case it is important that the compression does not change the phase of the signal. Again zeroing a number of small coefficients only decrease the frequency content of the original signal and not the phase so an assumption is that there will not be any significant changes of the signal phase. Even if the compression does not change the phase notably it must not change the shape of the ultra sonic signal in this case the A-scan. It is however inevitable that a signal shape is not preserved if it has been transformed with basis functions that do not match the signal in the first place. This phenomenon was seen on the Haar compressed files when they were to be evaluated. The original pulse shape was too distorted by the Haar wavelet basis function, see figure 5.5, that the method was dismissed from the beginning and thus never evaluated. This conclusion would not have been drawn with the mathematical error calculations as the only quality measurement as the Haar and the FFT methods roughly seem to be giving similar error results. The signal distortion will be most apparent on the Haar compressed TOFD files as they are recorded without the hardware compression and thus not as saw tooth shaped as the pulse-echo files. Figure 7.10 show a TOFD A-scan which has been compressed and reconstructed with the Daubechies and the Haar wavelets, figure 7.11 and 7.12. The square wave look on the Haar signal originates from its basis function and it was considered to be too distorted to evaluate.

Page 45: Compression of Ultrasonic Files Ir-sb-ex-0501

42 (49)

Figure 7.10: Uncompressed TOFD A-Scan.

Figure 7.11: Daubechies compressed TOFD A-scan.

Figure 7.12: Haar compressed TOFD A-scan.

Amplitude

Amplitude

Amplitude

5000

5000

5000

5000−

5000−

5000−

50

50

50

samples

samples

samples

Page 46: Compression of Ultrasonic Files Ir-sb-ex-0501

43 (49)

7.2.2 Visual File Evaluation The compressed files have been evaluated by two persons with a level 2 ultra sonic certificate. This means that they are qualified to collect and evaluate ultra sonic data at nuclear power plants in Sweden and in other countries around the world. The persons were given the original files together with the compressed ones without knowing the compression level on each file. The compressed files are labelled as the example below shows. pe4_2d_db8_1 = FileType_CompDimension_Transform_FileNumber FileType: Pe = pulse-echo or to = TOFD + defect number. CompDimension: Dimension of compression method, always 2D. Transform: Used transform. FFT, Db8 or Haar. FileNumber: Random number between 1 and 4. The random numbering of the files is done to remove the compression ratio information that could affect the evaluation result. The need to have the original file as a reference is however making the evaluation a bit biased but could not be avoided. The three defects are called defect 4, defect 5 and defect 8. The three files are all a bit different from each other. Defect 4: This file contains an indication that is below the reportable limit. In this case reportable limit means that the received signal amplitude must reach over a specific level in three consecutive scans. A reportable defect has to be scanned with a TOFD probe to determine its size. This defect was chosen to see if any of the compression methods and ratios made it reportable i.e. made the signal level reach over the critical level. The TOFD file is eva luated for this report anyway. The signal does not contain a lot of noise. Defect 5: This file has one reportable indication which is measured with a TOFD probe. The TOFD file is however very noisy. This will make it interesting to see if the compression reduces the noise and thus have a positive effect on the signal. Defect 8: This file also has one reportable indication but the TOFD file is much less noisy than defect 5.

Page 47: Compression of Ultrasonic Files Ir-sb-ex-0501

44 (49)

7.3 Lossless Compression Results Table 7.1 shows the compression ratios when the 6 UT files are compressed with WinZip without any pre processing i.e. in original format. It should be noted that this compression is lossless and the files do not lose any information. The letter ‘m’ at the end of each file name means that the file is modified and only contains the defect area as mentioned in section 5.

File Name Original Size (KBytes)

WinZipped Size (KBytes) Compression Ratio

pe4m 10972 4497 1:2.4 to4m 6406 3894 1:1.6 pe5m 13855 5490 1:2.5 to5m 54414 38432 1:1.4 pe8m 17450 7905 1:2.2 to8m 103724 69054 1:1.5

Table 7.1

The resulting compression ratios achieved with this straight forward method is however not sufficient but it gives a good indication of the efficiency of lossless compression applied to ultra sonic files. 7.4 Lossy Compression Evaluation Results This section gives an overview of the data presented in the appendices. The tables below show conclusions drawn from the visual evaluations and the mathematical calculations in the appendices. As mentioned in section 7.2.1 a visual evaluation is needed as a complement to the mathematical error calculations in order to determine if a compressed signal is good or bad. Based on the data from those two evaluation methods the Daubechies wavelet compression has proven to be better than the FFT transform compression. It preserves the signal shape better at higher compression ratios and gives the smallest mathematical error. The tables below therefore only show the results from compressions made with the Daubechies wavelet. The results from the FFT compressions are however presented in the appendices. The results show that TOFD files can be compressed more than the pulse-echo files studied in this report. This is due to the hardware compression that damages the original sinusoid pulse shape and makes it more broadband. This means that more coefficients have to be saved to be able to reconstruct the original pulse. The pulse-echo files are on the other hand already compressed to 1/8 of the original size.

Page 48: Compression of Ultrasonic Files Ir-sb-ex-0501

45 (49)

Defect 4

Daubechies compressed One non reportable indication, original signal not noisy

% saved elements Pulse-echo TOFD 20 15

No visual change in A-scan shape.

5 Minor change in A-scan shape.

No visual change in A-scan shape.

1 Change in A-scan shape. A-scan not too good.

Table 7.2

From table 7.2 it is seen that saving 15% of the pulse-echo and TOFD samples can be done without loosing any significant information. From the Daubechies curve in figure AP1.2 in appendix A1.1 it seems likely that 8-10% of the TOFD samples can be saved without significant changes in pulse-shape as the error only increases marginally.

Table 7.3

From table 7.3 it is seen that saving 5% of the TOFD samples can be done without loosing any significant information and 15% of the pulse-echo samples. From the Daubechies curve in figure AP2.1 in appendix A2.1 it seems likely that 10% of the pulse-echo samples can be saved without significant changes in pulse shape as the error only increases marginally.

Defect 5 Daubechies compressed

One reportable indication, original signal noisy % saved elements Pulse-echo TOFD

20 15

No visual change in A-scan shape and ampl.

No visual change in A-scan shape.

5 Small change in A-scan shape and ampl.

Minor change in A-scan shape, most noise gone.

1 Change in A-scan shape. A-scan shape not good

Page 49: Compression of Ultrasonic Files Ir-sb-ex-0501

46 (49)

Table 7.4

From table 7.4 it is seen that saving 15% of the TOFD and the pulse-echo samples can be done without loosing any significant information. From the Daubechies curve in figure AP2.1 in appendix A2.1 it seems likely that 5-10% of the pulse-echo and TOFD samples can be saved without significant changes in pulse shape as the error only increases marginally. All background data to these tables can be found in the appendices. Error calculations, error plots and pictures from the visual evaluations are listed defect wise. The Haar transform is only represented in the error calculation table and in the error plots but not in any of the evaluation tables due to its poor visual compression results.

Defect 8 Daubechies compressed

One reportable indication, original signal not noisy % saved elements Pulse-echo TOFD

20 15

No visual change in A-scan shape and ampl.

No visual change in A-scan shape.

5 Small change in A-scan shape and ampl.

Minor change in A-scan shape.

1 Change in A-scan shape and ampl. A-scan shape not good.

Page 50: Compression of Ultrasonic Files Ir-sb-ex-0501

47 (49)

7.5 Reality Compression Tests So far all compression calculations have been on a theoretical level. In this section a complete compression test is made with one of the files to see if the theoretical calculations are accurate. When a practical test is to be done a number of problems occur that do not exist on the theoretical level. The list below describes two of them.

1. The amplitude resolution is different between the TOFD and the pulse-echo files in the time domain. Each TOFD sample is 12 bits long and each pulse-echo sample is 8 bits long in the examined files. After transformation to the frequency domain the coefficients may be larger than 8 or 12 bits.

2. This problem follows from nr 1. When a UT file is transformed to the frequency

domain the magnitude of the coefficients varies a lot. This means that some coefficients may need a 32 bit representation while the major part only needs 16 bits. The number of bits has to be defined when data is saved with Matlab which implies that the file size gets unnecessary big if it has to be saved with a 32 bit representation.

The problems described above could in the worst case make the compressed file even bigger than the original one if for example a file with 8 bit data values is saved with a 32 bit representation. Fortunately WinZip almost removes the effect of this problem but not all which means that there still is room for some improvement when it comes to effective data storage. The file in this test is to4, i.e. the TOFD file from defect 4. The whole file is now compressed compared to the modified versions previously used where the defect area was cut out. A TOFD file was chosen because it only contains data from one transducer whereas the pulse-echo files have data from two different transducers. The coefficient vector has been saved in two parts to overcome the storage problem described above (nr 2). Part one has ~7% of the coefficients and was saved with 32 bits and part two with the remaining 93% was saved with 16 bits.

Page 51: Compression of Ultrasonic Files Ir-sb-ex-0501

48 (49)

Three different compression ratios were WinZipped and compared with the original TOFD file. Table 7.5 shows the result.

to4 Size = 6406 KB

Pre processed compression ratio

WinZipped pre processed file

size (KB)

WinZipped original file size

(KB)

Reality compression ratio

1:6.7 (85% zeros) 1181 1:5.4 1:10 (90% zeros) 854 1:7.5 1:20 (95% zeros) 486

3903 1:13.2

Table 7.5

As can bee seen the calculated compression ratio does not exactly match the compression in reality when the file has been WinZipped. A possible explanation could be the storage problem mentioned before.

Page 52: Compression of Ultrasonic Files Ir-sb-ex-0501

49 (49)

8 Conclusions and Future Work The aim of this report is to investigate if it is possible to compress large ultrasonic data files, around 200 MB, enough to transfer them over a slow transmission line e.g. a telephone modem in a reasonable period of time. According to the results in section 7.4 a satisfactory quality on the compressed files is achieved at a compression ratio of 1:10 when the 2 dimensional Daubechies wavelet with 8 vanishing moments is used as a transform. This would reduce the data size to 1/10 of the original size. The 200 MB file above would then be compressed to 20 MB giving a theoretical transfer time over a 56.6 Kb/s modem of:

56600820000000 × = 47 minutes (24)

as opposed to the original 7 hours and 51 minutes. Normally the files are smaller than that and of course the transfer time decreases accordingly. The results also show the importance of choosing the right transfer function depending on the signal shape. In this case the Daubechies wavelet matched the transmitting pulse the best which could be seen in both the visual and the mathematical plots and tables. A factor that has not been taken into account is the time the compression and decompression process takes which can be substantially if the files are large. The compression/decompression time can probably be reduced if the algorithms are implemented in C instead of using the wavelet toolbox in Matlab. The practical test results in section 7.5 show that in order to get a final compression ratio of 1:10 a slightly higher pre processing compression ratio must be used. This result is however based on the way the file was saved with the 32 and 16 bit word lengths. This file splitting was far from optimal and a more effective method would probably result in a better final compression ratio. Interesting work for the future would be to use wavelet packets together with the best basis concept and see what the improvements in compression ratio and signal quality are like.

Page 53: Compression of Ultrasonic Files Ir-sb-ex-0501

50 (49)

9 References [Proakis, Salehi 94] John G. Proakis, Masoud Salehi, Communication Systems Engineering, Prentice Hall Int. Editions 1994. [Proakis, Manolakis 96] John G. Proakis, Dimitris G. Manolakis, Digital Signal Processing, Prentice Hall Int. Editions 1996. [Petersson 97] Jan Petersson, Fourieranalys, Rex Offsettryck 1997. [Salomon 97] David Salomon, Data Compression, Springer-Verlag 1998. [Valens 99] Clemens Valens http://perso.wanadoo.fr/ polyvalens/clemens/clemens.html [Jensen, la Cour-Harbo 01] Arne Jensen, Anders la Cour -Harbo, Ripples in Mathematics, Springer-Verlag 2001

Page 54: Compression of Ultrasonic Files Ir-sb-ex-0501

Appendices

Page 55: Compression of Ultrasonic Files Ir-sb-ex-0501

Appendix A1.1 Mathematical Error Calculation Results

Defect 4

Table AT.1: Compression error calculations

Error

Pulse-echo TOFD % saved elements

FFT Haar Db8 FFT Haar Db8

20 0.317940 0.233810 0.175670 0.004870 0.017057 0.001660 15 0.406600 0.301030 0.248250 0.010118 0.026142 0.004193 5 0.687840 0.622270 0.568250 0.086005 0.069284 0.025588 1 0.811960 1.000000 0.881240 0.152450 0.220020 0.091936

Page 56: Compression of Ultrasonic Files Ir-sb-ex-0501

Appendix A1.2 Visual Pulse-echo Evaluation Results

Defect 4

FFT

File name % saved elem.

Max ampl. dB

Max scan deg

Max index mm

Soundpath mm/2

Scan 1 -6dB deg

Scan 2 -6dB deg

Index 1 -6dB mm

Index 2 -6dB mm

Scan length deg.

Length index mm Comment

pe4m Original -3,5 84,2 149 14,7 79 87,8 149 161 8,8 12 Uncompressed file pe4_2d_fft_3 20 -3,4 84,6 149 13,9 79 86,2 149 161 7,2 12 Distorted A-scan

pe4_2d_fft_1 15 -3,3 84,6 149 13,9 83,4 86,2 149 149 2,8 0 Distorted A-scan pe4_2d_fft_2 5 -4,4 84,6 149 13,9 83,4 86,6 149 153 3,2 4 Distorted A-scan

pe4_2d_fft_4 1 - - - - - - - - - - Signal amplitude to low

Db8

File name % saved elem.

Max ampl. dB

Max scan deg

Max index mm

Soundpath mm/2

Scan 1 -6dB deg

Scan 2 -6dB deg

Index 1 -6dB mm

Index 2 -6dB mm

Scan length deg.

Length index mm Comment

pe4_2d_db8_3 20 -3,5 84,2 149 14,7 83 86,2 149 149 3,2 0 pe4_2d_db8_2 15 -3,5 84,2 149 14,7 83 86,2 149 149 3,2 0 pe4_2d_db8_1 5 -3,5 84,6 149 13,9 79 87,4 149 161 8,4 12

pe4_2d_db8_4 1 -3,1 84,6 149 13,9 83 85,8 149 149 2,8 0

Page 57: Compression of Ultrasonic Files Ir-sb-ex-0501

1

Appendix A1.3 Pulse-echo plots.

Defect 4

pe4m orginal

Page 58: Compression of Ultrasonic Files Ir-sb-ex-0501

2

Appendix A1.3 Pulse-echo plots.

Defect 4

pe4_2d_fft_3 20% saved elements

pe4_2d_fft_1 15% saved elements

Page 59: Compression of Ultrasonic Files Ir-sb-ex-0501

3

Appendix A1.3 Pulse-echo plots.

Defect 4

pe4_2d_fft_2 5% saved elements

pe4_2d_fft_4 1% saved elements

Page 60: Compression of Ultrasonic Files Ir-sb-ex-0501

4

Appendix A1.3 Pulse-echo plots.

Defect 4

pe4_2d_db8_3 20% saved elements

pe4_2d_db8_2 15% saved elements

Page 61: Compression of Ultrasonic Files Ir-sb-ex-0501

5

Appendix A1.3 Pulse-echo plots.

Defect 4

pe4_2d_db8_1 5% saved elements

pe4_2d_db8_4 1% saved elements

Page 62: Compression of Ultrasonic Files Ir-sb-ex-0501

6

Appendix A1.4 Visual TOFD Evaluation Results

Defect 4

FFT File name % saved

elements Max

ampl. dB Lath Tip 1 ∆ time Depth

(mm) Num. of

scan lines Noise Comment

to4m Original 24,4 7,894 9,877 1,983 12,63 11

to4_2d_fft_1 20 24,8 7,894 9,877 1,983 12,63 11

to 4_2d_fft_2 15 24,9 7,894 9,877 1,983 12,63 11

to 4_2d_fft_4 5 21,7 7,894 9,877 1,983 12,63 11

to 4_2d_fft_3 1 17,6 7,894 9,877 1,983 12,63 11 Signal shape not to good

Db8

File name % saved elements

Max ampl. dB

Lath Tip 1 ∆ time Depth (mm)

Num. of scan lines

Noise Comment

to 4_2d_db8_3 20 24,1 7,894 9,877 1,983 12,63 11

to 4_2d_db8_1 15 24,7 7,894 9,877 1,983 12,63 11

to 4_2d_db8_2 5 24,5 7,894 9,877 1,983 12,63 11

to 4_2d_db8_4 1 25,0 7,894 9,877 1,983 12,63 11 Signal shape not to good

Page 63: Compression of Ultrasonic Files Ir-sb-ex-0501

7

Appendix A1.5 TOFD Plots.

Defect 4

to4_2d_fft_1 20% saved elements to4_2d_fft_2 15% saved elements

to4_2d_fft_4 5% saved elements to4_2d_fft_3 1% saved elements

Page 64: Compression of Ultrasonic Files Ir-sb-ex-0501

8

Appendix A1.5 TOFD Plots.

Defect 4

to4_2d_db8_3 20% saved elements to4_2d_db8_1 15% saved elements

to4_2d_db8_2 5% saved elements to4_2d_db8_4 1% saved elements

Page 65: Compression of Ultrasonic Files Ir-sb-ex-0501

9

Appendix A2.1 Mathematical Error Calculation Results

Defect 5

Error

Pulse-echo TOFD % saved eleme nts

FFT Haar Db8 FFT Haar Db8 20 0.198140 0.203560 0.167630 0.040112 0.059895 0.020100 15 0.287320 0.256490 0.249140 0.089521 0.089051 0.036558 5 0.627750 0.549540 0.438320 0.343180 0.227780 0.147360 1 0.745690 0.817920 0.821220 0.580810 0.561480 0.374410

Table AT2: Compression Error Calculations.

Page 66: Compression of Ultrasonic Files Ir-sb-ex-0501

10

Appendix A2.2 Visual Pulse-echo Evaluation Results

Defect 5

FFT

File name % saved elem.

Max ampl. dB

Max scan deg

Max index mm

Soundpath mm/2

Scan 1 -6dB deg

Scan 2 -6dB deg

Index 1 -6dB mm

Index 2 -6dB mm

Scan length

deg.

Length index mm Comment

Pe5m Original -2,6 169,2 27 25,6 165,7 173,7 21 47 8 26 Uncompressed file pe5_2d_fft_3 20 -2 169,2 27 25,6 165,7 173,7 21 47 8 26

pe5_2d_fft_2 15 -2 169,2 27 25,6 164,2 178,7 18 49 14,5 31 pe5_2d_fft_1 5 -3,4 171,2 31 29,9 163,7 182,8 18 49 19,1 31

pe5_2d_fft_4 1 - - - - - - - - - - No detectable indication

Db8

File name % saved elem.

Max ampl. dB

Max scan deg

Max index mm

Soundpath mm/2

Scan 1 -6dB deg

Scan 2 -6dB deg

Index 1 -6dB mm

Index 2 -6dB mm

Scan length

deg.

Length index mm Comment

pe5_2d_db8_2 20 -2,6 169,2 27 25,6 165,7 173,7 18 49 8 31 pe5_2d_db8_1 15 -2,2 169,2 27 25,6 167,2 174,2 23 47 7 24 pe5_2d_db8_4 5 -1,4 169,2 27 25,6 167,2 173,7 23 39 6,5 16

pe5_2d_db8_3 1 -1,9 169,2 31 26,4 167,2 172,7 23 39 5,5 16

Page 67: Compression of Ultrasonic Files Ir-sb-ex-0501

11

Appendix A2.3 Pulse-echo plots.

Defect 5

pe5m orginal

Page 68: Compression of Ultrasonic Files Ir-sb-ex-0501

12

Appendix A2.3 Pulse-echo plots.

Defect 5

pe5_2d_fft_3 20% saved elements

pe5_2d_fft_2 15% saved ele ments

Page 69: Compression of Ultrasonic Files Ir-sb-ex-0501

13

Appendix A2.3 Pulse-echo plots.

Defect 5

pe5_2d_fft_1 5% saved elements

pe5_2d_fft_4 1% saved elements

Page 70: Compression of Ultrasonic Files Ir-sb-ex-0501

14

Appendix A2.3 Pulse-echo plots.

Defect 5

pe5_2d_db8_2 20% saved elements

pe5_2d_db8_1 15% saved elements

Page 71: Compression of Ultrasonic Files Ir-sb-ex-0501

15

Appendix A2.3 Pulse-echo plots.

Defect 5

pe5_2d_db8_4 5% saved elements

pe5_2d_db8_3 1% saved elements

Page 72: Compression of Ultrasonic Files Ir-sb-ex-0501

16

Appendix A2.4 Visual TOFD Evaluation Results

Defect 5

FFT File name % saved

elements Max

ampl. dB Lath Tip 1 ∆ time Depth

(mm) Num. of

scan lines Noise Comment

to5m Original 26,4 13,694 15,360 1,666 14,90 11 Noisy signal

to5_2d_fft_3 20 17,6 13,694 15,344 1,650 14,81 11 Noisy signal

to5_2d_fft_1 15 24,8 13,694 15,360 1,666 14,90 11 Noisy signal

to5_2d_fft_2 5 24,9 13,710 15,360 1,650 14,81 11 Noisy signal

to5_2d_fft_4 1 21,7 13,710 15,377 1,667 14,90 11 Noisy signal, signal shape not ok

Db8 File name % saved

elements Max

ampl. dB Lath Tip 1 ∆ time Depth (mm)

Num. of scan lines Noise Comment

to5_2d_db8_1 20 24,7 13,694 15,344 1,650 14,81 11 Noisy signal

to5_2d_db8_2 15 24,5 13,710 15,344 1,634 14,73 11 Noisy signal

to5_2d_db8_3 5 24,1 13,694 15,360 1,666 14,90 11 Most noise gone

to5_2d_db8_4 1 25,0 13,694 15,344 1,650 14,81 11 Signal shape not ok

Page 73: Compression of Ultrasonic Files Ir-sb-ex-0501

17

Appendix A2.5 TOFD Plots.

Defect 5

to5_2d_fft_3 20% saved elements to5_2d_fft_1 15% saved elements

to5_2d_fft_2 5% saved elements to5_2d_fft_4 1% saved elements

Page 74: Compression of Ultrasonic Files Ir-sb-ex-0501

18

Appendix A2.5 TOFD Plots.

Defect 5

to5_2d_db8_1 20% saved elements to5_2d_db8_2 15% saved elements

to5_2d_db8_3 5% saved elements to5_2d_db8_4 1% saved elements

Page 75: Compression of Ultrasonic Files Ir-sb-ex-0501

19

Appendix A3.1 Mathematical Error Calculation Results

Defect 8

Error

Pulse-echo TOFD % saved elements

FFT Haar Db8 FFT Haar Db8 20 0.143830 0.091941 0.069915 0.005039 0.007032 0.000722 15 0.204240 0.144500 0.096112 0.021851 0.012428 0.001901 5 0.435510 0.440680 0.272820 0.206570 0.051234 0.015114 1 0.689470 0.795590 0.865840 0.383470 0.117070 0.069977

Table AT.3: Compression error calculations.

Page 76: Compression of Ultrasonic Files Ir-sb-ex-0501

20

Appendix A3.2 Visual Pulse-echo Evaluation Results

Defect 8

FFT

File name % saved elem.

Max ampl. dB

Max scan deg

Max index mm

Soundpath mm/2

Scan 1 -6dB deg

Scan 2 -6dB deg

Index 1 -6dB mm

Index 2 -6dB mm

Scan length

deg.

Length index mm Comment

pe8m Original -1,1 295,9 140 26,3 294,1 298,9 120 140 4,8 20 Uncompressed file pe8_2d_fft_2 20 -1,1 295,9 140 25,5 293,8 298,9 120 140 5,1 20

pe8_2d_fft_1 15 -1,1 295,9 140 25,5 294,1 298,9 120 140 4,8 20 pe8_2d_fft_4 5 -1,6 295,9 140 25,5 294,1 298,9 120 140 4,8 20

pe8_2d_fft_3 1 -3,2 296,2 140 25,5 294,1 298,9 120 140 4,8 20

Db8

File name % saved elem.

Max ampl. dB

Max scan deg

Max index mm

Soundpath mm/2

Scan 1 -6dB deg

Scan 2 -6dB deg

Index 1 -6dB mm

Index 2 -6dB mm

Scan length

deg.

Length index mm Comment

pe8_2d_db8_2 20 -1,4 295,9 140 26,3 293,8 298,9 120 140 5,1 20 pe8_2d_db8_1 15 -1,4 295,9 140 26,3 293,8 298,9 120 140 5,1 20

pe8_2d_db8_4 5 -1,4 295,9 140 26,3 294,1 298,9 120 140 4,8 20

pe8_2d_db8_3 1 -1,3 295,6 140 27,1 294,1 298,9 120 140 4,8 20

Page 77: Compression of Ultrasonic Files Ir-sb-ex-0501

21

Appendix A3.3 Pulse-echo plots.

Defect 8

pe8m orginal

Page 78: Compression of Ultrasonic Files Ir-sb-ex-0501

22

Appendix A3.3 Pulse-echo plots.

Defect 8

pe8_2d_fft_2 20% saved elements

pe8_2d_fft_1 15% saved elements

Page 79: Compression of Ultrasonic Files Ir-sb-ex-0501

23

Appendix A3.3 Pulse-echo plots.

Defect 8

pe8_2d_fft_4 5% saved elements

pe8_2d_fft_3 1% saved elements

Page 80: Compression of Ultrasonic Files Ir-sb-ex-0501

24

Appendix A3.3 Pulse-echo plots.

Defect 8

pe8_2d_db8_2 20% saved elements

pe8_2d_db8_1 15% saved elements

Page 81: Compression of Ultrasonic Files Ir-sb-ex-0501

25

Appendix A3.3 Pulse-echo plots.

Defect 8

pe8_2d_db8_4 5% saved elements

pe8_2d_db8_3 1% saved elements

Page 82: Compression of Ultrasonic Files Ir-sb-ex-0501

26

Appendix A3.4 Visual TOFD Evaluation Results

Defect 8

FFT File name % saved

elements Max

ampl. dB Lath Tip 1 ∆ time Depth (mm)

Num. of scan lines Noise Comment

to8m Original 24,4 7,277 8,444 1,167 8,29 18 Some noise before latheral

to8_2d_fft_3 20 24,6 7,277 8,444 1,167 8,29 18 Some noise before latheral

to8_2d_fft_1 15 25,6 7,277 8,444 1,167 8,29 18 Signal noisier

to8_2d_fft_2 5 17,8 7,277 8,460 1,183 8,35 18 Signal noisier

to8_2d_fft_4 1 - - - - - - No measurable signal

Db8 File name % saved

elements Max

ampl. dB Lath Tip 1 ∆ time Depth (mm)

Num. of scan lines Noise Comment

to8_2d_db8_3 20 24,9 7,277 8,444 1,167 8,29 18 No noise

to8_2d_db8_1 15 24,4 7,277 8,444 1,167 8,29 18 No noise

to8_2d_db8_2 5 20,7 7,244 8,427 1,183 8,35 18 No noise

to8_2d_db8_4 1 12,5 7,277 8,427 1,150 8,21 18 Bad signal shape

Page 83: Compression of Ultrasonic Files Ir-sb-ex-0501

27

Appendix A3.5 TOFD Plots.

Defect 8

to8_2d_fft_3 20% saved elements to8_2d_fft_1 15% saved elements

to8_2d_fft_2 5% saved elements to8_2d_fft_4 1% saved elements

Page 84: Compression of Ultrasonic Files Ir-sb-ex-0501

28

Appendix A3.5 TOFD Plots.

Defect 8

to8_2d_db8_3 20% saved elements to8_2d_db8_1 15% saved elements

to8_2d_db8_2 5% saved elements to8_2d_db8_4 1% saved elements