automatic fault diagnosis of rolling element bearings...

Automatic Fault Diagnosis of Rolling Element

Bearings Using Wavelet Based Pursuit Features

Hongyu Yang

Bachelor of Engineering (DUT)*

Master of Engineering (DUT)

* Dalian University of Technology, China

Thesis submitted in total fulfilment of the requirements of the degree of

Doctor of Philosophy

School of Mechanical, Manufacturing, and Medical Engineering

Faculty of Built Environmental Engineering

Queensland University of Technology

8th October 2004

STATEMENT OF ORIGINALITY

____________________________________________________________________

I

STATEMENT OF ORIGINALITY

This thesis contains no material which has been accepted for the award of any other

degree or diploma in any university, and to the best of my knowledge it contains no

material previously published or written by another person, except where due

reference is made in the text of the thesis.

Signed :_____________ Date:____________

Hongyu Yang

School of Mechanical, Manufacturing, and

Medical Engineering

Queensland University of Technology;

Gardens Point, Brisbane, Queensland,

4001

Australia

ACKNOWLEDGEMENTS

____________________________________________________________________

II

To my loving parents, Yang Yutian, and Jiang Guiyun.

ACKNOWLEDGEMENTS

____________________________________________________________________

III

ACKNOWLEDGEMENTS

I wish to express my sincere appreciation to the following people for their support

and contributions:

My families for their emotional support and encouragement.

My supervisors, Prof. Joseph Mathew and Dr. Lin Ma, for their technical

supervision, and general support and advice.

Dr Vladis Kosse for providing access to the bearing test rig located in the School of

Mechanical, Manufacturing and Medical Engineering at Queensland University of

Technology.

My friends for their encouragement.

Hongyu Yang

School of Mechanical, Manufacturing, and

Medical Engineering

Queensland University of Technology;

Gardens Point, Brisbane, Queensland,

4001

Australia

TABLE OF CONTENTS

____________________________________________________________________

IV

TABLE OF CONTENTS

STATEMENT OF ORIGINALITY ..........................................................................I

ACKNOWLEDGEMENTS......................................................................................II

TABLE OF CONTENTS......................................................................................... IV

LIST OF FIGURES ..............................................................................................VIII

LIST OF TABLES ................................................................................................XIII

NOMENCLATURE..............................................................................................XIV

ABSTRACT ....................................................................................................... XVII

CHAPTER 1. GENERAL INTRODUCTION ........................................................ 1

1.1 INTRODUCTION.............................................................................................. 1

1.2 OBJECTIVES.................................................................................................. 1

1.3 SIGNIFICANCE ............................................................................................... 2

1.4 SCOPE OF RESEARCH..................................................................................... 3

1.5 ORIGINALITY OF RESEARCH.......................................................................... 4

1.6 ORGANISATION OF THESIS............................................................................ 6

CHAPTER 2. LITERATURE REVIEW ................................................................. 8

2.1 INTRODUCTION.............................................................................................. 8

2.2 BACKGROUND............................................................................................... 8

2.2.1 CONDITION MONITORING AND FAULT DIAGNOSIS................................................................8

2.2.2 ARTIFICIAL INTELLIGENCE .................................................................................................12

2.3 LITERATURE ON FAULT DIAGNOSIS OF ROTATING MACHINERY.................... 15

2.3.1 FEATURE EXTRACTION FOR FAULT DIAGNOSIS OF ROTATING MACHINERY .......................16

2.3.2 ARTIFICIAL INTELLIGENCE TECHNIQUES IN FAULT IDENTIFICATION ..................................42

2.3.3 ARTIFICIAL INTELLIGENCE & WAVELET TRANSFORM FOR FAULT DIAGNOSIS...................56

2.4 CONCLUSION OF REVIEW ............................................................................ 59

CHAPTER 3. ADVANCED FEATURE EXTRACTION TECHNIQUES .. ...... 61

TABLE OF CONTENTS

____________________________________________________________________

V

3.1 INTRODUCTION............................................................................................ 61

3.2 BEST BASIS DISCRETE WAVELET PACKET ANALYSIS ................................. 62

3.2.1 INTRODUCTION TO DIFFERENT WAVELET FUNCTIONS........................................................66

3.2.2 SELECTION OF BEST BASIS .................................................................................................74

3.3 ADAPTIVE APPROXIMATION WITH PURSUIT................................................ 77

3.3.1 FUNDAMENTALS OF ADAPTIVE APPROXIMATION WITH PURSUIT........................................77

3.3.2 MATCHING PURSUIT ...........................................................................................................84

3.3.3 BASIS PURSUIT ...................................................................................................................85

3.4 SUMMARY ................................................................................................... 89

CHAPTER 4. AUTOMATIC DIAGNOSIS SCHEMA.............. .......................... 91

4.1 INTRODUCTION............................................................................................ 91

4.2 AUTOMATIC FAULT DIAGNOSIS USING SPECTRUM ANALYSIS .................... 92

4.3 AUTOMATIC FAULT DIAGNOSIS BASED ON DWPA..................................... 93

4.4 AUTOMATIC FAULT DIAGNOSIS USING PURSUIT......................................... 96

4.5 DESIGN OF THE FEED FORWARD NEURAL NETWORK CLASSIFIER............... 97

4.5.1 FEED FORWARD NEURAL NETWORKS.................................................................................97

4.5.2 DESIGN OF THE STRUCTURE OF THE FEED FORWARD NEURAL NETWORK CLASSIFIERS...100

4.6 SUMMARY ................................................................................................. 101

CHAPTER 5. SIMULATION AND EXPERIMENT ............... .......................... 103

5.1 INTRODUCTION.......................................................................................... 103

5.2 SIMULATED SIGNALS ................................................................................ 103

5.3 TEST RIG AND EXPERIMENT PROCEDURE.................................................. 104

5.4 FAULT SIMULATION .................................................................................. 106

5.5 SUMMARY ................................................................................................. 109

CHAPTER 6. RESULTS AND DISCUSSION.................................................... 110

6.1 INTRODUCTION.......................................................................................... 110

TABLE OF CONTENTS

____________________________________________________________________

VI

6.2 ANALYSIS USING DWPA, MATCHING PURSUIT, AND BASIS PURSUIT....... 111

6.2.1 TIME-FREQUENCY ANALYSIS OF SIMULATED SIGNALS ....................................................111

6.2.2 TIME-FREQUENCY ANALYSIS OF THE SIGNALS OF BEARINGS WITH MARGINAL FAULTS ..121

6.2.3 BASIS PURSUIT AND BEST BASIS DWPA AND MATCHING PURSUIT – A COMPARISON.....128

6.2.4 SEVERITY OF BEARING FAULTS ANALYSED USING BASIS PURSUIT..................................130

6.2.5 BASIS PURSUIT DENOISING...............................................................................................135

6.3 AUTOMATIC FAULT DIAGNOSIS ................................................................. 137

6.3.1 SPECTRUM BASED AUTOMATIC FAULT DIAGNOSIS..........................................................137

6.3.2 DWPA FEATURE BASED AUTOMATIC FAULT DIAGNOSIS................................................146

6.3.3 MATCHING PURSUIT FEATURE BASED AUTOMATIC FAULT DIAGNOSIS............................155

6.3.4 BASIS PURSUIT FEATURE BASED AUTOMATIC FAULT DIAGNOSIS....................................161

6.4 DISCUSSION AND CONCLUSION................................................................. 166

6.5 SUMMARY ................................................................................................. 167

CHAPTER 7. CONCLUSION.............................................................................. 168

7.1 INTRODUCTION.......................................................................................... 168

7.2 THE IMPROVED DWPA, MATCHING PURSUIT, AND BASIS PURSUIT ......... 168

7.3 AUTOMATIC DIAGNOSIS USING SPECTRUM............................................... 169

7.4 AUTOMATIC DIAGNOSIS USING DWPA.................................................... 169

7.5 AUTOMATIC DIAGNOSIS USING MATCHING PURSUIT................................ 170

7.6 AUTOMATIC DIAGNOSIS USING BASIS PURSUIT........................................ 170

CHAPTER 8. FUTURE RESEARCH ................................................................. 171

8.1 SIGNAL PROCESSING TECHNIQUES FOR FEATURE EXTRACTION ................ 171

8.2 ARTIFICIAL INTELLIGENCE FOR DIAGNOSIS.............................................. 171

8.3 INCIPIENT FAULT DETECTION USING TIME-FREQUENCY ANALYSIS

TECHNIQUES......................................................................................................... 172

8.4 AUTOMATED DIAGNOSIS OF PROCESS MONITORING AND MATERIAL

DEGRADATION. .....................................................................................................172

TABLE OF CONTENTS

____________________________________________________________________

VII

8.5 AUTOMATIC DIAGNOSIS OF TRANSMISSION SYSTEMS................................ 172

8.6 COMMERCIALIZING AN INTEGRAL INTELLIGENT DIAGNOSTIC TOOLBOX.... 173

REFERENCES....................................................................................................... 174

PUBLICATIONS ................................................................................................... 195

GLOSSARY .......................................................................................................... 196

APPENDIX .......................................................................................................... 199

LIST OF FIGURES

____________________________________________________________________

VIII

LIST OF FIGURES

Figure 2.1: Machine life bathtub curve........................................................................ 9

Figure 2.2: Overall levels of a bearing of continuing phases of failure..................... 10

Figure 2.3: Spectral characteristics of different stage bearing faults ......................... 12

Figure 2.4: Conceptual representation of a pattern recognition problem................... 14

Figure 2.5: Architecture of a Neural Network ........................................................... 14

Figure 2.6: Fault diagnosis -an overview................................................................... 15

Figure 2.7: An overview of feature extraction techniques......................................... 18

Figure 2.8: An overview of time domain feature extraction techniques.................... 19

Figure 2.9: An overview of fault detection and identification techniques................. 44

Figure 2.10: A Recurrent Neural Network................................................................. 46

Figure 2.11: One Dimensional Self Organising Map ................................................ 47

Figure 2.12: Two dimensional Self Organising Map................................................. 47

Figure 2.13: A Zero-Order Sugeno Fuzzy Model...................................................... 54

Figure 3.1: Filter bank representation of DWT and DWPA [149] ............................ 66

Figure 3.2: An example tree of wavelet packet decomposition [91] ......................... 67

Figure 3.3: Harr Wavelet............................................................................................ 67

Figure 3.4: Meyer wavelet function........................................................................... 69

Figure 3.5: Coiflets wavelet function......................................................................... 70

Figure 3.6: Daubechies function ................................................................................ 71

Figure 3.7: Symlet function ....................................................................................... 72

Figure 3.8: An example of the best tree DWPA decomposition of a signal (with depth

position index).................................................................................................... 76

Figure 3.9: An example of the best tree DWPA decomposition of a signal (with

Entropy value index).......................................................................................... 76

Figure 3.10: An example of a time-frequency atom plot [151] ................................. 82

Figure 4.1: Automatic fault diagnosis procedure using spectrum, spectrogram with

NN...................................................................................................................... 92

LIST OF FIGURES

____________________________________________________________________

IX

Figure 4.2: Automatic fault diagnosis procedure using DWPA and NN................... 94

Figure 4.3: Symlet8 function with a few scales and locations................................... 95

Figure 4.4: Automatic fault diagnosis procedure using Basis Pursuit (or Matching

Pursuit) and NN ................................................................................................. 96

Figure 4.5: Sigmoid function [152]............................................................................ 98

Figure 4.6: A single output FFNN ........................................................................... 100

Figure 4.7: A multi output FFNN ............................................................................ 101

Figure 5.1: Experimental apparatus ......................................................................... 104

Figure 5.2: Test rig with V-belt load........................................................................ 105

Figure 5.3: Test rig without load.............................................................................. 105

Figure 5.4: Experimental apparatus ......................................................................... 106

Figure 5.5: Fault simulation of SKF 6205 ............................................................... 107

Figure 5.6: Fault simulation of KOYO 6201 RS ..................................................... 107

Figure 6.1: Simulated impulse signal 1y ................................................................. 112

Figure 6.2: Basis Pursuit TF plane of 1y ................................................................. 112

Figure 6.3: Best basis DWPA TF plane of 1y ......................................................... 113

Figure 6.4: Matching Pursuit TF plane of 1y .......................................................... 113

Figure 6.5: Simulated impulse signal 1y with noise................................................ 114

Figure 6.6: Basis Pursuit TF plane of 1y with noise................................................ 115

Figure 6.7: DWPA plane of 1y with noise .............................................................. 115

Figure 6.8: Matching Pursuit Plane of 1y with noise .............................................. 116

Figure 6.9. Simulated impulse signal 2y ................................................................. 117

Figure 6.10: Basis Pursuit TF plane of 2y ............................................................... 117

Figure 6.11: DWPA TF plane of 2y ........................................................................ 118

Figure 6.12: Matching Pursuit TF plane of 2y ........................................................ 118

Figure 6.13: Simulated impulse signal 2y with noise ............................................. 119

Figure 6.14: Basis Pursuit TF Plane of signal 2y with noise ................................... 120

Figure 6.15: DWPA TF plane of signal 2y with noise............................................. 120

LIST OF FIGURES

____________________________________________________________________

X

Figure 6.16: Matching Pursuit TF plane of signal 2y with noise ............................. 121

Figure 6.17: Vibration signal of a bearing with normal condition........................... 122

Figure 6.18: Basis Pursuit TF plane......................................................................... 122

Figure 6.19: Best basis DWPA TF plane.................................................................123

Figure 6.20: Matching Pursuit TF Plane.................................................................. 123

Figure 6.21: Vibration signal of a bearing with IRF................................................ 124



Figure 6.24: Matching Pursuit TF Plane.................................................................. 125

Figure 6.25: Vibration signal of a bearing with ORF.............................................. 126



Figure 6.28: Matching Pursuit TF plane .................................................................. 127

Figure 6.29: The time waveform and its Basis Pursuit TF plane obtained from the

bearing with 0.07 inch EDM IRF..................................................................... 131








bearing with a crack in inner race ....................................................................133

Figure 6.34: The time waveform and its Basis Pursuit TF plane of the bearing with

ORF: 0.07inch.................................................................................................. 133


ORF: 0.14 inch................................................................................................. 134


ORF: 0.21inch.................................................................................................. 134


LIST OF FIGURES

____________________________________________________________________

XI

ORF: crack ....................................................................................................... 135

Figure 6.38: The time waveforms and the Basis Pursuit denoised signals: (a), (b)

normal, (c), (d) with IRF (e), (f) with ORF...................................................... 136

Figure 6.39: Spectrum of the signal of a normal bearing......................................... 138

Figure 6.40: Spectrum of the signal of a bearing with IRF...................................... 138

Figure 6.41: Spectrum of the signal of a bearing with ORF .................................... 139

Figure 6.42: Spectrum of the signal of a bearing with REF..................................... 139

Figure 6.43: Feature vectors based on Spectrum of the signal of a normal bearing 140

Figure 6.44: Feature vector based on Spectrum of the signal of a bearing with IRF140

Figure 6.45: Feature vector based on spectrum of the signal of a bearing with ORF141

Figure 6.46: Feature vector based on Spectrum of the signal of a bearing with REF141

Figure 6.47: Spectrogram of the signal of a normal bearing.................................... 142

Figure 6.48: Spectrogram of the signal of a bearing with IRF................................. 142

Figure 6.49: Spectrogram of the signal of a bearing with ORF............................... 143

Figure 6.50: Spectrogram of the signal of a bearing with REF ............................... 143

Figure 6.51: Feature vector based on spectrogram of the signal of a normal bearing144

Figure 6.52: Spectrogram feature of the signal of a bearing with IRF .................... 144

Figure 6.53: Spectrogram feature of the signal of a bearing with ORF................... 145

Figure 6.54: Spectrogram feature of the signal of a bearing with REF ................... 145

Figure 6.55: The DWPA of the signal of a normal bearing..................................... 148

Figure 6.56: The DWPA of the signal of a bearing with IRF.................................. 148

Figure 6.57: The DWPA of the signal of a bearing with ORF ................................ 149

Figure 6.58: The DWPA of the signal of a bearing with REF................................. 149

Figure 6.59: Features based on wavelet packets: Mean Value ................................ 150

Figure 6.60: Features based on wavelet packets: Variance...................................... 150

Figure 6.61: Features based on wavelet packets: Skewness .................................... 151

Figure 6.62: Features based on wavelet packets: Kurtosis ...................................... 151

Figure 6.63: Features based on wavelet packets: Energy ........................................ 152

LIST OF FIGURES

____________________________________________________________________

XII

Figure 6.64: Features based on wavelet packets: Root Mean Square...................... 152

Figure 6.65: Features based on wavelet packets: Crest Factor ................................ 153

Figure 6.66: Features based on wavelet packets: Matched Filter ............................ 153

Figure 6.67: The Matching Pursuit of the vibration signal of a bearing under

condition: Normal ............................................................................................ 157


condition: ORF................................................................................................. 158

Figure 6.69: The Matching Pursuit of the vibration signal of bearing under condition:

IRF ................................................................................................................... 158


condition: REF ................................................................................................. 159

Figure 6.71: The Matching Pursuit (MP) coefficients of vibration signals of bearings

under conditions: (a) Normal (b) ORF (c) IRF (d) REF .................................. 159

Figure 6.72: The Matching Pursuit Features of Vibration Signals of bearings under

conditions: (a) Normal (b) ORF (c) IRF (d) REF ............................................ 160

Figure 6.73: The Basis Pursuit of the vibration signals of bearings under condition:

Normal ............................................................................................................. 163


ORF.................................................................................................................. 163


IRF ................................................................................................................... 164


REF .................................................................................................................. 164

Figure 6.77: The Basis Pursuit coefficients of vibration signals of bearings under


Figure 6.78: The Basis Pursuit features of vibration signals of bearings under


LIST OF TABLES

____________________________________________________________________

XIII

LIST OF TABLES

Table 2.1: An overview of frequency techniques and time-frequency techniques.... 34

Table 2.2 Neural Networks applied in diagnosing rotating machinery faults ........... 48

Table 3.1 Summary of Wavelet Families and Associated Properties (Manual of

wavelet toolbox in matlab)................................................................................. 73

Table 5.1 Fault specification for the bearings KOYO 6201 RS. ............................. 108

Table 5.2 Drive end bearing KOYO 6201 (Size in mm) ......................................... 108

Table 5.3 Drive end bearing: 6204 SKF, deep groove ball bearing......................... 108

Table 5.4 Drive end bearing: 6205-2RS JEM SKF, deep groove ball bearing (Size in

inches) .............................................................................................................. 109

Table 5.5 Fault Specifications for 6204-2RS JEM SKF (All dimension in inches) 109

Table 6.1 Fault severity specifications in Figures 6.29-37 ...................................... 136

Table 6.2 Signal to Noise Ratio of the Original and the BP Denoised signals ........ 137

Table 6.3 Classification performance of automatic diagnosis based on Spectrum and

Spectrogram ..................................................................................................... 146

Table 6.4 Performance of the single output FFNN using DWPA features.............. 154

Table 6.5 Performance of the multi output FFNN using DWPA features............... 155

Table 6.6 Classification performance of different procedures using Matching Pursuit160

Table 6.7 Classification performance of different procedures using Basis Pursuit . 166

NOMENCLATURE

____________________________________________________________________

XIV

NOMENCLATURE

AI Artificial Intelligence

ANN Artificial Neural Network

x(t) A time signal

( )ωF Fast Fourier Transform

STFT Short Time Fourier Transform

( )ftSPECx , Spectrogram

µ Mean value

sN Sampling length

ix i th sample of the series of vibration data

RMS Root Mean Square

σ Variance

E Energy

ψ Wavelet

cf Crest Factor

Mfrms Matched Filter

Ai(first) Amplitude of the i th frequency component of the first data set

S A scale parameter

U Translation

g(x) Activation function

P Vector

R Dimension

CWT Continuous Wavelet Transform

NOMENCLATURE

____________________________________________________________________

XV

DWPA Discrete Wavelet Packet Analysis

j Different levels of wavelets

k Number of wavelets in each level

E(s) Entropy

γφ Elements

Γ An over complete dictionary

γ Index of a setΓ

γα Coefficient of the elementγφ

m Order of decomposition

( )mR A residual

( )tγψ Wavelet atoms

( )ωψ γˆ Fourier transform of ( )tγψ

ξ Demodulation

nξ Frequency parameter

0ξ Constant

( )kpj ,,=γ Index of wavelet packet dictionary

Φ Dictionary

O(n) Complexity

MP Matching Pursuit

BP Basis Pursuit

BPFF Back Propagation for Feed Forward Networks

FFNN Feed Forward Neural Network

RNN Recurrent Neural Network

RBF Radial Basis Function

NOMENCLATURE

____________________________________________________________________

XVI

MLP Multi Layer Perceptron

SOM Self Organising Maps

LVQ Learning Vector Quantization

SVM Support Vector Machines

rd Diameter of rolling elements

cd Diameter of the cage

outd Diameter of the outer race

ind Diameter of the inner race

α Contact angle between the rolling elements and rolling surfaces

BPFI Ball-pass frequency on the inner race

BSF Rotational frequency of the rolling elements

BPFO Ball-pass frequency on the outer race

IRF Inner Race Fault

ORF Outer Race Fault

REF Rolling Element Fault

ABSTRACT

____________________________________________________________________

XVII

ABSTRACT

Today’s industry uses increasingly complex machines, some with extremely

demanding performance criteria. Failed machines can lead to economic loss and

safety problems due to unexpected production stoppages. Fault diagnosis in the

condition monitoring of these machines is crucial for increasing machinery

availability and reliability.

Fault diagnosis of machinery is often a difficult and daunting task. To be truly

effective, the process needs to be automated to reduce the reliance on manual data

interpretation. It is the aim of this research to automate this process using data from

machinery vibrations. This thesis focuses on the design, development, and

application of an automatic diagnosis procedure for rolling element bearing faults.

Rolling element bearings are representative elements in most industrial rotating

machinery. Besides, these elements can also be tested economically in the laboratory

using relatively simple test rigs.

Novel modern signal processing methods were applied to vibration signals collected

from rolling element tests to destruction. These included three advanced time-

frequency signal processing techniques, best basis Discrete Wavelet Packet Analysis

(DWPA), Matching Pursuit (MP), and Basis Pursuit (BP). This research presents

the first application of the Basis Pursuit to successfully diagnosing rolling

element faults. Meanwhile, Best basis DWPA and Matching Pursuit were also

benchmarked with the Basis Pursuit, and further extended using some novel

ideas particularly on the extraction of defect related features.

The DWPA was researched in two aspects: i) selecting a suitable wavelet, and ii)

choosing a best basis. To choose the most appropriate wavelet function and

decomposition tree of best basis in bearing fault diagnostics, several different

wavelets and decomposition trees for best basis determination were applied and

comparisons made. The Matching Pursuit and Basis Pursuit techniques were effected

by choosing a powerful wavelet packet dictionary. These algorithms were also

studied in their ability to extract precise features as well as their speed in achieving a

result. The advantage and disadvantage of these techniques for feature extraction of

bearing faults were further evaluated.

ABSTRACT

____________________________________________________________________

XVIII

An additional contribution of this thesis is the automation of fault diagnosis by

using Artificial Neural Networks (ANNs). Most of work presented in the current

literature has been concerned with the use of a standard pre-processing technique -

the spectrum. This research employed additional pre-processing techniques such as

the spectrogram and DWPA based Kurtosis, as well as the MP and BP features that

were subsequently incorporated into ANN classifiers. Discrete Wavelet Packets and

Spectra, were derived to extract features by calculating RMS (root mean square),

Crest Factor, Variance, Skewness, Kurtosis, and Matched Filter. Certain spikes in

Matching Pursuit analysis and Basis Pursuit analysis were also used as features.

These various alternative methods of pre-processing for feature extraction were

tested, and evaluated with the criteria of the classification performance of Neural

Networks.

Numerous experimental tests were conducted to simulate the real world environment.

The data were obtained from a variety of bearings with a series of fault severities.

The mechanism of bearing fault development was analysed and further modelled to

evaluate the performance of this research methodology.

The results of the researched methodology are presented, discussed, and evaluated in

the results and discussion chapter of this thesis. The Basis Pursuit technique proved

to be effective in diagnostic tasks. The applied Neural Network classifiers were

designed as multi layer Feed Forward Neural Networks. Using these Neural

Networks, automatic diagnosis methods based on spectrum analysis, DWPA,

Matching Pursuit, and Basis Pursuit proved to be effective in diagnosing different

conditions such as normal bearings, bearings with inner race and outer race faults,

and rolling element faults, with high accuracy.

Future research topics are proposed in the final chapter of the thesis to provide

perspectives and suggestions for advancing research into fault diagnosis and

condition monitoring.

Keywords: Rolling element bearing, Fault diagnosis, Feature extraction, Discrete

Wavelet Packet Analysis (DWPA), Matching Pursuit, Basis Pursuit, Neural Network.

CHAPTER 1. GENERAL INTRODUCTION

____________________________________________________________________

1


1.1 Introduction

Today’s industry uses increasingly complex rotating machines, some with extremely

demanding performance criteria. Attempting to diagnose faults in these systems is

often a difficult and daunting task for operators and plant maintainers. Machine

failure can lead to economic loss and safety problems due to unexpected and sudden

production stoppages.

Rotating machinery is a common class of machinery in industry. The root cause of

faults in rotating machinery is often faulty rolling element bearings. One way to

increase operational reliability and thereby increase machine availability is to

monitor faults in these bearings.

Fault diagnosis techniques are crucial for monitoring conditions in bearings. Current

fault diagnosis techniques have a variety of limitations. Methods that are more

effective need to be researched and developed for industrial machinery diagnostic

activities.

This work presented in the adjoining sections of this thesis is a thorough

investigation into application of selected wavelet based automatic fault diagnostic

techniques to non-stationary signals collected from bearings.

This chapter provides the description of the objectives, significance, and scope of

this research. The originality of this work and its contribution to the overall field of

fault diagnosis is also presented.

1.2 Objectives

The main aim of this research was to develop novel signal processing methods to

enable automated diagnosis of rotating machinery. The work program comprised

synthesis of techniques from the fields of pattern recognition and Neural Networks,

with application to condition monitoring. Further detailed objectives of the research

were:-

(1) Novel techniques in the field of signal processing were developed and applied in

feature extraction for machinery faults diagnosis, which enabled features to be


____________________________________________________________________

2

extracted effectively and clearly to reduce the dependency of fault diagnosis on well-

trained technicians. In particular, wavelet packets based methods were investigated,

improved, and applied from the perspective of:

• attaining better time-frequency resolution, which assists in avoiding missing

information and increasing information accuracy;

• wavelets that are suitable for vibration analysis. These wavelets were selected on

the basis of their resemblance to real characteristics of vibration signals of

rotating machinery.

Transient characteristics of vibration data of rolling element bearings are usually

represented more comprehensively using wavelet based methods than conventional

time or frequency analysis techniques. In this study, features in signals were

extracted and selected out, which were often impulsive components in signals. These

impulsive components were generated by defects on the inner and outer races, and

rolling elements of bearings. The wavelet based methods were targeted to extract

these components from an overall signal containing noisy components. The

resonance and defect related frequencies of bearings were distinguishable for rotating

machinery fault diagnosis.

(2) Automated fault diagnosis techniques were developed to try to reduce the

dependency on human interpretation for the task of fault diagnosis. Broadly, Neural

Networks were used to classify faults in bearings. Algorithms derived from

Objective (1) were subsequently combined with Neural Networks in the development

of the automatic diagnosis procedure.

1.3 Significance

Activities for condition monitoring and fault diagnosis such as observation and

periodic maintenance are labour-intensive and often unreliable. Some faults, for

instance, lack a flexible, an easy to use and efficient apparatus for fault detection and

can only be distinguished through the observation or audition of experts. This makes

maintenance costly and increases the possibility of undetected faults. More seriously,

a run-to-failure program and periodic maintenance often unnecessarily interrupt

production, thus dramatically reducing industry profits. Machines need to be


____________________________________________________________________

3

monitored during the production process to improve machine operation reliability

and reduce unavailability. Therefore, conducting effective condition monitoring

brings significant benefits to industry [1, 2]. The field of condition monitoring has

attracted the increasing attention of engineers in the quest to improve the economic

efficiency and safety of machinery operation. However, condition monitoring

requires effective fault diagnosis, which is a labour-oriented practice to this day.

Without effective and efficient diagnosis, one is unable to make reliable prediction of

lead-time to failure.

The automation of this labour-oriented process of diagnosis by implementing

intelligent diagnosis strategies is a natural procedure, which helps experts and

technicians to be relieved from this relatively expensive task. This research has the

following significance:

(1)The diagnosis of machine faults conventionally requires human interpretation for

the large amounts of data collected from an operating machine. This research

automated and facilitated the procedure of machine fault diagnosis. Humans can

therefore be relieved from laborious work. Consequently, the productivity and life of

diagnosed machines can be improved, and profits for industry increased.

(2)Automated diagnosis renders safe and reliable asset management. Labour-oriented

and unpredictable faults are the difficulties that confront asset managers. Automated

diagnosis requires less labour and makes faults predictable, thus reducing the effort

required to assign staff to monitor machines. In addition, automated diagnosis can

greatly assist the task of prognosing condition, which is a necessary step towards

residual life estimation.

1.4 Scope of Research

This thesis mainly presents research into novel methods for vibration based condition

monitoring fault diagnosis of rotating machinery. It is acknowledged that other

techniques for condition monitoring and fault diagnosis such as lubrication,

temperature, and operation performance, are also necessary technologies in condition

monitoring, but lie outside the scope of this research.


____________________________________________________________________

4

The focus of this work was based on rotating machinery faults and rolling element

bearings in particular.

The new methods in this research were developed to represent non-stationary

vibration signals in the time-frequency domain. The time-frequency analysis was

mainly studied in terms of making the most use of wavelets in the application to fault

diagnosis of rolling element bearings. The methods included improved Discrete

Wavelet Packet Analysis (DWPA), Matching Pursuit, and Basis Pursuit - a novel

time-frequency analysis. Initial evaluation of these techniques were conducted using

simulated signals.

The time-frequency analysis methods were used to extract fault related features.

These features were presented in a distinguishable manner for the purpose of direct

interpretation of rolling element bearing faults. Other factors, which were irrelevant

to bearing faults, were not pursued.

Noise removal was also investigated using the Basis Pursuit. The new method and

improved methods were tested using both the simulated signals and experimental

data.

Neural Network techniques were applied to automatically detect and diagnose

bearing faults including Outer Race Fault (ORF), Inner Race Fault (IRF), and

Rolling Element Fault (REF).

Feed Forward Neural Networks (FFNN) were designed and tested to classify the

bearing faults among various types of Neural Networks. Other types of Neural

Networks such as radial basis function, and Self Organised Maps Neural Networks

were not attempted in this research. The reasons for this selected are described in

Section 4.5.

1.5 Originality of Research

The originality and contribution of this research are as follows:

(1) The advancement of time-frequency analysis, as applied to the vibration analysis

of rolling element bearings. Specifically:-

• This thesis is the first application of Basis Pursuit to fault diagnosis.


____________________________________________________________________

5

• The DWPA procedure was improved and benchmarked with Basis Pursuit;

• The Matching Pursuit technique was refined and benchmarked with Basis

Pursuit.

This research work resulted in the following publications:

• “Time Frequency Techniques for Fault Diagnosis of Rolling Element Bearings”,

in the Proceeding of 10th Asia-Pacific Vibration Conference (APVC, 2003): pp.

789-794. ISBN: 06464 42853;

• “Feature extraction of faulty bearings vibration via Basis Pursuit”, in the

Proceedings of 10th Asia-Pacific Vibration Conference, (APVC, 2003): pp. 795-

800. ISBN: 06464 42853;

• “Fault diagnosis of rolling element bearings using basis pursuit”, to be published

in the Journal of Mechanical Systems and Signal Processing, Volume 19 (2),

2005: pp. 341-356.

(2) Development of additional features for time-frequency analysis:-

• Statistical parameters derived from wavelet packets such as Mean, Variance,

Root Mean Square (RMS), Skewness, Kurtosis, and Crest Factor;

• Wavelet packet based Matched Filter parameter.

• Selected components derived coefficients of Matching Pursuit; and Basis Pursuit.

(3) The design and evaluation of Neural Network classifiers and their performance in

automatic diagnosis. These include:

• The design of two architectures for Feed Forward Neural Networks;

• An evaluation of the performance of the Neural Networks on a variety of features

derived from DWPA, Matching Pursuit and Basis Pursuit. All of this work have

been published in:

o “Bearing fault classification using wavelet features”, in the Proceedings

of Intelligence Maintenance System 2004 International Conference.

Arles, France: Section 1-D;


____________________________________________________________________

6

o “Matching Pursuit features based Neural Network pattern recognition of

rolling bearing faults”, in the Proceedings of the International Conference

of Maintenance Societies, 2004. Sydney, Australia: paper 74:pp. 1-8;

o “Basis Pursuit feature based pattern recognition of rolling bearing faults”,

in the Proceeding of Intelligence Maintenance System 2004 International

Conference, 2004. Arles, France: Section 2-C.

(4) Presenting an integrated application of the various wavelet based techniques

1.6 Organisation of Thesis

This thesis starts with a brief introduction to fault diagnosis of rotating machinery.

The objectives, significance, and originality of this research is covered in Chapter 1.

Chapter 2, provides a comprehensive literature review on vibration feature extraction

and AI techniques for condition monitoring and fault diagnosis of rotating

machinery. The feature extraction of vibrations, was categorized as:

• time domain feature extraction techniques,

• frequency domain feature extraction techniques, and

• time-frequency feature extraction techniques.

The advantages and disadvantages of these techniques are presented. The review also

provides the application of a variety of AI techniques such as Neural Networks,

fuzzy logic, expert systems, and hybrid AI techniques.

Chapter 3, provides the analysis and interpretation of vibration analysis techniques

for better representation of vibration signals in terms of diagnostic results. It covers

the Basis Pursuit, the best basis DWPA and the Matching Pursuit techniques.

In Chapter 4, the automated diagnosis procedure based on spectrum and time-

frequency analysis techniques is presented. The design and evaluation of Neural

Network classifiers for the automatic diagnostic schema is presented.

Chapter 5 presents the simulation and experimental phases of this thesis.

In Chapter 6, the results of the application of these techniques to the simulated and

experimental data are presented and discussed.


____________________________________________________________________

7

Chapter 7 presents the conclusions of this work. Some suggestions to extend the

work contained in this thesis are presented in chapter 8.

CHAPTER 2. LITERATURE REVIEW

____________________________________________________________________

8


2.1 Introduction

The background of fault diagnosis of rotating machinery is introduced in this chapter.

A literature of techniques for vibration based fault diagnosis is reviewed. It includes

the research work done in the past and presented in publications such as books,

conference articles, journal papers and reports. The variety of methods used, are

discussed and analysed with critical comments. Based on the overall review of

techniques for diagnosing machinery faults, some conclusions are drawn from the

literature.

2.2 Background

2.2.1 Condition Monitoring and Fault Diagnosis

Machine availability is a major concern in industry. In general, machine life can be

described using the bathtub curve (as shown in Figure 2.1). The first phase of a

machine is the preparation phase, that is, a period a new machine experiences until it

commences normal operation. The second phase is the normal operation phase,

which is followed by the failure phase. The failure phase starts with incipient faults

and ends in the breakdown of the machine. It is necessary to monitor the condition of

a machine during the period of machine life. Condition monitoring is a field of

technical activity in which selected parameters associated with machinery operation

are observed for determining integrity [3]. Condition monitoring is essential for

maintenance management in industry, which usually involves five distinct phases

such as detection of fault, diagnosis of fault, prognosis of fault progression,

prescription for treatment of a problem, and post mortem.

Fault diagnosis is a critical component for condition monitoring and mainly

ascertains location, cause, and severity of a machine fault.


____________________________________________________________________

9

Figure 2.1: Machine life bathtub curve

Condition monitoring includes breakdown maintenance, routine maintenance,

condition-based maintenance, and pro-active maintenance [2]. To date, various

condition-monitoring techniques have been developed based on theory from areas

such as dynamics, industrial noise and vibration, tribology, and non-destructive

techniques of moving structures and rotating machinery [1, 2]. In a recent ISO

working party (ISO 1991), it was identified that the main techniques for machine

condition monitoring are:

• Vibration measurements;

• Tribological measurements;

• Electrical measurements;

• Process and performance measurements; and

• Non-destructive testing.

These measurements are analysed as wear debris, vibrations, temperature,

performance, and expired life [4]. As a general principle, the degree of deterioration

is detected by the ‘level’ of a measurement and its change with time, while the cause

is generally indicated by its ‘shape’.

These five basic techniques can be used as a guideline for planning the condition

monitoring of industrial plant or machines, since it provides a check-list to ensure

that all possible techniques have been considered.

All these techniques require some form of transduction and statistical parameter or

advanced signal processing for diagnosis and life estimation of systems. Among the

condition monitoring techniques, vibration condition monitoring is popular for its

Failure

Preparation

Normal

Ris

k

Time


____________________________________________________________________

10

versatility and its effectiveness. There are many mechanical problems associated

with vibrations. Common problems are: (a) imbalance of rotating parts; (b) eccentric

components; (c) misalignment of couplings and bearings; (d) bent shafts; (e)

component looseness; (f) worn or damaged gears; (g) worn drive belts and drive

chains; (h) defective anti-friction bearings; (i) torque variations; (j) electromagnetic

forces; (k) aerodynamic forces; (l) hydraulic forces; (m) resonance; and (n) rubbing

[5]. All of the above problems can cause vibration. Meanwhile, vibration in machines

causes periodic stresses in machine parts, which lead to fatigue failure. If the motion

due to vibration is severe enough, it can cause machine parts to come into unwanted

contact, causing wear or damage [6]. Vibration of machines is a parameter, which

often indirectly represents the health of machines and is generally capable of

detecting more kinds of machine faults when compared with the other techniques.

Vibration monitoring also has advantages as a non-destructive, clean, relatively

simple and cost effective technique [7].

Vibration monitoring of rolling element bearings are typically conducted using a case

mounted transducer: an accelerometer, velocity pickup, and sometimes a

displacement sensor. Acceleration signals, obtained from case mounted sensors,

emphasize high frequency sources, while displacement signals emphasize lower

frequency sources, with velocity signals falling between the extremes.

Figure 2.2: Overall levels of a bearing of continuing phases of failure

2 3 4 1

1 Filtered High Frequency 2 Acceleration 3 Velocity 4 Displacement

Phase 0 Phase I Phase II Phase III Phase IV

Am

plitu

de

Time


____________________________________________________________________

11

Figure 2.2 [8] depicts the overall amplitude levels obtained from a bearing as it

progresses through continuing phases of failure.

Figure 2.3 illustrates spectral phenomenon when bearings have different severities of

faults [8]. Generally, bearings fault progression can be described in four stages from

normal condition to most serious damage. The spectra of bearings with faults of

different stages contains harmonics of rotating frequencies, defect frequencies,

natural resonance frequency, and high frequencies in the ultrasonic region.

Stage I exhibits very high frequency content in the Spike Energy region. This zone

lies in the ultrasonic region and requires a sensor specifically designed to detect these

components. Physical inspection of the bearing at this stage may not show any

identifiable defects.

Stage II begins to generate signals associated with natural resonance frequencies of

the bearing parts as bearing defects begin to "ring" the bearing components. A

notable increase in zones 3 and 4 is normally recorded at this stage. Visual inspection

will show defects at this stage.

The Stage III condition shows the fundamental bearing defect frequencies present.

Harmonics of defect frequencies may be present depending upon the quantity of

defects and their dispersal around the bearing races. The harmonic frequencies will

be modulated, or side banded, by the shaft speed. Zone 4 signals continue to grow

throughout this stage.

Stage IV is the last condition before catastrophic failure of the bearing. This stage is

associated with numerous modulated fundamental frequencies and harmonics

indicating that the defects are distributed around the bearing races. The internal

clearances are greater and allow the shaft to vibrate more freely with associated

increases in the shaft frequencies associated with balance or mis-alignment due to the

increased degradation of the bearing. During the later phases of stage IV, the bearing

fundamental frequencies will decline and be replaced with random noise floor or

"hay stack" at higher frequencies. Zone 4 signal levels will actually decrease with a

significant increase just prior to failure.


____________________________________________________________________

12

Figure 2.3: Spectral characteristics of different stage bearing faults

2.2.2 Artificial Intelligence

In an era when information technology and inter-disciplinary approaches need to be

employed to assess the integrity of systems and machinery, condition monitoring has

begun to incorporate these technologies to assist with the interpretation and

diagnostics of signals particularly with integration of artificial intelligence

techniques. Artificial intelligence (AI) can be defined as a branch of computer

science that is concerned with automation of intelligent behaviour [9]. The goal of

Artificial Intelligence is the development of paradigms or algorithms that require

machines to perform cognitive tasks, at which humans are currently better. An AI

system must be capable of doing three things: (1) store knowledge; (2) apply the

knowledge stored to solve problems; and (3) acquire new knowledge through

experience. An AI system has three key components: representation, reasoning, and

learning [10]. AI techniques have been widely applied in the engineering area which

includes expert systems, fuzzy logic and Neural Networks.

Zone 1 Zone 2 Defect frequencies

Zone 3 Natural resonance frequency

Zone 4 High frequency

Zone 1

Zone 1

Zone 1

Normal

Stage I

Stage II

Stage III Zone 1

Stage IV





1xR

PM

2

xRP

M

3xR

PM

1xR

PM

2

xRP

M

3xR

PM

1xR

PM

2

xRP

M

3xR

PM

1xR

PM

2

xRP

M

3xR

PM

1xR

PM

2

xRP

M

3xR

PM





Zone 2 Defect frequencies




____________________________________________________________________

13

Expert systems (knowledge based systems) is a branch of AI techniques and

originated in 1965. An expert system generally consists of four essential

components: a knowledge base, an inference engine, a knowledge-acquisition

module, and an explanatory interface [11]. Expert systems can capture and retain the

expertise of a skilled person. This is often necessary, particularly in the condition

monitoring and fault diagnosis field, because, for example, that expert is about to

retire or change jobs. Expert systems for condition monitoring and fault diagnosis

have been built up to address the problems such as lack of staff, lack of correct skill

levels, lack of time to perform tasks, and inconsistent performance of tasks [5] .

Fuzzy logic was first introduced by Zadeh [12] and is one kind of logic satisfied with

the imprecise nature of reality. Traditional logic allows only for “true” and “false”

states. However, knowledge under the logic with alternative values will be partial

because imprecision is ubiquitous and realistic. Factors and variables should also be

permitted between “true” and “false”. A fuzzy set is a set of elements within an

interval and with a membership function on the interval [13, 14]. A fuzzy set is the

cornerstone of a non-additive uncertainty theory, namely possibility theory, and is a

versatile tool for both linguistic and numerical modelling: known as fuzzy rule-based

systems. Several works now combine concepts on fuzzy sets with other scientific

disciplines. In the field of machine condition monitoring, fuzzy logic can be used to

represent machine faults more intuitively than is possible for conventional (precise)

diagnosis. It has the potential to represent the imprecise nature of machine diagnosis.

For example, a machine fault can be represented as seriously or lightly damaged

rather than damaged or not.

Pattern recognition is the method of assigning data into one of a number of pre-

specified classes based on the extraction and processing of significant features [15].

A conceptual representation of a pattern recognition problem is shown in Figure 2.4

[16]. It is a key component in identifying failure models which are induced from the

monitored system. The three approaches to pattern recognition are:

• Statistical pattern recognition (decision-theoretic);

• Syntactic (linguistic or structural); and

• Neural Network (black box).


____________________________________________________________________

14

Figure 2.4: Conceptual representation of a pattern recognition problem

Neural Networks are an important technique in the field of artificial intelligence,

which has been motivated right from its inception by the recognition that the human

brain computes in an entirely different way from conventional digital computers

[10]. The brain is a highly complex, nonlinear, and parallel computer (information-

processing system). It has the capability to organize its structural constituents, known

as neurons, so as to perform certain computations many times faster than the fastest

digital computer in existence today. Neural Networks are widely adopted for their

learning ability, which could be usefully applied in machine diagnosis. For example,

an eligible Neural Network can record new classes of faults so that they can be

utilized when the same fault happens again. Without the learning ability, a fault

diagnosis system may classify a new class of faults into an existing class thus leading

to a wrong judgement.

Jang [17] depicted the architecture of a Neural Network (as shown in Figure 2.5),

which was typically organized in layers. Layers were made up of a number of

interconnected “nodes” which contained an “activation function”. Patterns were

presented to the network via the “input layer”, which communicated to one or more

“hidden layers” where the actual processing was done via a system of weighted

“connections”. The hidden layers were then linked to an “output layer” where the

answer was provided [17, 18].

Figure 2.5: Architecture of a Neural Network

Physical

variables

Data acquisition

Data pre-processing

Decision classification

Phase I Phase II

Phase III


____________________________________________________________________

15

2.3 Literature on fault diagnosis of rotating machinery

Generally, vibration based fault diagnosis can be conducted conventionally in the

following phases: data collection, feature extraction, and fault detection and

identification. Intelligent diagnosis procedure is shown in Figure 2.6, which begins

with the act of data collection (obtaining signals using transducers from machinery to

be diagnosed) followed by feature extraction (extracting characteristics whose values

quantitatively represent faults). As an example of extracted features, in vibration

analysis, impact pulses due to defect are usually regarded as an important

characteristic (feature) for machinery fault diagnosis.

Figure 2.6: Fault diagnosis -an overview

Various mechanical rotating parts, e.g. rolling bearings, gears, pumps, chains, belts,

and electric rotators are very important diagnostic objects in industry. Among these

machinery parts, rolling bearings are most often used mechanical parts. There are

many problems which can be related to bearings and their vibrations. Installation

problems are relatively common and are often caused by improperly forcing the

bearing onto the shaft or in the housing. Misalignment of bearings is a common

result of defective bearing installation. There are four ways of misalignment

including Out-of-Line, shaft deflection, cocked or tilted outer race, and cocked or

tilted inner race [19]. The problems can cause bearing failure. Usually the failure is

accelerated by overloading, over speeding, or starving the bearings of lubricants.

Bearing defects may be categorized as ‘distributed’ or ‘local’. Distributed defects

include surface roughness, waviness, misaligned races and off-size rolling elements.

Distributed defects are caused by manufacturing error, improper installation or

abrasive wear. The variation in contact area between rolling elements and raceways

due to distributed defects results in an increased vibration level. Localized defects

include cracks, pits and spalls in the rolling surfaces. The presence of a defect causes

Data

collection

Feature

Extraction

Fault

detection and

identification


____________________________________________________________________

16

a significant increase in the vibration levels, which are used as the features to be

sought for diagnosing different machinery faults.

Feature extraction techniques for diagnosing rotating machinery faults are

widespread and can range from statistical to model based techniques and comprises a

variety of signal processing algorithms, which include wavelet transforms. Existing

techniques for extracting vibration features, which are critical and significant for

reliable fault diagnosis, are reviewed in section 2.1. Fault detection and identification

is a subsequent step and recently incorporates artificial intelligence techniques.

Conventionally, faults are detected and identified by using experts’ interpretation of

extracted features, and sometimes even direct inspection of running machines. The

advances of fault identification techniques enhance efficacy and reliability of fault

diagnosis. Techniques for fault detection and identification using artificial

intelligence techniques are reviewed in section 2.3.1. With the advancement of

modern fault diagnosis techniques, researchers synthetically employ techniques to

achieve effectiveness of fault diagnosis. Several feature extraction, artificial

intelligence techniques, or their combinations may be used together for one fault

diagnostic task. Diagnosis using both techniques for feature extraction and artificial

intelligence techniques are reviewed in section 2.3.2.

2.3.1 Feature Extraction for Fault Diagnosis of Rotating Machinery

A variety of features have been sought to assist fault diagnosis in literature. There are

many potential advantages obtained when performing feature extraction in the

application of Condition Monitoring and Fault Diagnosis (CMFD). From the point of

view of computing complexity, modern machine CMFD can have a number of inputs

obtained from collected digital signals, which often creates an information overload

for the operators. The information overload makes it difficult to gain an overview of

the machine, and determine exactly what is happening. Using feature selection in

conjunction with classifiers allows the classifiers to act as a data generalization layer,

interpreting the raw data to provide information that can be digested easily, and used

to make meaningful assessments of the condition of a machine. The effective

features can allow machine condition to be ascertained more reliably than by human

operators, and allows further high-level integration with artificial intelligence


____________________________________________________________________

17

techniques such as expert systems, which can provide wide-scale monitoring of the

machine as a whole, rather than the sub-assemblies that the classifiers monitor.

The feature extraction phase is critical in fault diagnosis practice. The purpose of

feature extraction is twofold; firstly, feature extraction is an attempt to reduce the

dimensionality of the data presented to the classifier or human inspection, without

diminishing the content presented in the data. Secondly, feature extraction is utilized

to turn raw data into the information that the classifier can use. The vibration data as

sampled will consist of several hundreds, or even thousands of data points.

To train a classifier to deal with raw data always requires a very large training time,

makes a network extremely complex, and sometimes leads to failure of the

classification. It also makes the generalisation of the network fairly poor, as so many

input factors will make it difficult for the classifier to determine useful relationships

between inputs, and consequently, to generalize effectively. This effect, known as the

curse of dimensionality, can be dealt with using feature extraction, which is simple

and effective.

To extract features from raw data, numerous vibration analysis techniques have been

applied to fault diagnosis of rotating machinery. Mathew and Alfredson [20] gave a

review of vibration monitoring techniques in time and frequency domains and their

results on rolling element bearings. McFadden, Smith [21-23] and Kim [24-27]

included classical non-parametric spectral analysis; principal component analysis;

joint time-frequency analysis; the discrete wavelet transform; and change detection

algorithm based on residual generation Lebold and McClintic [28] reviewed

statistical methods for extracting vibration features when diagnosing gearbox. They

made an attempt to define and unify statistical technique terms, establish the

preprocessing needed for each feature, and provide the details needed to produce

consistent results. They categorized features into five different groups based on their

preprocessing needs when diagnosing gearboxes. They were: 1) Raw signals (RAW)

which includes Root Mean Square (RMS), Kurtosis, Delta RMS, Crest Factor,

enveloping and demodulation, 2) Time synchronous averaged signal (TSA) which

includes FM0 and the Comblet, 3) Residual signal (RES) include NA4 and NA4*, 4)

Difference signal (DIF) which includes FM4, M6A, and M8A, and 5) Band-pass

mesh signal (BPM) which includes NB4.


____________________________________________________________________

18

Tandon and Choudhury studied a review on vibration and acoustic measurement

techniques for the detection of defects in rolling element bearings [29]. In the time

domain, RMS, Kurtosis and a shock pulse method have been analysed. In the

frequency domain, how to apply Fast Fourier Transform (FFT) has been explained.

Power Cepstrum is defined as the logarithm of power spectrum. Adaptive Noise

Cancelling (ANC) technique, envelop detection or the high-frequency resonance

technique (HFRT) are important signal processing techniques. Chow [30] provided a

brief review of model-based approaches and signal processing approaches on motor

fault detection and diagnosis.

A review of feature extraction techniques are updated according to signals under

time domain, frequency domain, and the combination of time and frequency domain

(as shown in Figure 2.7). These categories are treated and reviewed separately in the

following sections.

Figure 2.7: An overview of feature extraction techniques

2.3.1.1 Time domain Feature Extraction Techniques

Vibration signals are initially obtained as a series of digital values representing

proximity, velocity, or acceleration in the time domain. The time waveforms can be

processed to achieve diagnostic objectives. Certain features such as statistical

parameters can be signified using time domain vibration analysis techniques. The

machine faults can be distinguished using the quantitative representation of time

domain features. This section includes research appearing in literature, and reviews

vibration techniques in the time domain for various types of rotating machinery and

categorises these techniques into the following groups (as shown in Figure 2.8).

Feature extraction

Time domain Frequency domain Time and frequency domain


____________________________________________________________________

19

Figure 2.8: An overview of time domain feature extraction techniques

(1) Statistical parameters, which include Root Mean Square (RMS), Mean, Variance,

Skewness, Kurtosis, and Crest Factor

Statistics is an area, which can provide many ideas for vibration analysis in fault

diagnosis of machinery. Statistical analyses of vibration signals have proved to be

useful in detecting machinery faults. Tandon [29] showed that the probability density

function is correlated with bearing defects. The probability density of acceleration of

a bearing in good condition has a Gaussian distribution, whereas a damaged bearing

results in non-Gaussian distribution with dominant tails, because of a relative

increase in the number of high levels of acceleration. Andrade [31] proposed a

comparison of the Cumulative Density Function (CDF) of a target distribution with

the CDF of a reference distribution and used the likelihood to successfully detect

gear tooth fatigue crack. Mathew and Alfredson [20] also reported obtaining a near-

Gaussian distribution for some damaged bearings. Instead of studying the probability

density curves, it is often more informative to examine the statistical moments of the

data, defined as

∫+∞

∞−

== mnxPxM nx ,,3,2,1)( L (2.1)

Filter based methods Raw signals

Statistical parameters

Time domain

Stochastic methods and other advanced methods

Time Synchronous Averaged Signal (TSA) based methods

Time synchronous averaged (TSA) signal, Residual signal (RES), and Difference

Demodulation, Prony model, and Adaptive noise cancelling

Chaos, Blind deconvolution, Thresholding, and Autoregressive model based method

Root Mean Square (RMS), Mean, Variance, Skewness, Kurtosis, and Crest Factor


____________________________________________________________________

20

Where P(x) is the probability density function of instantaneous amplitude x. The first

and second moments are well known, being the mean value and the Variance,

respectively. The third moment normalized with respect to the cube of standard

deviation is known as the coefficient of ‘Skewness’. Kurtosis is defined as the fourth

moment of the distribution and measures the relative peakedness or flatness of a

distribution as compared to a normal distribution. Kurtosis provides a measure of the

size of the tails of distribution and is used as an indicator of major peaks in a set of

data. As rotating machinery faults present themselves, Kurtosis should signal an error

due to the increased level of vibration. Kurtosis has been applied to diagnosing

bearing, and gearbox faults [32].

RMS and Delta RMS

The root mean square (RMS) value of a vibration signal is a time analysis feature,

which is the measure of the power content in the vibration signature. This feature is

good for tracking the overall noise level, but it will not provide any information on

which component is failing. It can be very effective in detecting a major out of

balance in rotating systems.

Delta RMS is the difference between the current RMS value and the previous. The

RMS are value and Crest Factor have been applied in diagnosing bearings and gears

[29]. This feature can be very effective when detecting an imbalance in rotating

machinery. The most basic approach to measuring defects in the time domain is to

use the RMS approach, which is often not sensitive enough to detect incipient faults

in particular.

Kurtosis

Kurtosis is defined as the fourth moment of the distribution and measures the relative

peakedness or flatness of a distribution as compared to a normal distribution.

Kurtosis provides a measure of the size of the tails of distribution and is used as an

indicator of major peaks in a set of data. As a gear wears this feature should signal an

error due to the increased level of vibration.


____________________________________________________________________

21

Crest Factor

The RMS level may not show appreciable changes in the early stages of gear and

bearing damage. A better measure is to use “Crest Factor” which is defined as the

ratio of the peak level of the input signal to the RMS level. Therefore, peaks in the

time series signal will result in an increase in the Crest Factor value. Crest Factor

may reach between 2 and 6 in normal operations. A value above 6 is usually

associated with machinery problems. This feature is used to detect changes in the

signal pattern due to impulsive vibration sources such as tooth breakage on a gear or

a defect on the outer race of a bearing.

The RMS, peak value, Kurtosis and Crest Factor have been combined with a high

frequency resonance technique and an adaptive line enhancer to detect and localize

the damage in rolling bearing [33].

(2)Time synchronous averaging based methods, which include Time Synchronous

Averaged (TSA) signal, residual signal (RES), and difference signal (DIFS)

The TSA signals are the signals obtained by time synchronous averaging of the

initial data and reducing redundant noise. The repetitive signals after TSA can

indicate the information related to the faults, which need to be diagnosed. Time

synchronous averaged signal (TSA) includes FM0 and the Comblet, Residual signal

(RES) included NA4 and NA4*, Difference signal (DIF) includes FM4, M6A, and

M8A, Band-pass mesh signal (BPM) includes NB4. The TSA including FM0 and

Comblet [28] requires knowing the repetitive frequency of the desired signal such as

defect frequencies of rolling bearings, gears, and shafts. Synchronous averaged

signals were utilized to diagnose faults in rolling bearings and gears successfully [34-

36].

Residual signals (RES) [28] was used for diagnosing gear faults. RES consists of

time synchronous averaged signal with primary meshing and shaft components along

with their harmonics removed. RES may be system dependent. Difference signals

(DIF) were calculated by removing the regular meshing components from the time


____________________________________________________________________

22

synchronous averaged signal. The DIF signals were used to diagnose gearbox faults

effectively [37].

FM0 is a relatively simple method used to detect major changes in the meshing

pattern. Major tooth faults typically result in an increase of the peak-to-peak signal

levels, but do not change the meshing frequency. FM0 is defined as the peak-to-peak

level of the TSA signal divided by the sum of the amplitude at the gear-mesh

frequency and its corresponding harmonics. The peak-to-peak level remains constant

for heavy wear, while the meshing frequency decreases, causing the FM0 parameter

to jump up. The above situations result in a large increase in the FM0 parameter.

However FM0 is not a good indicator for minor tooth damage. The equation for FM0

is:

( )∑=

= n

iifA

PPAFM

1

0 (2.1)

where PPA is the peak-to-peak amplitude of the time synchronous averaged

waveform and ( )ifA is the amplitude of the gear-mesh fundamental and harmonics

in the frequency domain.

NA4

NA4 was developed to detect the onset of damage and to continue to react to this

damage as it spreads and increases in magnitude. NA4 is determined by dividing the

fourth statistical moment of the residual signal by the current run time averaged

Variance of the residual signal, raised to the second power. The equation for NA4 is

( )

( )2

1 1

2

1

4

14

−

−=

∑ ∑

∑

= =

=

m

j

N

ijij

N

ii

rrm

rrNA (2.2)

where r is the residual signal, r is the mean value of the residual signal, N is the

total number of data points in the time record, and m is the current time record

number in the run ensemble.


____________________________________________________________________

23

NA4*

NA4* (or ENA4) was developed as an enhanced version of NA4, and was expected to

be more robust when progressive damage occurs. This added robustness is

incorporated into NA4* by normalizing the fourth statistical moment with the

residual signal Variance for a gearbox in good condition instead of the running

Variance, which is used for NA4. The equation for NA4* follows:

( )

( )22

1

4

~*4M

rrNA

N

ii∑

=

−= (2.3)

where r is the residual signal, r is the mean value of residual signal, N is the total

number of data points in time record, and 2~

M is the Variance of the residual signal

for a gearbox in good condition.

FM4

FM4 was developed to detect changes in the vibration pattern resulting from damage

on a limited number of gear teeth. FM4 is calculated by applying the fourth

normalized statistical moment to this difference signal as given in the equation:

( )

( )2

1

2

1

4

4

−

−=

∑

∑

=

=N

ii

N

ii

dd

ddNFM (2.4)

where d is the difference signal, d is the mean value of difference signal, and N is the

total number of data points in the time record. A difference signal from a gear in

good condition will be primarily Gaussian noise therefore resulting in a normalized

Kurtosis value of 3. As a defect develops in a tooth, peaks will grow in the difference

signal that will result in the Kurtosis value to increase beyond 3.


____________________________________________________________________

24

M6A and M8A

M6A and M8A were proposed to detect surface damage on machinery components.

Both of these features are applied to the difference signal. The theory behind M6A

and M8A is the same as that for FM4, except that M6A and M8A are expected to be

more sensitive to peaks in the difference signal. The equations for M6A and M8A are

as follows:

( )

( )3

1

2

1

62

6

−

−=

∑

∑

=

=N

ii

N

ii

dd

ddNAM (2.5)

( )

( )4

1

2

1

82

8

−

−=

∑

∑

=

=

N

ii

N

ii

dd

ddNAM (2.6)

where d is the difference signal, d is the mean value of difference signal, and N is

the total number of data points in the time record.

NB4

NB4 is similar to NA4 except that instead of using the residual signal, NB4 uses the

envelope of a band-passed segment of the time synchronous averaged signal. The

idea behind this method is that a few damaged gear teeth will cause transient load

fluctuations that are different from the normal tooth load fluctuations. The theory

suggests that these fluctuations will be manifested in the envelope of a signal which

is band-pass filtered about the dominant meshing frequency. The dominant meshing

frequency is either the primary meshing frequency or one of its harmonics whichever

appears to give the most robust group of sidebands. Researchers suggest that the

width of the band-pass filter depends on the location of the meshing frequency to

other meshing frequency harmonics, while others suggest using a bandwidth giving

the maximum amount of sidebands even if the sidebands interfere with those from

other harmonics.


____________________________________________________________________

25

The reasoning of the latter method is to assume that the interference from other

sidebands is negligible and includes as many of the primary modulating sidebands as

plausible. The envelope of the band-passed signal is the magnitude of the complex

(i.e., analytic) signal obtained by applying the Hilbert transform to the band-passed

signal:

( ) ( )( ) ( )[ ]22 tAHtAtE += (2.7)

where E(t) is the envelope of the band-passed signal, A(t) is the band-passed signal,

and H[A(t)] is the Hilbert transform of the band-passed signal. The analytic signal is

A(t)+iH[A(t)].

NB4 is then determined by dividing the fourth statistical moment of this envelope

signal by the current run time averaged Variance of the envelope signal, raised to the

second power, with the equation following:

( )

( )2

1 1

2

1

4

14

−

−=

∑ ∑

∑

= =

=

m

j

N

ijij

N

ii

EEm

EENNB (2.8)

where E is the envelope of the band-passed signal, E is the mean value of the

envelope signal, N is the total number of data points in the time record, and m is the

current time record number in the run ensemble.

(3) Filter based methods including demodulation, Prony model, and adaptive noise

cancelling (ANC)

Filters are widely used in feature extraction techniques for removing noise and

isolating signals. Generally all these methods were referred as filter based methods.

Filter based methods include demodulation, prony model, and adaptive noise

cancelling (ANC).

Demodulation including phase and amplitude demodulation is an important signal

processing technique. The amplitude demodulation was also known as envelope, or


____________________________________________________________________

26

resonance demodulation, or high frequency resonance demodulation techniques [38].

The amplitude demodulation separates low-level, low-frequency signals from

background noise, enabling them to be easily measured. In the application of gear

faults detection, the amplitude demodulation focused on the fault-induced high-order

modulation sidebands around the dominant gear meshing harmonic [39]. It has also

been successfully applied to diagnose bearing faults [40]. The phase demodulation

emphasised the band associated with the structural resonance excited by the fault-

induced impacts [38].

Generally the demodulation procedure starts with using conventional Infinite

Impulse Response (IIR) Filters such as Butterworth, Chebyshev, Bessel, and Elliptic

in pass band or stop band. Prony's model was used as an algorithm for finding an IIR

filter with a prescribed time domain impulse response.

Enveloping

Enveloping is used to monitor the high frequency response of the mechanical system

to periodic impacts such as gear or bearing faults. An impulse is produced each time

a loaded rolling element makes contact with a defect on another surface in the

bearing or as a faulty gear tooth makes contact with another tooth. This impulse has

an extremely short duration compared to the interval between the pulses. The energy

from the defect pulse will be distributed at a very low level over a wide range of

frequencies. This wide distribution of energy makes bearing defects difficult to

detect by conventional spectrum analysis when they are in the presence of vibrations

from gears and other machine components. The impact usually excites a resonance in

the system at a much higher frequency than the vibration generated by the other

components. This structural energy is usually concentrated into a narrow band that is

easier to detect than the widely distributed energy of the bearing defect frequencies.

With tooth wear and breakage, the sideband activity near critical frequencies such as

the output shaft frequency is expected to increase. The entire spectrum contains very

high periodic signals associated with the gear mesh frequencies.

The envelope or high frequency technique focuses on the structural resonance to

determine the health of a gear or the type of failure in a bearing. This technique


____________________________________________________________________

27

consists of processing structural resonance energy with an envelope detector. The

structural resonance is obtained by band-pass filtering the data around the structural

resonance frequency. The band-pass filtered signal is then processed by an envelope

detector, which consists of half-wave (or full-wave) rectifier and a peak-hold and

smoothing section.

The centre frequency of the band-pass filter should be selected to coincide with the

structure resonance frequency being studied. The bandwidth of the filter should be at

least double the highest characteristic defect frequency. This will ensure that the

filter will pass the carrier frequency and at least on a pair of modulation sidebands. In

practice, the bandwidth should be somewhat greater to accommodate the first two

pairs of modulation sidebands around the carrier frequency.

The rectifier in the envelope detector turns the bipolar filtered signal into a unipolar

waveform. The peak-hold smoothing section will then remove the carrier frequency

by smoothing/filtering the fast transitions in the signal. The remaining signal will

then consist of the defect frequencies.

This feature produces several values of merit for analysis use. The primary value of

merit is the peak frequency and amplitude in the power spectral density of the

enveloped data. Other values of merit include the RMS and Kurtosis values of the

filtering section and the standard deviation of the output from the rectification and

smoothing block.

The envelope technique has been widely used in numerous applications and has

shown successful results in the early detection of bearing faults. Besides early

detection, this process can help distinguish the actual cause of bearing failure by

inspecting the actual bearing defect frequencies.

Envelope detection or high-frequency resonance technique (HFRT) has been shown

to be effective in fault diagnosis [23, 34] [40].

Amplitude demodulation used in gear fault diagnosis

During a normal gear roll, one tooth essentially pushes the next without sliding.

When teeth wear, sliding occurs. The energy that went into pushing before will now

go into pushing and sliding, thus resulting in a change of amplitude or amplitude


____________________________________________________________________

28

modulation of the vibrations at the gear mesh frequency (GMF) and its harmonics.

Demodulation identifies periodicity in modulation of the carrier. The carriers used in

this processing were the GMF and 2*GMF. Demodulation techniques detect the

amplitude modulation components induced by gear wear in the region of a single

frequency, in this case the GMF or 2*GMF. This differs from enveloping which

detects the combined effects over a range of frequencies. To implement the

demodulation technique, the raw data is high-passed filtered at 85%*GMF and then

low-passed filtered at 115%*GMF. The power spectral density of the filtered signal

is searched to obtain the actual carrier frequency (GMF). The actual carrier is used to

amplitude demodulate the filtered carrier signal. The power spectral density of the

resulting signal is searched within +/- five percent of the output shaft frequency. The

values of merit extracted for this technique are the frequency of the peak and the

magnitude squared amplitude.

Resonance demodulation [39], [41] is similar to the commonly used narrow-band

demodulation. The former emphasizes the band associated with the structural

resonance excited by the fault-induced impacts, whereas the latter focuses on the

fault-induced high-order modulation sidebands around the dominant gear meshing

harmonic.

A Prony model based method [42] was applied to bearing faults diagnosis. The

method shows potential for analysing transient vibration signals created from faulty

low speed rolling element bearings. Spectral plots can be generated by applying the

procedure to very short data samples, as well as trending parameters based on these

spectral estimations and Prony parameters. An equation was derived to quantitatively

determine the fault status. It is shown that application of the Prony model based

method has the potential to be an effective as well as efficient machine condition

monitoring and diagnostic tool where short duration transient vibration signals are

being generated.

A recently developed filter-adaptive filter was embedded into an adaptive noise

cancelling (ANC) system and showed promise in diagnosing bearing faults [43].

Adaptive noise cancelling is an approach to reduce noise based on reference signals.

In conventional adaptive noise cancelling systems, the primary input signal is a

combined signal and noise c(n)=s(n)+r0(n), and the reference signal is a noise signal


____________________________________________________________________

29

r1(n) through another channel from the same noise source. Asynchronous adaptive

noise cancelling technology was employed to detect self-aligning roller bearing

faults successfully [44]. Wang [38] detected gear faults using phase and amplitude

demodulation.

(4) Stochastic methods (including chaos) and others (blind deconvolution, blind

source separation, thresholding, and autoregressive model based method)

Advanced methods such as stochastic parameters have been used to analyse

vibrations in the time domain. Chaos, whose computation parameters are known as

the correlation dimension, is used to characterise several induced faults of varying

severity in a rolling element bearing [45],[46]. The correlation dimension can

provide some intrinsic information of an underlying dynamical system, and can be

used to classify different faults intelligently [47].

D. Logan [45] applied a new field of chaos to mechanical systems. His research

proposed the computation of chaotic parameters, known as the correlation

dimension, and used this to characterise several induced faults of varying severity in

a rolling element bearing. Further detailed investigations were then made into the

parameters governing the multi-stage process of determining the correlation

dimension. The correlation dimension was obtained from a computationally

straightforward algorithm and appeared as a single scalar index.

The key to chaos theory was being able to extract the nature of a strange attractor

(presumably assuming one existed) from, in their case, turbulent fluid flow. Strange

attractors were normally characterised by fractal dimensionality d, where d is less

than the number of degrees of freedom of the system F, d<F. The correlation

dimension could provide some intrinsic information of an underlying dynamical

system, and could be used to classify different faults intelligently [47].

Nirbito [48] proposed and tested the feasibility of blind deconvolution for the

enhancement of bearing signals corrupted by noise. Blind deconvolution

(equalization) is a technique used to recover the desired signals from a single


____________________________________________________________________

30

received channel without any priory knowledge about the unknown channel. The

technique has been widely used in network communication. A major advantage of

blind deconvolution is that it does not require a training stage, which is essential in

conventional equalization.

Serviere [49] applied Blind Source Separation (BSS) to rotating machinery

diagnosis. BSS consisted of recovering signals from different physical sources from

several observed combinations independent of the propagation medium. BSS was

used as a promising tool for non-destructive machine condition monitoring by

vibration analysis, as it was intended to retrieve the signature of a single rotating

machine from combinations of several working machines. In this way, BSS could be

seen as a pre-processing step that improves the diagnosis. BSS methods generally

assumed observations that were either noise-free or corrupted with spatially distinct

white noises. In the latter case, principal component analysis (PCA) was applied as a

first step to filter out the noise and whiten the observations. The efficiency of the

whole separation procedure depends on the accuracy of the first step PCA. However,

in the real world, signals of rotating machine vibration might be severely corrupted

with spatially correlated noises and therefore the signal subspace would not be

correctly estimated with PCA. A ‘robust-to-noise’ technique was proposed for the

separation of rotating machine signals. The sources were assumed to be periodic and

could be modelled as the sum of sinusoids of harmonic frequencies. A new estimator

of the signal subspace and the whitening matrix was introduced which exploited the

model of sinusoidal sources and used spectral matrices of delayed observations to

eliminate the influence of the noise. After whitening, the second step of source

separation remained unchanged. Finally, performance of the algorithm was

investigated with artificial data and experimental rotating machine vibration data.

The pseudo-phase portrait was sensitive to some rotating machinery faults [46].

Threshold denoising (including hard threshold and soft threshold) were often used to

denoise vibration analysis. The threshold denoising methods were usually combined

with envelope or some other methods together when diagnosing machinery faults. A

soft-thresholding method and hard thresholding method have also been used in

diagnosing machine faults [50]. An autoregressive model-based method has also

been successfully applied in fault diagnosis [41].


____________________________________________________________________

31

Andrade [31] compared Cumulative Density Function (CDF) of a target distribution

with the CDF of a reference distribution and used the likelihood to detect gear tooth

fatigue crack. The statistical distance between two CDFs was converted into a

similarity probability using the Kolmogorov-Smirnov(KS) probability distribution

function ksQ defined as

( ) ( ) 222

1

112 λλ j

j

jks eQ −

∞

=

−∑ −= (2.9)

A synchronous averaging signal was utilized to diagnose faults in rolling bearings

and gears [34, 51]. Calculation of a synchronous averaging signal ( )ty of a time

signal x(t) using a trigger signal having a frequency tf is equivalent to the

convolution ( ) ( ) ( )txtcty *= where ( )tc is a train of N impulses of amplitude 1/N,

spaced at intervals tt fT /1= , given by ( ) ( )∑−

=

+=1

0

1 N

ntnTt

Ntc δ

In the frequency domain, this is equivalent to the multiplication of the Fourier

transform X(f) of the signal by C(f), represented by

( ) ( ) ( )fXfCfY ⋅= (2.10)

Where C(f), the Fourier transform of c(t), is a comb filter function of the form

( ) ( )( )fT

fNT

NfC

1

1

sin

sin1

ππ=

(2.11)

Increasing the number of averages N narrows the teeth of the comb, and reduces the

amplitude of the side lobes between the teeth. For very large N, only frequencies at

exact multiples of the trigger frequencytf are passed. Thus, synchronous averaging

can be viewed in the frequency domain, for large numbers of averages, as the

complete removal of all components except those that occur at integer multiples of

the frequency tf .

The autoregressive model-based method has been applied in fault diagnosis [1, 41,

52-55].


____________________________________________________________________

32

Statistical methods have been combined with AI techniques such as Neural Networks

in order to diagnose faults more efficiently [56]. An 8×12×1 artificial Neural

Network has been used successfully for on-line monitoring of ball bearing

conditions. Peak amplitude in the frequency domain, peak RMS, and the power

spectrum of vibration signals have been used as inputs of the Neural Network while

the outputs indicate the bearing states.

An empirical model-based fault diagnosis system was developed for induction

motors using recurrent dynamic Neural Networks and multiresolution signal

processing methods [57]. It was pointed out that, in practice, it is desirable to

perform multi step (MS) predictions recursively, by relating current estimated output

with previous estimated output and previous input. The recursive relation between

inputs and outputs in MS prediction was expressed using a Feed Forward Neural

Network (FFNN). IIR type feedback (GF) and the lack GF in IIR type network (TF)

were used as learning algorithms to train the Neural Network. Air-gap eccentricity

for motor and broken rotor bars for motor was successfully detected using the model-

based diagnosis system.

Other methods such as shock pulse, Matched Filter root mean square method and

threshold denoising (including hard threshold and soft threshold) show promise.

The shock pulse method [24] is a signal processing technique used to measure metal

impact and rolling noise such as those found in rolling element bearings and gears. In

this method, an accelerometer is involved that is excited to resonance by shock

impacts from bearing defects. The impact components of signals can be effectively

focused and localised. The peak value of the recorded shock pulse is measured in

order to obtain an indication of the condition of a bearing. The disadvantage of

tuning the transducer resonance to the resonant frequency of the structure is that the

absolute amplitude of the vibration is not known. To surmount this problem the

shock pulse value is normalised by subtracting the shock pulse value of an

undamaged bearing. A similar method is spike energy measurement [24], which can

be obtained using spike energy meters. Spike energy meters detect bursts of vibration

at very high frequencies. A high-pass filter is used to filter our vibration components

below the accelerometer resonant frequency.


____________________________________________________________________

33

The adaptive filter [58] and Matched Filter root mean square method [20] are also

popular techniques for fault diagnosis. The ANC technique [43] was used to

diagnose bearing faults. Soft-thresholding denoising method [50] and hard

thresholding denoising method are used widely to denoise signals when diagnosing

machine faults.

2.3.1.2 Frequency and Time-frequency Feature Extraction Techniques

Tandon [29] presented a review of vibration and acoustic measurement methods for

the detection of defects in rolling element bearings. He considered the detection of

both localized and distributed categories of defect. Vibration measurement in both

time and frequency domains along with signal processing techniques such as the

high-frequency resonance technique have been covered. Other acoustic measurement

techniques such as sound pressure, sound intensity and acoustic emission have been

reviewed. Recent trends in research on the detection of defects in bearings, such as

the wavelet transform method and automated data processing, have also been

included. In the time domain, RMS, Kurtosis and shock pulse methods have been

analysed. In the frequency domain, how to apply Fast Fourier Transform (FFT) has

been explained. Power cepstrum is defined as the logarithm of power spectrum. The

ANC technique, envelop detection or the high-frequency resonance technique

(HFRT) is an important signal processing technique.

This section starts from the advent of modern Fast Fourier Transform and then

emphasizes time-frequency representation. A comprehensive review with most

frequency and time-frequency methods for diagnosing machinery faults is covered. It

is found that in the literature both frequency and time-frequency analysis techniques

are being attempted to extract efficient coefficients by increasing the order of

transformation of all kinds of parameters. For instance, power spectrum as a second

order spectrum was applied successfully even though spectrum has been used

widely. Various high order parameters have already showed their capability of

magnifying vibration features in fault diagnosis research. Theoretically, it can be

found that high order transformation increases the magnitude of characteristic

frequency or time-frequency parameters.


____________________________________________________________________

34

Table 2.1 gives an overview of developed high order frequency techniques and time-

frequency techniques. The detailed definitions and explanations of those parameters

are given in the context, which follows Table 2.1:

Table 2.1: An overview of frequency techniques and time-frequency techniques

First order Second order Third order Fourth order

Power spectrum Spectrum

Instantaneous Power Spectrum(IPS)

Bicoherence spectrum

Correlation of spectrum, signal averaging

Cyclostationarity Bilinearity

Spectrogram Short Time Fourier Transform (STFT) Wigner distribution Wigner bi

spectra Wigner tri spectra

Continuous Wavelet Transform(CWT)

Scalogram

Discrete Wavelet Packet Analysis (DWPA)

Matching Pursuit

Time-Averaged Wavelet Spectrum (TAWS)

Time-Frequency-Scale domain (TFS)

As is well known, frequency-domain or spectral analysis of the vibration signal is

perhaps the most widely used approach of bearing defect detection. The Discrete

Fourier Transform (DFT) and FFT are the most conventional diagnosis techniques

which have been widely used [29, 59-63]. DFT provides information about any

sampled signal, returning complex frequency spectra that can in turn be used to

calculate both the phase and magnitude spectra of a signal. Each one of the frequency

bins created as a result of this process has a resolution of fs/N, where fs is the

sampling frequency. Applying a DFT directly to a time series causes large peaks to

be generated at both ends of the spectrum; this is caused by the assumption that the


____________________________________________________________________

35

signal is periodic, and any discontinuities in the signal cause the large peaks to be

generated to compensate for this. In order to alleviate this problem, a window is

applied to the data, which ensures that the beginning and end of the time sequence

start at zero.

( ) ( )∑−

=

−=1

0

/2N

n

NknjenxkX π (2.12)

Calculating the DFT is a fairly computationally intensive job, however FFT opened

up the field of simple spectral analysis. This method and its variants are still found

commonly in use today.

Spectrum is used to derive a few methods such as singular spectrum, envelope

spectrum, and Power Spectrum. Singular spectrum analysis can reveal the

complexity of a signal. By means of singular spectrum analysis one can reduce the

noise of a signal [46].

The procedures for obtaining the spectrum of the envelope signal by a high-

frequency resonance technique are well established and applied in bearing fault

detection [64].

Du [65] proposed a method for the fault diagnosis of roller bearings based on the

empirical mode decomposition (EMD) method and the Hilbert spectrum. The local

Hilbert spectrum and local Hilbert marginal spectrum were introduced. The

orthogonal wavelet bases were used to translate vibration signals of a roller bearing

into time-scale representation. Then, an envelope signal could be obtained by

envelope spectrum analysis of wavelet coefficients of high scales. By applying the

EMD method and Hilbert transform to the envelope signal, they derived the local

Hilbert marginal spectrum from which the faults in a roller bearing could be

diagnosed and fault patterns identified. Practical vibration signals measured from

roller bearings with out-race faults or inner-race faults were analysed by the

proposed method. The results showed that the proposed method was superior to the

traditional envelope spectrum method in extracting the fault characteristics of roller

bearings.

The Power Spectrum whose amplitude is square of amplitude of spectrum is also an

effective method for diagnosing machinery faults [61, 66]. The power spectrum is a


____________________________________________________________________

36

development of the DFT. The DFT returns a complex answer to a real valued

problem, and examining the individual components it is hard to make sense of what

is actually happening within a signal. The Power Spectrum calculates the average of

the magnitude of the DFT, and is defined:

( ) ( ) ( ){ }kXkXEkSxx*= (2.13)

where ( )kX * is the complex conjugate of X(k). This is probably the most common

spectral technique used. By examining the peaks, harmonics can be spotted in the

spectrum, and from this the frequencies of interest can be calculated, and then related

back to the faulty components. However, in many cases it can be difficult to

distinguish harmonic peaks within a spectrum, dependent upon the degree of noise

present in the raw data, and the number of other vibration sources in the location

where the data was sampled.

The Instantaneous Power Spectrum (IPS) is a time-frequency technique, which has

been successfully applied in the detection of the local faults in helical gears [67]. The

IPS was combined with Neural Networks to effectively diagnose machinery faults

[68].

The Power Cepstrum, which is logarithm of power spectrum, was applied to bearing

fault diagnosis. Cepstrum is the inverse Fourier transform of the logarithmic power

spectrum and was used in the diagnosis of bearing faults [29].

Cepstrum analysis is a nonlinear signal processing technique with a variety of

applications in areas such as speech and image processing. The real Cepstrum of a

signal x, sometimes called simply the Cepstrum, is calculated by determining the

natural logarithm of magnitude of the Fourier transform of x, then obtaining the

inverse Fourier transform of the resulting sequence. This process creates another

spectrum, which when read, gives an indication of the presence of harmonics in the

power spectrum. Peaks in the Cepstrum correspond to the existence of harmonics in

the power spectrum, with the frequency giving the period of the separation between

the harmonics in the frequency domain. The harmonics on the Cepstrum plot (known

as harmonics) would indicate the existence of several different harmonics which are

a multiple of the frequency of interest.


____________________________________________________________________

37

The Cepstrum is used in vibration monitoring because it is often the case that the

power spectra taken from a machine can be extremely noisy. This is due to vibration

coming from external influences, such as other machines operating on the same area.

Thus, it can be very difficult to determine what parts of the signal constitute

harmonics and what is noise in the power spectrum, and Cepstrum analysis provides

a relatively efficient way to isolate the harmonic frequencies present in the signal.

The higher order spectrum is also called Bispectrum and is defined by

( ) ( ) ( ) ( )[ ]yxyxyx ffSfSfSEffB += *,, (2.14)

It can be seen that the Bispectrum is complex and that the Bispectral values depend

on two frequencies xf and yf . Rewriting the above equation in terms of amplitude

and phase quantities gives

( ) ( ) ( ) ( ) ( )yx ffjyxyxyx effSfSfSffB ,, βθ+=

(2.15)

where ( ) ( ) ( ) ( )yxyxyx ffffff +−+= ββββ θθθθ , and is called biphase, which is

applied to fault diagnosis of motor bearing [68].

Bicoherence

The Bicoherence Spectrum is a third-order spectrum used to measure the phase

coherence among three spectral components due to nonlinear wave coupling.

Bicoherence has been used to monitor bearing condition [69].

Cyclostationarity

Signal averaging is a well-known synchronised averaging method, which is the

expression of Cyclostationarity of the first order. The spectral correlation function

issued from the second-order Cyclostationarity is an efficient parameter for the early

diagnosis of faults in rotating machines. Moreover, it is shown that vibration signals

measured on gear systems display second-order Cyclostationarity. Application in

early diagnosis of spalling in gear teeth demonstrates the power of this new

parameter [70]. A comparison between Cyclostationarity and Bilinearity has been

presented in an application to early diagnosis for helicopter gearboxes [71].


____________________________________________________________________

38

Generally, frequency domain parameters were more consistent in the detection of

damage than time domain parameters. However, sufficient evidence is produced to

show that it would be unreliable to depend exclusively on any one technique to

detect bearing damage [72].

In recent years, the time-frequency signal analysis has been studied and applied in

machinery fault diagnosis due to its capability of representing signals in both time

and frequency domains. This characteristic of time-frequency analysis technique

meets the requirements for analysing vibration signals, which are non stationary

signals. Preliminary time-frequency analysis, Windowed Fourier Transform [73] and

Short Time Fourier Transform (STFT) [74] were used to monitor condition of

machinery.

The Wigner Distribution [73, 75-77] and the Spectrogram [36, 78, 79] are the most

well-known quadratic time-frequency representations belonging to the Cohen class

which have been applied in diagnosing gear faults.

The Spectrogram is defined as

( ) ( ) ( ) ( )2

22,, ∫

∞+

∞−

−−== τττ τπ detwxftSTFTftSPEC fjxx (2.16)

The Wigner Distribution (WD) is defined as

( ) ∫+∞

∞−

−

−

+= τττ τπ detxtxftWD fjx

2*

22,

(2.17)

The Wigner-Ville distribution has been used in the diagnosis of faults of rotating

machines and particularly in the analysis of gearbox faults. However, it is

computationally expensive to perform. The basic nature of such signals causes

significant interfering cross-terms, which do not permit a straightforward

interpretation of the energy distribution. To counter this, a modified version was

proposed by Choi and Williams, which has become more widely used. It reduces the

effect of the artefacts by the use of a kernel to minimise the cross terms. The Choi-

Williams distribution is defined:


____________________________________________________________________

39

( ) ( ) ( )[ ] τττστπ

ω τωστ duduxuxetP jtuCW

+

−= −−−∫∫ 22

1

4

1,

22 4/

22/3

(2.18)

where σ is constant.

A new signal processing technique, the Directional Choi-Williams Distribution

(dCWD), is proposed to account for complex-valued time-varying signals, which

represent the planar motion of rotating machinery at each instant of time [80].

Directional Wigner Distributions (DWDs) defined for the forward and backward

pass analytic signals [81] have been applied in analysing order of rotating machines.

Higher-order statistics for fault detection within mechanical systems based on the

observation that impulsive signals tend to increase the Kurtosis values. The use of

the third- and fourth-order Wigner moment spectra, called the Wigner bi- and tri-

spectra respectively [82].

The sliced Wigner fourth-order moment spectra for multiple signals had problems

with its application which were due to the existence of non-oscillating cross-terms

not smoothed by conventional methods. A new method was developed to smooth

non-oscillation cross-terms. The techniques developed are applied to the diagnosis of

valve system faults in an engine [83].

The Continuous Wavelet Transform (CWT) has been used in time-frequency

analysis of rotating machinery fault diagnosis [35, 38, 84, 85].

Continuous Wavelet Transforms (CWTs) [86] were widely recognized as effective

tools for vibration-based machine fault diagnosis, as CWTs could detect both

stationary and transitory signals. However, due to the problem of overlapping, a large

amount of redundant information existed in the results that were generated by CWTs.

The appearance of overlapping could smear the spectral features and make the results

very difficult to interpret for machine operators. Misinterpretation of results might

lead to false alarms or failures to detect anomalous signals. Moreover, as

conventional CWTs only used a single mother wavelet to generate daughter

wavelets, the distortion of the time waveform in the resultant coefficients was

inevitable. Obviously, this would significantly affect the accuracy in anomalous

signal detection. To minimize the effect of overlapping and to enhance the accuracy


____________________________________________________________________

40

of fault detection, a novel wavelet transform, which was named exact wavelet

analysis, had been designed for use in vibration-based machine fault diagnosis. The

design of exact wavelet analysis was based on genetic algorithms. At each selected

time frame, the algorithms would generate an adaptive daughter wavelet to match the

inspected signal as exactly as possible. The optimization process of exact wavelet

analysis was different from other adaptive wavelets as it considered both the

optimization of wavelet coefficients and the satisfaction of the admissibility

conditions of wavelets. The results obtained from simulated and practical

experiments proved that exact wavelet analysis not only minimized the undesirable

effect of overlapping, but also helped operators to detect faults and distinguish the

causes of faults. With the help from exact wavelet analysis, sudden shutdowns of

production and services due to the fatal breakdown of machines could be avoided.

The scalogram – the squared modulus of the CWT [41] was applied in diagnosing

gears. Vibration signatures are passed into a harmonic wavelet transform algorithm

and the mean square wavelet map [87] when diagnosing rotorcraft planetary

geartrain system.

The Discrete Wavelet Transform (DWT) was also applied in [27, 88, 89]. It was

demonstrated that the Discrete Wavelet Packet Analysis (DWPA) [2], wavelet packet

transform [90-93], discrete wavelet analysis [94] all showed potential in fault

diagnosis. Wavelet transform analysis were extended using the Gabor [95-97]

dictionary in Matching Pursuit [84, 96, 97], and using both continuous and discrete

Morlet wavelets [50, 98].

An approach to diagnose gear faults based on continuous wavelet transform was

presented [99]. Continuous wavelet transform can provide a finer scale resolution

than the orthogonal wavelet transform. It is more suitable for extracting mechanical

fault information. The concept of time-averaged wavelet spectrum (TAWS) based on

Morlet continuous wavelet transform was proposed. Two fault diagnosis methods,

the Spectrum Comparison Method (SCM) and Feature Energy Method (FEM) based

on TAWS were established. The results of the application to gearbox gear fault

diagnosis showed that TAWS could effectively extract gear fault information. The

feature energy of the TAWS identifies gear fault advancement very well and is

conically proportional to gear fault advancement.


____________________________________________________________________

41

An adaptive wavelet filter based on the Morlet wavelet was introduced by Lin [100].

The parameters in the Morlet wavelet function were optimised based on the Kurtosis

maximisation principle. The wavelet used was adaptive because the parameters were

not fixed. The adaptive wavelet filter was found to be very effective in detecting

symptoms from vibration signals of a gearbox with early fatigue tooth crack. Two

types of discrete wavelet transform (DWT), the decimated with DB4 wavelet and the

undecimated with harmonic wavelet, were also used to analyse the same signals for

comparison. No periodic impulses appeared on any scale in either DWT

decomposition.

Non orthogonal wavelets are more like vibration impulses [50]. Lin denied that non

orthogonal wavelets can be used in DWT. However, in practice, CWT has been

approximated by DWT [101]. Orthogonality is often not crucial in the post-

processing of signal coefficients. One may thus further enlarge the freedom of choice

by approximating the signal with non-orthogonal vectors, chosen from a large and

redundant dictionary [102] .

Most applications of wavelet bases exploit their ability to efficiently approximate

particular classes of functions with few non-zero wavelet coefficients. This is true

not only for data compression but also for noise removal and fast calculations. In

signal processing, orthogonal bases are of interest because they can efficiently

approximate certain types of signals with just a few vectors. Two examples of such

applications are image compression and the estimation of noisy signals.

From families of wavelet packet bases and local cosine bases, a fast dynamical

programming algorithm is used to select the “best” basis that minimizes a Schur

concave cost function. Pursuit algorithms generalize these adaptive approximations

by selecting the approximation vectors from redundant dictionaries of time-

frequency atoms, with no orthogonality constraint.

A time-frequency-scale domain (TFS) technique has been developed to diagnose

machine faults [103]. The method was proposed to detect transients in mechanical

systems by matching wavelets with associated signals, leading to a development of

joint time-frequency-scale distribution. The three variables, the time, frequency and

scale, maximised the chance for finding similar signal segments from a system under


____________________________________________________________________

42

inspection. The sensitivity was shown to be very high due to closer matching and

better choice of wavelet shapes, which was essential for early fault detection and

failure prevention. Fundamental types of wavelets were introduced based on the

shapes of widely encountered system responses. A method of processing the three-

dimensional image was suggested for interpreting the time-frequency-scale wavelet

map, where the properties of the object patterns uncovered the features of a signal

source, so as to understand the defect and to indicate the condition of a diagnosed

system. The joint distribution was demonstrated to be useful in detecting transients

from different mechanical systems.

2.3.1.3 The Application of Vibration Techniques

In conclusion, statistical parameters have been applied to analysing time domain

signals. The FFT and high order spectrum analyses of signals have been applied in

the frequency domain. Wavelet transforms can show features of signals in the

frequency as well as the time domains. The resolution of frequency and time can be

adjusted. Wavelet transformation has several limitations. A drawback of the wavelet

transform is that it has overlapping or inter terms between neighbouring frequency

and time. The Wigner distribution is a proposed algorithm using differentiation to

reduce the overlapping [104]. In discrete wavelet transform or wavelet packet

analysis, the best basis of discrete wavelet analysis was being investigated to detect

component information required for specific applications. In fault diagnosis,

characteristic frequencies lie in low as well as high frequencies. This makes dyadic

Discrete Wavelet Transform unreliable for fault diagnosis of rotating machinery due

to its low frequency focus. However, Discrete Wavelet Packet Analysis(DWPA)

determines packet coefficients both in low frequencies and in high frequencies,

which makes it useful in fault diagnosis of rotating machineries.

2.3.2 Artificial Intelligence Techniques in Fault Identif ication

The call for automatic diagnosis of faults in machinery systems is gaining more and

more importance, due to the increasing complexity and riskiness of modern

machinery systems, and the growing demands for quality, cost efficiency,

availability, reliability and safety. Artificial intelligence applied in fault diagnosis

brings two benefits: firstly, it reduces the negative effects of conventional human


____________________________________________________________________

43

interpretation such as time consumption; secondly, it improves the reliability of the

fault diagnosis because of the removal of human subjectivity during the diagnostic

procedure.

To achieve the goal of automatic diagnosis, fault detection and identification

sometimes employs artificial intelligent (AI) approaches for pattern classification.

Numerous attempts have been made to improve the accuracy and efficiency of fault

diagnosis of rotating machinery by employing AI techniques. Few have attempted to

summarise these techniques comprehensively. Zhong [105] introduced new

developments in the theory and application of intelligent condition monitoring and

diagnostics in China. He concluded that the trends in intelligent diagnosis are NN-

based fault classifiers, NN-based expert systems, NN-based prognosis, behaviour-

based intelligent diagnosis, remote distributed intelligent diagnosis networks and

intelligent multi-agent architecture for fault diagnosis. He provided a good overview

of intelligent fault diagnosis of machinery but was somewhat general.

Pham [106] theoretically analysed the applicability of artificial intelligence in

engineering problems and predominantly looked at knowledge-based systems, fuzzy

logic, inductive learning, Neural Networks and genetic algorithms in different

branches of engineering but not in machinery fault diagnosis. Tandon [29] mentioned

that automatic diagnosis was a trend in the fault diagnosis of rolling elements. Gao

[107] gave an up-to-date review on recent progress of soft computing methods-based

motor fault diagnosis systems. He summarized several motor fault diagnosis

techniques using Neural Networks, fuzzy logic, neural-fuzzy, and genetic algorithms

(GAs) and compared them with conventional techniques such as direct inspection.

Chow [30] also gave a brief review, which listed 14 papers from experts in the area

of motor fault detection and diagnosis. He grouped those papers into five main

categories: survey papers, model-based approaches, signal processing approaches,

emerging technology approaches, and experimentation.

This section presents a review of the application of Artificial Intelligence (AI)

techniques in fault diagnosis of rotating machinery. In the literature, it appears that

diagnostics techniques for different rotating machinery parts can have much in

common. Previous research has been conducted predominately on specific

components and faults in rotating machinery. This section provides a review of


____________________________________________________________________

44

techniques concerned with rotating elements and shows the potential of generic

application of techniques over a range of rotating elements.

The second part of this section presents a summary of AI based intelligent fault

diagnosis techniques. These are grouped into the following categories: Neural

Networks, fuzzy sets, expert systems, and hybrid approaches (as shown in Figure

2.9). The limitations and strengths of these different techniques are addressed

respectively.

Figure 2.9: An overview of fault detection and identification techniques

2.3.2.1 Neural Networks Based Fault Diagnosis

In general, intelligent diagnosis is carried out when known inputs are fed into black

boxes, which subsequently produce outputs in accordance with machine faults.

Neural Networks (NN) are suitable for these tasks and have been widely researched

as an artificial intelligence tool for machinery fault diagnosis [1]. By employing such

a tool, maintenance personnel need not understand or operate the internal

mechanisms of a Neural Network. He or she will only be responsible for inputting

the appropriate data to a Neural Network. The Neural Network will then be trained

on this data so that it can diagnose faults.

Neural Networks have become widely adopted in rotating machinery for fault

diagnosis due to their learning ability. The learning ability of NN makes Neural

Networks capable of tackling a new problem by making use of existing information.

In addition, Neural Networks also have the capacity to model complex nonlinear

problems, which may approximate real world fault diagnosis.

In rotating machinery fault diagnosis, a Neural Network is usually trained on features

extracted from obtained signals. The Neural Network is usually viewed as a fault

classifier. To date, several types of Neural Networks [17] have been applied in

machinery fault diagnosis including Back Propagation for Feed Forward Networks

Fault detection and identification

Neural Networks Fuzzy set Expert systems Hybrid AI techniques


____________________________________________________________________

45

(BPFF), Multi Layer Perceptrons (MLP), Back Propagation Multiplayer Perceptrons

(BPMP), Radial Basis Function networks (RBF), Self Organised Maps (SOM), and

Principal Component Analysis (PCA).

A Feed Forward Neural Network consists of several successive layers of sigmoid

neurones, where every neuron in a layer receives as inputs the outputs of all the

neurones of the preceding layer. The input to the network is the n inputs to the

neurones in the first layer. The outputs of the m neurones in the last layer are the

network outputs.

The most common application of Feed Forward neural nets is for finding a mapping

that best interpolates from a given set of sample points of a multivariate function. For

this purpose the neural net designer has a choice of two variants: the multi-layer

perceptron (MLP) and Radial basis functions.

Multi-layer perceptrons are usually used for pattern recognition. Each perceptron has

the function of outputting either “0” or “1”. This makes it suitable for classification.

Radial basis function networks use Gaussian transfer functions for their nodes. Also

the inputs are not aggregated by an affine transform but by the Euclidean distance

from a centre. Each neuron has a different centre. Therefore the output of the neuron

decreases exponentially as the input point moves away from the centre.

Principle Components is a set of variables that define a projection that encapsulates

the maximum amount of variation in a dataset and is orthogonal (and therefore

uncorrelated) to the previous principle component of the same dataset. Principle

component analysis (PCA) is commonly used as a cluster analysis tool. It is designed

to capture the Variance in a dataset in terms of principle components. In effect, one is

trying to reduce the dimensionality of the data to summarise the most important (i.e.

defining) parts whilst simultaneously filtering out noise. Normalisation, however,

can sometimes remove this noise and make the data less variate, which could affect

the ability of PCA to capture data structure.

Back propagation is a commonly used training algorithm in the training of a Neural

Network. All back propagation methods are to apply chain rules for ordered partial

derivatives, which can be used to calculate the sensitivity that a cost function has

with respect to the internal states and weights of a network. In other words, the term


____________________________________________________________________

46

back propagation is used to imply a backward pass of error to each internal node

within the network, which is then used to calculate weight gradients for that node.

Learning progresses by alternately propagating forward the activations and

propagating backward the instantaneous errors.

Recurrent Networks have the same characteristics as the standard Feed Forward

networks, but with feedback connections (as shown in Figure 2.10). Because of these

feedback connections, cycles are present in the network. Therefore, training is

sometimes iterated for a long period of time before a response is produced.

Figure 2.10: A Recurrent Neural Network

Recurrent networks tend to be more difficult to train than feed forward networks, as a

result of the cycles, though there are still a fair number of algorithms which are

frequently used, including Jordan, Elman or time delay networks. Some problems are

particularly suited to the use of recurrent networks in favour of Feed Forward

networks, such as time series prediction.

A Self Organising Map (SOM) Neural Network defines a mapping from input signal

of arbitrary dimension to a one or two dimensional array of nodes. This array of

nodes corresponds to a discrete map. The biological basis of a SOM is that sensory

inputs, such as motor, visual, auditory etc., are mapped onto corresponding areas of

the cerebral cortex in an orderly fashion. The neurons are close together and interact

Output

Input

Biases


____________________________________________________________________

47

via short synapse connections. Figure 2.11 shows a one dimensional SOM structure.

Figure 2.12 displays an SOM in two dimensions.

Figure 2.11: One Dimensional Self Organising Map

As can be seen from Figure 2.11, the SOM is essentially a two layer Neural Network

with full connections, only with connections between neurons in the output layer.

Methods for learning in such architecture are to be discussed in the following

section.

Figure 2.12: Two dimensional Self Organising Map

Two dimensional output layer

Input node

One dimensional output layer

Input layer


____________________________________________________________________

48

SOMs represent a topology differently because the connections exist between the

input layer and output layer as well as between neurons in the output layer.

Connections do not exist between neurons in the output layer for other network

types. The Kohonen SOM can have the following attributes:

Competition: Neurons in the network compete to see which input is closest to an

input pattern.

Cooperation: The winning neuron determines a neighbourhood - a group of

networks close by providing the basis for the neighbouring neurons to cooperate,

which is to increase the weights together.

Adaptation: The excited neurons steadily adapt their values to be closer to that of

the input pattern through adjustments to their weights.

Neural Networks, which have been successfully applied in rotating machinery fault

detection and identification, are listed in Table 2.2. The table provides a preliminary

summary of the type of NN used and its application in diagnosing faults in common

machine components.

Table 2.2 Neural Networks applied in diagnosing rotating machinery faults

Neural Networks

pumps rolling bearings

gears gearboxes shafts fans motor

RNN [108] [108] RBF [1] BPFF [75, 109] [1] [110] [109] [111] [108] MLP [63,

112] [63, 112]

[112] [63, 113]

[114] [112]

Kohonen SOM

[115]

LVQ [63] [63] [63]

where, FFNN represents a Feed Forward Neural Network, and RNN represents a

Recurrent Neural Network.

Lu [66] processed the vibration signal of the rolling bearing in motors using fast

Fourier transformation and auto power spectrum estimation. The natural frequencies

of the rolling bearing could be obtained and were used for diagnosing faults through


____________________________________________________________________

49

Neural Network training period and identification period by using back-propagation

feed forward multilayer Neural Networks. Lu[66] obtained positive results from

rolling element bearings in electrical machines, which showed the feasibility and

effectiveness of his technique. It was clearly shown that the frequency signature was

more sensitive for damage detection than other signatures.

Baillie and Mathew’s results [1] in diagnosing faults of rolling element bearings

indicated that back propagation Neural Networks generally outperformed the radial

basis functions.

In the research of Baillie and Mathew, they introduced the concept of fault diagnosis

using an observer bank of autoregressive time series models. The concept was

applied experimentally to diagnose a number of induced faults in a rolling element

bearing using the measured time series vibration signal. Three distinct techniques of

autoregressive modelling were compared for their performance and reliability under

conditions of various signal lengths. The results indicated that back propagation

Neural Networks generally outperformed the radial basis functions and the

traditional linear autoregressive models. This model-based technique for fault

diagnosis was found to require much shorter lengths of vibration data than traditional

pattern classification techniques used in the field of machine condition monitoring.

Three techniques of autoregressive modelling were: the Box-Jenkins linear

autoregressive models, back propagation Neural Networks, and radial basis function

networks.

A back-propagation Neural Network is a multi-layer feed-forward architecture

Neural Network. In Baillie and Mathew’s work, the inclusion of a hidden layer

allowed the network to perform a non-linear mapping of vectors from input to

output. The output was only one node - the disadvantage is that a systematic method

for selection of the most appropriate parameters did not exist. Thus the building and

training of Artificial Neural Networks (ANN) typically required a trial and error

approach (and some experience). When data length was as long as 500 vector

representations, the classification of faults was 100% correct. It needed half the

length of data for accurate classification of the faults.


____________________________________________________________________

50

Radial Basis Functions were characterised by the localised response of hidden layer

nodes, making some applications highly suited to this approach. Another important

feature of radial basis functions was their extremely rapid training times compared to

back propagation networks. Researchers had repeatedly observed training times of an

order of magnitude quicker.

Radial basis functions generally failed to satisfactorily classify vibration signals,

compared to the performance of the linear autoregressive models and back

propagation Neural Network models. It was also observed that the operation of the

radial basis functions was much slower than either the linear autoregressive models

or the back propagation models. This was attributable to the larger number of hidden

layer nodes inside the network, requiring many floating point calculations to be

performed.

McCormick and Nandi [113] used multi-layer perceptron and radial basis function

Neural Networks to classify the condition of a small rotating machine. It was found

that both networks achieved similar success in fault classification. However, Radial

Basis Function (RBF) networks could be trained in a significantly shorter length of

time. Multi-layer perceptrons (MLP) required fewer neurons and were faster in recall

operation.

Meesad and Yen [63] applied MLP and Learning Vector Quantization (LVQ)

classifiers to diagnosing faults in gears, bearings, and shafts. It was proven that these

two Neural Networks were successful while they required off-line training and

iterative presentation of the training data. They were cumbersome when applied to

pattern classification problems that needed fast, on-line, real-time, incremental

learning.

A diagnostic algorithm using Kohonen’s network proposed by Tanaka [115-120]

could classify data obtained from a motor in a plant. The proposed diagnostic

algorithm was compared with Back Propagation -networks for both fault detection

and fault identification. In Tanaka’s application, it was proven that the performance

of the Kohonen Self Organised Map (SOM) was better than Back Propagation

networks. It was also concluded that the map size of the Kohonen’s network was a

factor, which affected its performance.


____________________________________________________________________

51

A Kohonen Self-Organizing Map (KN) was developed for pattern recognition, and

had been extended to fault classification. However, the KN could not be applied to

classify faults from the system output if it contains other factors, such as system state

and sensor mounting errors. To overcome this problem, a constrained KN (CKN)

[121] was proposed. To eliminate the effect of the system state and the mounting

errors, it was proposed that the weight vectors of the CKN were constrained in the

parity space. The training algorithm of the CKN was derived, and its convergence

discussed. Application of the CKN to fault classification was presented, and its

performance was illustrated by an example involving a redundant sensor system with

six sensors.

The performance of Neural Networks is sometimes subjective. The selection of the

network structure may affect the network performance to a significant extent. Some

Neural Networks can have very slow convergence speeds. Training these Neural

Networks take a substantial length of time. Even worse, training can fall victim to the

law of diminishing returns. In particular, the slow operating speed of some Neural

Networks is unable to meet the requirement that machinery be diagnosed on line or

in certain timeframes. Furthermore, a Neural Network can output local optimisation

thereby not guaranteeing an optimal solution. This disadvantage of Neural Networks

can lead directly to errors in diagnosis. Yet another limitation of NN is the lack of

semantics [93].

2.3.2.2 Fuzzy Sets Based Fault Diagnosis

Fuzzy logic-based fault diagnosis methods have the advantages of embedded

linguistic knowledge and approximate reasoning capability. The Fuzzy logic

proposed by Zadeh [12] performs well at qualitative description of knowledge.

However, the design of such a system depends heavily on the intuitive experience

acquired from practicing operators thus resulting in subjectivity of diagnosed faults.

The fuzzy membership function and fuzzy rules cannot be guaranteed to be optimal

in any case. Furthermore, fuzzy logic systems lack the ability of self-learning, which

is compulsory in some highly demanding real-time fault diagnosis cases [107].

Rough set based intelligence diagnostic systems have been constructed and used in


____________________________________________________________________

52

diagnosing valves in three-cylinder reciprocating pumps [90] and turbogenerators

[122, 123].

Miguel [124] described a failure detection and diagnosis method for rotating

machinery which combines fuzzy logic with vibration analysis techniques. Vibration

analysis provided diagnosis information that contained uncertainties from several

sources such as measurement conditions and disturbances. However, this information

was qualitatively meaningful and redundant enough so as to obtain a reliable fault

detection and diagnosis by using fuzzy knowledge processing. Fault sensibility of the

different symptoms was the key factor to establish a fault diagnosis by a fuzzy

processing of those symptoms.

2.3.2.3 Expert Systems Based Fault Diagnosis

Expert systems are computer programs embodying knowledge about a narrow

domain for solving problems related to that domain. An expert system usually

comprises two main elements, a knowledge base and an inference mechanism. The

knowledge base contains domain knowledge which may be expressed as any

combination of ‘IF-THEN’ rules, factual statements, frames, objects, procedures and

cases [106].

A fault diagnosis expert system is a system with a knowledge base storing the

accumulated experience of fault diagnostics experts. However, expert systems can

only diagnose faults stored in the knowledge base and are thus unable to tackle new

problems which have not been classified, i.e., a fault diagnosis expert system cannot

diagnose new machinery faults when compared to real human expertise.

Furthermore, building knowledge bases is labour oriented and time consuming. This

often makes other AI techniques as the preferred ones due to their flexibility and

efficiency in diagnosing rotating machinery faults.

Available expert systems for rotating machinery fault diagnosis are limited.

Amethyst [125] is an expert system to assist with vibration based condition

monitoring of rotating machinery. Vibration patterns are collected from machinery

such as pumps, fans, motors and generators during normal operating conditions. The

Amethyst software performs automatically the same analysis of these vibration

patterns as an experienced machinery expert. The result of this analysis, is a


____________________________________________________________________

53

diagnosis as to whether the machine is working properly and what the problems are.

Amethyst has been used by approximately 400 companies world-wide, making it one

of most widely used expert systems in the manufacturing industry.

Another expert system [126] developed by El Adawi performs preventive

maintenance tasks and detects faults/failure during standard operating cycles. The

expert system was a combination of an intelligent inference engine matched with a

database of information. That system enabled the operator to spot instantaneously the

parameters of interest. Predictive maintenance enabled the operator to minimize the

shut down time of faulty equipment and hence increased productivity. Furthermore,

the system minimized probable human faults and reduced production costs. The

design of this system was based upon the data from measuring equipment linked with

the information system for a Syngas compressor and other rotating machinery such

as pumps, motors, and turbines, etc.

Georgin [127] carried out research based on two of EDF's diagnostic expert systems,

DIVA (for turbine-generators) and DIAPO (for primary cooling pumps), which

interactively deals with users with different knowledge levels.

Shao [128] proposed a new concept on the degree of credibility of parameter value

variations (DCPV factor) and developed an expert system for on-line monitoring and

diagnostics of rolling element bearings.

2.3.2.4 Other Artificial Intelligence Techniques Based Fault Diagnosis

More and more hybrid artificial intelligence techniques are applied to fault diagnosis

of rotating machinery. Neurofuzzy computing recently emerged as an alternative

technique to diagnosing rotating machinery faults [63, 118, 129-139]. Neurofuzzy

techniques work in the following way: A fuzzy set interpretation is incorporated into

the network design to handle imprecise information. Neural Network architecture is

used to automatically deduce fuzzy if-then rules based on a hybrid supervised

learning scheme. For instance, Altmann and Mathew successfully applied an

adaptive neural fuzzy inference system (ANFIS), using a zero-order sugeno fuzzy

model (shown in Figure 2.13) in rolling element bearings fault diagnosis [139].


____________________________________________________________________

54

input inputmembership

function

outputmembership

function

weightedsum

output

rule output

x

y W2Z2

W2

W1 W1Z1

x y

x y

A1

A2

B1

B2

normalisation node

B2

Σ

Σ

W1Z1+W2Z2

W1+W2

W1Z1+W2Z2

W1+W2

Figure 2.13: A Zero-Order Sugeno Fuzzy Model

A fuzzy expert system [140, 141] for real-time process condition monitoring and

incident prevention was developed. Its reasoning strategy was based on dynamic

membership functions of fuzzy systems. The fuzzy expert system could codify the

expertise knowledge to handle incidents, perform process condition monitoring, and

provide operation support with a multimedia user interface. The prototype of this

system was successfully used in a chemical pulp mill for process condition

monitoring and incident prevention.

Artificial Neural Network (ANN) technology and fuzzy logic [118] were applied to

the field of on-line machine condition monitoring (CM) for complex electro-

mechanical systems. A method utilising a combination of human knowledge,

encoded using techniques borrowed from fuzzy logic, Kohonen Neural Networks and

statistical K-means clustering was constructed. The methodology was discussed by

means of a direct comparison between this new approach and a purely neural

approach. An analysis of other situations where this approach would be applicable

was also presented and it is discussed other current research work in the area of

hybrid AI technologies which should assist further with the alleviation of the

problems under consideration.

In mechanical equipment monitoring tasks, fuzzy logic theory was applied to

situations where accurate mathematical models were unavailable or too complex to


____________________________________________________________________

55

be established, but some obscure, subjective and empirical knowledge about the

problem under investigation may still exist [132]. Such knowledge is usually

formalized as a set of fuzzy relationships (rules) on which the entire fuzzy system is

based upon. Sometimes, the fuzzy rules provided by human experts were only partial

and rarely complete; while a set of system input/output data were available. Under

such situations, it was desirable to extract fuzzy relationships from system data and

combine human knowledge and experience to form a complete and relevant set of

fuzzy rules. The application of the B-spline Neural Network to monitor centrifugal

pumps was described. A neuro-fuzzy approach was established for extracting a set of

fuzzy relationships from observation data, where a B-spline Neural Network was

employed to learn the internal mapping relations from a set of features/conditions of

the pump. A general procedure was set up using the basic structure and learning

mechanism of the network and finally, the network performance and results were

discussed.

Turbo machinery diagnosis [142] used fuzzy logic and certainty factor techniques for

uncertainty maintenance. For each observation of symptom provided by the user, the

system would apply a set of diagnostic rules stored in the knowledge base to

postulate currently possible causes. This system could deal with all the 88 possible

symptoms typically observed in the field.

A probabilistic Neural Network for automatic diagnosis was able to deal with the

ambiguous diagnosis problems. Kawabe [143] proposed the "partially-linearized

Neural Network (PNN)" by which failure types could be quickly distinguished on the

basis of the probability distributions of symptom parameters. The knowledge

acquisition method for PNN learning using rough sets was also proposed. These

methods were applied to the failure diagnosis of centrifugal pumps, which were used

in a chemical plant. The results of the failure diagnosis verified that the methods

were effective. The methods could also be applied to other diagnosis or pattern

recognition problems.

A more recent development, Support Vector Machines (SVMs) has been used in

diagnosing rolling bearings and compared with ANNs [144]. Features, which are

used in these two strategies, include statistical features and spectral feature. The

performance of these features as input are compared with selected data of features


____________________________________________________________________

56

using genetic algorithms. The SVMs have a high success rate in this work. However,

training the SVMs took longer than training an equivalent ANN. Additionally,

dependent upon the number of support vectors used in the trained solution, the SVM

may also be significantly slower to operate than the ANN.

Jack combined ANN and genetic algorithms in the fault detection of rotating

machinery [145]. In any given scenario, there were many different possible features

that may be used as inputs for the ANN. The main problem in the use of ANNs was

the selection of the best inputs to the ANN, allowing the creation of compact, highly

accurate networks that require comparatively little preprocessing. Jack proposed a

genetic algorithm to tackle the problem in the application of ANN to detect faults in

rotating machinery and examined the use of a genetic algorithm (GA) to select the

most significant input features from a large set of possible features in machine

condition monitoring contexts. Using the GA, a subset of six input features was

selected from a set of 66, giving a classification accuracy of 99.8%, compared with

an accuracy of 87.2% using an ANN without feature selection and all 66 inputs.

From a larger set of 156 different features, the GA was able to select a set of six

features to give 100% recognition accuracy.

In pump condition monitoring, ANN, mostly, with Feed-Forward Multiple Layered

(FFML) network structure, and Back-Propagation (BP) training method was seen to

be an effective strategy. Wang [146] proposed a systematic approach for constructing

FFML network structure based on the related concepts of Genetic Algorithms (GA).

However, when applying this strategy, a potential risk was the selection of the

network structure, which was poorly defined and affected the network performance

to a large extent.

2.3.3 Artificial Intelligence & Wavelet Transform for Fau lt Diagnosis

Several researchers have used wavelet transforms and Neural Networks with success

in fault diagnosis. Engin [110] outlined a wavelet transform (WT) based artificial

Neural Network (ANN) input data pre-processing scheme and presented the results

of localized gear tooth defect recognition tests by employing this proposed

methodology. The methodology consisted of calculating Daubechies' 20-order

(DAUB-20) mean-square dilation WT of the data, and then selecting predominant


____________________________________________________________________

57

wavelet coefficients distributed to certain levels of these WT as inputs to ANN for

pattern recognition. The test results showed that a fairly small sized backpropagation

network trained with a reasonably small number of training sets could detect and

classify various types or degrees of failures occurring on a spur gear pair

successfully.

Lopez [85] presented successful initial results applying continuous wavelet

transforms coupled with conventional Neural Networks to the development of a real-

time fault detection and classification systems. The development of the real-time

fault detection and identification technologies allowed a migration, in the respective

theatre of operation, from expensive scheduled based maintenance to the more

efficient, less costly alternative of condition based maintenance. The approach

produced results in a general methodology which was shown to work equally well on

fault-seeded, helicopter gearbox data and operational data from Navy shipboard

pumps. The family of wavelet basis functions were specifically engineered to allow

for real-time implementation. The wavelet basis functions had a time-scale

decomposition mathematically inspired from biological systems and provided a

clustering in feature space, which allowed for the development of simplified Neural

Network classifiers. Application to various classes of fault data (helicopter and

shipboard pump data) resulted in perfect detection, no false alarms with only modest

deferral rates.

Paya [112] proposed the purpose of condition monitoring and fault diagnostics was

to detect and distinguish faults occurring in machinery, in order to provide a

significant improvement in plant economy, reduce operational and maintenance costs

and improve the level of safety. The condition of a model drive-line, consisting of

various interconnected rotating parts, including an actual vehicle gearbox, two

bearing housings, and an electric motor, all connected via flexible couplings and

loaded by a disc brake, was investigated. This model drive-line was run in its normal

condition, and then single and multiple faults were introduced intentionally to the

gearbox, and to the one of the bearing housings. These single and multiple faults

studied on the drive-line were typical bearing and gear faults which may develop

during normal and continuous operation of this kind of rotating machinery. He

presented the investigation carried out in order to study both bearing and gear faults


____________________________________________________________________

58

introduced first separately as a single fault and then together as multiple faults to the

drive-line. The real-time domain vibration signals obtained for the drive-line were

preprocessed by wavelet transforms for the Neural Network to perform fault

detection and identify the exact kinds of fault occurring in the model drive-line. It is

shown that by using multi-layer artificial Neural Networks on the sets of

preprocessed data by wavelet transforms, single and multiple faults were successfully

detected and classified into distinct groups.

Classical signal processing techniques when combined with pattern classification

analysis could provide an automated fault detection procedure for machinery

diagnostics [147]. Artificial Neural Networks had been established as a powerful

method of pattern recognition. The Neural Network-based fault detection approach

usually required pre-processing algorithms, which enhance the fault features,

reducing their number at the same time. Various time-invariant and time-variant

signal pre-processing algorithms were studied here. These included spectral analysis,

time domain averaging, envelope detection, Wigner-Ville distributions and wavelet

transforms. A Neural Network pattern classifier with pre-processing algorithms was

applied to experimental data in the form of vibration records taken from a controlled

tooth fault in a pair of meshing spur gears. The results showed that faults could be

detected and classified without errors.

Wavelet transforms were combined with fuzzy sets in the application to fault

diagnosis. Lou [148] dealt with a new scheme for the diagnosis of localised defects

in ball bearings based on the wavelet transform and neuro-fuzzy classification.

Vibration signals for normal bearings, bearings with inner race faults and ball faults

were acquired from a motor-driven experimental system. The wavelet transform was

used to process the accelerometer signals and to generate feature vectors. An

adaptive neural-fuzzy inference system (ANFIS) was trained and used as a diagnostic

classifier. For comparison purposes, the Euclidean vector distance method as well as

the vector correlation coefficient method was also investigated. The results

demonstrate that the developed diagnostic method could reliably separate different

fault conditions under the presence of load variations.

Altmann [2] proposed diagnosis of low speed rolling element using best basis

wavelet transformation and ANFIS. First, he analysed the mathematical model of


____________________________________________________________________

59

rolling element vibration. This model provided both an understanding of the

fundamental cause and behaviour of the vibrations. It also implied that it was

necessary to explore the capabilities and limitations of current signal processing

techniques and to facilitate research into new signal processing techniques. A

traditional technique such as FFT requires knowledge of characteristic frequencies to

diagnose faults, thus demanding maintenance persons know the geometry of rolling

elements. Such requirement increases the complexity of utilising the traditional

techniques. Altmann extracted features using Spectral Peak Ratio (SPR) and Kurtosis

to overcome this disadvantage of traditional techniques. After removing outliners,

the Kurtosis and SPR were used as the input to ANFIS (adaptive neuro fuzzy

inference system) that was based on the Sugeno network. The output of ANFIS was

the membership value of different classification and the severity of different faults.

Although Altmann successfully tested the efficiency of applying DWPA and ANFIS

into fault diagnosis of rolling element bearings, there were still several aspects which

needed to be further investigated. Firstly, the Shannon Entropy was chosen as the

criteria (cost function) to select time-frequency localisation. Some other alternatives

such as the Gauss-Marchov “log energy”, Lp norm, threshold entropy, and the

“SURE” entropy could also be used to select the basis of the tree of frequency

localisation. It had not yet been determined which of these could produce the optimal

basis tree for rotating machinery. Secondly, Debauchies 16 was used as the mother

wavelet to represent the signal collected from rotating bearings. In the field of signal

processing, there were other choices such as the Morlet, Coif, and Meyer wavelets.

Attempts to use the Morlet wavelet and Matching Pursuit occurred previously [84,

96, 97]. Comparison between these mother wavelets needed to be conducted so that

different mother wavelets may be applied for different diagnosis tasks.

2.4 Conclusion of Review

The literature review shows that the following topics need to be developed further.

(1) Developing of more in-depth feature extraction tools and methodologies with

correlations to experts’ interpretation of these features to achieve accurate fault

diagnosis.


____________________________________________________________________

60

(2) Development of advanced artificial intelligence techniques (instead of human

expert interpretation) for diagnosis that uses advanced or traditional extracted

features.

(3) Development of advanced integrated automated diagnosis techniques using a

combination of artificial intelligence techniques and advanced feature extraction

techniques.

In particular, the application of AI techniques to fault diagnosis of rotating

machinery has become prominent due to its potential in being able to reduce

dependency on human expertise.

CHAPTER 3. ADVANCED FEATURE EXTRACTION TECHNIQUES

____________________________________________________________________

61

CHAPTER 3. ADVANCED FEATURE EXTRACTION

TECHNIQUES

3.1 Introduction

This chapter introduces the theoretical foundation of the methodology for feature

extraction investigated in this research, predominantly based on advanced time-

frequency analysis techniques. The representation of time-frequency planes of

signals was further developed to better present features of signals. Further studies of

time-frequency analysis techniques were conducted in this research to better extract

defect related impulsive components or represent time-frequency planes of signals

with better resolution.

DWPA has been applied with some success, to fault diagnosis of rotating machinery.

However, the investigations to date has been inadequate. For instance, the choice of

wavelet functions and best basis has been limited. In this chapter, an improved

DWPA is presented from the aspects of best basis selection as well as wavelet

function selection. A variety of basis selection criteria are presented with an analysis

and a final decision. Waveforms and properties of different wavelet functions are

presented.

Matching Pursuit, a recent technique, has been applied to fault diagnosis of rolling

bearings. It is one of the adaptive approximation techniques with pursuit, where a

variety of dictionaries are available for approximation. However, not all dictionaries

are suitable for the application of rolling bearing fault diagnosis. In this chapter, the

principles of Matching Pursuit are elaborated. The available dictionaries are

introduced and then the dictionary suitable for vibration analysis is selected and

presented in detail.

Another adaptive approximation technique, Basis Pursuit is introduced and is the

first application in rolling bearing fault diagnosis. The selection of wavelet packet

dictionary and optimisation algorithms of Basis Pursuit are presented. A Basis

Pursuit denoising method is employed to analyse vibration of rolling bearings under

different fault conditions.


____________________________________________________________________

62

The computation complexity of Basis Pursuit is discussed and analysed in the

chapter. A comparison of these three time-frequency analysis techniques is also

presented in terms of computation complexity and time-frequency component

resolution.

3.2 Best Basis Discrete Wavelet Packet Analysis

Two problems are often confronted when utilising classical methods such as

demodulation. Firstly, the most appropriate frequency band to demodulate needs to

be chosen. If the chosen frequency band does not consist of resonance excited by

bearing damage, demodulation analysis can fail in defect detection. Impulse

characteristics can change with the development of bearing severity. The impulses

are narrow and prominent when defects are incipient but would broaden and would

usually be embedded in the signal when defects develop in the bearing. The other

intricacy is to catch the transitional nature of defect-generated impulses. The

frequency components can change with time. However demodulation as well as

spectrum can only show certain frequency components of the signals [2]. In order to

overcome these problems, wavelet transform techniques are utilised to present

signals in the time-frequency domain.

In this work, the best basis DWPA for the collected signal was calculated in the joint

time-frequency domain. Wavelet coefficients, which were mainly affected by the

bearing faults, was determined and extracted. The resonance frequency need not be

known when using DWPA. Furthermore, a basis that is well matched to the

characteristics of the signal was chosen to improve the representation of bearing

impacts. Features from the interested frequencies were extracted, while contributions

from noise and other components rather than the interested frequencies were

substantially reduced.

An improved method for extracting bearing fault impulses of vibration signals, using

DWPA best basis to extract the wavelet correlation coefficients at an optimal time-

frequency resolution is presented. The DWPA accomplishes this by forming a well-

balanced binary tree from which the best basis is selected.

To demonstrate the improved best basis DWPA, concepts of wavelet transform are

introduced first and followed by the introduction of the fundamentals of best basis


____________________________________________________________________

63

discrete wavelet packet analysis. Discrete wavelet transform is also used in

comparison with the discrete wavelet packet analysis for further illumination of the

DWPA.

Concepts and fundamentals of DWPA [2]

A wavelet ψ is a function of zero average:

( ) 0=∫+∞

∞−dttψ

(3.1)

Which is dilated with a scale parameter s, and translated by u:

( )

−=s

ut

stsu ψψ 1

, (3.2)

The wavelet transform of f at the scale s and position u is computed by

correlating f with a wavelet ψ

( ) ( ) dts

ut

stfsuWf ∫

∞+

∞−

−= *1, ψ (3.3)

A wavelet orthonormal basis decomposes the frequency axis in dyadic intervals

whose sizes have an exponential growth. This fixed dyadic construction by

decomposing the frequency in intervals whose bandwidths may vary. Each frequency

interval is covered by the time-frequency boxes of wavelet packet functions that are

uniformly translated in time in order to cover the whole plane.

Although the wavelet transform has been widely researched during the last decade,

the founding principles behind wavelets can be traced back as far as 1909 when

Alfred Haar discovered another orthonormal system of functions, such that for any

continuous function f(x), the series

10),2()(0

12

02

<≤−=∑∑∞

=

−

=+ xforkxxf j

j k

j

kj ψα (3.4)

converges to f(x) uniformly over the interval 10 <≤ x . Haar’s research led to the

simplest of the orthogonal wavelets, a set of rectangular basis functions. The Haar

basis function was of limited use due to it being discontinuous in nature. This

resulted in it being inefficient in modelling smooth signals, as many levels need to be


____________________________________________________________________

64

included to obtain an accurate representation. Since then major advances in the

development of wavelets have been presented, which will be introduced in a

following section “wavelet selection”.

The wavelet transform is made up of a scaling function (father wavelet) and an

analysing function (mother wavelet). The scaling function is a solution of the dilation

equation:

( ) ( )∑ −=k

k kxcx 2φφ (3.5)

Where the constant kc satisfies the condition:

∑ =k

kc 2 (3.6)

The dilation equation generates the family of wavelets:

( )∑ −k

k kxc 2φ (3.7)

Where j,k Z∈ .

The sequence ( ){ }Zkj,:xkj, ∈φ

is an orthonomal family in ( )RL2

Equations for the mother wavelet can be derived from the scale function:

( ) ( ) ( )∑ += +k

1 21-x kxckk φψ (3.8)

The mother wavelet generates the family of wavelets:

( ) ( )kxjjkj −= −− 22x 2/, ψψ (3.9)

Where Zkj, ∈ .

As with the scaling function, the sequence ( ){ }Zkj, :x, ∈kjψ is an orthonormal

family in ( )RL2

.

Fundamental concepts of Discrete Wavelet Packet Analysis

Discrete wavelet packets form a redundant dictionary of bases, from which the best

basis to represent a given signal can be selected. Wavelet packets are composed of


____________________________________________________________________

65

elementary functions called wavelet packets ( ){ }Znkjkxjn

j ∈− ,,:22 2/ ω , where j, k

and n represent the index of scale, position and degree of oscillation, respectively.

Hubbard loosely described wavelet packets as the product of a wavelet, a wiggle, and

an oscillating function. The wavelet reacts to abrupt fluctuations in the signal, while

smooth oscillations are altered for by the wiggle function. It is this combination of

the wavelet and wiggle functions that enables the size, frequency and position of

time-frequency atoms all to be varied independently.

As with the wavelet transform, wavelet packets can be represented by a filter bank

constructed from quadrature mirror filters. The construction of the wavelet packet

bases can be expressed as:

A wavelet packet base allows any dyadic tree structure. At each point in the tree we

have an option to send the signal through the low pass-high pass filter bank, or not.

Wavelet packets are generated by the following iterations:

( ) ( )( ) ( )∑

∑

−=−

−=−

−−

+

−−

m

jnkm

jn

m

jnkm

jn

mtgkt

mthkt

222

222

21

12

21

2

ωω

ωω

(3.10)

where h and g represent the respective high- and low-pass quadrature mirror filters,

and 0ω and 1ω correspond to the father wavelet (scaling function) and mother wavelet

(analysing function).

For a given signal the wavelet packet coefficients can be iteratively computed by the

following equations.

∑

∑

−−

+

−−

=

=

k

jknlk

jn

k

jknlk

jn

cgc

chc

,21

1,12

,211,2

(3.11)

DWPA can also be explained in comparison with DWT. DWT decomposes the

signal into ortho-normal “wavelets”, scaled and shifted versions of the “mother

wavelet”,ψ . A function ( )tf can be expressed by its wavelet expansion,

( ) ( ) 10,20

12

020 <≤−+= ∑∑

∞

=

−

=+ tforkttf

j k

j

j

kj ψαα (3.12)


____________________________________________________________________

66

The integer j describes the different levels of wavelets, and k covers the number of

wavelets in each level. DWPA can decompose signals into both low frequency

components and high frequency components in a manner shown in Figure 3.1.

For instance, an example tree of wavelet packet decomposition of a signal with

sampling frequency 16k Hz is shown in Figure 3.2. It can be seen that the children

nodes of the best basis tree are in an order which is different from the numerical

order of the frequency.

Figure 3.1: Filter bank representation of DWT and DWPA [149]

3.2.1 Introduction to Different Wavelet Functions

Any discussion of wavelets begins with Haar wavelet, the first and simplest. Haar

wavelet (3.13) is discontinuous (as shown in Figure 3.3), and resembles a step

function. Daubechies (dbN), Biorthogonal (biorNr. Nd), Coiflet (CoifN), Symlets

(symN), Morlet (morl), Mexican Hat (mexh), Meyer, Batter-Lemarie are some other

wavelets which can be chosen [150].

≤≤−≤≤

=otherwise

x

x

H

0

12/11

2/101

ψ (3.13)

halla

This figure is not available online. Please consult the hardcopy thesis available from the QUT Library


____________________________________________________________________

67

Figure 3.2: An example tree of wavelet packet decomposition [91]

0 0.2 0.4 0.6 0.8 1 1.2 1.4-1.5

-1

-0.5

0

0.5

1

1.5 Haar

Figure 3.3: Harr Wavelet

To further introduce some concepts of Shannon and Meyer wavelets, the following

notations are firstly presented.

Hψ

x

halla



____________________________________________________________________

68

Notations

Let φ be a scaling function and h the corresponding conjugate mirror filter. Let ψ be

the function whose Fourier transform is

( )

=2

ˆ2

ˆ2

1ˆ

ωφωωψ g, (3.14)

with

( ) ( )πωω ω += *ˆˆ heg i . (3.15)

Let us denote

( )

−=j

j

nj

nt

jt

2

2

2

1, ψψ . (3.16)

For any scale j2 ,{ }Ζ∈nnj ,ψ is an orthonormal basis of jW . For all scales, { }( ) 2,, Ζ∈njnjψ is

an orthonormal basis of ( )RL2 .

Shannon wavelet

The Shannon wavelet is constructed from the Shannon multiresolution

approximation, which approximates functions by their restriction to low frequency

intervals. It corresponds to [ ]ππφ ,1ˆ−= and ( ) [ ]( )ωω ππ 2/,2/12ˆ

−=h for [ ]ππω ,−= .

We derive from (3.14) that

( ) [ ] [ ]

∪−−∈=−

otherwise

ifei

0

2,,2ˆ2 ππππωωψω

(3.17)

and hence

( )

−

−−

−

−=

2

12

1sin

2

12

2

12sin

t

t

t

tt

π

π

π

πψ . (3.18)

This wavelet is ∞C but has slow asymptotic time decay. Since ( )ωψ is zero in the

neighbourhood of 0=ω , all its derivatives are zero at 0=ω .


____________________________________________________________________

69

Meyer Wavelets

A Meyer wavelet is a frequency band-limited function whose Fourier transform is

smooth, unlike the Fourier transform of the Shannon wavelet. This smoothness

provides a much faster asymptotic decay in time. These wavelets are constructed

with conjugate mirror filters that are and satisfy

( )

∪

−−∈

−∈=

ππππω

ππωω

,3

2

3

2,0

3,

32

ˆ

if

ifh . (3.19)

The only degree of freedom is the behaviour of in the transition bands. It must satisfy

the quadrature condition

( ) ( ) 2ˆˆ 22

=++ πωω hh , (3.20)

and to obtain nC junctions at 3

πω = and 3

2πω = , the n first derivatives must

vanish at these abscissa. One can construct such functions that are ∞C . The

waveform of a meyer wavelet filter is shown in Figure 3.4. Another wavelet, coiflet

waveform is shown in Figure 3.5.

-8 -6 -4 -2 0 2 4 6 8-1

-0.5

0

0.5

1

1.5 Meyer

Figure 3.4: Meyer wavelet function

( )ωh

ω


____________________________________________________________________

70

0 1 2 3 4 5-2

-1

0

1

2

3(a) Coiflets1

0 5 10 15-1

-0.5

0

0.5

1

1.5

2(b) Coiflets2

0 5 10 15 20-1

-0.5

0

0.5

1

1.5(c) Coiflets3

0 5 10 15 20 25-1

-0.5

0

0.5

1

1.5(d) Coiflets4

Figure 3.5: Coiflets wavelet function

Battle-Lemarié Wavelets

Polynomial spline wavelets introduced by Battle and Lemarié are computed from

spline multiresolution approximations. For splines of degree m, ( )ωh and its first m

derivatives are zero at πω = .

This wavelet has an exponential decay. Since it is a polynomial spline of degree m, it

is m-1 times continuously differentiable. Polynomial spline wavelets are less regular

than Meyer wavelets but have faster time asymptotic decay. For m odd, ψ is

symmetric about 1/2. For m even, it is antisymmetric about 1/2.

Daubechies Wavelets

Daubechies wavelets (see Figure 3.6) have a support of minimum size for any given

number p of vanishing moments.

( )ωh

( )ωh

( )ωh

( )ωh

ω ω

ω ω


____________________________________________________________________

71

0 1 2 3-1.5

-1

-0.5

0

0.5

1

1.5

2(a) Daubechies2

0 2 4 6 8-1

-0.5

0

0.5

1

1.5(b) Daubechies4

0 5 10 15-1.5

-1

-0.5

0

0.5

1(c) Daubechies8

0 5 10 15 20-1.5

-1

-0.5

0

0.5

1(d) Daubechies10

Figure 3.6: Daubechies function

Symmlets

Daubechies wavelets are very asymmetric because they are constructed by selecting

the minimum phase square root of( )ωieQ − . Filters corresponding to a minimum

phase square root have their energy optimally concentrated near the starting point of

their support. They are highly non-symmetric, which yields very asymmetric

wavelets.

To obtain a symmetric or antisymmetric wavelet, the filter h must be symmetric or

antisymmetric with respect to the centre of its support. The Symmlet filters (as

shown in Figure 3.7) of Daubechies are obtained by optimizing the choice of the

square root ( )ωieR − of ( )ωieQ to obtain an almost linear phase.

( )ωh

( )ωh

( )ωh

ω ω

ω ω

( )ωh


____________________________________________________________________

72

0 1 2 3-2

-1.5

-1

-0.5

0

0.5

1

1.5(a) symlet2

0 2 4 6 8-1.5

-1

-0.5

0

0.5

1

1.5

2(b) symlet4

0 5 10 15-1

-0.5

0

0.5

1

1.5(c) symlet8

0 5 10 15 20-1

-0.5

0

0.5

1

1.5(d) symlet10

Figure 3.7: Symlet function

There is trade off between symmetry and computation complexity of a wavelet

function. Among the orthogonal wavelets, Harr wavelets have computational

advantages due to its symmetry property. However, Harr wavelets can hardly be used

in the analysis of vibration because it has straight waveforms, which has nothing in

common with vibration waveforms.

For the wavelet families, their properties are summarized in Table 3.1. It can be

concluded that symlet family and coiflet family have most properties such as

compactly supported, near symmetry, orthogonal analysis. These two wavelet

families will be preferable. In particular, symlet function shows good matched

vibration characteristics, therefore, will be a wavelet suitable for the time-frequency

analysis of the bearing vibrations. In this vibration analysis, symlet functions are

considered for reasons such as orthogonality and their waveforms similar to

vibration. The symlet family includes symlet2, symlet4, symlet8, and symlet10.

Among these wavelet functions, symlet8 appears very much similar to vibration

waveforms so that it is eventually chosen in the feature extraction procedure.

( )ωh

( )ωh

ω ω

ω ω

( )ωh

( )ωh


____________________________________________________________________

73

Table 3.1 Summary of Wavelet Families and Associated Properties (Manual of

wavelet toolbox in Matlab)

Property morl Mexh meyr haar dbN symN coifN biorNr.Nd Crude • • Infinitely regular

• • •

Arbitrary regularity

• • • •

Compactly supported orthogonal

• • • •

Compactly supported biothogonal

•

Symmetry • • • • •

Asymmetry • Near symmetry

• •

Arbitrary number of vanishing moments

• • • •

Vanishing moments for φ

•

Existence of φ

• • • • • •

Orthogonal analysis

• • • • •

Biorthogonal analysis

• • • • • •

Exact reconstruction

• • • • • • •

FIR filters • • • • • Continuous transform

• • • • • • • •

Discrete transform

• • • • • •

Fast algorithm

• • • • •

Explicit expression

• • •


____________________________________________________________________

74

3.2.2 Selection of Best Basis

When analysing a signal using DWPA, the information concerned might lie in any

position on the decomposition binary tree. There exists a variety of selection criteria

to choose the best basis for different application. Additive cost functions such as

entropy based criteria are well suited for efficient searching of binary tree structures,

and provide an accurate information cost function of a given signal. For the cost

function E to be additive it must satisfy the criteria E (0) =0 and E(s) = ( )∑i

isE .

The following is a list of additive cost functions that could be used:

The Shannon entropy

( ) ( )∑−=i

iiaddshannon sssE 22 ln (3.21)

The Gauss-Markov “log energy” entropy

( ) ( )∑=i

iadd ssE 2

Markov-Gauss ln (3.22)

The concentration in Lp norm

( ) 21 ≤≤−= ∑ pssEi

p

iaddnorm (3.23)

The threshold entropy

( )elsewhere

pssE iiadd

0

1 threshold

=>=

(3.24)

The “SURE” entropy

( ) { } ( )∑+≤−=i

iiaddsure pspsthatsuchinsE 22,min#

(3.25)

An alternative to additive cost functions implementing non-additive cost functions

can determine a near-best basis for the tree. A non-additive cost function cannot

guarantee the selection of a best basis, thus the basis selected is referred to as a near-

best basis. A non-additive cost function is defined as any function that can be used to

provide a comparison of basis, but does not satisfy the additive requirements. One


____________________________________________________________________

75

such cost function is the Shannon entropy of a finite scheme, where the probabilistic

events are identified with the normalised energies.

Shannon entropy of a finite scheme

( ) ∑

−= 2

2

2

2

2

2

lni

i

i

inon

s

s

s

ssE (3.26)

Best tree searching algorithms

There are two primary approaches used in searching a tree, the top-down search and

the bottom-up search algorithms. The top-down search algorithm calculates a near-

best or sub-optimal basis from a library of bases. An optimal basis cannot be

guaranteed through this method, as the entire library is not searched. However, this

technique does offer substantial computational cost savings, as the best basis can be

simultaneously calculated as the tree is constructed. This means it is not necessary to

calculate the transform coefficients for the entire tree, as pruning of the tree is done

on the fly, with the search terminating as soon as the sum of the children’s cost

functions is greater than the parents. This process reduces the overall calculation and

storage requirements of the optimisation process. The following steps are used to

locate the sub-optimal tree structure:

If ( )∑≤child

opt childEparentE )( , and the parent is not the root node, then set

( ) )( parentEnodeEopt = and terminate the branch.

If ( ) ( )∑>child

optopt childEnodeE , then split and set ( ) ( )∑=child

optopt childEnodeE .

Determination of the optimal or best basis can be achieved through a bottom-up

search algorithm using an additive cost function. Unlike the top-down algorithm, the

bottom-up approach searches through the entire family of bases, to ensure that it

finds the best basis. To do this it starts at the bottom of the tree and compares the

children’s cost function with that of their parent. If the children’s cost function is

greater than their parents, then the children get pruned from the tree, and the parent

becomes a child node. This procedure can be written as:


____________________________________________________________________

76

If E (parent) ( )∑≤child

opt childE , and the parent is not the root node, then prune the tree

and set ( ) )( parentEnodeEopt = .

If ( ) ( )∑>child

optopt childEnodeE , then set ( ) ( )∑=child

optopt childEnodeE .

An example of the best tree DWPA decomposition of a signal is shown in Figure 3.8,

and Figure 3.9. The example shows the top-down search saves on the computation of

the coefficients and entropy of many nodes.


depth position index)


Entropy value index)

120.09

62.624 64.56

27.934 29.436

4.9837 13.292

3.0661 39.037

18.051 14.827

2.8831 10.86

4.8559 5.1133

17.875 12.115 2.6565 0.6131

1.8115 2.9245

(0,0)

(1,0) (1,1)

(2,0) (2,1)

(3,2) (3,3)

(2,2) (2,3)

(3,6) (3,7)

(4,14) (4,15)

(5,30) (5,31)


____________________________________________________________________

77

Discussion on the selection of best basis

The best basis DWPA was applied to signals from the same bearing. It was seen that

even different data segments of a signal often have different decomposition trees

even if same criterion was used in the selection of best basis. It is usually difficult to

find a unique best basis for signals collected in same situation. Therefore, without

further processing, best tree is mostly suitable for the analysis of individual signal

segments and generally needs human interpretation. In the case a generalization is

needed, a best basis selection is not suitable then unless a unique basis is found

before data is processed. In particular, for the usage of a Neural Network, a great

number of data sets are required for the training process. The Neural Network

requires a unique input vector (i.e. a basis), which cannot be obtained using this

principle of best basis selection. The selection of best basis cannot be directly applied

in automatic diagnosis.

3.3 Adaptive Approximation with Pursuit

3.3.1 Fundamentals of Adaptive Approximation with Pursuit

3.3.1.1 Concepts of Adaptive Approximation

With the advance of time-frequency analysis signal processing techniques, recently,

adaptive approximation techniques [102] have become popular for obtaining

economical and precise representations of signals. In adaptive approximation, the

goal is to find the representation of a signal x as a weighted sum of elements γφ from

an over complete dictionaryΓ

γγ

γφα∑Γ∈

=x (3.27)

Or an approximate decomposition

( )mm

i

Rxii

+=∑=

γγ φα1

(3.28)

Where r is an index of a set,Γ ; γα is the coefficient of the elementγφ ; m is the order

of decomposition; and( )mR is a residual.


____________________________________________________________________

78

Adaptive approximation is accurate in representing signals mostly due to its

employing a set “Dictionary”. There have been a variety of dictionaries developed

substantially in the past. The variety of dictionaries will be introduced in the

subsequent section, where the Discrete Wavelet Packet Dictionary is elaborated and

chosen to analyse vibration signals in fault diagnosis of rolling bearings. An

investigation of the reasons why the discrete wavelet packet dictionary is suitable for

this study is also provided. Principles of two adaptive approximation methods are

introduced and have been applied using this Discrete Wavelet Packet dictionary. The

adaptive approximation methods are Matching Pursuit and Basis Pursuit.

3.3.1.2 Concepts and Principles of Different Dictionaries

Atoms and dictionaries

A considerable focus of activity in the recent signal processing literature has been the

development of basis used such as atoms or wavelets in signal representations.

Terminology introduced by Mallat and Zhang [25] was use. A dictionary Γ is

defined as a collection of parameterized waveformsγφ ,{ }Γ∈γφ γ | . The waveforms

γφ are discrete-time signals of length n called atoms. Depending on the dictionary,

the parameter γ can have the interpretation of indexing frequency, in which case the

dictionary is a frequency or Fourier dictionary, of indexing time-scale jointly, in

which case the dictionary is a time-scale dictionary, or of indexing time-frequency

jointly, in which case the dictionary is a time-frequency dictionary. Usually

dictionaries are complete or overcomplete, in which case they contain exactly n

atoms, or more than n atoms. In certain cases, an overcomplete dictionary can also

have continuum dictionaries containing an infinity of atoms and an undercomplete

dictionary might contain fewer than n atoms.

Interesting dictionaries have been proposed over the last few years including trivial

dictionaries, frequency dictionaries, time scale dictionaries, and many time-

frequency dictionaries, which are introduced as follows.

The Dirac dictionary The Dirac dictionary is simply the collection of waveforms

that are zero except in one point:{ }1,,1,0 −∈ nKγ and γφ . This is an orthogonal basis

of nR the standard basis.


____________________________________________________________________

79

The Heaviside dictionary

The Heaviside dictionary is the collection of waveforms that jump at one particular

point: { }1,,1,0 −∈ nKγ ; ( ) { }γγφ ≥= tt 1 ; Atoms in this dictionary are not orthogonal, but

every signal s has a representation

( ) γγ

γγ φφ ∑−

=−−+=

1

1100

n

ssss (3.29)

Frequency dictionaries

A Fourier dictionary is a collection of sinusoidal waveforms γφ indexed by

( )νωγ ,= , where [ )πω 2,0= is an angular frequency variable and { }1,0∈ν indicates

phase type: sine or cosine. In detail,

( ) ( ),cos0, tωφ ω = ( ) ( )tωφ ω sin1, = . (3.30)

For the standard Fourier dictionary, γ runs through the set of all cosines with Fourier

frequencies 2/,,0,/2 nknkk K== πω , and all sines with Fourier

frequencies .12/,,1, −= nkk Kω This dictionary consists of n waveforms; it is in fact

a basis, and a very simple one: the atoms are all mutually orthogonal. An

overcomplete Fourier dictionary is obtained by sampling the frequencies more finely.

Let l be a whole number > 1 and let lΓ be the collection of all cosines with

2ln/,,0ln,/2 K== kkk πω , and all sines with frequencies 12ln/,,0, −= Kkkω .

This is an l fold over complete system. Both complete and overcomplete dictionaries

based on discrete cosine transforms and sine transforms are also used.

Time-scale dictionaries

Several types of wavelet dictionaries are available. Haar dictionary is considered

with “father wavelet” ( ]1,01=ϕ and “mother wavelet" ( ] [ )2/1,01,2/1 11 −=ψ . The

dictionary is a collection of translations and dilations of the basic mother wavelet,

together with translations of a father wavelet. It is indexed by ( )vba ,,=γ , where

( )∞∈ ,0a is a scale variable, [ ]nb ,0∈ indicates location, and { }1,0∈v indicates

gender. In detail,


____________________________________________________________________

80

( ) ( )( ) abtaba ⋅−=ψφ 1,, , ( ) ( )( ) abtaba ⋅−= ϕφ 0,, . (3.31)

For the standard Haar dictionary, γ runs through the discrete collection of mother

wavelets with dyadic scales ( ) 1log,,,/2 20 −== njjna jj L , and locations that are

integer multiples of the scale 12,,0, −=⋅= jjj kakb L , and the collection of father

wavelets at the coarse scale 0j . This dictionary consists of n waveforms; it is an

orthonormal basis. An overcomplete wavelet dictionary is obtained by sampling the

locations more finely: one location per sample point. This gives the so-called

stationary Haar dictionary, consisting of O (n log2 (n)) waveforms.

A variety of other wavelet bases are possible. The most important variations are

smooth wavelet bases, using splines or using wavelets defined recursively from two

scale filtering relations. Although the rules of construction are more complicated

(boundary conditions [28], orthogonality versus biorthogonality [10], etc.), these

have the same indexing structure as the standard Haar dictionary.

Time-frequency dictionaries

Much recent activity in the wavelet communities has focused on the study of time-

frequency phenomena. The standard example, the Gabor dictionary, is due to Gabor

(1946); in notation, ( )tδθτωγ ,,,= , where [ )πω ,0∈ is a frequency, τ is a location,

θ is a phase, and tδ is the duration, and

atoms ( ) ( ) ( ){ } ( )( )θτωδτφγ +−⋅−−= tttt cos/exp 22 are considered. Such atoms indeed

consist of frequencies near ω and essentially vanish far away fromτ .

For fixed tδ , discrete dictionaries can be built from time-frequency lattices,

ωω ∆= kk and ττ ∆= ll , and { }2/,0 πθ ∈ ; with τ∆ and ω∆ chosen sufficiently fine

dictionaries are complete.

The usage of a time-frequency dictionary provides adaptedness for the

representations of vibration signals. Depending on different dictionaries, a parameter

γ for γφ can have different interpretations. In the application to analysing vibrations

of rolling bearings, the employed dictionary is a time-frequency dictionary,

γ indexes time-frequency jointly. In a time-frequency dictionary, the waveforms may


____________________________________________________________________

81

be chosen from different types of wavelet functions, cosine functions with a great

range of different scales. For instance, the wavelet functions can be a group of

Daubechies functions or a group of Morlet functions. Atoms can be produced by

scaling the function with certain time and frequency dimensions using the same

function.

The wavelets dictionary is only the best known of available alternate dictionries

collections of parameterized waveforms. Wavelets, steerable wavelets, segmented

wavelets, Gabor dictionaries, multiscale Gabor dictionaries, wavelet packets, cosine

packets, chirplets, warplets, and a wide range of other dictionaries are now available.

Time-frequency atoms

Mallat and Zhang [97] proposed that a general family of time-frequency atoms (as

seen in Figure 3.10) can be generated by scaling, translating and modulating a single

window function. It can be considered that wavelet atoms ( )tγψ are wavelets, which

are dilated with a scale parameter s, translated by u, and demodulated byξ . It is

denoted that ( )ξγ ,,su= :

( ) ( ) tisu e

s

ut

stt ξ

ξγ ψψψ

−== 1,,

(3.32)

The index γ is an element of the set2RR ×=Γ +[97]. The factor

s

1normalizes

( )tγψ to 1. The energy of ( )tγψ is mostly concentrated in a neighbourhood of u,

whose size is proportional to s. Let ( )ωψ be Fourier transform of ( )tψ , the Fourier

transform of ( )tγψ is

( ) ( ) ( )( )uiess ξωγ ξωψωψ −−−= ˆˆ

(3.33)

The energy of ( )ωψ γˆ is concentrated in a neighbourhood of ξ , whose size is

proportional to 1/s. The time-frequency representation of a wavelet family is built by

relating the frequency parameter nξ to the scale ns with n

n s0ξξ = , where 0ξ is a


____________________________________________________________________

82

constant. The wavelet atom is capable of catching high-energy components, which

are located in subtle time and frequency intervals.

Figure 3.10: An example of a time-frequency atom plot [151]

Wavelet packet dictionary

Recently, Coifman and Meyer [97] developed the wavelet packet especially to meet

the computational demands of discrete-time signal processing. A wavelet packet

dictionary includes, as special cases, a standard orthogonal wavelet dictionary, the

Dirac dictionary, and a collection of oscillating waveforms spanning a range of

frequencies and durations. This dictionary has wavelet functions (oscillating

waveforms), which are advantageous when diagnosing rotating machines (as

discussed in the section of introduction of wavelet functions). The orthogonality of

this dictionary makes it possible for us to get a unique basis for a signal. The dyadic

dictionary is an efficient way for computation of an approximation of a vibration

signal. The scaling of wavelet packet atoms is more flexible than wavelet transform

and discrete wavelet packet analysis.

halla



____________________________________________________________________

83

For signals of N samples, each vector of a wavepacket dictionary (a wavelet packet

atom γψ ) is indexed by ( )kpj ,,=γ , with jj kNpNj 20,20),(log0 2 ≤≤≤≤≤≤ − .

Such an atom has similar time-frequency localization properties to a discrete window

function, dilated byj2 , centred at

+21

2 pj , and modulated by a sinusoidal wave of

frequency

+−

21

22 kjπ .

A wavelet packet dictionary with a symlet8 wavelet function is suitable for vibration

analysis of rolling bearings due to the following reasons:

(1) it is one type of time-frequency dictionary and able to catch time-frequency

atoms with high energy level. This is desirable for extracting time-frequency high

energy vibration features when diagnosing rolling bearings faults;

(2) The wavelet packet dictionary has wavelet functions (oscillating waveforms),

which match vibration waveforms. This makes their coefficients best represent

vibration level. The more similar the wavelet functions are to the vibrations, the less

number of coefficients are needed to represent the signal. This also helps improve

feature extraction performance.

(3) The orthogonality of this dictionary makes it possible to get a unique basis for a

signal.

(4) The dyadic dictionary is an efficient way for computation of an approximation of

a vibration signal.

3.3.1.3 Computational complexity of dictionaries, Φ and TΦ .

Different dictionaries can impose drastically different computational burdens. The

nominal cost of storing and applying an arbitrary n-by-p matrix to a p-vector is a

constant times np. Certain dictionaries have fast implicit algorithms. The

parameters αΦ and sTΦ can be computed, for arbitrary vectors α ands , (a) without

ever storing the matrices Φ and TΦ and (b) using special properties of the matrices

to accelerate computations.


____________________________________________________________________

84

Wavelets give a dictionary with a fast implicit algorithm; if S8-symmlet is used, both

Φ and TΦ may be applied in O(n) time. For the stationary wavelet dictionary, O(n

log(n)) time is required. Cosine packets and wavelet packets also have fast implicit

algorithms. Here both Φ and TΦ can be applied in order O(n log(n)) time and order

O(n log(n)) space much better than the nominal ( )nnnp 22 log= one would expect

from naive use of the matrix definition.

3.3.2 Matching Pursuit

3.3.2.1 Concepts and Principles of Matching Pursuit

Notations

The space ( )RL2 [25] is the Hilbert space of complex valued functions such that

( ) +∞<= ∫+∞

∞−dttff

22 (3.34)

The inner product of ( ) ( )22, RLgf ∈ is defined by

( ) ( )dttgtfgf ∫+∞

∞−=, (3.35)

where ( )tg is the complex conjugate of ( )tg . The Fourier transform of ( )∈tf ( )RL2

is written ( )ωf and defined by

( ) ( ) dtetff ti

∫+∞

∞−

−= ωωˆ (3.36)

Matching Pursuit uses a specific criterion to search and decide the atoms and their

coefficients in the adaptive approximation. The Matching Pursuit (MP) initiates the

approximation with xR =)0( and builds up a sequence of sparse approximation

stepwise.

At the first step, the Matching Pursuit selects wavelet atoms 1γψ where

1, γψx is

maximum over the whole dictionary. Then the signal can begin with the

representation,

)1(

11, Rxx += γγ ψψ . (3.37)


____________________________________________________________________

85

At the stage k , the Matching Pursuit identifies the dictionary atom that best

correlates with the residual and then adds to the current approximation a scalar

multiple of the atom, so that ( ) ( ) )(11 , kkk RRRkk

+= −−γγ ψψ . That is,

)1( +kγψ is selected

in the dictionary if maximum over ( )k

kR γψ,1− .

After the k iterations, the signal can be represented as

( ) kk

i

i RRxii

+=∑=

−

1

1 , γγ ψψ , (3.38)

Where ( )i

iR γψ,1− is the coefficient of the atomiγψ . The decomposition is ceased

when certain residual requirement is fulfilled according to the application of the

Matching Pursuit.

3.3.2.2 Complexity

Each Matching Pursuit iteration requires ( )( )NNO 2log operations.

3.3.3 Basis Pursuit

3.3.3.1 Concepts and Principles of Basis Pursuit

Basis Pursuit [151] represents signals in over complete dictionaries by convex

optimization. It obtains the decomposition that minimizes the 1l norm of the

coefficients occurring in the representation.

1min α (3.39)

The powerful over complete dictionaries and optimization rules of Basis Pursuit, if

properly applied to analysing vibration signals of machines, can lead to effective

machinery fault diagnosis.

A vibration signal of an operating bearing ( )ntxx t <≤= 0: (viewed as a vector

in nR ), is a discrete-time signal. This signal is sampled with equal time intervals

(depending on its sampling frequency) and a sampling length n. It is important to

find the precise reconstruction of the signal using superposition of certain elementary

waveforms γφ . This is the main objective of Basis Pursuit algorithm and which

yields an adaptive time-frequency transform to extract vibration features. In


____________________________________________________________________

86

particular, using Basis Pursuit, impulsive component at defect frequencies can be

represented in the time-frequency domain with subtle resolution and sparsity.

According to the residual requirement of an adaptive approximation, an adaptive

approximation can be achieved after a certain number of iterations. Best matched

wavelet packet atoms are selected from the wave packet dictionary in these iterations

based on the optimization principle of Basis Pursuit.

Basis Pursuit can be used with noisy data by solving an optimization problem trading

off a quadratic misfit measure with a 1l norm of coefficients. Basis Pursuit can stably

suppress noise while preserving structure that is well expressed in the dictionary

under consideration.

Basis Pursuit is closely connected with linear programming. Recent advances in

large-scale linear programming associated with interior-point methods can be applied

to Basis Pursuit and can make it possible, with certain dictionaries, to nearly solve

the BP optimization problem in nearly linear time.

The optimization principle for Basis Pursuit leads to decompositions that can have

very different properties from the wavelet transform, DWPA, and Matching Pursuit.

In particular, they can be much sparser. Vibration analysis using Basis Pursuit can

effectively extract features, which are sparser than features DWPA and Matching

Pursuit can extract. As a result, unnecessary features can be invisible and disregarded

when conducting the interpretation of features. Furthermore, because Basis Pursuit is

based on convex optimization, it searches for solutions globally. A signal can be

stably super resolved, ie, the time and frequency of features can be sufficiently

localised.

3.3.3.2 Algorithms used in solving Basis Pursuit problems

Basis Pursuit is realised using an optimization principle. This optimization procedure

is a complicated linear problem, which can be successfully solved given the

significant amount of work done on the solution of linear programs in the past.

Algorithms for solving linear problems are the simplex and interior-point algorithms,

which have been applied to Basis Pursuit. These two algorithms in the application of


____________________________________________________________________

87

solving Basis Pursuit problems are called Basis Pursuit - simplex or Basis Pursuit –

interior, and are introduced as follows:

Basis Pursuit - simplex

In standard implementations of the simplex method for LP, one first finds an initial

basis B consisting of n linearly independent columns of A for which the

corresponding solution bB 1− is feasible (nonnegative). Then one iteratively improves

the current basis by swapping one term, at each step, on the basis for one term not in

the basis, using the swap that best improves the objective function. There always

exists a swap that improves or maintains the objective value, except at the optimal

solution. Moreover, LP researchers have shown how one can select terms to swap in

such a way as to guarantee convergence to an optimal solution (anticycling rules).

Hence the simplex algorithm is explicitly a process of BP: iterative improvement of a

basis until no improvement is possible, at which point the solution is achieved.

Translating this LP algorithm into BP terminology, one starts from any linearly

independent collection of n atoms from the dictionary. This is the current

decomposition. The current decomposition is iteratively improved by swapping

atoms in the current decomposition for new atoms, with the goal of improving the

objective function. By application of anticycling rules, there is a way to select swaps

that guarantees convergence to an optimal solution (assuming exact arithmetic).

BP-interior

The collection of feasible points { }0,: ≥= xbAxx is a convex polyhedron in mR (a

“simplex"). The simplex method, viewed geometrically, works by walking around

the boundary of this simplex, jumping from one vertex (extreme point) of the

polyhedron to an adjacent vertex at which the objective is better. Interior point

methods instead start from a point ( )0x well inside the interior of the simplex

( )( )00 >>x and go “through the interior" of the simplex. Since the solution of a LP is

always at an extreme point of the simplex, as the interior-point method converges,

the current iterate ( )kx approaches the boundary. One may abandon the basic interior

point iteration and invoke a “crossover" procedure that uses simplex iterations to find

the optimizing extreme point.


____________________________________________________________________

88

Translating this LP algorithm into Basis Pursuit terminology, one starts from a

solution to the overcomplete representation problem ( ) s=Φ 0α with ( )0α > 0. One

iteratively modifies the coefficients, maintaining feasibility ( ) sk =Φα , and applying

a transformation that effectively sparsifies the vector ( )kα . At some iteration, the

vector has n≤ significantly nonzero entries, and it “becomes clear” that those

correspond to the atoms appearing in the final solution. One forces all the other

coefficients to zero and “jumps” to the decomposition in terms of the n≤ selected

atoms. (More general interior-point algorithms start with ( ) 00 >α but don't require the

feasibility ( ) sk =Φα throughout; they achieve feasibility eventually.)

3.3.3.3 Denoising

For noisy data, Basis Pursuit can be adapted to assume the form:

zsy σ+= ; (3.40)

where ( iz ) is a standard white Gaussian noise, σ > 0 is a noise level, and s is the

clean signal. In this setting, s is unknown, while y is known. Usually an exact

decomposition of y is not required. Instead decompositions like (3-28) become

relevant.

Basis Pursuit denoising (BPDN) refers to the solution of

1

2

221

min αλαα

⋅+Φ−y (3.41)

The details of the denoising algorithm can be referred to [151]. Basis Pursuit

denoising of the vibration signals was conducted in this research to highlight the fault

features.

3.3.3.4 Factors affecting the application of Basis Pursuit

When applying Basis Pursuit to analyse vibration signals, problems such as lengthy

calculation times or converging trends can be experienced. These often lead to

insufficiency or failure of the Basis Pursuit analysis of the vibration signals. The

performance of Basis Pursuit analysis of the vibration signals is determined by the

size of the problem, the parameter selection, and the complexity of the signal. These


____________________________________________________________________

89

factors are vital for the success of the Basis Pursuit analysis of signals and are

explained as follows.

(1). The size of problem

The complexity of the Basis Pursuit analysis increases as the size of the problem

grows. The innermost computational step (a conjugate-gradient iteration) has a

complexity that scales with problem size like O(n) or O(n log(n)) depending on the

type of dictionary used. The choice of right dictionary is vital for the successful

application of Basis Pursuit analysis.

(2). Selection of parameters in the Basis Pursuit analysis

The complexity of the primal-dual logarithmic barrier interior-point implementation

depends on both the accuracy of the solution and the accuracy of the conjugate-

gradient solver. The accuracy of the solution is determined by the two parameters

FeaTol, PDGapTol [151] controlling the number of barrier iterations, and the

parameter CGAccuracy, which decides the accuracy of the conjugate-gradient solver

and consequently the number of conjugate-gradient iterations. To meet with the

accuracy required for accurate fault diagnosis, FeaTol, PDGapTol, and CGAccuracy

are set up at 210− for superresolution.

(3). The complexity of the signal

When the vibration signal is straightforward with only a few tonal components, Basis

Pursuit can achieve sparse representation with the algorithm converging quickly.

3.4 Summary

This chapter has presented the basic theory and comments on the suitability of the

time-frequency analysis techniques for fault diagnosis of rolling bearings. According

to the analysis of the above sections, symmlet 8 wavelet functions were chosen for

DWPA. A discrete wavelet packet dictionary with symmlet 8 functions was used in

Matching Pursuit and Basis Pursuit analysis. The wavelet packet dictionary was

chosen for its properties as follows:

(1) It is a time-frequency dictionary,

(2) It has wavelet functions,


____________________________________________________________________

90

(3) It is orthogonal,

(4) It is a dyadic dictionary.

A primal dual log barrier interior algorithm was chosen in the procedure of the

optimization of Basis Pursuit.

For highly non-stationary signals, the entropy minimization in DWPA produces a

mismatch between the “best” orthonormal basis and many local signal components.

On the contrary, the Matching Pursuit is a greedy algorithm that locally optimizes the

choice of the wavelet packet function, for each signal residue. Matching Pursuit

analysis can thus adapt to varying structures. This greedy strategy requires more

computations than the best basis DWPA, whose total complexity is O (N logN). The

globally optimization of Basis Pursuit requires more computational complexity but

yields good results for the decomposition of vibration signals.

Signals were decomposed with the same length and number of decomposition levels

using these three time-frequency analysis techniques in the application to diagnose

rolling element bearing faults.

CHAPTER 4. AUTOMATIC DIAGNOSIS SCHEMA

____________________________________________________________________

91


4.1 Introduction

This chapter presents automatic fault diagnosis methods. Neural Networks were

applied to automate the interpretation of the feature extracted using Fourier spectrum

analysis, time-frequency spectrogram analysis, or advanced time-frequency analysis

techniques (improved DWPA, Matching Pursuit, and Basis Pursuit) as illustrated in

Chapter 3. Conventionally, human experts will distinguish features by inspection. In

spectrum analysis, certain specific frequency components can be discernible for

specialists to analyse the conditions of rolling bearings. In the time-frequency maps

of spectrogram, diagnostic personnel can detect the condition of a rolling bearing by

inspecting the spectrogram time-frequency map of the rolling bearing signal.

In DWPA, certain wavelet packets of bearing signals are filtered to signify the status

of the bearing. Although the above mentioned techniques can sometimes be used

with limited success in diagnosing rolling bearing faults, the above mentioned

features may not be easily distinguished by human experts. To overcome this

limitation of the above methods (reliance on experts), this chapter presents some

common rules for feature extraction. These feature patterns were incorporated with

Neural Network classifiers to form an automatic pattern recognition strategy.

Neural Networks were chosen as pattern classifiers for the reasons mentioned in

Chapter 2. Of the various types of Neural Networks, Feed Forward Neural Networks

are the most well developed and generally applied with success in bearing fault

classification.

In each section of this chapter, the automatic diagnostic schema was designed and

explained one by one with a flowchart drawn first followed by a detailed description.

The schemas presented follow the process of feature extraction and rolling bearing

conditions classification. The strategy for feature selection is elaborated first in each

section. The principle of Neural Networks is presented with specific details and two

architectures are designed for pattern recognition.


____________________________________________________________________

92

4.2 Automatic Fault Diagnosis Using Spectrum Analysis

The automatic diagnosis procedure used in this study is shown in Figure 4.1, where

frequency features are extracted first. Frequency features were extracted using

Fourier spectra and the time frequency spectrogram separately. The spectrum or time

series signals of each frequency band after spectrogram were processed to reduce the

dimension of feature vectors. A Feed Forward Neural Network (FFNN) was

designed and trained on these features to classify bearing conditions.

Figure 4.1: Automatic fault diagnosis procedure using spectrum, spectrogram

with NN

A signal x(t) is a series of sampling digital values which represents the vibration of a

bearing to be diagnosed.

Fourier Spectrum (FFT)

The FFT is the most common technique to convert signals from its time domain into

the frequency domain.

( ) ( )∫+∞

∞−

−= dtetxF tjωω (4.1)

Fault Classification

Feature Extraction

Collected Data

Diagnosis results: ORF, IRF, REF, HB

Fourier spectrum or spectrogram

Averaging

FFNN


____________________________________________________________________

93

FFT of the signals is further calculated by averaging the spectrum in equally divided

frequency intervals. Sixteen intervals were divided on the frequency range of each

data set and averaged to formulate a vector as the input to an FFNN.

Spectrogram

The spectrogram is defined as the square of STFT.

( ) ( ) ( ) ( )2

22,, ∫

∞+

∞−

−−== τττ τπ detwxftSTFTftSPEC fjxx (4.2)

The spectrogram is a arguably the most useful presentation of time frequency

information at this time [149]. In this study, the vibration signals were analysed into

16 frequency bands using the spectrogram. The time series in each band were

averaged and further formulated into a feature vector.

4.3 Automatic Fault Diagnosis Based on DWPA

The automatic diagnosis procedure adopted in this investigation is shown in Figure

4.2. Features were extracted from input signals and subsequently classified to assess

bearing condition. The feature extraction process utilized the DWPA as a pre-

processor. A variety of parameters such as Mean, Root Mean Square (RMS),

Variance, Energy value, Skewness, Kurtosis, Crest Factor, and Matched Filter were

derived from the DWPA based wavelet packets and further used as features. Feed

Forward Neural Networks (FFNN) were used to classify bearing conditions based on

features obtained. The training of the Neural Network was conducted by feeding in

features of a number of signals collected from bearings with known conditions

including Normal (N) (or Healthy Bearing (HB), Inner Race Fault (IRF), Outer Race

Fault (ORF), and Rolling Element Fault (REF).

Choosing wavelet functions

In the procedure outlined above, symlet8 functions were used in feature extraction

for reasons mentioned in the previous chapter. Figure 4.3 shows the symlet8 function

with certain scale and location where the left value in the brackets indicates the

decomposition level and the right value identifies the location of bands (packets) in

the decomposition level.


____________________________________________________________________

94

Figure 4.2: Automatic fault diagnosis procedure using DWPA and NN

Best level decision

A signal can be decomposed into wavelet packets after DWPA. The selection of

wavelet packets is critical for the success of the Neural Network classifier. The best

basis algorithm cannot be used to find a unique tree for different data segments of a

signal, therefore not suitable for the requirement of a Neural Network classifier. To

find a unique way of selecting wavelet packets, it is recommended to just choose the

decomposition level and keep all the nodes up to that level. In this case, information

Reference signals (Signals from bearings with known faults)

BEARING CONDITION CLASSIFICATION

Test signals (Signals from a bearing to be diagnosed)


A Feed Forward Neural Network: Single output NN or multi output NN

Testing sets

Training sets

FEATURE EXTRACTION

DWPA DWPA

Signals of wavelet packets

Mean, or Variance, or Skewness, or Kurtosis, or Crest Factor, or Energy, or Matched Filter

Wavelet Selection

Tree Level Decision

Wavelet Selection

Tree Level Decision

Signals of wavelet packets

Mean, or Variance, or Skewness, or Kurtosis, or Crest Factor, or Energy, or Matched Filter


____________________________________________________________________

95

will not be lossy and then good for finding a uniquely generating input feature

vectors for a Neural Network classifier.

A high decomposition level leads to signals filtered in narrow frequency bands and

accurate frequency localisation of high energy vibrations. However it requires

significant computation time and thus cannot fulfil the requirement for efficient

feature extraction calculation in an automatic diagnostic procedure. There is trade off

between the precision of filter bands and computation time. It is important to select

an appropriate decomposition level to analyse signals in accurate frequency bands

while conducting the DWPA in reasonable time.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4

6

8

10

12

14

16Some S8 Symmlets at various scales and locations

(4, 3)

(4, 8)

(4,11)

(6,12)

(6,26)

(6,34)

(6,42)

(7,51)

(7,77)

(7,101)

(8,31)

(8,81)

(8,102)

(8,166)

(8,202)

Figure 4.3: Symlet8 function with a few scales and locations

Features based on wavelet packets

Parameters are further calculated based on the wavelet packets after the DWPA of

signals. This procedure can dramatically reduce the dimension of the inputs to the

Feed Forward Neural Network (FFNN) classifier. Identifying a reasonable dimension

is critical for the performance of the automatic interpretation of a FFNN outputs and

can avoid the curse of dimension.


____________________________________________________________________

96

4.4 Automatic Fault Diagnosis Using Pursuit

The Pursuit based automatic diagnosis procedure is shown in Figure 4.4. The feature

extraction utilizes Matching Pursuit or Basis Pursuit by choosing dictionary and

decomposition iteration decisions. Feed Forward Neural Networks with multi outputs

were used to classify bearing conditions based on the features.

Figure 4.4: Automatic fault diagnosis procedure using Basis Pursuit (or

Matching Pursuit) and NN

Dictionary selection

In this feature extraction, Discrete Wavelet Packet Dictionary with symlet8 functions

were adopted.

Bearing Condition Classification

Reference signals (Signals from bearings with known faults)

Basis Pursuit Or Matching Pursuit analysis

Dictionary Selection

Decomposition Level Decision

Feature selection Feature selection

Test signals (Signals from a bearing to be diagnosed)


A Neural Network

Testing sets Training sets

Basis Pursuit Or Matching Pursuit analysis

Dictionary Selection

Decomposition Level Decision


____________________________________________________________________

97

Iteration decision

The number of iterations in the calculation of Basis Pursuit (Matching Pursuit) needs

to be decided according to the residual requirement and the calculation efficiency

requirements. In these analyses, the decomposition was conducted from level 1, to 4.

Feature selection

The feature pattern is further formed from the Basis Pursuit coefficients of signals by

selecting high energy components, or a certain number of maximum values of the

coefficients. In this application, the number of selected values for the feature pattern

is the power of two, including 16, 32, 64, and 128. A value under 16 is considered

insufficient to represent a signal. A value over 128 is considered burdensome for

calculation.

4.5 Design of the Feed Forward Neural Network classifier

4.5.1 Feed Forward Neural Networks

Of the variety of activation functions and learning rules, the ones used in this thesis

are presented in the following sections. The back propagation rule will be also

presented.

Activation Function

Most neurons in Neural Networks convert their inputs using a scalar-to-scalar

function called an activation function, yielding a neuron output. The most commonly

used activation functions are linear functions, threshold functions, sigmoid functions,

and bipolar sigmoid functions (refer to APPENDIX). Among these activation

functions, the sigmoid function was selected in the Neural Network classifiers used

in this study.

Sigmoid function (Figure 4.5)

xexg −+

=1

1)( (4.3)

Sigmoid function is suitable for this application due to two reasons:


____________________________________________________________________

98

(1) This function has particular advantages for use in back propagation Neural

Networks because it is easy to differentiate, and thus can dramatically reduce the

computational burden for training.

(2) It is suitable for applications whose desired output values are between 0 and 1.

This makes it suitable for pattern recognition applications, and in this case,

particularly for bearing fault classification.

Figure 4.5: Sigmoid function [152]

(3) Activation functions should be chosen to suit the distribution of target values for

output neurons. The sigmoid function is well suited for target values that are binary

[0,1]. The sigmoid functions have additional advantages for continuous-valued

targets with a bounded range, provided that either the outputs or the targets is scaled

to the range of the output activation function.

Back propagation was devised by generalizing the Widrow-Hoff learning rule to

multiple-layer networks and nonlinear differentiable transfer functions. Input vectors

and the corresponding target vectors are used to train a network until it can

approximate a function, associate input vectors with specific output vectors, or

classify input vectors in an appropriate way. Networks with biases, a sigmoid layer,

and a linear output layer are capable of approximating any function with a finite

number of discontinuities.

Back propagation is a gradient descent algorithm, as is the Widrow-Hoff learning

rule, in which the network weights are moved along the negative of the gradient of

the performance function. The term backpropagation refers to the manner in which

the gradient is computed for nonlinear multilayer networks.

halla



____________________________________________________________________

99

A few aspects need to be considered when designing the structure of Neural

Network classifiers:

(1) Number of neural nodes

A larger number of neurons enable a Neural Network to approximate functions with

more complexity. The shortcoming, however, is that while more neurons (and,

consequently, more parameters) can fit the training data better, the system will likely

provide greater degrees of freedom in between the training points. This might cause

the test data to fit the function desired poorly. Very few parameters relative to the

training data distribution and even the training set would not be adequately matched.

The gradient descent algorithm sometimes may have a tendency to get trapped in

local minima of the error surface.

(2) Layers of Neural Networks

Single-layer networks are sometimes more capable of representing only linearly

separable functions or linearly separable decision domains. Two hidden layered

networks can represent an arbitrary decision boundary to arbitrary accuracy and

could approximate any smooth mapping to any accuracy with sigmoid activation

functions.

Many hidden layers lead to a dramatic increase in the number of local minima.

Gradient-based optimization algorithms often only find local minima, and sometimes

may miss global minima. Even if the training algorithm can find the global minima,

it is highly possible to be stuck in a local minimum after many time-consuming

iterations. The training will then have to be ceased and started over.

In general, one hidden layer is the first choice for any practical feed-forward network

design. A Neural Network using a single hidden layer with a large number of hidden

neurons does not perform well. A second hidden layer with fewer processing neurons

need to then be considered.

Considering the limitation and strength of different numbers of hidden layers in

Neural Networks, only the single and double hidden layered networks were designed

and tested in this study.


____________________________________________________________________

100

4.5.2 Design of the Structure of the Feed Forward Neural Network Classifiers

In the automatic diagnosis procedure, the features of signals extracted using the time

frequency analysis methods are fed into a FFNN to train the network and classify the

condition of the bearing to test.

In this research, two FFNNs were designed for the application of the classification of

bearing conditions. These two FFNNs have different architectures, with different

number of hidden layers and different number of output nodes. The first NN (as

shown in Figure 4.6) has eight input nodes, connected to one hidden layer with ten

nodes. The hidden layer was subsequently input to the one output node. The decision

space of the output node is divided into four intervals, which are related to diagnostic

results respectively.

The second FFNN (as shown in Figure 4.7) has eight input nodes, two hidden layers

with ten nodes and four output nodes. The input layer includes eight nodes, which

are connected to two hidden layers. The output layer of the NN comprises of four

nodes, which represent the classes of the rolling bearing conditions: Normal, IRF,

ORF, and REF respectively.

Figure 4.6: A single output FFNN

Input layer Hidden

layer Output layer

ORF

IRF

REF

Normal


____________________________________________________________________

101

Figure 4.7: A multi output FFNN

The Neural Networks are trained using a back propagation algorithm. The training

can cease according to the criteria of either Mean Square Error (MSE) reach to

certain value or that the epoch of training reaches certain value. In our application, a

target Mean Square Error of 510− and a maximum iteration number (epoch) of 300 is

setup. The training process would stop if any of these conditions were met. The

initial weights and biases of the network were generated automatically by the

program. During our training processes, generally the epoch value of 300 is the one

reached first. The Mean Square Error (MSE) at this time is used as a criterion for

appraising the training performance of the Neural Network and the classification rate

as the criterion for appraising each diagnosis procedure.

4.6 Summary

This chapter presented automatic fault diagnosis procedures. The feature extraction

techniques employed ranged from conventional Fourier Spectrum analysis, time-

frequency spectrogram, improved DWPA, improved Matching Pursuit, to a first

application of the Basis Pursuit. The feature pattern based on spectrum analysis was

formed by selecting a certain number of averaged values of the spectrum. The feature

pattern from the spectrogram was formed by averaging signals in different frequency

bands. The DWPA pattern consisted of Mean, Variance, Skewness, Kurtosis, RMS,

Input layer Hidden

layer Hidden layer Output

layer

ORF

IRF

REF

Normal


____________________________________________________________________

102

Energy Value, Crest Factor, and Matched Filter derived from the wavelet packets.

The feature pattern for the Matching Pursuit was formed by selecting maximum

value of Matching Pursuit coefficients while that of the Basis Pursuit was formed by

selecting maximum Basis Pursuit coefficients.

Details of Feed Forward Neural Networks were also introduced. At the same time, a

single hidden layer single output and double hidden layers multi-output Neural

Networks were designed for diagnosing faults in rolling bearings.

CHAPTER 5. SIMULATION AND EXPERIMENT

____________________________________________________________________

103


5.1 Introduction

This chapter presents vibration signals simulated using mathematical models.

Impulsive signals, which have similar waveforms as vibration waveforms, were

simulated to test the performance of the studied techniques. In the test, the analytical

results of these simulated signals by the proposed methods were used in comparison

with known theoretical characteristics in the designed model. The comparison shows

the performance of the proposed feature extraction techniques.

Experimental work was conducted to collect vibration signals to test the performance

of the proposed methods for experimental diagnostic tasks.

The framework for preparing test data from rolling bearings is briefly presented in

this chapter and includes test rigs and electronic instruments.

Various fault severities were artificially introduced. The dimensions of these bearing

faults are presented in the third section, where the characteristic frequencies of the

bearings are also provided.

The fifth section analyses failure development of different severity of faults in rolling

bearings.

5.2 Simulated Signals

Two signals were simulated as presented in Equations 5.1 and 5.2. The simulated

signals are impulsive signals with digitisation at 5000 Hz. These two simulated

signals consist of four sinusoidal components with exponential amplitude.

( ) ( )[ ] ( ) ( )[ ] ( )( )[ ] ( ) ( )[ ] ( )7.45000cos3200800exp75000cos2700600exp

4.55000cos3000400exp65000cos2400200exp22

221

tttt

ttttty

ππππ

−−+−−+

−−+−−=,

(5.1)

( ) ( )[ ] ( ) ( )[ ] ( )( )[ ] ( ) ( )[ ] ( )165000cos60800exp145000cos90600exp

185000cos70400exp155000cos80200exp22

222

tttt

ttttty

ππππ

−−+−−+

−−+−−=

(5.2)


____________________________________________________________________

104

The former has outliners similar to a sinusoid signal while the latter has outliners

which appear more impulsive. These two signals can successfully present vibration

signals of faulty bearings. In these models, the amplitude and frequency values of the

main components are clearly distinguished. The first model has primary components

with frequencies: 833 Hz, 463 Hz, 357 Hz, and 531 Hz. These frequency components

have time periods: 0.0012 s, 0.0233 s, 0.0028 s, and 0.00188 s. The second model

has primary components with frequencies: 167 Hz, 139 Hz, 178.5 Hz, and 156 Hz.

And associated periods: 0.006 s, 0.0072 s, 0.0056 s, and 0.0064 s.

Gaussian noise with Signal to noise ratio (SNR) of 15 dB were added to these two

simulated signals. These signals with noise more readily simulated actual vibration

signals from bearings, which are often contaminated by environmental noise or other

extraneous vibration sources.

5.3 Test Rig and Experiment Procedure

The experimental procedure adopted in the thesis work program is presented in

Figure 5.1. Accelerometers were attached to the test rigs. Signals were then

transmitted to an amplifier and low pass filtered.

Figure 5.1: Experimental apparatus

Experiments were conducted using two bearing test rigs. The first one comprised an

AC motor and a shaft supported by two rolling element bearings one of which was

the test bearing (see Figure 5.2). Load was applied using a V-belt configuration.

Acceleration transducer

Low pass filter

Amplifier

Data Data recorder A/D converter

Test rig


____________________________________________________________________

105

Figure 5.2: Test rig with V-belt load

The other test rig had an AC motor and DC motor connected via a geared shaft

which was supported by bearings (as shown in Figure 5.3 ).

Figure 5.3: Test rig without load

For the first test rig, an accelerometer was used to measure the vibrations and was

located on the plummer block of the faulty bearing. The accelerometer, ENDEVCO

Model 256HX, is a lightweight low impedance (constant current) piezoelectric

accelerometer with integral electronics, designed specifically for vibration

measurements on small structures. Its frequency response lay in the region, 15Hz to

20 kHz. Signals were amplified using a PCB Model 482A20 amplifier and low

passed using a Krohn-hite model 3202 filter for both noise reduction and anti-

aliasing.


____________________________________________________________________

106

A DAQP-308 16-bit A/D PCMCIA data acquisition card was used to convert

analogue signals to digital signals and which was controlled using the DaqEZ

professional data acquisition software.

For the other test rig, the whole experimental procedure is shown in Figure 5.4. An

NI BNC 2120 A/D card was used in this experiment and the Labview software was

used to record the signals.

Figure 5.4: Experimental apparatus

5.4 Fault Simulation

Faults were simulated by using a wire cutting technique to induce a “crack” in the

races of the two SKF 6204 bearings to be tested. Figure 5.5 illustrates the simulated

faults, one crack in the inner race of a ball bearing and one crack in the outer race of

the second ball bearing. The inner race crack was 0.593 mm wide and 0.858 mm

deep. The outer race crack was 0.660 mm wide and 2.098 mm deep. The ball passing

inner race frequency (IRF) was calculated to be approximately five times the shaft

rotational frequency. The ball passing Outer Race Frequency (ORF) is approximately

three times the shaft rotational frequency. These bearings were embedded in the first

test rig and used to collect data using DAQ acquisition system.


____________________________________________________________________

107

Figure 5.5: Fault simulation of SKF 6205

A series of bearing faults were also simulated in the second test rig. KOYO 6201 RS

bearings were used (shown as Figure 5.6). Laser processing was used to create cracks

with depth, 0.5 mm in inner races and outer races, and a spot in the ball of the

bearing. The width of the outer race and inner race cracks and the diameters of the

rolling element spots are specified in Table 5.1. The dimensions and characteristic

frequencies of these bearings are shown in Table 5.2. The dimensions and

characteristic of the SKF 6204 bearings are listed in Table 5.3.

Figure 5.6: Fault simulation of KOYO 6201 RS


____________________________________________________________________

108

Table 5.1 Fault specification for the bearings KOYO 6201 RS.

Fault location

Outer race (width)

Inner race (width)

Ball (diameter)

type Crack Spot 0.1 0.1 0.1 0.2 0.2 0.2

Fault size (mm)

0.5 0.5 0.5

Table 5.2 Drive end bearing KOYO 6201 (Size in mm)

Inside Diameter

Outside Diameter

Thickness Ball Diameter Pitch Diameter

12 32 10 6 22

The number of balls: 7.

Defect frequencies: (multiple of running speed in Hz)

Inner Ring Outer Ring Cage Train Rolling Element 4.45 2.55 0.36 3.39

Table 5.3 Drive end bearing: 6204 SKF, deep groove ball bearing


Inner Race Outer Race Cage Train Rolling Element 4.95 3.05 0.38 3.98 Ball Diameter (mm) Pitch Diameter (mm) 7.938 33.5

Data with marginal faults were also obtained from the Case Western Reserve

University (CWRU) website. Their bearing faults were simulated using Electric

Discharge Machining (EDM). The CWRU signals were collected from SKF 6205

bearings (Table 5.4 and Table 5.5) with different severities of inner race and outer

race faults. The ball passing inner race and outer race frequencies for these bearings

are 5.4 and 4.7 times shaft rotational frequency respectively.


____________________________________________________________________

109

Table 5.4 Drive end bearing: 6205-2RS JEM SKF, deep groove ball bearing

(Size in inches)

Inside Diameter

Outside Diameter

Thickness Ball Diameter Pitch Diameter

0.9843 2.0472 0.5906 0.3126 1.537


Inner Ring Outer Ring Cage Train Rolling Element 5.4152 3.5848 0.39828 4.7135

Table 5.5 Fault Specifications for 6204-2RS JEM SKF (All dimension in inches)

Bearing Fault Location Diameter Diameter Diameter Drive End Inner Race .007 .014 .021 Drive End Outer Race .007 .014 .021 Drive End Ball .007 .014 .021

Defect period: (s;1/Hz)

Inner Race Outer Race Cage Train Rolling Element 0.005624 0.008496 0.07647 0.006462

5.5 Summary

The structure of the test rigs and the details of instruments used are described in this

chapter. Bearing faults are simulated and measurements made to identify the fault

characteristics of the bearings. The simulated signals were designed with impulses

similar to vibration waveforms generated by bearing defects. Noise was also added to

the simulated signals to further improve the simulation.

Data collected from these experiments was used in rigorous testing of the proposed

methods in this project such as the feature extraction performance of the proposed

techniques such as Discrete Wavelet Packet Analysis (DWPA), Matching Pursuit,

and Basis Pursuit. Furthermore, a significant amount of data collected from this

experimental phase was used in the training and testing of the automatic diagnostic

techniques.

CHAPTER 6. RESULTS AND DISCUSSION

____________________________________________________________________

110


6.1 Introduction

To evaluate the performance of the proposed techniques in practice, the following

aspects need to be considered:

• How can these methods be applied to real diagnostic tasks?

• What would the time-frequency maps look like?

• How would one interpret the time-frequency maps?

• How has the automatic procedures performed?

The time-frequency analysis methods (including the improved DWPA, Matching

Pursuit, and the Basis Pursuit presented in the third chapter) and the automatic

diagnosis using these three methods (presented in the fourth chapter) were applied to

analyzing the data collected in the experiments.

A mixture of software codes were utilised in generating the results in this thesis. The

Matlab Wavelet toolbox, neural network toolbox, and the Matlab code for Basis

Pursuit [151] were utilized to extract features from the vibration signals and for

classification of the bearing condition

Section 6.2 presents the analysis of simulated data and experimental data using these

three time-frequency analysis techniques. The features extracted from the simulated

signals using the time-frequency analysis techniques were compared with the known

characteristics in the simulation to check the accuracy of the time-frequency analysis

techniques. The experimental data were analysed to extract defect related time-

frequency features, which were interpreted as bearing faults.

Section 6.3 presents the result of the application of the automatic diagnostic

techniques with emphasis on the preparation of feature inputs to the Neural

Networks. These features were derived from spectra, spectrograms, DWPA,

Matching Pursuit, and Basis Pursuit. Each feature is embedded into a diagnostic

procedure. The Neural Networks were then implemented by being trained and tested

on real data from faulty rolling element bearings.


____________________________________________________________________

111

6.2 Analysis using DWPA, Matching Pursuit, and Basis Pursuit

Simulated signals and data from the experiments were used to evaluate the Basis

Pursuit technique by comparisons with the best basis discrete wavelet packet analysis

(DWPA) and the Matching Pursuit particularly with regard to their performance on

time-frequency feature extraction. Basis Pursuit was also applied to detect different

fault severities. Moreover, features, which were extracted from signals of faulty

bearings using Basis Pursuit, were reconstructed into a time series with clear defect

impacts and little noise contamination. The results are presented in the following

sections.

6.2.1 Time-Frequency Analysis of Simulated Signals

DWPA, Matching Pursuit, and Basis Pursuit were tested using simulated signals as

shown in Figures 6.1- 6.4. In Figure 6.2, it can be seen that there are mainly four

frequency components in the analysed signal, which have time periods around

0.0012 s, 0.0233 s, 0.0028 s, and 0.00188 s. These detected frequency components

exactly match the frequency components of the theoretical signal. Figure 6.3 and

Figure 6.4 can also generally be used to determine the frequency components.

However, as seen in the figures, when decomposed into same level, the Matching

Pursuit and the DWPA roughly presented the time-frequency components. The

resolution of the time-frequency components from the Basis Pursuit analysis is better

than those from the DWPA or Matching Pursuit analysis.


____________________________________________________________________

112

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Am

plitu

de

Figure 6.1: Simulated impulse signal 1y

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.2: Basis Pursuit TF plane of 1y


____________________________________________________________________

113

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.3: Best basis DWPA TF plane of 1y

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.4: Matching Pursuit TF plane of 1y


____________________________________________________________________

114

To make the simulation more real, Gaussian noise were artificially added into the

pure signal (as shown in Figure 6.5). From Figure 6.6, it can be seen that the noise

influence has been diminished significantly and the frequency components are

distinguished clearly in the time-frequency plane. The Basis Pursuit TF plane

provides very good resolution as well as sparsity for the detection of the time-

frequency components. The Basis Pursuit TF plane is still a clean map with clearly

distinguishable four main time-frequency components. In the Basis Pursuit TF plane,

the noise diminishes dramatically. In the DWPA (Figure 6.7) and Matching Pursuit

TF planes (Figure 6.8), redundant time-frequency components appear and affect the

precise interpretation of the TF planes. The DWPA TF and Matching Pursuit TF

planes shows several frequency components from noise. Basis Pursuit also is able to

remove most of the noise that was introduced in the study. Besides, it also attained

the best resolution of the techniques investigated.

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Am

plitu

de

Figure 6.5: Simulated impulse signal 1y with noise


____________________________________________________________________

115

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.6: Basis Pursuit TF plane of 1y with noise

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.7: DWPA plane of 1y with noise


____________________________________________________________________

116

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.8: Matching Pursuit Plane of 1y with noise

High amplitude impulsive vibrations were also simulated as shown in Figures 6.9-12.

These impulses are meant to be similar to the bearing defects signals. In the time-

frequency planes of Basis Pursuit analysis of this signal, significant components

clearly lead to the interpretation of four main frequencies. The four main frequency

components have time periods around 0.006 s, 0.0072 s, 0.0056 s, and 0.0064 s,

which match the theoretical components of the simulated signal. The time periods of

four main components were obtained by measuring the time intervals of

neighbouring components. As seen from these figures, the time-frequency

components show up better in the Basis Pursuit plane than in the DWPA and

Matching Pursuit planes.


____________________________________________________________________

117

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Am

plitu

de

Figure 6.9. Simulated impulse signal 2y

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.10: Basis Pursuit TF plane of 2y


____________________________________________________________________

118

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.11: DWPA TF plane of 2y

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.12: Matching Pursuit TF plane of 2y


____________________________________________________________________

119

The simulated impulsive signal was also corrupted with Gaussian noise as seen in

Figure 6.13. Further analysis was conducted using the proposed DWPA, Matching

Pursuit, and Basis Pursuit. Figure 6.14, Figure 6.15, and Figure 6.16 present the TF

planes of Basis Pursuit, DWPA, and Matching Pursuit analysis of this signal

respectively. In Figure 6.15 and Figure 6.16, the predominant time-frequency

components are shown but are mixed in with noise elements in both the DWPA and

Matching Pursuit results. In the Matching Pursuit case, it is difficult to distinguish

pertinent time-frequency components from unrelated components due to the

similarity of the amplitude (shown in grey) of the both types of components.

However, the Basis Pursuit plane can diminish the colour of the unrelated

components and clear show impulsive related components. Basis Pursuit is so far

proving to be the most accurate of these techniques.

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Am

plitu

de

Figure 6.13: Simulated impulse signal 2y with noise


____________________________________________________________________

120

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.14: Basis Pursuit TF Plane of signal 2y with noise

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.15: DWPA TF plane of signal 2y with noise


____________________________________________________________________

121

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500

Figure 6.16: Matching Pursuit TF plane of signal 2y with noise

6.2.2 Time-Frequency Analysis of the Signals of Bearings with Marginal Faults

Figures 6.17- 6.20 are the time waveform of a normal bearing, its Basis Pursuit

plane, best basis DWPA plane, and Matching Pursuit plane. The time-frequency

planes of DWPA and Basis Pursuit analysis shows one predominant component at

about 1000Hz, presumably caused by a resonance. Figures 6.21- 6.24 are the time

waveform of a bearing with inner race fault, its Basis Pursuit, best basis DWPA, and

Matching Pursuit planes. The DWPA and Matching Pursuit planes do indicate that a

fault is present but the data is imprecise compared to the Basis Pursuit plane. Figures

6.25- 6.28 are the time waveform of a bearing with ORF, its Basis Pursuit plane, best

basis DWPA plane, and Matching Pursuit plane.


____________________________________________________________________

122

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Am

plitu

de

Figure 6.17: Vibration signal of a bearing with normal condition

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.18: Basis Pursuit TF plane

(m

/s2 )


____________________________________________________________________

123

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.19: Best basis DWPA TF plane

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.20: Matching Pursuit TF Plane


____________________________________________________________________

124

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Am

plitu

de

Figure 6.21: Vibration signal of a bearing with IRF

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


(m

/s2 )

E F

A B C D


____________________________________________________________________

125

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.24: Matching Pursuit TF Plane


____________________________________________________________________

126

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Am

plitu

de

Figure 6.25: Vibration signal of a bearing with ORF

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


(m

/s2 )

A B C D

E F


____________________________________________________________________

127

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.28: Matching Pursuit TF plane


____________________________________________________________________

128

6.2.3 Basis Pursuit and Best Basis DWPA and Matching Pursuit – a

Comparison

A comparison was made between the application of these three techniques to

diagnosing bearing faults regarding their feature extraction performance, sparsity,

superresolution, and computational complexity.

6.2.3.1 Feature extraction performance of Basis Pursuit, Matching Pursuit, and best

basis DWPA

Generally, marginal faults in rolling bearings can go undetected using vibration

techniques such as best basis DWPA and Matching Pursuit. However, in this study

the Basis Pursuit technique proved to be effective in detecting marginal faults. The

application of DWPA, Matching Pursuit, and Basis Pursuit is shown in Figures 6.17-

6.28. The signals were obtained from a normal bearing, a bearing with an inner race

fault, and a bearing with an outer race fault respectively. The darkness indicates the

amplitude of the coefficients of wavelet atoms in the Time-frequency (TF) planes. It

is noted that the results generated using the best basis DWPA and Matching Pursuit

contained a large number of features (wavelet atoms with high amplitude), which

were extracted with relatively rough resolution and often irrelevant to the bearing

faults. These extracted features may cause confusion in a routine diagnostic survey.

The DWPA is mostly considered as a multi-band pass filter. Certain bands of interest

can be selected from the TF plane of DWPA. Matching Pursuit provides more

discernible features in the TF plane but the features are not separated as identifiable

as features extracted from the Basis Pursuit. The features in the TF plane of Basis

Pursuit appear with certain frequencies that related to the bearing fault. Figures 6.17-

6.20 also illustrates the analysis of the original vibration signal of a normal bearing

including the DWPA TF plane, the Matching Pursuit TF plane, and the Basis Pursuit

TF plane. It again shows that Basis Pursuit TF plane provides more effective

features, which can be interpreted more easily to diagnostic faults, when compared

with the DWPA TF and the Matching Pursuit TF planes.

In Figures 6.21- 6.24, these three methods are used to diagnose a marginal inner race

fault in a rolling bearing. The IRF of the bearing to be diagnosed is 162Hz. In the

Basis Pursuit TF plane, the time differences between two neighbouring significant


____________________________________________________________________

129

atoms such as A, B, C, and D (marked) are approximately 0.0062s. This result

corresponds with the IRF defect frequency. It can also be recognized that the

difference between the two frequency components such as E and F matches the IRF,

162 Hz. Therefore, using Basis Pursuit, features of the IRF can be effectively

extracted and IRF of the bearing can be clearly identified. The outer race fault

detection was more straightforward as seen in Figures 6.25- 6.28, where the DWPA

TF and Matching Pursuit TF planes show promise in displaying the features.

However, Basis Pursuit provides the best presentation of features among these three

time-frequency methods. The atoms such as A, B, C, and D are distributed with time

intervals around 0.0093s, which matches the ORF, 107 Hz. It can also be clearly seen

that the difference between two frequency components matches the ORF, 107 Hz.

6.2.3.2 Sparsity

The objective of finding an effective feature extraction technique is to obtain the

sparsest and most representative features possible. Basis Pursuit provides the fewest

significant atoms and has the best sparsity. Matching Pursuit and DWPA produce

more irrelevant atoms for the decomposition of vibration signals from rolling

bearings.

6.2.3.3 Computational complexity

DWPA is the fastest algorithm among these three algorithms. Matching Pursuit is

faster than Basis Pursuit but slower than DWPA. The Basis Pursuit method

computation time requires around five times the Matching Pursuit method to analyse

the same length of data. This disadvantage of Basis Pursuit is offset by the short

length of data needed in its application.

6.2.3.4 Superresolution

Matching Pursuit and Basis Pursuit perform better than best basis DWPA with better

resolution. The improvement in performance of Matching Pursuit and Basis Pursuit

in diagnosing bearing faults is due to employing a wavelet packet dictionary. The

usage of a wavelet packet dictionary brings in more wavelet atoms. There are up to

)(log)1( 2 NN + atoms to be chosen for decomposition of a signal where N is the length


____________________________________________________________________

130

of the signal. The optimisation rule in Basis Pursuit makes the Basis Pursuit analysis

more accurate.

6.2.4 Severity of Bearing Faults Analysed Using Basis Pursuit

The Basis Pursuit technique was evaluated using varying severity of faults in our test

bearings. The effect that fault severity has on the time-frequency characteristic of

faulty bearings vibration is presented in Figures 6.29- 6.33. The signals in Figures

6.29- 6.32 were collected from bearings with different severity of Electro-Discharge

Machining (EDM) inner race faults (shown in Table 6.1). In Figure 6.33, the signal

was collected from a SKF 6204 bearing with a crack in its inner race. In Figures

6.34- 6.36, the signals were collected from bearings, which had different severity of

outer race faults (shown in Table 6.1). In Figure 6.37, the signal was collected from a

bearing with a crack in its outer race. These two figures present the original vibration

signals of faulty bearings and the Basis Pursuit TF features. The severity of bearing

faults in the figures increases from Figures 6.29- 6.37.

The ORF and IRF of bearings SKF 6205 are 107 Hz and 162 Hz respectively. The

ORF and IRF of SKF6204 bearings are approximately 60 Hz and 100 Hz given the

different rotational shaft speed. These defect frequencies can be identified either

from time intervals or frequency intervals of features.

Generally, the time intervals of features matched the defect frequencies when the

fault was marginal. As faults developed, the frequency intervals of features (which

match defect frequencies) became more discernible. Both time intervals and

frequency intervals are shown clearly (Figures 6.29-6.37) when faults grew more

severe.


____________________________________________________________________

131

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.29: The time waveform and its Basis Pursuit TF plane obtained from

the bearing with 0.07 inch EDM IRF

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000



Acc

ele

ratio

n (

m/s2 )

Acc

ele

ratio

n (

m/s2 )


____________________________________________________________________

132

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000



0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-3

-2

-1

0

1

2

3

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000



Acc

ele

ratio

n (

m/s2 )

Acc

ele

ratio

n (

m/s2 )


____________________________________________________________________

133

0 0.02 0.04 0.06 0.08 0.1 0.12-4

-2

0

2

4

Time(s)

Am

plitu

de

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.120

1000

2000

3000

4000


the bearing with a crack in inner race

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-4

-2

0

2

4

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.34: The time waveform and its Basis Pursuit TF plane of the bearing

with ORF: 0.07inch

Acc

ele

ratio

n (

m/s2 )

(m

/s2 )


____________________________________________________________________

134

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-4

-2

0

2

4

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


with ORF: 0.14 inch

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08-4

-2

0

2

4

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


with ORF: 0.21inch

Acc

ele

ratio

n (

m/s2 )

Acc

ele

ratio

n (

m/s2 )


____________________________________________________________________

135

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2-4

-2

0

2

4

Time(s)

Am

plitu

de

Time(s)

Fre

quen

cy(H

z)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

500

1000

1500

2000

2500


with ORF: crack

6.2.5 Basis Pursuit Denoising

The preceding sections have shown that Basis Pursuit can be used to extract vibration

features of a faulty bearing efficiently. The atoms in a Basis Pursuit TF plane can be

used to reconstruct the signal for a more accurate representation of the fault as shown

in Figure 6.38. Figure 6.38 (a), (c), and (e) show the time waveforms from a normal

bearing, a bearing with inner race fault, a bearing with outer race fault. Figure 6.38

(b), (d), and (f) show signals whose noise has been significantly removed by using

Basis Pursuit. It appears after the denoising application that:

(1) there are hardly any impacts detectable in Figure 6.38 (b), the reconstructed

signals of normal bearings;

(2) regular impacts are clearly shown such as spikes A and B in Figure 6.38 (d) (the

reconstructed signals of the inner race faulty bearing), and spikes A and B in Figure

6.38 (f) (the reconstructed signals of the outer race faulty bearing) ;

(3) the measured impacts due to inner race faults that were merged in noise are now

visible.

(m

/s2 )


____________________________________________________________________

136

These differences are quantified as “Signal to Noise” ratios for both the original and

denoised signals as shown in Table 6.2. Signal to Noise ratios were calculated using

both the time and frequency domains and expressed in decibels. In general, the Basis

Pursuit denoised signals achieved signal to noise ratios improvements of about 40dB

which is very significant.

Table 6.1 Fault severity specifications in Figures 6.29- 6.37

0.1778mm (0.007 inch)

0.3556 mm (0.014 inch)

0.5334 mm (0.021 inch)

0.7112mm (0.028 inch)

Wire-cut crack

IRF Figure 6.29 Figure 6.30 Figure 6.31 Figure 6.32 Figure 6.33

ORF Figure 6.34 Figure 6.35 Figure 6.36 Figure 6.37

0 0.1 0.2 0.3-4

-2

0

2

4

0 0.1 0.2 0.3-1

-0.5

0

0.5

1

0 0.1 0.2 0.3-4

-2

0

2

4

0 0.1 0.2 0.3-1

-0.5

0

0.5

1

0 0.1 0.2 0.3-4

-2

0

2

4

Time(s)

Am

plitu

de

0 0.1 0.2 0.3-1

-0.5

0

0.5

1

Time(s)

Am

plitu

de

Figure 6.38: The time waveforms and the Basis Pursuit denoised signals: (a), (b)

normal, (c), (d) with IRF (e), (f) with ORF

A B

A B

(a) (b)

(c) (d)

(e) (f)


____________________________________________________________________

137

Table 6.2 Signal to Noise Ratio of the Original and the BP Denoised signals

Signals of bearings with faults SNR of Time Waveforms (dB)

SNR of BP denoised signals (dB)

Time 21.5 61.7 IRF

Frequency 29.0 69.8

Time 24.8 66.0 ORF

Frequency 29.6 72.1

6.3 Automatic fault diagnosis

The results of the evaluation of the various automatic fault diagnosis schema is

presented in the following sections.

6.3.1 Spectrum Based Automatic Fault Diagnosis

Data collected from a healthy bearing and bearings with IRF, ORF, and REF was

analysed using spectrum and spectrogram. An averaging procedure was used to

reduce the curse of dimension. Feature vectors with 16 elements were used for ANN

classification.

Four data sets obtained from the bearings under different conditions, which were

analysed using spectrum, and spectrogram, and then averaged, are shown in Figures

6.43- 6.51. The feature vectors of bearings under different conditions can be

distinguished from each other, using averaging spectrum and spectrogram.

The NNs were trained and tested based on the above features. Training procedures

were stopped after 300 epochs. The classification rate and estimation error are shown

in Table 6.3. The classification performance of the NN based on spectrum averaged

features and spectrogram averaged features are satisfactory. This is consistent with

the distinguished features extracted using these two techniques (as shown from

Figure 6.43 to Figure 6.51 ). The classification rate is low when features are

extracted by averaging wavelet packets, which is consistent with that the feature

vectors are similar and difficult to distinguish from each other among the features of

signals of bearings with different conditions.

Frequency features and time-frequency features can be successfully combined with

Neural Networks to classify different bearing conditions.


____________________________________________________________________

138

0 1000 2000 3000 4000 5000 60000

0.01

0.02

0.03

0.04

0.05

0.06

Frequency

Acc

eler

atio

n

Figure 6.39: Spectrum of the signal of a normal bearing

0 1000 2000 3000 4000 5000 60000

0.02

0.04

0.06

0.08

0.1

0.12

Frequency

Acc

eler

atio

n

Figure 6.40: Spectrum of the signal of a bearing with IRF

(m

/s2 )

(m

/s2 )

(Hz)

(Hz)


____________________________________________________________________

139

0 1000 2000 3000 4000 5000 60000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Frequency

Acc

eler

atio

n

Figure 6.41: Spectrum of the signal of a bearing with ORF

0 1000 2000 3000 4000 5000 60000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

Frequency

Acc

eler

atio

n

Figure 6.42: Spectrum of the signal of a bearing with REF

(m

/s2 )

(m

/s2 )

(Hz)

(Hz)


____________________________________________________________________

140

0 1000 2000 3000 4000 5000 60000

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

0.01

Frequency(Hz)

Acc

eler

atio

n

Figure 6.43: Feature vectors based on Spectrum of the signal of a normal

bearing

0 1000 2000 3000 4000 5000 60000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

Frequency(Hz)

Acc

eler

atio

n

Figure 6.44: Feature vector based on Spectrum of the signal of a bearing with

IRF

(m

/s2 )

(m/s

2 )


____________________________________________________________________

141

0 1000 2000 3000 4000 5000 60000

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Frequency(Hz)

Acc

eler

atio

n

Figure 6.45: Feature vector based on spectrum of the signal of a bearing with

ORF

0 1000 2000 3000 4000 5000 60000

0.005

0.01

0.015

0.02

0.025

Frequency(Hz)

Acce

lera

tion

Figure 6.46: Feature vector based on Spectrum of the signal of a bearing with

REF

(m/s

2 ) (m

/s2 )


____________________________________________________________________

142

Time

Fre

quen

cy

0 100 200 300 400 500 600 700 800 900 1000

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 6.47: Spectrogram of the signal of a normal bearing

Time

Fre

quen

cy

0 100 200 300 400 500 600 700 800 900 1000

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 6.48: Spectrogram of the signal of a bearing with IRF

(s)

(Hz)

(s)

(Hz)


____________________________________________________________________

143

Time

Fre

quen

cy

0 100 200 300 400 500 600 700 800 900 1000

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 6.49: Spectrogram of the signal of a bearing with ORF

Time

Fre

quen

cy

0 100 200 300 400 500 600 700 800 900 1000

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 6.50: Spectrogram of the signal of a bearing with REF

(s)

(Hz)

(s)

(Hz)


____________________________________________________________________

144

0 2 4 6 8 10 12 14 160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

The No. of a frequency band

Acc

eler

atio

n

Figure 6.51: Feature vector based on spectrogram of the signal of a normal

bearing

0 2 4 6 8 10 12 14 160

0.5

1

1.5


Acc

eler

atio

n

Figure 6.52: Spectrogram feature of the signal of a bearing with IRF

(m/s

2 ) (m

/s2 )


____________________________________________________________________

145

0 2 4 6 8 10 12 14 160

0.5

1

1.5

2

2.5

3

3.5

4

4.5


Acc

eler

atio

n

Figure 6.53: Spectrogram feature of the signal of a bearing with ORF

0 2 4 6 8 10 12 14 160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


Acc

eler

atio

n

Figure 6.54: Spectrogram feature of the signal of a bearing with REF

(m/s

2 ) (m

/s2 )


____________________________________________________________________

146

Table 6.3 Classification performance of automatic diagnosis based on Spectra

and Spectrogram

Feature name

Fault type Training sets

Test sets

Correct classification

Misclas-sification

Classification rate

MSE

IRF 40 0 100% 0.00012 ORF 40 0 100% 0.00040 REF 40 0 100% 0.00025

Mean based on Spectrum

Normal

80 40

40 0 100% 0.00018 IRF 40 0 100% 0.00023 ORF 40 0 100% 0.00037 REF 40 0 100% 0.00014

Mean based on spectrogram Normal

80

40

40 0 100% 0.00023

6.3.2 DWPA Feature Based Automatic Fault Diagnosis

An example of the time waveforms collected and analysed using DWPA to level

three with the symlet function, is shown in Figures 6.55- 6.58. The vertical axis

represents the decomposition level of DWPA, which ranges from one to six. The

horizontal axis includes the wavelet packets in each decomposition level. Each of the

packets is framed by vertical dashed lines and horizontal lines and presented as time

series signals (filtered within the frequency band of the wavelet packet). As shown in

the figures, it can be noted that four signals were filtered and then the energy of each

signal was localised in narrow frequency bands after DWPA. These high energy

frequency bands contain defect information. Therefore parameters derived from

these frequency bands features can be used as features. The energy is focused on the

frequency bands of the first and second wavelet packets of the signal obtained from a

normal bearing (see Figure 6.55). The energy appeared focused on the frequency

bands of the fourth and fifth wavelet packets with high vibration levels (see Figures

6.56- 6.58) by using DWPA analysis of the signals of the bearings with IRF, ORF,

and REF.

Features such as Mean Value, Variance, Energy, Skewness, Kurtosis, Root Mean

Square (RMS), Crest factor, and Matched filter were further derived from the

reconstructed time series signals of the DWPA wavelet packets. These features were

used to represent the time frequency characteristic of the vibration signals.

These features extracted from data sets of bearings are shown in Figures 6.59- 6.66.

In Figure 6.59, the mean value derived from wavelet packets as a feature, provides

little value as a predictor for fault classification.


____________________________________________________________________

147

As shown in Figures 6.60- 6.64, the parameters, Variance, Energy value, and RMS,

are successful in picking up high energy vibration frequency bands, which are often

induced by defects in rolling element bearings. The defect led to significant increases

of these parameters in these graphs. The feature graph of each signal can be easily

distinguished. Even the feature trends of bearings are clearly identifiable. These

features provide accurate classification rate, as presented in Table 6.4 and Table 6.5.

Skewness, and Kurtosis are high order statistical parameters, which are accepted as

efficient parameters in analysing time series of vibration signals. The Crest Factor

also has the potential to be an effective feature in time series classification. Figures

6.61- 6.66 show the trends of these three parameters. It can be observed that the

feature graphs of each bearing signal are distinctive from each other. Kurtosis is

shown to be effective in the automatic classification of the bearing conditions(as

shown in Table 6.4 and Table 6.5). Skewness can be used to reach acceptable

classification rates providing the Neural Network classifiers are well designed.

The Matched Filter is a parameter which extracts frequency features. The Matched

Filter derived from wavelet packets aims at capturing frequency characteristic in

each wavelet packet frequency band. As shown in the Figure 6.59, the Matched Filter

is also able to capture defect caused high energy vibrations, which are presented as

high amplitude peaks. The Matched Filter feature graphs of bearings under the four

conditions are almost as distinctive as those obtained from the RMS, Energy Value,

and Variance parameters.

The FFNN classifiers were evaluated by the input of the above mentioned features.

Training procedures were halted after 300 epochs. The classification rate and

estimation error of the single output Neural Network and the multi output Neural

Network are shown in Table 6.4 and Table 6.5, respectively. The correct

classification rate of the single output Neural Network classifier using Mean Value

as the feature and input ranged from 37.5% for REF to 72.5% for Healthy bearings.

A hundred percent classification rate was achieved using RMS as feature-inputs to

the multi-output Neural Network for all the bearing conditions.


____________________________________________________________________

148

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-7

-6

-5

-4

-3

-2

-1

0

1

Spl

it Le

vel

Frequency[Time]

Figure 6.55: The DWPA of the signal of a normal bearing

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-7

-6

-5

-4

-3

-2

-1

0

1

Spl

it Le

vel

Frequency[Time]

Figure 6.56: The DWPA of the signal of a bearing with IRF


____________________________________________________________________

149

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-7

-6

-5

-4

-3

-2

-1

0

1

Spl

it Le

vel

Frequency[Time]

Figure 6.57: The DWPA of the signal of a bearing with ORF

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-7

-6

-5

-4

-3

-2

-1

0

1

Spl

it Le

vel

Frequency[Time]

Figure 6.58: The DWPA of the signal of a bearing with REF


____________________________________________________________________

150

1 2 3 4 5 6 7 8-0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

The No. of a Wavelet Packet

Mea

n V

alue

REFIRFORFNormal

Figure 6.59: Features based on wavelet packets: Mean Value

1 2 3 4 5 6 7 80

0.05

0.1

0.15

0.2

0.25


Var

ianc

e

REFIRFORFNormal

Figure 6.60: Features based on wavelet packets: Variance


____________________________________________________________________

151

1 2 3 4 5 6 7 8-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6


Ske

wne

ss

REFIRFORFNormal

Figure 6.61: Features based on wavelet packets: Skewness

1 2 3 4 5 6 7 82

4

6

8

10

12

14

16


Kur

tosi

s

REFIRFORFNormal

Figure 6.62: Features based on wavelet packets: Kurtosis


____________________________________________________________________

152

1 2 3 4 5 6 7 80

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1


Ene

rgy

Val

ue

REFIRFORFNormal

Figure 6.63: Features based on wavelet packets: Energy

1 2 3 4 5 6 7 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5


Roo

t M

ean

Squ

are

REFIRFORFNormal

Figure 6.64: Features based on wavelet packets: Root Mean Square


____________________________________________________________________

153

1 2 3 4 5 6 7 84

6

8

10

12

14

16


Cre

st F

acto

r

REFIRFORFNormal

Figure 6.65: Features based on wavelet packets: Crest Factor

1 2 3 4 5 6 7 8-20

0

20

40

60

80

100


Mat

ched

filt

er

REFIRFORFNormal

Figure 6.66: Features based on wavelet packets: Matched Filter


____________________________________________________________________

154

Table 6.4 Performance of the single output FFNN using DWPA features

Feature name

Fault type

Training sets

Test sets

Correct classific-ation

Misclass-ification

Classifi-cation rate

MSE

IRF 27 13 67.5% 0.04254 ORF 17 23 37.5% 0.04186 REF 25 15 62.5% 0.08220

Mean

Normal

80

40

30 10 72.5% 0.06997 IRF 40 0 100% 0.00027 ORF 39 1 97.5% 0.00013 REF 40 0 100% 0.00007

Variance

Normal

80 40

40 0 100% 0.00002 IRF 39 0 100% 0.00002 ORF 40 0 97.5% 0.00001 REF 36 4 90% 0.00004

Energy

Normal

80 40

40 0 100% 0.00001 IRF 25 15 62.5% 0.03795 ORF 27 13 67.5% 0.04283 REF 17 23 42.5% 0.04376

Skewness

Normal

80 40

40 14 65% 0.33935 IRF 38 2 95% 0.02693 ORF 36 4 90% 0.03392 REF 35 5 87.5% 0.03249

Kurtosis

Normal

80 40

39 1 97.5% 0.00188 IRF 39 1 97.5% 0.00001 ORF 39 1 97.5% 0.00009 REF 38 2 95% 0.00008

RMS

Normal

80 40

40 0 100% 0.00009 IRF 36 4 90% 0.09938 ORF 38 2 95% 0.15510 REF 21 19 52.5% 0.08706

Crest Factor

Normal

80 40

39 1 97.5% 0.06382 IRF 29 11 72.5% 0.03544 ORF 39 1 97.5% 0.03311 REF 39 1 97.5% 0.01245

Matched Filter

Normal

80 40

40 0 100% 0.02567


____________________________________________________________________

155

Table 6.5 Performance of the multi output FFNN using DWPA features

Feature name

Fault type

Training sets

Test Sets

Correct classif-ication

Miscl-sificat-ion

Classifi-cation rate

MSE

IRF 31 9 77.5% 0.03071 ORF 40 0 100% 0.02867 REF 24 16 60% 0.03621

Mean

Normal

80

40

40 0 85% 0.02355 IRF 40 0 100% 0.00029 ORF 39 1 97.5% 0.00020 REF 38 2 95% 0.00035

Variance

Normal

80 40

40 0 100% 0.00033 IRF 40 0 100% 0.00022 ORF 40 0 100% 0.00013 REF 40 0 100% 0.00010

Energy

Normal

80 40

40 0 100% 0.92000 IRF 25 15 62.5% 0.03121 ORF 30 10 75% 0.05035 REF 33 7 82.5% 0.03737

Skewness

Normal

80 40

37 3 92.5% 0.06933 IRF 34 6 85% 0.00996 ORF 40 0 100% 0.01326 REF 35 5 87.5% 0.00531

Kurtosis

Normal

80 40

39 1 97.5% 0.00738 IRF 40 0 100% 0.00025 ORF 40 0 100% 0.00014 REF 40 0 100% 0.00012

RMS

Normal

80 40

40 0 100% 0.00028 IRF 39 1 97.5% 0.02227 ORF 40 0 100% 0.01573 REF 33 7 82.5% 0.02476

Crest Factor

Normal

80 40

35 5 87.5% 0.02019 IRF 35 5 87.5% 0.01746 ORF 39 1 97.5% 0.01708 REF 39 1 97.5% 0.02501

Matched Filter

Normal

80 40

40 0 100% 0.07307

6.3.3 Matching Pursuit Feature Based Automatic Fault Diagnosis

Data collected from a Healthy bearing, and bearings with IRF, ORF, and REF were

analysed using Matching Pursuit with four iterations. The calculation to this iteration

level was efficient and precise enough for accurate FFNN classification. Figures

6.67- 6.70 shows typical data that was collected from bearings in Healthy condition,

and bearings with IRF, ORF, and REF, respectively.


____________________________________________________________________

156

The time series signal of a bearing under normal condition appears similar as the

signal of a bearing under REF condition. It is also difficult to distinguish ORF

condition and IRF condition by observing the time series signals. The four signals in

Figures 6.67- 6.70 were further analysed using Matching Pursuit with four iterations

using a wavelet packet dictionary with a symlet function, is shown in Figure 6.71.

The signals were decomposed to the fourth iteration and then presented as time

frequency maps with relatively coarse resolution. The high energy time frequency

components appear “blocky”. Although these decompositions were not accurate

enough for direct interpretation, the features were identifiable from the time

frequency analysis of bearing signals under Healthy, IRF, ORF, and REF conditions.

Note that the data were filtered by the wavelet function in Matching Pursuit analysis

with energy being localised in narrow frequency bands. In Figure 6.71, the high

energy of the signals was concentrated primarily in certain frequency bands for the

different conditions:

• 0-2 kHz band for the Healthy condition –see Figure 6.67,

• 1-5 kHz band for the Inner Race Fault – see Figure 6.68,

• 3-4 kHz band for the Outer Race Fault –see Figure 6.69,

• Under 2 kHz, and 2-4 kHz for the Rolling Element Fault –see Figure 6.70.

An added observation in the above mentioned four figures is the variations in time

intervals for the different frequency bands of the different signals. These time

variations further enable one to differentiate the faults.

The Matching Pursuit coefficients of the signals of bearings under different

conditions are shown in Figures Figure 6.71 (a)-(d). These graphs correspond to the

time frequency maps in Figures 6.67- 6.70, respectively. It can be seen that the

Matching Pursuit coefficients of signals for the different conditions are clearly

distinguishable with the most activity in Figure 6.71 (c) – the signal for the inner race

fault.

Feature vectors were further derived from the above Matching Pursuit coefficients

and were formed by selecting the maximum values among these coefficients (as


____________________________________________________________________

157

shown in Figure 6.72 (a)-(d)). These feature vectors for the different conditions

appear:-

• Flat for Healthy condition - see Figure 6.72 (a),

• With several peaks for Inner Race Fault, Outer Race Fault, and Rolling Element

Fault – see Figure 6.72 (b) - (d)).

These derived feature vectors can be used to classify the different bearing conditions.

The FFNN was tested using the above derived features. In total, 120 data sets of each

bearing condition were analysed in the training and testing of the proposed

methodology. The training procedures ceased after 300 epochs. The classification

rate and estimation error are shown in Table 6.6. The maximum value number used

as the dimension of the input feature vectors were 16, 32, 64, and 128 respectively.

The classification rate ranged from 70 % (outer race fault condition) to 97.5% (inner

race fault condition) when using 16 maximum values of Matching Pursuit

coefficients as the inputs. The classification rate ranged from 55 % (rolling element

fault condition) to 97.5% (inner race fault condition) when using 128 maximum

values of Matching Pursuit coefficients as the inputs. It appears that the higher

maximum value numbers did not increase the classification accuracy.

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


condition: Normal


____________________________________________________________________

158

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


condition: ORF

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.69: The Matching Pursuit of the vibration signal of bearing under

condition: IRF


____________________________________________________________________

159

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


condition: REF

0 1000 2000 3000 4000 5000 6000-1

-0.5

0

0.5

0 1000 2000 3000 4000 5000 6000-2

0

2

0 1000 2000 3000 4000 5000 6000-10

0

10

0 1000 2000 3000 4000 5000 6000-1

0

1

Figure 6.71: The Matching Pursuit (MP) coefficients of vibration signals of

bearings under conditions: (a) Normal (b) ORF (c) IRF (d) REF

(a)

(b)

(c)

(d)

Am

plitu

de(m

/s2 )

The No. of MP coefficients


____________________________________________________________________

160

0 20 40 60 80 100 120

2

4

6

0 20 40 60 80 100 120

2

4

6

0 20 40 60 80 100 120

2

4

6

0 20 40 60 80 100 120

2

4

6

Figure 6.72: The Matching Pursuit Features of Vibration Signals of bearings

under conditions: (a) Normal (b) ORF (c) IRF (d) REF

Table 6.6 Classification performance of different procedures using Matching

Pursuit

Features Fault type Training sets

Test sets

Correct classifi-cation

Misclass-ification

Classif-ication rate

MSE

IRF 39 1 97.5% 0.01657 ORF 28 12 70% 0.01626 REF 33 7 82.5% 0.01835

Maximum 16 value

Normal

80 40

31 9 77.5% 0.02855 IRF 24 16 60% 0.00962 ORF 36 4 90% 0.00887 REF 26 14 65% 0.01043

Maximum 32 value

Normal

80 40

35 5 87.5% 0.00981 IRF 24 16 60% 0.00012 ORF 35 5 87.5% 0.00219 REF 32 8 80% 0.00033

Maximum 64 value

Normal

80 40

35 5 87.5% 0.00224 IRF 39 1 97.5% 0.00012 ORF 39 1 97.5% 0.00013 REF 22 18 55% 0.00065

Maximum 128 value

Normal

80 40

35 5 87.5% 0.00019

(a)

(b)

(c)

(d)

Am

plitu

de(m

/s2 )

The No. of feature vector


____________________________________________________________________

161

6.3.4 Basis Pursuit Feature Based Automatic Fault Diagnosis

Case studies were conducted using data collected from healthy bearings, as well as

bearings with IRF, ORF and REF to test the proposed intelligent diagnostic

technique. The data were analysed using Basis Pursuit for FFNN classification.

Figures 6.73- 6.76 illustrates an example of four data sets, which were analysed

using the Basis Pursuit with a symlet wavelet packet dictionary but at a low level of

decomposition. In this study, it was found that decomposition to level 4 was deemed

to be sufficient for accurate feature extraction as discussed in Chapter 4, Section 4.4.

In the time-frequency maps, the colours range from white to black, with shades of

grey representing, the amplitude values of the time-frequency components. The

signals were decomposed to the fourth iteration and then presented as time-frequency

maps with relatively coarse resolution. The high energy time-frequency components

appear clearly in the time-frequency maps. Although these decompositions were not

accurate enough for direct interpretation, the time-frequency analysis of bearing

signals under Healthy, IRF, ORF and REF conditions were clearly identifiable. It can

be noted that the data were filtered, and the resultant energy was localised in narrow

frequency bands after Basis Pursuit analysis. Referring to these four figures, it can be

seen that the high energy of the signals was concentrated primarily in certain

frequency bands within varying conditions:

• 0-3 kHz band for Healthy condition (see Figure 6.73),

• 2-5 kHz band for Inner Race Fault (see Figure 6.74),

• 3-4 kHz band for Outer Race Fault (Figure 6.75),

• 3-4 kHz for Rolling Element Fault (Figure 6.76).

The above mentioned four figures provide additional information relating to the

difference of the amplitude values of high energy time-frequency components and

the variations in time intervals for the different frequency bands of the different

signals. These time variations further enable one to differentiate the faults.

Comparing Basis Pursuit analysis with Matching Pursuit analysis (as shown in

Figures 6.67- 6.70), Basis Pursuit provides greater accuracy of information on fault

related features. The high energy components from Basis Pursuit analysis in Figures


____________________________________________________________________

162

6.73-76 are both more focussed and distinguishable than those from Matching

Pursuit analysis in Figure 6.71.

Figure 6.77 demonstrate the Basis Pursuit coefficients corresponding with the time-

frequency maps in Figure 6.73, Figure 6.74, Figure 6.75, and Figure 6.76

respectively. It can be seen that the Basis Pursuit coefficients of signals under

different conditions are clearly distinguishable with the most activity in Figure

6.77(c), the signal for the inner race fault.

Feature vectors were further derived from the above-mentioned Basis Pursuit

coefficients and formed by selecting the maximum values among these coefficients

(as shown in Figure 6.78 (a)-(d)). Results indicate that the feature vectors of the

bearing signals under different conditions:

• appear flat in the Healthy condition (refer to Figure 6.78 (a)); and

• have some peak values for Inner Race Fault, Outer Race Fault and Rolling

Element Fault (see Figure 6.78 (b) - (d)).

These derived feature vectors appear identifiable for the classification of the bearing

conditions. The FFNN was tested based on the above derived features. In total, 120

data sets of each bearing condition were analysed to test the proposed methodology.

The training procedures were ceased after 300 epochs. The resultant classification

rate and estimation error are shown in Table 6.7. The maximum value number used

as the dimension of the input feature vectors were 16, 32, 64 and 128 respectively.

The classification rate ranged from 85% for the Rolling Element Fault condition to

100% for the other conditions when using 16 maximum values of Basis Pursuit

coefficients as the inputs. The classification rate ranged from 55% for the Rolling

Element Fault condition to 97.5% for the Outer Race Fault condition when using 128

maximum values of Basis Pursuit coefficients as the inputs. These features

performed poorly for the classification of bearings under the Rolling Element Fault

condition.


____________________________________________________________________

163

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000

Figure 6.73: The Basis Pursuit of the vibration signals of bearings under

condition: Normal

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


condition: ORF


____________________________________________________________________

164

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


condition: IRF

Time(s)

Fre

quen

cy(H

z)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1000

2000

3000

4000

5000

6000


condition: REF


____________________________________________________________________

165

Figure 6.77: The Basis Pursuit coefficients of vibration signals of bearings under

conditions: (a) Normal (b) ORF (c) IRF (d) REF

0 10 20 30 40 50 60

0.20.40.60.8

11.2

0 10 20 30 40 50 60

0.20.40.60.8

11.2

0 10 20 30 40 50 60

0.20.40.60.8

11.2

0 10 20 30 40 50 60

0.20.40.60.8

11.2

Figure 6.78: The Basis Pursuit features of vibration signals of bearings under

conditions: (a) Normal (b) ORF (c) IRF (d) REF

0 1000 2000 3000 4000 5000 6000-0.2

0

0.2

0 1000 2000 3000 4000 5000 6000-0.5

0

0.5

0 1000 2000 3000 4000 5000 6000-2

0

2

0 1000 2000 3000 4000 5000 6000-0.2

0

0.2

(a)

(b)

(c)

(d)

Am

plitu

de (m

/s2 )

The No. of coefficients

(a)

(b)

(c)

(d) The No. of feature vector

Am

plitu

de (m

/s2 )


____________________________________________________________________

166

Table 6.7 Classification performance of different procedures using Basis Pursuit

Features Fault type

Training sets

Test Sets

Correct classification

Misclass-ification

Classification rate

MSE

IRF 40 0 100% 0.00014 ORF 40 0 100% 0.00043 REF 34 6 85% 0.00028

Maximum 16 value

Normal

80 40

40 0 100% 0.00005 IRF 39 1 97.5% 0.00001 ORF 38 2 95% 0.00001 REF 37 3 92.5% 0.00007

Maximum 32 value

Normal

80 40

40 0 100% 0.00015 IRF 28 12 70% 0.00371 ORF 31 9 77.5% 0.00222 REF 35 5 87.5% 0.00346

Maximum 64 value

Normal

80 40

40 0 100% 0.00224 IRF 25 15 62.5% 0.00021 ORF 37 3 97.5% 0.00037 REF 25 15 55% 0.00088

Maximum 128 value

Normal

80 40

38 2 95% 0.00007

6.4 Discussion and Conclusion

The study on TF plane representation of vibration signals has shown that the Basis

Pursuit technique is able to represent vibration signals with sparsely separated

features. Best basis DWPA on the other hand produces more irrelevant atoms, and

which do not contribute to fault features and which can lead to inaccurate diagnosis.

The Matching Pursuit technique performs better than the best basis DWPA with

improved resolution. However, its performance is judged to be less accurate than

Basis Pursuit in fault diagnosis. Matching Pursuit and Basis Pursuit outperform

DWPA in fault diagnosis due to the use of the wavelet packet dictionary. This

dictionary introduces additional wavelet atoms to the analysis. In general, a

maximum of )(log)1( 2 NN + atoms are chosen for decomposition of a signal, where

N is the length of the signal. This procedure provides more flexibility and accuracy

for representing the vibration characteristics of rolling bearings. Furthermore, the

global optimisation rule in Basis Pursuit makes it more accurate than Matching

Pursuit. Basis Pursuit resolves vibration signals more finely with very sparse

significant atoms during decomposition. As a result, defect frequency component

features are efficiently extracted and displayed in the time-frequency plane. An


____________________________________________________________________

167

added feature of Basis Pursuit is that the method needs relatively short lengths of

data for effective analysis while traditional diagnostic techniques require larger

amounts of vibration time series data.

Basis Pursuit also increases the signal to noise ratio in identifying fault features. It

removes high frequency noise thereby accentuating the defect frequency

components. The limitation of Basis Pursuit is that its computation tends to take

longer in comparison with DWPA and Matching Pursuit using the same length of

data. However, this disadvantage is offset by the fact that Basis Pursuit can use

shorter data lengths than the other techniques.

The results of NN classifier using different features also show that the proposed

techniques for feature extraction are mostly effective. The features derived from

spectrum, spectrogram, DWPA, Matching Pursuit, and Basis Pursuit are identifiable

for rolling element bearing conditions.

6.5 Summary

This chapter provides a discussion of the research findings obtained from

investigation into the application of three time-frequency analysis techniques – the

improved DWPA, improved Matching Pursuit, and the novel Basis Pursuit.

Comparisons were made regarding a variety of characteristics of these three analysis

techniques. Advantages and disadvantages of these three techniques were described.

This chapter concluded with results obtained from evaluation of the classification

rates of the automatic diagnostic techniques incorporating these three time-frequency

analysis techniques.

CHAPTER 7. CONCLUSION

____________________________________________________________________

168


7.1 Introduction

This thesis has presented novel time-frequency analysis techniques and ANN

classifiers, which were validated on vibration signals of bearings under different

conditions.

This research contributes new knowledge to the feature extraction of vibration

signals of rolling element bearings and the intelligent interpretation of the extracted

features. The major objectives of this research were to improve the reliability of fault

diagnostic techniques and reduce post processing tasks normally conducted by

human experts.

This research focused on wavelet based feature extraction and automation of fault

identification using these features. The development of wavelet based vibration

analysis improves the time-frequency resolution of defect-related features as well as

identifies defects related features.

An overall conclusion of the conducted work is provided in the following sections.

7.2 The Improved DWPA, Matching Pursuit, and Basis Pursuit

The literature demonstrated clearly that research into the further development of the

DWPA and Matching Pursuit techniques for fault diagnosis would be both needed

and justified. In this work, it was proven that these wavelet based methods are

capable of accurate feature extraction. Basis Pursuit has not yet found application in

the diagnosis of bearing defects. This research has addressed the matter and has

presented the first successful application of this technique.

The best basis DWPA can be used to filter high energy frequency bands without

choosing frequency bands. Best basis selection helps DWPA capture relevant

features as well as optimising computation effort.

The Matching Pursuit and Basis Pursuit can produce time frequency maps which

often distinguish defect related components clearly. In particular, Basis Pursuit can

generate better resolution time-frequency maps with the same number of

decomposition levels when compared with the Matching Pursuit and the DWPA.


____________________________________________________________________

169

Basis Pursuit denoising improves the signal to noise ratio of vibration signals

significantly and enables defects to be more discernible in time waveforms.

These three techniques still have limitations when applied to vibration analysis. The

wavelet filters do not always identify the useful components in a time frequency

map. Some of these components can still present problems to diagnostics personnel

in their interpretation. Practically these techniques can sometimes have limitations in

highlighting low frequency components such as those generated due to shaft

misalignment or balancing because of the high sampling frequency data sampling

often used in standard condition monitoring programs.

7.3 Automatic Diagnosis Using Spectrum

Conventionally, the Fourier spectrum is used as the basis in most automatic fault

diagnostics tasks. This research used Fourier spectra initially and subsequently used

the time frequency spectrogram in developing an automatic fault diagnosis

technique. Frequency features and time-frequency feature extraction can be

successfully combined with Neural Networks to classify different bearing conditions.

The spectrum based features and spectrogram features are identifiable for vibrations

of bearings under different conditions. The designed Feed Forward Neural Network

(FFNN) performed well in classification of faults. The proper design of an FFNN is

important for successful diagnosis. An FFNN, with two hidden layers and four

output nodes, produced accurate diagnostic results with accurate classification and

low Mean Square Error (MSE).

7.4 Automatic Diagnosis Using DWPA

The statistical parameters derived from DWPA wavelet packets can be used as

features and fed into Neural Network classifiers to distinguish healthy and faulty

bearings.

The parameters Variance, Skewness, Kurtosis, Energy, RMS, Crest Factor and

Matched Filter generally produced faithful classification using the multi output

FFNN. In particular, Variance, RMS, Energy value and Matched Filter provided the

best classification result. The proposed DWPA features for NN classifiers worked


____________________________________________________________________

170

well in diagnosing bearing faults and can be widely applied to a large range of both

rotating and reciprocating machinery.

Compared with the spectrum and spectrogram analysis based automatic diagnosis,

features using the mean value derived from DWPA produced poor classification rate

of bearing faults.

7.5 Automatic Diagnosis Using Matching Pursuit

Matching Pursuit was proposed as a feature extraction technique in the automatic

diagnosis of rolling element bearing faults. Matching Pursuit can effectively extract

features which are subsequently fed to a Feed Forward Neural Network to accurately

classify bearing conditions. In particular, accurate classification was obtained with

16 and 32 maximum values of Matching Pursuit coefficients as features.

7.6 Automatic Diagnosis Using Basis Pursuit

Basis Pursuit was shown to be more effective then Matching Pursuit in being able to

extract features and provided better classification when used with the FFNN

classifier for automatic fault diagnosis. In this study, the FFNN classifier accurately

categorised bearings’ conditions for Healthy, IRF, ORF and REF. The 16, 32, and 64

maximum values of Basis Pursuit coefficients (as features) produced accurate

classification.

CHAPTER 8. FUTURE RESEARCH

____________________________________________________________________

171


8.1 Signal Processing Techniques for Feature Extraction

The time-frequency analysis techniques developed and applied in this thesis

candidature can be further refined for feature extraction of vibration signals from the

point of view of:

• Reduced computation time;

• Producing time-frequency maps with better resolution, and

• Reconstruction of time waveforms with less noise.

The computation of Matching Pursuit and Basis Pursuit analysis can be time

consuming. Further work on investigating the algorithm for optimizing the selection

of atoms can be done to improve the computation time and the resolution of

extracted features. The algorithm for solving Linear Problems can also be improved

for economic computation by using advances in computation theory.

The wavelet packet dictionary used in this study assisted in producing effective

bearing fault detection. The wavelet packet dictionary can be further developed to

better correlate with vibrations in Matching Pursuit and Basis Pursuit analyses.

8.2 Artificial Intelligence for Diagnosis

Neural Networks were employed in this study of automatic fault diagnosis. This

work can be extended by employing other artificial intelligence techniques for fault

diagnosis. Evolutionary algorithms, genetic algorithms, fuzzy logic, combined with

expert systems may be applied in various condition monitoring and fault diagnosis.

Evolutionary algorithms can assist in determining the best features during feature

extraction. Fuzzy logic can assist in human language input and output. The

development of support vector machines in the area of computational intelligence

can assist in more accurate and faster classification of bearing conditions. Features

selected using Matching Pursuit analysis and Basis Pursuit analysis can be used as

inputs to support vector machines, evolutionary algorithms, genetic algorithms, or

fuzzy logic, for automatic fault diagnosis.


____________________________________________________________________

172

A variety of Neural Networks can be developed and applied in intelligent fault

diagnosis. These Neural Networks can be Self Organised Maps, Recurrent Neural

Networks, and Probabilistic Neural Networks. These Neural Networks can be used as

classifiers based on the features extracted using DWPA, Matching Pursuit, and Basis

Pursuit analysis. Feature selection can be further investigated based on DWPA,

Matching Pursuit, and Basis Pursuit and incorporated into these various Neural

Networks to achieve fault diagnosis of rolling element bearings.

8.3 Incipient Fault Detection using Time-Frequency Analysis Techniques

This work has been concerned with the development of the overall diagnostics

technique and has not particularly focussed on incipient faults. Further work may

investigate the sensitivity of the technique to detecting marginal faults.

This study has been concerned with studying rolling element bearing faults which

represent a fundamental component in rotating machinery. The work can be extended

to other rotational components such as pumps, fans and motors. The techniques used

in this study may also be validated on various operating regimes such as high and

low speed condition.

8.4 Automated Diagnosis of Process Monitoring and Material

Degradation.

The time frequency techniques and automated diagnosis procedure evaluated in this

study can also be used in other fields for diagnostics such as process monitoring and

material degradation studies particularly for structural health monitoring.

8.5 Automatic diagnosis of transmission systems

Power transmission systems such as geartrains have similarities to the operation of

rolling element bearings given their rotational and periodic behaviour. The feature

extraction technique proposed in this thesis can be directly applied to these classes of

machines. For example, the characteristic fault frequencies of a transmission system

can be extracted from the time-frequency plane derived from DWPA, Matching

Pursuit, and Basis Pursuit and analysed in exactly the same manner as that adopted in

this thesis.


____________________________________________________________________

173

8.6 Commercializing an integral intelligent diagnostic toolbox

This research provides a solid and successful study of automatic fault diagnosis of

rolling bearings using time-frequency analysis and neural networks. The Matching

and Basis Pursuit techniques can be incorporated into a comprehensive toolbox for

fault diagnosis which needs to be further developed and commercialized to improve

the practice of condition monitoring and diagnostics.

REFERENCES

____________________________________________________________________

174

REFERENCES

1. Baillie, D. and J. Mathew, Diagnosing Rolling Element Bearing Faults with

Artificial Neural Networks. Acoustics Australia, 22(3): pp. 79-84. (1994).

2. Altmann, J., Application of Discrete Wavelet Packet Analysis for the

Detection and Diagnosis of Low Speed Rolling-Element Bearing Faults.

Monash University, Melbourne. (PhD thesis,1999).

3. Mathew, J., Standards in condition monitoring, in Proceedings of Condition

Monitoring 97, (keynote Paper), Xian, P R China: pp. 1-9. (1997).

4. Yardley, E.D., Condition Monitoring-Engineering the Practice, Professional

Engineering Publishing, Bury St Edmunds and London, UK: pp. 25-27.

(2002).

5. Rao, B.K.N., Handbook of condition monitoring. Oxford, UK, Elsevier

Advanced Technology: pp. 49-80. (1996).

6. Bently, D.E. and C.T. Hatch, Fundamentals of Rotating Machinery

Diagnostics. Bob Grissom ed. Canada, Bently pressurized bearing press: pp.

11-12. (2002).

7. Hale, V. and J. Mathew, High and low speed bearings, in Condition

Monitoring Frontiers, the second (CM)2 Forum, Melbourne, Australia: pp.

21-30. (1995).

8. Rolling Element Bearings. http://www.cmcpweb.com/appnotes/reb.htm.

9. Russell, S. and P. Norvig, Artificial intelligence: a modern approach,

Englewood Cliffs, N.J, Prentice Hall: pp. 23-24. (1995).

REFERENCES

____________________________________________________________________

175

10. Haykin, S., Neural networks: a comprehensive foundation. New York;

Macmillan; Toronto; Maxwell Macmillan Canada; New York: Maxwell

Macmillan International: pp. 16-18. (1994).

11. Forsyth, R., Expert systems: principles and case studies. London, Chapman

and Hall: pp. 8-9. (1984).

12. Zadeh, L.A., Fuzzy sets. Information and Control, 8: pp. 338-353. (1965).

13. Chen, C.H., Fuzzy logic and neural network handbook. Computer

engineering series. New York, McGraw-Hill: pp: (various pagings). vol. 1.

(1996).

14. Kruse, R., J. Gebhardt, and F. Klawonn, Foundations of fuzzy systems.

Chichester, West Sussex, England New York, Wiley & Sons: pp xii, 265.

(1994).

15. Baillie.D.C., Applications of Artificial Neural Networks for Bearing Fault

Diagnosis. Monash University, Melbourne, Australia. (PhD thesis, 1996).

16. Bow, Pattern recognition. Marcel Dekker, Inc., Now York: pp. 55-56. (1984).

17. Jang, J.-S.R., C.-T. Sun, and E. Mizutani, Neuro-fuzzy and soft computing : a

computational approach to learning and machine intelligence. MATLAB

curriculum series. Upper Saddle River, NJ, Prentice Hall: pp. xxvi, 614.

(1997).

18. Browne, A., Neural network analysis, architectures, and applications.

Philadelphia, Pa., Institute of Physics Pub.: pp. 7-9. (1997).

REFERENCES

____________________________________________________________________

176

19. El Hachemi Benbouzid, M., A review of induction motors signature analysis

as a medium for faults detection. Industrial Electronics, IEEE Transactions,

47(5): pp. 984-993. (2000).

20. Alfredson, R.J. and J. Mathew, Time domain methods for monitoring the

condition of rolling element bearings. Mechanical Engineering Transactions -

Institution of Engineers, Australia, ME 10(2): pp. 102-107. (1985).

21. McFadden, P.D. and J.D. Smith, Information from the vibration of rolling

bearings. In: Condition Monitoring '84, Proc. Int. Conf. on Condition

Monitoring, (Swansea, U.K.: Apr. 10-13, 1984), M.H. Jones (ed.), Swansea,

U.K., Pineridge Press, 1984, Section 2, Paper 8, pp.178-190. (ISBN 0-

906674-32-8). (1984).

22. McFadden, P.D. and J.D. Smith, Model for the vibration produced by a single

point defect in a rolling element bearing. Journal of Sound and Vibration,

96(1): pp. 69-82. (1984).

23. McFadden, P.D. and J.D. Smith, The vibration produced by multiple point

defects in a rolling element bearing. Journal of Sound and Vibration, 98(2):

pp. 263-273. (1985).

24. Kim, P.Y., A review of rolling element bearing health monitoring (II)

preliminary test results on current technologies. In: Proc. on Machinery

Vibration Monitoring & Analysis Meeting: pp. 127-137. (1984).

25. Kim, P.Y., Review of rolling element bearing health (III): preliminary test

results on eddy current proximity transducer technique. I Mech E Conference

Publications (Institution of Mechanical Engineers): pp. 119-125. (1984).

REFERENCES

____________________________________________________________________

177

26. Kim, P.Y. and I.R.G. Lowe, Review of rolling element bearing health

monitoring. Proceedings - Machinery Vibration Monitoring and Analysis

Seminar and Meeting: pp. 145-154. (1983).

27. Kim, Y.W., et al., Analysis and processing of shaft angular velocity signals in

rotating machinery for diagnostic applications, in Acoustics, Speech, and

Signal Processing, 1995. ICASSP-95., 1995 International Conference on,

Dept. of Mech. Eng., Ohio State Univ., Columbus, OH, USA, Theoretical or

Mathematical: pp. 2971-2974. vol.5. (1995).

28. Lebold, M., et al., Review of vibration analysis methods for gearbox

diagnostics and prognostics, in the 54th meeting of the Society for Machinery

Failure Prevention Technology, Virginia Beach,VA: pp. 623-634. (2000).

29. Tandon, N. and A. Choudhury, A review of vibration and acoustic

measurement methods for the detection of defects in rolling element bearings.

Tribology International, 32(8): pp. 469-480. (1999).

30. Chow, M.-Y., Guest editorial special section on motor fault detection and

diagnosis. Industrial Electronics, IEEE Transactions on, 47(5): pp. 982-983.

(2000).

31. Andrade, F.A., I. Esat, and M.N.M. Badi, A new approach to time-domain

vibration condition monitoring: gear tooth fatigue crack detection and

identification by the kolmogorov-smirnov test. Journal of Sound and

Vibration, 240(5): pp. 909-919. (2001).

32. Altmann, J. and J. Mathew, High Frequency Transient Analysis for the

Detection and Diagnosis of Faults in Low speed Rolling Element Bearings, in

the Asia Pacific Vibration Conference, Kyongju, Korea: pp. 730-735. (1997).

REFERENCES

____________________________________________________________________

178

33. Williams, T., et al., Rolling element bearing diagnostics in run-to-failure

lifetime testing. Mechanical Systems and Signal Processing, 15(5): pp. 979-

993. (2001).

34. Mcfadden, P.D. and M.M. Toozhy, Application of synchronous averaging to

vibration monitoring of rolling element bearings. Mechanical Systems and

Signal Processing, 14(6): pp. 891-906. (2000).

35. Samuel, P.D. and D.J. Pines, Vibration separation methodology for planetary

gear health monitoring. Proceedings of the SPIE, the International Society for

Optical Engineering, 3985: pp. 250-60. (2000).

36. Wang, W.J. and P.D. McFadden, Early detection of gear failure by vibration

analysis--ii. interpretation of the time-frequency distribution using image

processing techniques. Mechanical Systems and Signal Processing, 7(3): pp.

205-215. (1993).

37. McClintic, K., et al., Residual and difference feature analysis with transitional

gearbox data, in the 54th meeting of the society for machinery failure

prevention technology, Virginia Beach,VA: pp. 635-645. (2000).

38. Wang, W.Q., F. Ismail, and M.F. Golnaraghi, Assessment of gear damage

monitoring techniques using vibration measurements. Mechanical Systems

and Signal Processing, 15(5): pp. 905-22. (2001).

39. Wang, W., Early detection of gear tooth cracking using the resonance

demodulation technique. Mechanical Systems and Signal Processing, 15(5):

pp. 887-903. (2001).

40. Brown, D.N., Envelope analysis detects bearing faults before major damage

occurs. Pulp & Paper, 63(13): pp. 113-117. (1989).

REFERENCES

____________________________________________________________________

179

41. Wang, W. and A.K. Wong, Some new signal processing approaches for gear

fault diagnosis, in Signal Processing and Its Applications, 1999. ISSPA '99.

Proceedings of the Fifth International Symposium on, Aeronaut. & Maritime

Res. Lab., Defence Sci. & Technol. Organ., Melbourne, Vic., Australia, New

Development, Practical: pp. 587-590. vol.2. (1999).

42. Chen, Z. and C.K. Mechefske, Diagnosis of machinery fault status using

transient vibration signal parameters. JVC/Journal of Vibration and Control,

8(3): pp. 321-335.

43. Li, Z. and Y. Fu, Adaptive noise cancelling technique and bearing fault

diagnosis. Journal of Aerospace Power/Hangkong Dongli Xuebao, 5(3): pp.

199-203. (1990).

44. Shao, Y. and K. Nezu, Detection of self-aligning roller bearing fault by

asynchronous adaptive noise cancelling technology. JSME International

Journal, Series C, 42(1): pp. 33-43. (1999).

45. Logan, D., Using the correlation method for the detection of faults in rolling

element bearings. Monash University, Melbourne. (PhD thesis, 1996).

46. Wang, W.J., et al., The application of some non-linear methods in rotating

machinery fault diagnosis. Mechanical Systems and Signal Processing, 15(4):

pp. 697-705. (2001).

47. Mevel, L., L. Hermans, and H. Van Der Auweraer, Application of a

subspace-based fault detection method to industrial structures. Mechanical

Systems and Signal Processing, 13(6): pp. 823-838. (1999).

48. Nirbito, W., C.C.Tan, and J. Mathew, The enhancement of bearing signals

corrupted by noise using blind deconvolution-a feasibility study, in

REFERENCES

____________________________________________________________________

180

Proceedings of the 2nd Asia-Pacific Conference on Systems Integrating and

Maintenance, Nanjing, China: pp. 309-314. (2000).

49. Serviere, C. and P. Fabry, Blind source separation of noisy harmonic signals

for rotating machine diagnosis. Journal of Sound and Vibration, 272(1-2): pp.

317-339. (2004).

50. Lin, J. and L. Qu, Feature extraction based on morlet wavelet and its

application for mechanical fault diagnosis. Journal of Sound and Vibration,

234(1): pp. 135-148. (2000).

51. Mcfadden, P.D., Detection of gear faults by decomposition of matched

differences of vibration signals. Mechanical Systems and Signal Processing,

14(5): pp. 805-817. (2000).

52. Hambaba, A. and E. Huff, Multiresolution error detection on early fatigue

cracks in gears, in 2000 IEEE aerospace Conference Proceedings, Big Sky,

MT, USA, NASA Ames Res. Center Moffett Field CA USA, 2000 IEEE

Aerospace Conference.: pp. 566-573. (2000).

53. Li, X., et al., Fault prognosis for large rotating machinery using neural

network, Dept. of Mech. Eng. Xi'an Jiaotong Univ. China,Applications of

Artificial Intelligence in Engineering IX. Proceedings of the Ninth

International Conference. Comput. Mech. Publications Southampton UK: pp.

618-625. (1994).

54. Salami, M.J.E., A. Gani, and T. Pervez, Machine condition monitoring and

fault diagnosis using spectral analysis techniques, Fac. of Eng. Int. Islamic

Univ. Malaysia,First International Conference on Mechatronics.

Mechatronics - An Integrated Engineering for the New Millennium.

Conference Proceedings. Int. Islamic Univ. Malaysia Kuala Lumpar

Malaysia: pp. 2 vol. xiv+743. (2001).

REFERENCES

____________________________________________________________________

181

55. Zhengjia, H., et al., Wavelet Transform in Tandem with Autoregressive

Technique for Monitoring and Diagnosis of Machinery, in Condition

monitoring and diagnostic engineering management, Proceedings of

COMADEM 94. (September 26-29, 1994).

56. Liu, T.I. and J.M. Mengel, Detection of ball bearing conditions by an A.I.

approach. Sensors, Controls, and Quality Issues in Manufacturing; American

Society of Mechanical Engineers, Production Engineering Division

(Publication) PED, 55: pp. 13-21. (1991).

57. Kim, K. and A.G. Parlos, Model-based fault diagnosis of induction motors

using non-stationary signal segmentation. Mechanical Systems and Signal

Processing, 16(2-3): pp. 223-253. (2002).

58. Haykin, S.S., Adaptive filter theory. 3rd ed. Prentice-Hall information and

system sciences series. Upper Saddle River, N.J, Prentice Hall: pp. 8-10.

(1996).

59. Igarashi, T. and H. Hamada, Studies on the vibration and sound of defective

rolling bearings. (First report: vibration of ball bearings with one defect).

Bull. JSME, 25(204): pp. 994-1001. (1982).

60. Burgess, P.F.J., Antifriction bearing fault detection using envelope detection.

Transactions of the Institution of Professional Engineers New Zealand,

Electrical/Mechanical Chemical Engineering Section, 15(2/EMCh): pp. 77-

82. (1988).

61. Hong, S.Y., Forecasting minor machine failure by DDS spectrum analysis,

Dept. of Mech. Eng. Wright State Univ. Dayton OH USA, Intelligent

Manufacturing Systems 1994 (IMS`94). A Postprint Volume from the IFAC

Workshop. Pergamon Oxford UK: pp. ix+534. (1994).

REFERENCES

____________________________________________________________________

182

62. Xu, F., Z. Wang, and S. Fu, Research of a data acquisition system of

vibration signal for large-scale rotating machine. Qinghua Daxue

Xuebao/Journal of Tsinghua University, 38(6): pp. 111-114. (1998).

63. Meesad, P. and G.G. Yen, Pattern classification by a neurofuzzy network:

application to vibration monitoring. ISA Transactions, 39(3): pp. 293-308.

(2000).

64. McFadden, P.D. and J.D. Smith, Vibration monitoring of rolling element

bearings by the high-frequency resonance technique-a review. Tribology

International, 17(1): pp. 3-10. (1984).

65. Yu, D., J. Cheng, and Y. Yang, Application of EMD method and Hilbert

spectrum to the fault diagnosis of roller bearings. Mechanical Systems and


66. Lu, Q. and D. Li, Neural network method for diagnosing faults rolling

bearing in electrical machines with frequency signatures. Qinghua Daxue

Xuebao/Journal of Tsinghua University, 38(4): pp. 94-97. (1998).

67. Baydar, N. and A. Ball, Detection of gear deterioration under varying load

conditions by using the instantaneous power spectrum. Mechanical Systems


68. Yang, D.-M., et al., Third-order spectral techniques for the diagnosis of motor

bearing condition usign artificial neural networks. Mechanical Systems and

Signal Processing, 16(2-3): pp. 391-411. (2002).

69. Li, C.J., J. Ma, and B. Hwang, Bearing condition monitoring by pattern

recognition based on bicoherence analysis of vibrations. Proceedings of the

REFERENCES

____________________________________________________________________

183

Institution of Mechanical Engineers, Part C: Journal of Mechanical

Engineering Science, 210(3): pp. 277-285. (1996).

70. Capdessus, C., M. Sidahmed, and J.L. Lacoume, Cyclostationary processes:

application in gear faults early diagnosis. Mechanical Systems and Signal

Processing, 14(3): pp. 371-385. (2000).

71. Bouillaut, L. and M. Sidahmed, Cyclostationary approach and bilinear

approach: Comparison, applications to early diagnosis for helicopter gearbox

and classification method based on hocs. Mechanical Systems and Signal

Processing, 15(5): pp. 923-943. (2001).

72. Mathew, J. and R.J. Alfredson, Condition monitoring of rolling element

bearings using vibration analysis. Journal of Vibration, Acoustics, Stress, and

Reliability in Design, 106(3): pp. 447-453. (1984).

73. Pan, M.-C., P. Sas, and H. van Brussel, Nonstationary time-frequency

analysis for machine condition monitoring, in Time-Frequency and Time-

Scale Analysis, 1996., Proceedings of the IEEE-SP International Symposium

on, Dept. of Mech. Eng., Katholieke Univ., Leuven, Belgium, Theoretical or

Mathematical,Experimental: pp. 477-480. (1996).

74. Klein, R., D. Ingman, and S. Braun, Non-stationary signals: phase-energy

approach-theory and simulations. Mechanical Systems and Signal Processing,

15(6): pp. 1061-1089. (2001).

75. In Soo, K. and K. Whan Woo, The development of reactor coolant pump

vibration monitoring and a diagnostic system in the nuclear power plant. ISA

Transactions, 39(3): pp. 309-16. (2000).

REFERENCES

____________________________________________________________________

184

76. Lee, C.K., et al., A study on RCP vibration monitoring and diagnostics in

NPP,Proceedings of 3rd Asia-Pacific Conference on Control and

Measurement (APCCM'98), Dunhuang, China, Korea Atomic Energy Res.

Inst. Taejon South Korea,Proceedings of the 3rd Asia-Pacific Conference on

Control and Measurement. China Aviation Ind. Press Beijing China: pp.

vi+425. (1998).

77. Baydar, N. and A. Ball, A comparative study of acoustic and vibration signals

in detection of gear failures using Wigner-Ville distribution. Mechanical

Systems and Signal Processing, 15(6): pp. 1091-1107. (2001).

78. Wang, W.J. and P.D. McFadden, Early detection of gear failure by vibration

analysis i. calculation of the time-frequency distribution. Mechanical Systems


79. Wen Yi, W. and M.J. Harrap, Condition monitoring of rolling element

bearings by using cone kernel time-frequency distribution. Proceedings of the

SPIE The International Society for Optical Engineering, 2101(1): pp. 290-8.

(1993).

80. Lee, S.u., D. Robb, and C. Besant, The directional choi-willams distribution

for the analysis of rotor-vibration signals. Mechanical Systems and Signal

Processing, 15(4): pp. 789-811.(2001).

81. Han, Y.-S. and C.-W. Lee, Directional Wigner distribution for order analysis

in rotating/reciprocating machines. Mechanical Systems and Signal

Processing, 13(5): pp. 723-738. (1999).

82. Lee, S.K. and P.R. White, Higher-order time-frequency analysis and its

application to fault detection in rotating machinery. Mechanical Systems &


REFERENCES

____________________________________________________________________

185

83. Chin-Hsing Chen, Jiann-Der Lee, and Ming-Chi Lin, Classification of

underwater signals using wavelet transforms and neural networks.

Mathematical and Computer Modelling, 27(2): pp. 47-60. (1998).

84. Liu, B., S.F. Ling, and R. Gribonval, Bearing failure detection using

matching pursuit. NDT & E International, 35(4): pp. 255-262. (2002).

85. Lopez, J.E., R.R. Tenney, and J.C. Deckert, Fault detection and identification

using real-time wavelet feature extraction, in Proceedings of IEEE-SP

International Symposium on Time- Frequency and Time-Scale Analysis,

Philadelphia, PA, USA, IEEE New York NY USA: pp. 217-221. (1994).

86. Tse, P.W., W.-x. Yang, and H.Y. Tam, Machine fault diagnosis through an

effective exact wavelet analysis. Journal of Sound and Vibration, 277(4-

5):1005-1024. (2004).

87. Samuel, P. and D. Pines, Health monitoring/damage detection of a rotorcraft

planetary geartrain system using piezoelectric sensors. Proceedings of the

SPIE The International Society for Optical Engineering, 3041: pp. 44-53.

(1997).

88. Mori, K., et al., Prediction of spalling on a ball bearing by applying the

discrete wavelet transform to vibration signals. Wear, 195(1-2): pp. 162-168.

(1996).

89. Wickerhauser, M.V., Adapted wavelet analysis from theory to software.

Wellesley, MA, A.K. Peters: p. xii, 486. (1994).

90. Liu, S. and W. Shi, Rough set based intelligence diagnostic system for valves

in reciprocating pumps, in Systems, Man, and Cybernetics, 2001 IEEE

REFERENCES

____________________________________________________________________

186

International Conference on, Dept. of Mech. Eng., Daqing Pet. Inst.,

Heilongjiang, China, Practical: pp. 353-358. vol.1. (2001).

91. Nikolaou, N.G. and I.A. Antoniadis, Rolling element bearing fault diagnosis

using wavelet packets. NDT & E International, 35(3): pp. 197-205. (2002).

92. Shi, W., R. Wang, and W. Huang, Application of rough set theory to fault

diagnosis of check valves in reciprocating pumps,Canadian Conference on

Electrical and Computer Engineering 2001, Toronto, Ont., Canada, Dept. of

Astronaut. & Mech. Harbin Inst. of Technol. China,Canadian Conference on

Electrical and Computer Engineering 2001. Conference Proceedings (Cat.

No.01TH8555). IEEE Piscataway NJ USA: pp. 2 vol.1414. (2001).

93. Shulin, L. and S. Wengang, Rough set based intelligence diagnostic system

for valves in reciprocating pumps,Proceedings of IEEE International

Conference on Systems, Man & Cybernetics, Tucson, AZ, USA, Dept. of

Mech. Eng. Daqing Pet. Inst. Heilongjiang China,2001 IEEE International

Conference on Systems Man and Cybernetics. e-Systems and e-Man for

Cybernetics in Cyberspace (Cat.No.01CH37236). IEEE Piscataway NJ USA:

pp. 3494. vol. 5. (2001).

94. Goumas, S., et al., Intelligent on-line quality control of washing machines

using discrete wavelet analysis features and likelihood classification.

Engineering Applications of Artificial Intelligence, 14(5): pp. 655-666.

(2001).

95. Akan, A. and L.F. Chaparro, Evolutionary chirp representation of non-

stationary signals via Gabor transform. Signal Processing, 81(11): pp. 2429-

2436. (2001).

96. Ferrando, S.E., et al., Probablistic matching pursuit with Gabor dictionaries.

Signal Processing, 80. (2000).

REFERENCES

____________________________________________________________________

187

97. Mallat, S.G. and Z. Zhang, Matching pursuits with time-frequency

dictionaries. IEEE Transactions on Signal Processing, 41(12): pp. 3397-3415.

(1993).

98. Lin, J., Feature extraction of machine sound using wavelet and its application

in fault diagnosis. NDT & E International, 34(1): pp. 25-30. (2001).

99. Zheng, H., Z. Li, and X. Chen, Gear fault diagnosis based on continuous

wavelet transform. Mechanical Systems and Signal Processing, 16(2-3): pp.

447-457. (2002).

100. Lin, J. and M.J. Zuo, Gearbox fault diagnosis using adaptive wavelet filter.

Mechanical Systems and Signal Processing, 17(6): pp. 1259-1269. (2003).

101. Zhang, J. and Z. Bao, Initialization of orthogonal discrete wavelet transforms.

IEEE Transactions on Signal Processing, 48(5): pp. 1474-1477. (2000).

102. Mallat, S., a wavelet tour of signal processing. Paris, Courant Institute, New

York University: pp. 242-244. (1999).

103. Wang, W.J., Wavelets for detecting mechanical faults with high sensitivity.

Mechanical Systems and Signal Processing, 15(4): pp. 685-696. (2001).

104. Tse, P.W. and W.X. Yang, Shortcoming of Wavelet Transforms in Machine

fault diagnosis and the proposed solution, in the Proceedings of the 3rd Asia-

Pacific Conference on System Integrity and Maintenance (ACSIM, 2002): pp.

357-362. ISBN: 1 86435 589 1.

REFERENCES

____________________________________________________________________

188

105. Zhong, B., Developments in intelligent condition monitoring and diagnostics,

in System Integrity and Maintenance , the 2nd Asia-Pacific

Conference(ACSIM2000), Brisbane, Australia: pp. 1-7. (2000).

106. Pham, D.T. and P.T.N. Pham., Artificial intelligence in engineering.

International Journal of Machine Tools & Manufacture, (39): pp. 937-949.

(1999).

107. Gao, X.Z. and S.J. Ovaska, Soft computing methods in motor fault diagnosis.

Applied Soft Computing, 1(1): pp. 73-81. (2001).

108. Parlos, A.G., S.K. Menon, and A.F. Atiya, Adaptive state estimation using

dynamic recurrent neural networks, in Neural Networks, 1999. IJCNN '99.

International Joint Conference on, Dept. of Nucl. Eng., Texas A&M Univ.,

College Station, TX, USA, Application Theoretical or Mathematical: pp.

3361-3364 vol.5. (1999).

109. Priddy, K.L., M.D. Lothers, and R.E. Saeks, Neural networks and fault

diagnosis in rotating machinery, in Systems, Man and Cybernetics, 1993.

'Systems Engineering in the Service of Humans', Conference Proceedings.,

International Conference on, Accurate Autom. Corp., Chattanooga, TN, USA,

Practical: pp. 640-644 vol.2. (1993).

110. Engin, S.N. and K. Gulez, A wavelet transform-artificial neural networks

(WT-ANN) based rotating machinery fault diagnostics

methodology,Proceedings of the IEEE-EURASIP Workshop on Nonlinear

Signal and Image Processing (NSIP'99), Antalaya, Turkey, Dept. of Electr.

Eng. Yildiz Tech. Univ. Istanbul Turkey,Proceedings of the IEEE-EURASIP

Workshop on Nonlinear Signal and Image Processing (NSIP'99). Bogazici

Univ Instanbul Turkey: pp. 2 vol. xxiii+894. (1999).

REFERENCES

____________________________________________________________________

189

111. Zhao, L. and Z. Sheng, Combination of discrete cosine transform with neural

network in fault diagnosis for rotating machinery. Proceedings of the IEEE

International Conference on Industrial Technology: pp. 450-454. (1996).

112. Paya, B.A., I.I. Esat, and M.N.M. Badi, Artificial neural network based fault

diagnostics of rotating machinery using wavelet transforms as a preprocessor.

Mechanical Systems & Signal Processing, 11(5): pp. 751-765. (1997).

113. McCormick, A.C. and A.K. Nandi, A comparison of artificial neural

networks and other statistical methods for rotating machine condition

classification,IEE Colloquium on Modelling and Signal Processing for Fault

Diagnosis (Ref, Leicester, UK, Dept. of Electron. & Electr. Eng. Strathclyde

Univ. Glasgow UK,IEE Colloquium on Modelling and Signal Processing for

Fault Diagnosis (Ref. No.1996/260). IEE London UK: p. 100. (1996).

114. Lowes, S. and J.M. Shippen, A diagnostic system for industrial fans.

Measurement and Control, 30(1): pp. 9-13. (1997).

115. Tanaka, M., et al., Application of Kohonen's self-organizing network to the

diagnosis system for rotating machinery, Fac. of Eng. Hiroshima Univ.

Japan,1995 IEEE International Conference on Systems Man and Cybernetics.

Intelligent Systems for the 21st Century (Cat. No.95CH3576-7). IEEE New

York NY USA: pp. 5 vol. 4711. (1995).

116. Deschenes, C.J. and J. Noonan, Fuzzy Kohonen Network for the

Classification of Transients Using the Wavelet Transform for Feature

Extraction. Information Sciences, 87(4): pp. 247-266. (1995).

117. Hoffman, A.J. and N.T. van der Merwe, The application of neural networks

to vibrational diagnostics for multiple fault conditions. Computer Standards

& Interfaces, 24(2): pp. 139-149. (2002).

REFERENCES

____________________________________________________________________

190

118. Melvin, D.G. and J. Penman, Fusing human knowledge with neural networks

in machine condition monitoring systems. Proceedings of the SPIE The

International Society for Optical Engineering, 2492(1): pp. 276-83. (1995).

119. Suzuki, M., et al., Application of neural network to failure diagnosis.

Research Reports of Kogakuin University. no., 81: pp. 33-8. (1996).

120. Taylor, O. and J. Macintyre, Modified Kohonen network for data fusion and

novelty detection within condition monitoring,Proceedings of EuroFusion

98., Great Malvern, UK, Sch. of Comput. & Inf. Syst. Sunderland Univ. UK,

Proceedings of EuroFusio 98. International Data Fusion Conference. DERA

Malvern UK.: pp. vi+228. (1998).

121. Chan, C. W., et al., Fault detection of systems with redundant sensors using

constrained Kohonen networks. Automatica, 37(10): pp. 1671-1676. (2001).

122. Hu, T., B.C. Lu, and G.J. Chen, A Rotary Machinery Fault Diagnosis

Approach Based on Rough Set Theory, in the 3rd World Congress on

Intelligent Control and Automation, Hefei, China: pp. 589-685. (2000).

123. Jeffries, M., et al., A fuzzy approach to the condition monitoring of a

packaging plant. Journal of Materials Processing Technology, 109: pp. 83-89.

(2001).

124. De Miguel, L.J., J. Fernandez, and J.R. Peran, Applying fuzzy logic to

rotating machinery diagnosis, Dept. of Syst. Eng. & Control Valladolid Univ.

Spain. Methodologies for the Conception Design and Application of

Intelligent Systems. Proceedings of the 4th International Conference on Soft

Computing. World Scientific Singapore: pp. 2 vol. xlii+974. (1996).

125. Milne, R., Amethyst, Intelligent Applications Ltd. Kirkton Bus. Centre

Livingston UK,IEE Colloquium on 'Intelligent Fault Diagnosis - Part 1:

REFERENCES

____________________________________________________________________

191

Classification-Based Techniques' (Digest No.045). IEE London UK: p. 32.

(1992).

126. El Adawi, S., et al., Computer based expert system for rotating machinery

(preventive and predictive maintenance), Dept. of Mech. Power Zagazig

Univ. Egypt,Proceedings of the Second IASTED International Conference.

Computer Applications in Industry. ACTA Press Zurich Switzerland: pp. 2

vol.vii+585. (1992).

127. Georgin, E., et al., The importance of cases and domain models in

explanation, Proceedings of ISAP '94, Montpellier, France, Centre for Electr.

Power Eng. Strathclyde Univ. Glasgow UK,ISAP '94. International

Conference on Intelligent System Application to Power Systems. EC2

Nanterre Cedex France: p. 2 vol. 894. (1994).

128. Shao, Y. and K. Nezu, An online monitoring and diagnostic method of rolling

element bearing with AI. Transactions of the Society of Instrument and

Control Engineers, 32(8): pp. 1287-93. (1996).

129. Vilim, R.B., H.E. Garcia, and F.W. Chen, Machine condition monitoring

using neural networks and the likelihood function. Intelligent Engineering

Systems Through Artificial Neural Networks, 7: pp. 653-659. (1997).

130. Emmanouilidis, C., J. MacIntyre, and C. Cox, An integrated, soft computing

approach for machine condition diagnosis, Sch. of Comput. & Inf. Syst.

Sunderland Univ. UK,6th European Congress on Intelligent Techniques and

Soft Computing. EUFIT '98. Verlag Mainz Aachen Germany: pp. 3 vol.

xxvi+2010. (1998).

131. Hao, L. and X. Xu, The application of rough set neural network system in

fault diagnosis. Control Theory & Applications, 18(5): pp. 681-5. (2001).

REFERENCES

____________________________________________________________________

192

132. Kesheng, W. and L. Bing, Using B-spline neural network to extract fuzzy

rules for a centrifugal pump monitoring. Journal of Intelligent Manufacturing,

12(1): pp. 5-11. (2001).

133. Satoh, S., M.S. Shaikh, and Y. Dote, Fast fuzzy neural network for fault

diagnosis of rotational machine parts using general parameter learning and

adaptation,SMCia/01, Blacksburg, VA, USA, Dept. of Comput. Sci. & Syst.

Eng. Muroran Inst. of Technol. Japan,SMCia/01. Proceedings of the 2001

IEEE Mountain Workshop on Soft Computing in Industrial Applications

(Cat. No.01EX504). IEEE Piscataway NJ USA: pp. xv+134. (2001).

134. Satoh, S., M.S. Shaikh, and Y. Dote, Fault diagnosis for dynamical systems

using soft computing,Proceedings of IEEE International Conference on

Systems, Man & Cybernetics, Tucson, AZ, USA, Dept. of Comput. Sci. &

Syst. Eng. Muroran Inst. of Technol. Japan: pp. 3494. vol. 5. (2001).

135. Yakuwa, F., et al., Fault diagnosis for dynamical systems using soft

computing, in Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002

IEEE International Conference on, Muroran Institute of Technology: pp. 261-

266. (2002).

136. Taniguchi, S., D. Akhmetov, and Y. Dote, Fault detection of rotating machine

parts using novel fuzzy neural network. Proceedings of the IEEE

International Conference on Systems, Man and Cybernetics, 1: pp. I-365 - I-

369. (1999).

137. Taniguchi, S., et al., Nonlinear modeling and fault detection using fuzzy-

neural network, Dept. of Comput. Sci. & Syst. Eng. Muroran Inst. of

Technol. Japan,Proceedings of the ISCA 9th International Conference

Intelligent Systems. Int. Soc. Comput. & Their Appl. - ISCA Cary NC USA:

pp. iv+158. (2000).

REFERENCES

____________________________________________________________________

193

138. Hsiao, I.-L., et al., On-line fault diagnosis of rotor vibration by using signal-

based feature generation and neural fuzzy inference. Journal of the Chinese

Society of Mechanical Engineers, Transactions of the Chinese Institute of

Engineers, Series C/Chung-Kuo Chi Hsueh Kung Ch'eng Hsuebo Pao, 20(4):

pp. 345-352. (1999).

139. Altmann, J. and J. Mathew, Multiple Band-Pass Based Automatic

Classification of Low Speed Rolling-element Bearing Faults, in ACSIM.,

Nanjing China, Queensland University of Technology, Brisbane, Australia.

(2000).

140. Feng, E., H. Yang, and M. Rao, Fuzzy expert system for real-time process

condition monitoring and incident prevention. Expert Systems with

Applications, 15(3-4): pp. 383-390. (1998).

141. Liu, T., J. Shigonahalli, and N. Iyer, Detection of Roller Bearing Defects

using Expert systems and Fuzzy logic. Mechanical Systems and Signal

Processing, 10(5): pp. 595-614. (1996).

142. Siu, C., Q. Shen, and R. Milne, A fuzzy expert system for vibration cause

identification in rotating machines. IEEE. Vol (1): pp. 555-560. (1997).

143. Kawabe, Y., et al., Diagnosis method of centrifugal pumps by rough sets and

partially-linearized neural network.1997 IEEE International Conference on

Intelligent Processing Systems, Beijing, China, Mitubishi Chem. Corp.

Kitakyushu Japan. 1997 IEEE International Conference on Intelligent

Processing Systems (Cat. No.97TH8335). IEEE New York NY USA: p. 2

vol. xxviii+1893. (1997).

144. Jack, L.B. and A.K. Nandi, Fault detection using support vector machines and

artificial neural networks, augmented by genetic algorithm. Mechanical

Systems and Signal Processing, 16(2-3): pp. 373-390. (2002).

REFERENCES

____________________________________________________________________

194

145. Jack, L.B. and A.K. Nandi, Genetic algorithms for feature selection in

machine condition monitoring with vibration signals. IEE Proceedings:

Vision, Image and Signal Processing, 147(3): pp. 205-212. (2000).

146. Wang, K. and B. Lei, Genetic algorithms for constructing feed forward

multiple layered neural network in a centrifugal pump condition monitoring.

Intelligent Engineering Systems Through Artificial Neural Networks, 8: pp.

303-310. (1998).

147. Staszewski, W.J. and K. Worden, Classification of faults in gearboxes. Pre-

processing algorithms and neural networks. Neural Computing &

Applications, 5(3): pp. 160-83. (1997).

148. Lou, X. and K.A. Loparo, Bearing fault diagnosis based on wavelet transform

and fuzzy inference. Mechanical Systems and Signal Processing, 18(5): pp.

1077-1095. (2004).

149. Altmann, J. and J. Mathew, DWPA best basis demodulation for the detection

and diagnosis of faults in rolling element bearings, in Proceedings of the 1st

Australian Conference on Systems Integrity and Maintenance, Surfers

Paradise: pp. 331-339. (1997).

150. Changsheng, H., et al., System analysis and design based on MATLAB-

Wavelet Analysis. Xian, Xian Electronices Science & Technology

University: pp. 124-135. (1999).

151. Chen, S.S., D.L. Donoho, and M.A. Saunders, Atomic Decomposition by

Basis Pursuit. SIAM REVIEW, 43(1): pp. 129-159. (2001).

152. http://www.gc.ssr.upm.es/inves/neural/ann1/concepts/basis.htm

PUBLICATIONS

____________________________________________________________________

195

PUBLICATIONS

Accepted for Publications

1. Yang, H., J. Mathew, and L. Ma. Intelligent diagnosis of rotating machinery

faults- a review, Proceedings of the 3rd Asia-Pacific Conference on System Integrity

and Maintenance (ACSIM): pp. 385-392. ISBN: 1 86435 589 1. (2002).

2. Yang, H., J. Mathew, and L. Ma. Vibration Feature Extraction for Diagnosis of

Rotating Machinery Faults-A Literature Survey, Proceedings of the 10th Asia-Pacific

Vibration Conference, (APVC): pp. 801--807. ISBN: 06464 42853. (2003).

3. Yang, H., J. Mathew, and L. Ma. Time Frequency Techniques for Fault Diagnosis

of Rolling Element Bearings, Proceeding of 10th Asia-Pacific Vibration Conference

(APVC): pp. 789-794. ISBN: 06464 42853. (2003).

4. Yang, H., J. Mathew, and L. Ma. Feature extraction of faulty bearings vibration

via basis pursuit, Proceeding of the 10th Asia-Pacific Vibration Conference,

(APVC): pp. 789-794. ISBN: 06464 42853. (2003).

5. Yang, H., J. Mathew, and L. Ma. Bearing fault classification using wavelet

features, Proceeding of Intelligence Maintenance System (IMS) International

Conference. Arles, France: Section 1-D. (2004).

6. Yang, H., J. Mathew, L. Ma, and V. Kosse. Matching Pursuit features based

Neural Network pattern recognition of rolling bearing faults, Proceeding of

International Conference of Maintenance Societies. Sydney, Australia: Paper.74:

pp.1-8. (2004).

7. Yang, H., J. Mathew, and L. Ma. Basis Pursuit feature based pattern recognition of

rolling bearing faults, Proceeding of Intelligence Maintenance System (IMS)

International Conference. Arles, France: Section 2-C. (2004).

8. Hongyu Yang, Joseph Mathew, Lin Ma, Fault diagnosis of rolling element

bearings using basis pursuit, The Journal of Mechanical Systems and Signal

Processing, Volume 19 (2): pp. 341-356. (2005)

GLOASSARY

____________________________________________________________________

196

GLOSSARY

Discrete Wavelet Transformation

The Discrete Wavelet Transform (DWT), in particular, decomposes the signal into

ortho-normal “wavelets”, scaled and shifted versions of the “mother wavelet”, ψ. A

function f(t) can be expressed by its wavelet expansion, defined as follows.

10),2()(0

12

020 <≤−+= ∑∑

∞

=

−

=+ tforktatf j

j k

j

kj ψα

The integer j describes the different levels of wavelets, and k covers the number of

wavelets in each level.

Discrete Wavelet Packet Analysis (DWPA)

DWPA can decompose signals into both low frequency components and high

frequency components.

Power spectrum

Power spectrum is the square of amplitude of spectrum.

Cepstrum

Cepstrum is the logarithm of power spectrum.

Wigner distribution

( ) ∫+∞

∞−

−

−

+= τττ τπ detxtxftWD fjx

2*

22,

GLOASSARY

____________________________________________________________________

197

The Choi-Williams distribution is defined:

( ) ( ) ( )[ ] τττστπ

ω τωστ duduxuxetP jtuCW

+

−= −−−∫∫ 22

1

4

1,

22 4/

22/3

where σ is constant.

Scalogram – the squared modulus of the CWT.

Spectrogram is defined as the square of STFT (has been applied in radar, not found

in fault diagnosis yet)

( ) ( ) ( ) ( )2

22,, ∫

∞+

∞−

−−== τττ τπ detwxftSTFTftSPEC fjxx

Directional Choi-Williams distribution (dCWD) , is to account for complex-valued

time-varying signals.

Recurrent Networks have the same characteristics as the standard Feed Forward,

but with feedback connections.

A Self Organising Map (SOM) Neural Network defines a mapping from input

signal of arbitrary dimension to a one or two dimensional array of nodes.

Adaptive approximation

The goal of adaptive approximation is to find the representation of a signal x as a

weighted sum of elements γφ from an over complete dictionaryΓ

γγ

γ φα∑Γ∈

=x

Or an approximate decomposition

GLOASSARY

____________________________________________________________________

198

( )mm

i

Rxii

+=∑=

γγ φα1

Where γ is an index of a set,Γ ; γα is the coefficient of the elementγφ ; m is the order

of decomposition; and( )mR is a residual.

A dictionary Γ is defined as a collection of parameterized

waveforms γφ ,{ }Γ∈γφ γ | . The waveforms γφ are discrete-time signals of length n

called atoms.

Matching Pursuit uses a specific criterion to search and decide the atoms and their

coefficients in the adaptive approximation. The Matching Pursuit identifies the

dictionary atom that best correlates with the residual and then adds to the current

approximation a scalar multiple of the atom.

Basis Pursuit represents signals in over complete dictionaries by convex

optimization. It obtains the decomposition that minimizes the 1l norm of the

coefficients occurring in the representation.

APPENDIX

____________________________________________________________________

199

APPENDIX

Activation functions

(1) Linear function (see Figure 1)

kxxg =)( (1)

Where k is a constant multiplied by the input x to form a linear function.

g(x)

-1

0

1

-1 0 1x

Figure 1: Identity function

(2) Threshold or heaviside function (see Figure 2)

A threshold function or Heaviside function is limited to one of the two values:

<≥

=) if(

) if(

0

1)(

θθ

x

xxg (2)

This kind of function is often used in single layer networks. Figure 2 illuminates the

function in the case of 1=θ .

g(x)

0

1

-1 0 1 2 3x

Figure 2: Threshold function

(3) Bipolar sigmoid function (see Figure 3)

x

x

e

exg −

−

+−=

1

1)( (3)

APPENDIX

____________________________________________________________________

200

This function has similar properties with the sigmoid function. It works well for the

applications that yield output values in the range of [-1,1].

g(x)

-1

0

1

-6 -4 -2 0 2 4 6 x

Figure 3: Bipolar sigmoid function

Characteristics of Flaw Induced Bearing Vibration

Depending on which rolling surface has the irregularities the bearing will excite

vibration at the following well-known defect frequencies:

Cage rotating frequency:

−= αcos1

2

1

c

r

d

dRPMFTF

Where rd is the diameter of the rolling elements.

( )inoutc ddd +=2

1is the diameter of the cage.

outd is the diameter of the outer race

ind is the diameter of the inner race

α is the contact angle between the rolling elements and rolling surfaces and RPM/60

is the shaft rotating frequency (expressed in Hz).

Rotational frequency of the rolling elements (BSF):

−= αcos1

2

12

c

r

r

c

d

d

d

dRPMBSF

APPENDIX

____________________________________________________________________

201

Ball-pass frequency on the outer race (BPFO):

zd

dRPMBPFO

c

r

−= αcos1

2

1

where: z is the number of rolling elements.

Ball-pass frequency on the inner race (BPFI):

zd

dRPMBPFI

c

r

+= αcos1

2

1

Very often, especially when the load is variable, vibration at other frequencies are

excited in the bearing. These frequencies are the harmonics and sum and difference

combinations of the preceding frequencies.

automatic fault diagnosis of rolling element bearings...

Documents