Pakistan Research Repository Home

Title of Thesis

Muhammad Afzal Saleemi
Institute/University/Department Details
Department of Computer Science/ University of Karachi
Computer Science
Number of Pages
Keywords (Extracted from title, table of contents and abstract of thesis)
temporal data, information technology, filtering of temporal data, temporal structure, time series, temporal databases, data mining, wavelet based fuzzy clustering

The main objective of filtering of temporal data is to get its smooth form that may be used for further analysis. In temporal data, the observations have an ordered temporal structure which is separated with respect to time. The most common form of temporal data is the time series which are having equally spaced ordered set of measurements stamped over time. The main goal of our thesis is to explore new trends for filtering and smoothing of temporal data. We explore a recently developed technique of wavelets for smoothing of temporal data. We have further developed our approaches using different features of wavelet transformation for dimensionality reduction and tested them for various decision making processes in statistics, business and computer science. We have applied our smoothing techniques for forecasting, optimization of neuro fuzzy weight sets, query processing of similar sequences in huge data volumes and their clustering on the basis of membership value. For forecasting, we have proved that our proposed model gives better forecast than simple regression models. We have also proposed wavelet based compressed model for adjustment of neuro fuzzy weight sets. We have anticipated that for each neuron in the neural network, we can estimate reduced model from wavelet based compressed model to form wavelet based quasi fuzzy weight sets (WBQFWS). We have proved that such type of weight sets give good initial solution for training of fuzzified neural networks to reduce the search space for fast and efficient learning of fuzzy neural networks. In this thesis, we have also proposed two new approaches for smoothing of large temporal databases using the wavelet filtering theory for approximations of query procedures for decision support systems (DSS). For these smooth signals, we have developed novel query processing algorithms while selecting wavelet based features and time warping distance metric, called wavelet based features time warping (WBFTW). The time warping is a technique which is used for finding patterns in time series of even different lengths. We utilize the features selected on the basis of minima, maxima and average of wavelet based compressed signals for first model (Dtw-wlt comp) and local features of wavelet transformation using average of approximation coefficients at the coarsest scale and maxima of maxima and minima of minima of detail coefficients at all scales for second model (Dtw-wlt coeff.) Our both models support index based time warping distance.

We have shown by carrying out extensive experiments with synthesis as well as real life databases using different wavelet families that our proposed methods are very effective and ensure the nonoccurrence of false dismissals and minimal false alarms with least compromise over accuracy. Our methods give extremely fast response times as our approximate query executes its maximum processing over synoptic set of wavelet features. Our approaches also present compatible results as compared to so far considered best technique Dtw-lh and much better results than original time warping distance model Dtw These retrieved similar time series do not convey how much they are similar to a given query and sometimes may extract false conclusions for decision making processes. We have utilized this uncertainty and fuzziness in retrieved similar time series and proposed a new approach to cluster them on the bases of their degree of similarity using the concept of fuzzy logic. We introduce a new measure of degree of similarity for clustering using time warping distance method which gives us better control over the number of clusters. We have developed our techniques for static data that are already fully collected however they may be extended and applied for streaming data in future.

Download Full Thesis
3402.59 KB
S. No. Chapter Title of the Chapters Page Size (KB)
1 0 Contents
241.64 KB
2 1 Introduction 1
257.56 KB
  1.1 Filtering And Smoothing Of Temporal Data 3
  1.2 Similarity And Its Measures 4
  1.3 Indexing Of Time Series Data 5
  1.4 Thesis Objectives 7
  1.5 Original Approach 8
  1.6 Thesis Outline 8
3 2 Smoothing And Similarity Search In Temporal Databases 11
509.14 KB
  2.1 Temporal Data 11
  2.2 Data Mining And Knowledge Discovery In Temporal Databases 13
  2.3 Filtering And Smoothing Of Time Series 13
  2.4 Similarity Search In Time Series Databases 15
  2.5 Similarity Measures 18
  2.6 Index Based Similarity Search 24
  2.7 Gemini Frame Work For Indexed Based DTW 26
  2.8 Fuzzy Clustering Of Similar Time Series 29
4 3 Wavelets Based Smoothing In Time Series Analysis 30
1123.96 KB
  3.1 Introduction To Wavelet 31
  3.2 Wavelet Based Smoothing 36
  3.3 The Discrete Wavelet Transform( DWT) 43
  3.4 Multiresolution Analysis( MRA) 45
  3.5 The Wavelet And Scaling Filers 50
  3.6 The Partial Discrete Wavelet Transform( Pdwt ) 55
  3.7 Inverse Wavelet Transform 59
  3.8 Choice Of Wavelet Families 64
  3.9 Wavelet Applications 64
  3.10 A New Approach To Forecasting Based On Wavelets 65
  3.11 A New Approach To Wavelet Based Quasi Fuzzy Weight Sets( WBQFWS) 70
  3.12 Wavelets In Similarity Search 80
5 4 Similar Patterns Retrieval Using Wavelet Based Features Time Warping (WBFTW ) 82
513.38 KB
  4.1 Dimensionality Reduction Using Wavelets 83
  4.2 Data Compression 86
  4.3 New Approaches To Wavelet Based Features Time Warping (WBFTW ) 88
  4.4 A New Approach To Wavelet Based Fuzzy Clustering 103
6 5 Experimental Methods And Evaluation Of Results 106
498.19 KB
  5.1 Artificially Generated Time Series 107
  5.2 The Real Time Data 107
  5.3 Experimental Settings 108
  5.4 Analysis Of Proposed Algorithms And Evaluation Of Results 110
  5.5 Conclusions
7 6 Conclusions 123
80.16 KB
  6.1 Summary 123
  6.2 Our Contributions 123
  6.3 Future Work 125
8 7 References 126
361.46 KB