A neuro fuzzy image fusion using block based feature level method

ABSTRACT


INTRODUCTION
A wide variety of data acquisition devices are available at present, and hence image fusion has become an important subarea of image processing. There are sensors which cannot generate images of all objects at various distances with equal clarity. Thus several images of a scene are captured, with focus on different parts of it [1]. With the availability of multi-sensor data in many fields such as remote sensing, medical imaging, machine vision and military applications, sensor fusion has emerged as a new and promising research area. The current definition of sensor fusion is very broad and the fusion can take place at the signal, pixel, feature, and symbol level. The goal of image fusion is to create new images that are more suitable for the purposes of human visual perception, object detection and target recognition. In this paper we address the problem of pixel-level fusion [2,3] or the so-called image fusion problem. To achieve an image where all the objects are in focus, the process of images fusion is performed either in special domain or in transformed domain. Spatial domain includes the techniques which directly incorporate the pixel values. Multi-scale or multi-resolution approaches provide a means to exploit this fact. After applying certain operations on the transformed images, the fused image is created by taking the inverse transform.Image fusion is generally performed at three different levels of information representation including pixel level, feature level and decision level. In pixel-level image fusion, simple mathematical operations such as maximum or average are applied on the pixel values of the sources to generate fused image [4,5]. However these techniques usually smooth the sharp edges or leave the blurring effects in the fused image. In the feature level multi-focus image fusion, the source images are first segmented into different regions and then the feature values of these regions are calculated. Using some fusion rule, the regions are selected to generate the fused image. In decision level image fusion, the objects in the source images are first detected and then by using some suitable fusion algorithm, the fused image is generated [6,7]. In this paper, there is a new proposed method for multi-focus image fusion

FUZZY BASED IMAGE FUSION
Fuzzy image processing is not a unique theory. Fuzzy image processing is the collection of all approaches that understand, represent and process the images, their segments and features as fuzzy sets. The representation and processing depend on the selected fuzzy technique and on the problem to be solved. It has three main stages: − Image fuzzification((Using membership functions to graphically describe a situation) − Modification of membership values(Application of fuzzy rules) − Image defuzzification((Obtaining the crisp or actual results) The coding of image data (fuzzification) and decoding of the results (defuzzification) are steps that make possible to process images with fuzzy techniques. The main power of fuzzy image processing is in the middle step (modification of membership values). After the image data are transformed from gray-level plane to the membership plane (fuzzification), appropriate fuzzy techniques modify the membership values. Multisensor data fusion can be performed at four different processing levels, according to the stage at which the fusion takes place: signal level, pixel level, feature level, and decision level. Figure 1 illustrates of the concept of the four different fusion levels [8][9][10]. Signal level fusion. In signal-based fusion, signals from different sensors are combined to create a new signal with a better signal-to noise ratio than the original signals. (2) Pixel level fusion. Pixel-based fusion is performed on a pixel-by-pixel basis. It generates a fused image in which information associated with each pixel is determined from a set of pixels in source images to improve the performance of image processing tasks such as segmentation (3) Feature level fusion. Feature-based fusion at feature level requires an extraction of objects recognized in the various data sources. It requires the extraction of salient features which are depending on their environment such as pixel intensities, edges or textures. These similar features from input images are fused.

Steps in fuzzy image fusion
The original image in the gray level plane is subjected to fuzzification and the modification of membership functions is carried out in the membership plane. The result is the output image obtained after the defuzzification process. The algorithm for pixel-level image fusion using fuzzy logic is given as follows [11]. a. Read first image in variable I1 and find its size (rows: r1, columns: c1). b. Read second image in variable I2 and find its size (rows: r2, columns: c2). c. Variables I1 and I2 are images in matrix form where each pixel gray level value is in the range from 0 to 255. d. Compare rows and columns of both input images. If these two images are not of the same size, select the portion, which are of same size. e. Convert the images in column form which has C = r1×c1 entries. f. Make a fuzzy inference system file, which has two input images. g. Decide number and type of membership functions for both the input images by tuning the membership functions. h. Input images in antecedent are resolved to a degree of membership ranging 0 to 255. i. Make fuzzy if-then rules for input images, which resolve those two antecedents to a single number from 0 to 255. In the proposed method an image set of 10 different images are used to train the neural network. Every image is first divided into number of blocks and features are calculated as shown in Figure 2. The block size plays an important role in distinguishing the blurred and un-blurred regions from each other. After dividing the images into blocks, the feature values of every block of all the images are calculated and a features file is created. A sufficient number of feature vectors are used to train the neural network. The trained neural network is then used to fuse any set of multi-focus images. In the field of artificial intelligence, Neuro-Fuzzy refers to combinations of artificial neural networks and fuzzy logic. Neuro-Fuzzy composite results in a hybrid intelligent system that synergizes these two techniques by combining the human-like reasoning style of fuzzy systems with the learning and connectionist structure of neural networks. Neuro-Fuzzy hybridization is widely termed as Fuzzy Neural Network (FNN) or Neuro-Fuzzy System (NFS) in the literature. Neuro-Fuzzy system incorporates the human-like reasoning style of fuzzy systems through the use of fuzzy sets and a linguistic model consisting of a set of IF-THEN fuzzy rules. The strength of neuro-fuzzy systems involves two contradictory requirements in fuzzy Modeling interpretability versus accuracy. In practice, one of the two properties prevails. The Neuro-Fuzzy in fuzzy modeling research field is divided into two areas: linguistic fuzzy modeling that is focused on interpretability, mainly the Mamdani model; and precise fuzzy modeling that is focused on accuracy, mainly the Takagi-Sugeno-Kang (TSK) model.

ALGORITHM FOR NEURO FUZZY BASED IMAGE FUSION
The algorithm for pixel-level image fusion using neuro fuzzy logic is given as follows. − Read first image in variable I1 and find its size (rows:zl, columns: sl). − Read second image in variable I2 and find its size (rows:z2. columns: s2). − Variables I1 and I2 are images in matrix form where each pixel value is in the range from 0-255. Use Gray Color map. − Compare rows and columns of both input images. If the two images are not of the same size, select the portion. Which are of same size. − Convert the images in column form which has C= zl*sl entries. − Form a training data, which is a matrix with three columns and entries in each column are form 0 to 255 insteps of 1. − Form a check data. Which is a matrix of Pixels of two input images in column format.
In feature-level image fusion, the selection of different features is an important task. In multi-focus images, some of the objects are clear (in focus) and some objects are blurred (out of focus). The author has used five different features to characterize the information level contained in a specific portion of the image. This feature set includes Variance, Energy of Gradient, Contrast Visibility, Spatial Frequency and Canny Edge information.

Features selection
Contrast Visibility: It calculates the deviation of a block of pixels from the block's mean value. Therefore, it relates to the clearness level of the block. The visibility of the image block is obtained using (1).
(1) of where, VI,   , m×n, I(i,j) refers Contrast Visibility, mean, size of the block Bk and rows and columns of the image respectively.
Variance: Variance is used to measure the extent of focus in an image block. It is a mathematical expectation of the average squared deviations from the mean. A pseudo center weighted local variance in the neighborhood of an image pixel determines the amplification factor multiplying the difference between the image pixel and its blurred counterpart before it is combined with the original image. It is calculated using (2).
where, V is Variance, is the mean value of the block image, I(i, j) is rows and columns of the image and m×n is the image size. A high value of variance shows the greater extent of focus in the image block.
Spatial Frequency: Spatial frequency measures the activity level in an image. It is used to calculate the frequency changes along rows and columns of the image. Spatial frequency refers to the number of pairs One-third of a millimetre is a convenient unit of retinal distance because an image this size is said to subtend one degree of visual angle on the retina. To give an example, index fingernail casts an image of this size when that nail is viewed at arm's length, a typical human thumb, not just the nail, but the entire width, casts an image about twice as big, two degrees of visual angle. The size of the retinal image cast by some object depends on the distance of that object from the eye, as the distance between the eye and an object decreases, the object's image subtends a greater visual angle. The unit employed to express spatial frequency is the number of cycles that fall within one degree of visual angle. Spatial frequency is measured using (3). where, SF is Spatial Frequency, RF is Row Frequency, CF is Column Frequency, m×n is size of image, I(i, j) is the rows and columns of the image.
Energy of Gradient (EOG): It is also used to measure the amount of focus in an. It is calculated using (4). where, Energy of Gradient f i Energy of the row, f j is the Energy of the column, m × n is the size of the image.
Edge Information: The edge pixels can be found in the image block by using Canny edge detector. It returns 1 if the current pixel belongs to some edge in the image otherwise it returns 0. The edge feature is just the number of edge pixels contained within the image block. Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or more formally has discontinuities. The purpose of detecting sharp changes in image brightness is to capture important events and changes in properties of the world. Edges extracted from non-trivial images are often hampered by fragmentation, meaning that the edge curves are not connected, missing edge segments as well as false edges not corresponding to interesting phenomena in the image, thus complicating the subsequent task of interpreting the image data. Edge detection is one of the fundamental steps in image processing, image analysis, image pattern recognition, and computer vision techniques.

Artificial neural network
Artificial neural networks (ANNs) have proven to be a more powerful and self-adaptive method of pattern recognition as compared to traditional linear and simple nonlinear analyses. The ANN-based method employs a nonlinear response function that iterates many times in a special network structure in order to learn the complex functional relationship between input and output training data. The input layer has several neurons, which represent the feature factors extracted and normalized from image A and image B. The hidden layer has several neurons and the output layer has one neuron (or more neuron). The ith neuron of the input layer connects with the jth neuron of the hidden layer by weight Wij, and weight between the jth neuron of the hidden layer and the tth neuron of output layer is Vjt (in this case t = 1). The weighting function is used to simulate and recognize the response relationship between features of fused image and corresponding feature from original images (image A and image B). As the first step of ANN-based data fusion, two registered images are decomposed into several blocks with size of M and N .Then, features of the corresponding blocks in the two original images are extracted, and the normalized feature vector incident to neural networks can be constructed. The features used here to evaluate the fusion effect are normally spatial frequency, visibility, and edge. The next step is to select some vector samples to train neural networks. An ANN is a universal function approximator that directly adapts to any nonlinear function defined by a representative set of training data. Once trained, the ANN model can remember a functional relationship and be used for further calculations. For these reasons, the ANN concept has been adopted to develop strongly nonlinear models for multiple sensors data fusion.

Feed forward neural network
A neuron can have any number of inputs from one to n, where n is the total number of inputs. The inputs may be represented therefore as x1, x2, x3… xn. And the corresponding weights for the inputs as w1, w2, w3… wn., the summation of the weights multiplied by the inputs is x1w1+ x2w2+ x3w3…. + xnwn. Hence, a = x1w1+x2w2+x3w3... +xnwn. Assuming an array of inputs and weights are already initialized as if the activation > threshold, output is 1 and if activation < threshold output is 0.
One way of is by organising the neurons into a design called a feed forward network. It gets its name from the way the neurons in each layer feed their output forward to the next layer until we get the final output from the neural network. A feed forward neural network is first trained with the block features of ten pairs of multi-focus images. A feature set including spatial frequency, contrast visibility, edges, variance and energy of gradient is used to define the clarity of the image block. Block size is determined adaptively for each image. The trained neural network is then used to fuse any pair of multi-focus images.

Quantitative measures
There are different quantitative measures which are used to evaluate the performance of the fusion techniques. We used three measures Root Mean Square Error (RMSE), Peak Signal to Noise Ratio (PSNR) and entropy (He).

Root mean square error
The analytical performance studies were aimed to quantitatively assess image fusion performance in a straightforward manner. The root mean square error (RMSE), defined by the deviations between the reference image pixel value R(i, j) and the fused image pixel value F(i, j), is computed as where m × n is the input image size. If the value of 0 correspond to the complete image reconstruction for block m × n, it is a perfect image, which has been achieved through accurate reconstruction of multi focus to the reference image.

Entropy
Entropy is known to be a measure of the amount of uncertainly about the image. It is then given by where L is the number of graylevels.

PERFORMANCE EVALUATION
The input image is divided into blocks and the five features are extracted using feed forward neural network. The performance of the existing Average, Maximum, Minimum and PCA based techniques are compared with the results of the proposed technique. The experimentation results are obtained to evaluate the performance of the proposed technique. The simulation is carried out by using Mat lab 7.5, the simulation window is as shown in the Figure 3 (h). The Figure 3 (a, b, c, d, e and f) are the results of the fused image by various fusion techniques such as averaging, minimum, maximum, PCA and block-based feature level method respectively. Table I shows the entropy of the various methods and figure 5 shows the graphical performance of the same from which it can be seen that the proposed technique gives better results than the previous methods. Evaluation measures are used to evaluate the quality of the fused image. The fused images are evaluated, taking the following parameters into consideration.

Image quality index
Image quality index (IQI) measures the similarity between two images (11 & I2) and its value ranges from -1 to 1. IQI is equal to 1 if both images are identical.

Mutual information measure
Mutual information measure (MIM) furnishes the amount of information of one image in another. This gives the guidelines for selecting the best fusion method. Given two images M (i, j) and N (i, j). Where, PM (x) and PN (y) are the probability density functions in the individual images, and PMN (x, y) is joint probability density function.

Fusion factor
Given two images A and B, and their fused image F, the Fusion factor (FF).
Where IAF and IBF are the MIM values between input images and fused image. A higher value of FF indicates that fused image contains moderately good amount of information present in both the images. However, a high value of FF does not imply that the information from both images is symmetrically fused.

Fusion symmetry
Fusion symmetry (FS) is an indication of the degree of symmetry in the information content from both the images. The quality of fusion technique depends on the degree of Fusion symmetry. Since FS is the symmetry factor, when the sensors are of good quality, FS should be as low as possible so that the fused image derives features from both input images [11][12][13][14]. If any of the sensors is of low quality then it is better to maximize FS than minimizing it.

Fusion index
The study proposes a parameter called Fusion index from the factors Fusion symmetry and Fusion factor. The fusion index (FI) is defined as Is the mutual information index between multispectral image and fused image and IBF is the mutual information index between panchromatic image and fused image. The quality of fusion technique depends on the degree of fusion index. Where p contains the histogram, counts returned from im hist.

RESULTS AND DISCUSSIONS
There are many typical applications for image fusion. Modern spectral scanners gather up to several hundred of spectral bands which can be both visualized and processed individually, or which can be fused into a single image, depending on the image analysis task. In this section, input images are fused using fuzzy logic approach [15][16][17]. So, it is concluded that results obtained from the implementation of neuro fuzzy logic-based image fusion approach performs better for first two test cases and fuzzy based image fusion shows better performance for third test case. So further investigation is needed to resolve this issue. Our experimental results show that neuro fuzzy logic-based image fusion approach provides better performance when compared to fuzzy based image fusion for first two examples. Image quality index (IQI), the similarity between reference and fused image (0.9999, 0.9829, and 0.3182) are higher for first two cases when compared to values obtained from fuzzy based fusion technique (0.9758, 0.9824, and 0.8871  Figure 4 shows the Medical images (CT and MRI Brain) fused by different images fusion techniques and the proposed method. Figure a and b are the input CT and MRI images respectively. Table 2 shows the results of quantitative measures of medical images such as brain. The value of each quality assessment parameters of all mentioned fusion approaches are depicted in Table 3.

CONCLUSION
In this paper, block-based feature-level multi-focus image fusion technique is proposed for fusing images that are not in focus. A feed forward neural network is first trained with the block features of a pair of multi-focus images. A feature set including spatial frequency, contrast visibility, edges, variance and energy of gradient is used to define the clarity of the image block. Block size is determined adaptively for each image. The trained neural network is then used to fuse any pair of multi-focus images. The performance of the existing Average, Maximum, Minimum and PCA based techniques are compared with the results of the proposed techniques. The experimentation results are obtained to evaluate the performance of the proposed technique. Experimentation results show that the proposed technique performs better than the existing techniques. The experimental results clearly show that the proposed image fusion using fuzzy logic gives a considerable improvement on the quality of the fusion system and neuro fuzzy based image fusion preserves more texture information.