Karan Sikka

Research Projects

This page gives detailed account of the research projects undertaken chronological order.

1. Grozi (currently undergoing): This work focuses on designing assistive technologies for the blind, specifically object recognition techniques to assist them in grocery shopping. The central idea is to allow the user to build a list of some products (using assistive technologies) for which models will be searched over the internet (called in-vitro version). Once the training is complete, the blind user will use a camera to sweep through the grocery shelves. These in-vitro images will be used to identify objects in the grocery setting, which is termed as their in-situ version. The problem may seem conventional that can be solved using state of the art vision algorithms. However, a couple of reasons make the problem interesting- (1) wide disparity between in-situ and in-vitro data, (2) grocery products may change over time, (3) the FOV contains multiple objects whose scale is unknown, (4) one can expect high clutter and occlusions and (5) system should return minimum false positives to keep the algorithm practically viable for the user. There has been some prior work on this problem in UCSD vision lab (http://vision.ucsd.edu/project/assistive-technology-visually-impaired) , however they haven't been able to achieve practical recognition rates.

2. Content Based Image Retrieval, Dr Kannan Karthik, Assistant Professor, ECE Dept. , IIT Guwahati (Undergraduate thesis) : As part of our bachelor’s thesis project, a system was designed for image retrieval addressing both, the importance of low-level visual features and semantic (subjective) information. In order to enhance the celerity of the process, the levels of accuracy and options available with the user, the system works both automatically and semi-automatically. The algorithm employs an automatic segmentation algorithm since a region based query is more closer to human visual perception. The proposed system follows a two step approach. The ﬁrst step being the fuzzy association of a query (or regions) with multiple semantic labels. This approach is based on the mapping of different concepts on the visual space using Gaussian mixture model (GMM) training approach using maximum likelihood (ML). This fuzzy annotation is employed to extract the top results from the image database. The subsequent step uses the subset of the images generated from the previous process and implements a host of region based features like colour, texture, etc for estimating a reﬁned-similarity ranking of the database images.
The current formulation is equipped with many advantages. Firstly, the semantic modelling takes into
account the fact that similar objects can exist with different visual features and vice-versa. Secondly, the
inclusion of context is of paramount importance in image retrieval task since it reduces the ambiguity
owing to correlation in image features for different objects. This fusion of this formulation with region
based query further improves the results.

Report: (BTP thesis)

3. Evaluation of clustering algorithms for partitioning abdominal ultrasound images, Dr. Thomas M Deserno, Professor, Dept. of Medical Informatics, RWTH Aachen, Germany (May, 2009 - July, 2009)- Ultrasound images are among the most difficult to segment. This project focussed on application of clustering algorithms on these images; namely spectral clustering and k-means. Extensive experiments on the images revealed that the choice of feature space and weights corresponding to each space had a considerable effect of the segmentation results. Moreover, the the evaluation of different segmentation algorithms on ultrasound images suffer from problems like (i) non availabilty of ground truth, (ii) human bias of visual quality assesment, and (iii) partial evaluation on a local region of interest (ROI). Thus a standard measures were introduced to effectively tackle the above problem. Thus a novel approach for comparing segmentation maps without ground truths was introduced. The two algorithms mentioned above were utilized for a detailed analysis of the metrics with different maps.

Publication: (Link)

4. Segmentation of SAR Images though texture (via Wavelets) and intensity information - Our team got the idea of this project when one of the researcher from ISRO (Indian Space Research Institute) contacted us with SAR images of flood areas in the state of Bihar, India. These images (as can be seen in the slide show) require one to use both texture and intensity information for effective segmentation. In this regard, we had devised an algorithm to segment SAR images by combining the texture and intensity features of a satellite image. The extraction of texture information was based on variance calculation of the wavelet transform of an image. Moreover, this work introduced a novel concept of using a dynamic window to calculate variance rather than a static window. The selection of the size of this window was based on estimating the average texel size. Finally, the algorithm was tested on a number of SAR images of flood areas. The primary aim of this project was to assist the authorities in sending aid to a flood inflicted region by easy identification of flood areas. Luckily, we were able to highlight the above approach in our second journal publication in International Journal on Remote Sensing, Taylor and Francis. One can read the paper for more information.

Publication : (Link)

5. Segmentation of brain MRI images based on modified fuzzy framework - Automated brain MRI segmentation is a complex problem especially owing to presence of inhomogenity in different brain regions. This work focussed on segmentation by employing fuzzy clustering algorithm. Two innovative approaches were introduced in this work. Firstly, the problem of initial cluster feed to FCM was tackled by introducing a local peak merger to get an estimate of position of clusters. Secondly, an entropy driven homomorphic filter was employed to remove the inhomogenity in different regions. The performance of the proposed algorithm was validated alongwith FMRIB software library (Oxford Brain Research) by using the ground truths of the BrainWeb database of simulated images. The efficacy of the algorithm is also shown on some real MRI images (both diseased and non-diseased). This project led to my first journal publication (along with two students and my professor) in Magnetic Resonanace Imaging, Elsevier. The paper can be refered for more details.

Publication : (Link)

6. PSOC based ECG signal acquisition and processing, Dr Amit Kumar Mishra, Assistant Professor, ECE Dept, IIT Guwahati (Design Project) (Jan, 2009 - April, 2009) - This project focussed on designing a handheld devise that can be used for acquisition, display and processing of human ECG signals. A small chip called PSOC - developed by Cypress was the core of the device. A PSOC is a an advanced FPGA that also also has the features of a general micro-controller. Inspite of the advanced features offered in PSOC, it is known to operate of very little power. The processing of the ECG signal was done on the chip itself by desinging a low-pass averaging filter and a high pass filter to remove the bias and 50 Hz noise present alongwith the signal. This was further interfaced with a 128*64 LCD, that was refreshed at an appropriate rate to display a flicker free ECG wave. Moreover, the hear-rate of the patient was further displayed in the LCD.

Karan Sikka

University of California San Diego

Electrical and Computer Engineering