User:Iub5cse598a

Sublinear algorithms is a fast emerging subfield of algorithms which lets us quickly analyze given data and infer useful information very fast. It has a variety of application areas including streaming algorithms and property testing. One such area which has a large practical significance is using such methods to analyze visual properties of images. In today's world there is a extremely large number of images which are uploaded and distributed via the internet, surveilance cameras and other sources. Using sublinear algorithms to analyze them gain useful information quickly can be a very useful tool. This subfield of sublinear algorithms for visual property testing is relatively new and thus there is a large scope for future research in the area.

Sublinear algorithms[edit]

Sublinear algorithms is a relatively new field in algorithms motivated by large datasets. Traditionally algorithms used time and space which were polynomial in the input size. However, with the large data sets which are present today, a need arises to quickly process the data and to be able to infer something useful from it in time which is either independent or has a poly logarithmic dependence on the input size rather than linear or worse.

There are a variety of applications of sublinear algorithms such as to social network graphs, continental scale road networks or data mining. A common problem tackled in the field is property testing. This involves given some data such as a graph, text or image. We have to answer whether it satisfies a given property. This could be as simple as given a graph answer if it is connected. Or something far more complicated such as given an image answer if it matches a given template.

Often given the constraints that we have to process the data in under linear time and space we are not able to provide exact answers but rather provide certificates such as with high probability the data satisfies the given property or is 'far' from satisfying the given property.

Visual property testing[edit]

Visual data be it images or videos provide a rich source of information to be mined. With the growth of the internet and user generated content such as photos on facebook or picasa and with the grown of surveillance technology the need to effectively mine this data set for information is of great practical significance. There are a variety of visual properties one can test. They can range from simple properties such as does the image have a half plane to far more complicated ones such as detection of objects and their trajectories in videos.

The speed with which this data can be processed is very important. Even a simple iPhone camera has a 3 mega pixel camera. Thus processing the data in time and space which is polynomial in the data size is not feasible given the scale involved. Given the practical importance of this field there has already been a lot of work done in the area. However given the scale, real world and noisy nature of the area a lot of the work involves heuristic and ad-hoc approaches. Earlier clear cut provable algorithms would take too much time or simply not provide useful results. However given the advent of sublinear algorithms we have an opportunity to try and apply and adapt some of the techniques developed to create algorithms which are useful as well as provably efficient.

Model for testing visual properties using sublinear algorithms[edit]

While testing for visual properties using sublinear algorithms. We use the following model. We have a property we want to test for. We accept it with high probability if it matches the property. Secondly, if by modifying a $\epsilon$ fraction of the pixels or fewer it can be made to match the property we can either accept or reject. Finally if we need to modify more than $\epsilon$ fraction of pixels we reject with high probability.

To illustrate this consider the following image with 9 pixels. Assume we want to test if has the left 3 pixels as black. And we are willing to accept an error of 1 pixel i.e. 1/9 fraction of the pixels can be wrong. Thus we accept the first with high probability. We can either accept or reject the second image and the third image has to be rejected with high probability.

Image property testing under the sublinear model

In this model we call the done care image as 1/9 far from the required property and the rejected image 2/9 far from the required property as we need to modify 1 and 2 pixels respectively to get it to match the required image. Our $\epsilon$ is 1/9.

Some visual properties have already been studied under this model.

Testing if an image is a half plane[edit]

In their paper Raskhodnikova et al. ^[1] provide a sublinear algorithm for determining if a given image is a half plane or not. The algorithm works by querying the four corners of the image. If all corners are the same colour it outputs yes or 1 or 3 of them are different colour it output no. If 2 of them are different colours it does a binary search to determine points sufficiently close on the sides of the square where the colour changes. After which it looks at the regions formed by joining the respective white and black points and queries further within that region if needed.

Testing if an image has a convex set of dark pixels[edit]

They also provided a method for checking if the set of dark pixels in the image forms a convex set or not. The method proceeds by forming a grid of points within the image and queries the grid points. Thereafter it finds a convex hull of the grid points which are black. After this it queries a few pixels within the convex hull to check if it contains white points and rejects if it finds any. After this from the squares formed around the remaining grid points which do not intersect with the convex hull of black points it queries a few pixels to see if any black points exists and rejects if it finds any. Otherwise it accepts.

Testing if an image has a connected set of dark pixels[edit]

Also known from them is a test for checking if the set of dark pixels in the graph forms a connected set or not. The algorithm works by inspecting a few points at random and using a breath first search to finds all the black pixels from that point upto a certain number of black pixels. After this if no more black pixels are found we have a small connected component. If no such component is found then it rejects and if a such a component is found it queries the pixels in a small range around this component to see if any black pixels which are not connected to the component. If this is the case it rejects. Otherwise it accepts.

Testing if an image matches a certain template[edit]

A more complex property of an image can be whether it matches a certain template. For example we may want to determine if the image is a photo of the horizon or of the sun. In each such case we have a very particular patter we want the image to be. In the case of a horizon it should be an upper part which is blue (sky) and a lower part which is say green (ground). In the case of a photo of the sun we want an image which is all blue (sky) and with a central part being yellow (sun). Such matching helps us easily classify images and is very useful.

Progress towards achieving this using sublinear algorithms have been made by Kleiner et al. ^[2] . In the research the authors assume a black and white image and a template consisting of 0s and 1s which indicate the required pattern and they want to test for. To illustrate this concept consider the following image. The template is given two images which match it are shown and also one which does not match it.

Example of a template and matching and non matching images.

It is important to note that the matching is for the pattern only the size of the black part in the middle does not matter. All that matters is that it matches the template which is a single black area surrounded by white. For the sun example discussed earlier the required template to match it can be seen as follows.

For this problem, the researchers found a method which depends only on the size of the template, the area of the minimal partition block allowed, and a measure of correspondence to the template (for close or far it is from being a correct or incorrect match). The method is independent of the size of the input image. Furthermore, their method can be extended to 3D data and grayscale images. Thus they provide a powerful framework which can be used in more specialized detection.

Facial localization[edit]

Facial localization is the method that that determines the human faces in an image and identifies them with a known database of human faces. This is a technique which is very useful for social media sites such as facebook and also in law enforcement. For example when users upload pictures of themselves to facebook, facebook automatically detects the faces and suggests which of yours friends are in that picture so you can tag them. Law enforcement used the same technique during the superbowl to scan the stadium and to try and detect people with criminal records so as to better avoid any law and order violations during the game. Given the practical significance and the huge amount of data that needs to be processed quickly using sublinear methods in facial detection would be an useful tool.

The process of localizing faces is actually a two step process. The first part is face detection where the algorithm detects potential faces in the image. The aim here to find all the faces in the image and ignore everything else. The second facial recognition is when the algorithm takes the faces it found in an image and attempts to match them with a known database of faces.

There are a variety of methods to perform facial detection and earlier approached restricted themselves to finding faces in images with controlled background such as in passport photographs. This provided a very simple pattern to mine for because the background and face position are strictly controlled.

Subsequently people started using skin colour to try and extra faces however as skin colour, angle and lighting can vary this is a more complex problem. If we use video sometimes finding faces by motion can be used. Common facial animations such as blinking can be detected and used to locate faces.^[3]

Applying Template matching to facial detection[edit]

Attempting to apply the template matching work done is promising for face detection. However there are certain challenges which have to be overcome.

Scaling must be taken into account depending on the distance to the face but this can result in mismatches because the scaling as it exists can be done in a number of ways which does not maintain the relative distances.

Different angles and rotations of the face must be taken into account. However this is a problem which is known to be harder to tackle.

First of all faces can be under different lighting and this can lead to differing templates which are hard to predict even for images which are taken from the same angle and distance.

lighting can create two different templates

The rotation issue can be handled to some extent by using a variety of templates and trying to match for many angles however the scaling issue needs to have some more research done into and finally the lighting issue is much harder to tackle however possibly the fact that the method is tolerant to noise can be leveraged.

Possible applications in facial recognition[edit]

One method for matching faces is called Elastic Bunch Graph Matching. All human faces share a similar topological structure. Faces are represented as graphs, with nodes positioned at certain key points such as eyes, nose etc. Each node contains a set of 40 coefficients characterizing it called "jets". Also there are edges between nodes which correspond to distances between the points. Recognition is based on matching these labeled graphs.^[4]

Such representation could possibly be used while trying to match using sublinear algorithms.

References[edit]

^ Raskhonikova, S. (2003). "Approximate testing of visual properties". RANDOM-APPROX.
^ I. Kleiner , D. Keren, I. Newman, O. Ben-Zwi (2011). "Applying Property Testing to an Image Partitioning Problem". IEEE Trans. Pattern Anal. Mach. Intell. 2 (33).{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ "Face Detection".
^ Wiskott; et al. (1999). Intelligent Biometric Techniques in Fingerprint and Face Recognition. chapter 11: CRC Press. {{cite book}}: Explicit use of et al. in: |last= (help)CS1 maint: location (link)

[1] Raskhonikova, S. (2003). "Approximate testing of visual properties". RANDOM-APPROX.

[2] I. Kleiner , D. Keren, I. Newman, O. Ben-Zwi (2011). "Applying Property Testing to an Image Partitioning Problem". IEEE Trans. Pattern Anal. Mach. Intell. 2 (33).{{cite journal}}: CS1 maint: multiple names: authors list (link)

[3] "Face Detection".

[4] Wiskott; et al. (1999). Intelligent Biometric Techniques in Fingerprint and Face Recognition. chapter 11: CRC Press. {{cite book}}: Explicit use of et al. in: |last= (help)CS1 maint: location (link)

[1]

[2]

[3]

[4]