Misplaced Pages

Draft:Selective search: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Next edit →Content deleted Content addedVisualWikitext
Revision as of 01:23, 15 June 2024 edit Sirgeorge The 1ST (talk | contribs)36 edits Initial contribution. Obviously nowhere near complete. Needs EXTENSIVE additions and restructuring, but the page needed creation.Tag: large unwikified new articleNext edit →
(No difference)

Revision as of 01:23, 15 June 2024

Selective Search is the name of a method in the object detection space that takes an image and captures objects within that image into defined spaces called bounding boxes (defined by an origin point, height value, and length value). These boxes then act as Regions of Interest (ROI) for an image classifier to classify what object is present in the bounding box.

The Selective Search method is an attempt to make object detection less computationally taxing then exhaustive search and capture the benefits of segmentation in establishing the boundary lines of the boxes based on the shape of the object being classified.

It avoids needing to make a lot of correct guesses on its parameters by using an ensemble approach to segmentation rather than a single model and it avoids having a fixed bounding box size, thus being able to detect objects at different scales, by combining segments to varying sizes rather than having a fixed sized grind overlaying the image from which to search from, as the computationally feasible exhaustive approaches have.

Uses

It is the fundamental Region of Interest extractor for the original Region Based Convolutional Neural Network (R-CNN) as well as Fast R-CNN.

How It Works

The image is taken in and split up into an initial set of small starting regions by the fast method of Felzenszwalb and Huttenlocher. A greedy algorithm is used to iteratively group regions together. The algorithm works by taking a given region and calculating the similaritiese between it and all neighbouring regions. The two most similar regions are grouped together. This occurs recursively until a single region spanning the image results. The Hierarchical Grouping Algorithm is as follows:

Input: (color) image Output: Set of object location hypotheses L

Obtain initial regions R = {r1,...rn} using Fast Method (Felzenszwalb and Huttenlocher) Initialize similarity set S = Ø foreach Neighboring region pair (ri,rj) do

   Calculate similarity s(ri,rj)
   S=S U s(ri,rj)

while S != Ø do

   Get highest similarity s(ri,rj) = max(S)
   Merge corresponding regions rt = ri U rj

Remove similarities regarding ri : S = S \ s(ri,r*)

   Remove similarities regarding rj : S = S \ s(r*,rj)
   Calculate similarity set St, between rt and its neighbors
   S = S U St
   R = R U rt

Extract object location boxes L from all regions in R

Selective Search relies on variety in it's assembly of larger regions and it accomplishes this variety in 3 ways: using a variety of color spaces with different invariance properties, using different similarity measure sij, and varying the starting regions of calculation.

The similarity measure in the above algorithm s(ri,rj) is defined as:

a1Scolor(ri,rj)+a2Stexture(ri,rj)+a3Ssize(ri,rj)+a4Sfill(ri,rj)

ai ∈ {0,1}