OpenVL: A Developer-Level Abstraction of Computer Vision

The image contents are modelled as segments: regions within the image which are distinct from their surroundings. The definition of distinct is provided by the developer through properties such as colour, texture, intensity and blur. These can be chosen based on knowledge of the problem: e.g. to separate coloured balls or balloons we would choose colour as a the property, but to differentiate carpet and thread we may choose texture. If the chosen problem is segmentation, we use the model to segment the image based on the properties. Otherwise, we use this information as the first component of the problem description, and move on to the next step. Once segmentation has been defined for an image, we can apply operations using segments to solve other problems. For example, we can define a matching operation on segments which uses a developer-provided set of variances which describe how the segments vary between images (position, colour, intensity, size, etc.). If the matching (or image registration) problem is defined by the developer, the variances are used to select the correct algorithm: e.g. if the segments' intensity varies, an intensity-invariant method would be chosen by OpenVL. The operations may be sequenced together to describe higher-level tasks, e.g. image registration is a segmentation, correspondence and then global optimization to find a transform. The sequenced operations form the second component of the description, and the details (such as the variances) form the third and final component. When these are all in place, OpenVL interprets the description and executes a hidden method to produce the result.
The abstraction is designed to be easy to use, at a level above specific vision algorithms. This leads to more effective methods of acceleration: like OpenGL, vendors can provide their own implementations of OpenVL which compete based on power, precision and performance. For example, we can define the speed of operation in segments-per-second and detections-per-second, and use a standardized set of images and problems to evaluate quality. The quality-to-speed ratio would give some indication of the effectiveness of the implementation. Our reference implementation is CPU-based, and provides solutions for colour/texture/intensity/size segmentation, chroma-key matting, strong sparse correspondence, 2D image registration and front/profile face detection. A combined CPU/GPU version is also available for segmentation problems, and demonstrates the capacity of OpenVL to support hardware acceleration. We are continuously working on adding new descriptions and expanding the reference implementation for new problems. More information and development libraries are available from http://www.openvl.org.
Presented at the SIGGRAPH Talks session in Vancouver, August 2014.
|
|||
@InProceedings{Miller:SIGGRAPH2014:OpenVL,
author = {Gregor Miller and Sidney Fels},
title = {{OpenVL}: A Developer-Level Abstraction of Computer Vision},
booktitle = {Proceedings of the 41st Conference on Computer Graphics and Interactive Techniques; Talks},
series = {SIGGRAPH'14},
pages = {76:1--76:1},
articleno = {76},
month = {August},
year = {2014},
publisher = {ACM},
address = {New York, U.S.A.},
isbn = {978-1-4503-2960-6},
location = {Vancouver, Canada},
doi = {http://dx.doi.org/10.1145/2614106.2614206},
url = {http://www.openvl.org.uk/Publications/Publication.php?id=Miller:SIGGRAPH2014:Talk}
}
|
|||