pixtrace Documentation

0.8.0

History

The software presented here is the result of several years of writing software that is used for sorting or inspecting mostly agricultural products. There have been many changes over the years in the technology of sorting, but the two biggest have to be the advent of solid state, high resolution cameras, and the incredible versatility and computing power of the microprocessors.

In the late sixties, agricultural sorters could be built with a vacuum tube photocell, a red 680nm filter and a simple lens. Some simple electronics could then be used between the photocell and an air jet or mechanical paddle to "blow on green". This worked well enough to sort green tomatoes from red ones.

In the seventies, the vacuum tube was replaced with an array of three or four silicon photo diodes behind a single lens (see Patent No. 3,981,590), and the electronics was replaced with the Fairchild 709 and then the 741 operational amplifier. Combining the op-amps with the CMOS switches (RCA 4016) that had also become available, made possible some fairly sophisticated processing (see Patent No. 3,854,586).

In the late seventies, the first practical solid-state cameras using line-scan arrays became available. These could be positioned at the end of a conveyor belt and connected to analog comparators to do simple thresholding. They could then be used to "blow on green", or white, or black. Digital circuits were used for delaying the reject signals.

The early microprocessors such as the RCA 1802 and Motorola 6800 came out in the late seventies. The 1802 was a good choice for agriculture, because it did not generate much heat. It was not very fast, however.

The advent of the PC (personal computer) brought another major change, in that the special boards that used to be built to do the processing were not needed anymore. The sorting problem was changed from designing a special board to programing a PC.

The first PC's were not fast enough to do anything but simple thresholding, so the only way to find objects was to make sure they had enough contrast with the background. Thresholding usually worked fine, as long as the objects were not touching. Even when they were touching, it was possible to find where the edge was by using some kind of gradient edge detector. The slow digital computers could be used to calculate this gradient for the few cases that it was required, even if they were too slow to do the entire job with gradient type detectors.

Computers are no longer slow, so it is now possible to simply do all of the edge detection with gradients. The routines in this library would have been too slow to use a few years ago. This trend is expected to continue in the future.

Edge-Detection

Many edge detectors use some form of the operator -1 0 1, which will measure the size of a step function. There are a lot of modifications that can be made to this such as making it -1 -2 0 2 1 and/or extending it to two dimensions. An edge can be found by finding where this function peaks. Experience dictates that it is very difficult to use these kinds of operators to reliably find objects. Detectors of this kind are referred to in this article as slope detectors.

Another way of finding an edge is to use the difference of two slope detectors that are spaced apart. Two slope detectors -1 0 1 0 0, and 0 0 -1 0 1 can be subtracted to give -1 0 2 0 -1. This is a roof or ridge detector that is very useful for finding lines, since it will peak when located over a line. But the roof detector has another property that is equally useful, and that is that its output will change signs as it passes over an edge. This can be used with the slope measurement to reliably locate objects.

For simplicity in computing, the pixtrace; routines calculate the slopes and roofs for the x and y directions separately. The roof values are computed from the slope values. The roof width in the examples is the width between the center of the operator and the -peak or +peak. For example, -1 0 1 is said to have a width of 1, and -1 -2 -1 0 1 2 1 is said to have a width of 2.

The contrast value in the examples is the minimum value of the slope at an edge that will be considered an edge. This is needed because in the absence of any edges, the roof value will randomly change from + to - and back, but the slope will be small at these points. Only those changes where the absolute value of the slope is greater than the contrast will be called edges.

When the roof changes sign at an edge, it will hold that value as it is moved on for a distance in pixels similar to the roof width. This provides protection from the biggest problem with edge detectors, which is gaps that are created where the edge is not distinct.

The routine that calculates the slope and roof is in src/pix_make_roof.cxx and the routine that uses the roof and slope to find edges is in src/base_trace.cxx.

Line-Detection

The important task of line finding is split into two parts, finding line segments that could be parts of lines, and assembling those segments into lines. The line segments are called CentLines because they consist primarily of a list of pixels for the line center.

CentLines are found either by connecting roof peaks or by searching with a neural net. The roof peak method is quicker, but the neural net method tends to find fewer unusable line segments. Once the CentLines are used in lines they are normally deleted.

Finding Lines consists of combining the line centers to form the longest lines with the most contrast with their surroundings. Line centers are found by blindly following a ridge or some combination of pixels which can lead off in directions where an actual line does not go. Lines are made by taking the strongest parts of the line centers and bridging gaps. Once the lines are formed, the line centers are deleted. (Note: the line routine has been around a few years and needs work, so it is not available with this release. The line centers are.).

Pixtrace-Library-Use

To use the library, only the pix_trace/pix_trace.hh file needs to be included. pix_trace.hh; describes the camera class PiXCam and the camera frame class PiXFrame. There can be as many cameras as needed, and each camera can have as many frames as needed. There is no communication between cameras of between frames, That is left up to the application.

Another important library is base.hh which describes Base on which all of the color objects are based. The color objects are Region, Blob, Spot, CentLine, and Line. Region, Blob and Spot are found using edge tracing or thresholding techniques. Regions are the largest, and should be thought of a regions of similar colors. Regions can be divided into Blobs, which should be thought of as blobs of color. Spots are small areas which stand out despite their small size.

The picture data in the frames is arranged in layers. Frames have color layers, canvas layers, roof layers, and slope layers. The frames also keep track of the allocation and numbering of the Regions, Blobs, Spots, CentLines, and Lines belonging to the frame.

Five of the frame layers are for color. They are Luminance, U (red vs green), V(blue vs yellow), Chroma, and Hue. L, U, and V belong to the CIELUV color space. CIELUV is a color space adopted by the International Commission on Illumination to define an encoding with uniformity in the perceptibility of color differences. Dark differences are emphasized and light color differences are de-emphasized. A major advantage is that it is possible to translate from RGB to LUV and back again.

An additional color layer is VidBuf which has the traditional 8-bit RGB values and is where the original camera data is stored. The roof and slope layers are allocated as needed. Layers are distinguished by their color source and the width in pixels of the roof/slope.

There is a separate canvas layer for each of the Regions, Blobs, Spots, CentLines, and Lines. It consists of an x-y array of the locations of the base-objects. This allows an object to easily tell what are its neighboring objects, and for a blob to tell to which region it belongs.

Base-Color-Objects

All objects that are derived from base have a number that is set by their frame and a type which is an R for Region, B for Blob, S for Spot, C for CentLine, and L for Line. The base-objects store internally their edges in a simple run length encoded (rle) format. Their color is stored as a linear function of x and y in a structure called LMS_LUV so that the proper shading across the object is maintained. The LMS_LUV structure also provides the directions of lines, the center of mass of objects, and other mechanical data. The value of y that corresponds to the first rle line is stored as Y, the lowest value of x is stored as X, the maximum width as W. The height of the base-object is the number of rle lines. There is also a set of arrays labeled sr and el (start-right and end-left) that are created and used only by VanGogh when drawing base-objects.

VanGogh

VanGogh is intended only as a test vehicle. One of its major features is the ability to add a plug-in which can be used for designing and testing actual applications.

None of the routines in pixtrace will really identify anything. That requires specific knowledge of the articles being examined. This knowledge can be put in the plug-in. There are existing plug-ins that have been created for VanGogh, but there have been so many changes to pixtrace that they are going to have to be rewritten. Once that is done, an example plug-in will be created for all to use.

This website is actually pretty good documentation in itself, since if you study it you will be able to see what can be done and what cannot be done. If you decide to try and use some of the functions, look at how they are used in the try routines. That should be instructive, since you can also see by using them what they do.