The Vantage Project proposes to deploy many cameras around SRC and
to track people as they move from the field of view of one camera to another.
It should be possible to use this information to analyse the movements
of people and identify individuals.
This tracking (in particular the hand-off between cameras as a person
moves between their fields of view) will be easier if we can make inferences
about the real location of a person in the world based on that person's
position in a camera's image plane. It would also be helpful to know the
3D structure of the scene to guide the tracking and enforce the constraint
that the person is walking on the floor, not up the wall.
The problem can be divided into three areas:
Camera Calibration
A video camera can be approximated as a projection from the 3D world
onto a 2D image plane, and a calibrated camera is one for which that projection
matrix is known. Current methods for automatic calibration and structure
recovery require stereo images, hand-registration of features or the observation
of known objects. We used a method that uses the Manhattan assumption (that
most of the lines in a scene are aligned along three perpendicular axes)
to automatically recover the camera calibration from a single image.
Image Segmentation
Once the camera has been calibrated we want to segment the pixels in
the image into different regions that correspond to some kind of structure
in the world. Using our Manhattan assumption we can assume that most of
the surfaces in the scene are planar and that they are separated by extended
lines in one of the three primary directions. Once the camera is calibrated
we can detect these extented lines in the image and use them to define
an initial set of regions. We then reduce this set by merging the most
similar neighbouring regions until a minimum region difference is reached.
Structure Recovery
Once the image has been segmented we assume that each segment represents
a planar surface in the world. We begin by heuristically identifying the
floor region and assuming that the camera has been installed roughly upright
we can compute the orientation of this floor region. We then compute the
world coordinates of every pixel in the floor segment by performing simple
line plane intersections. Using these world coordinates we can discern
the boundaries of the floor which should assist greatly in tracking people
for the Vantage Project.
|