Two presentations in a session on urban modeling here delved into generating three-dimensional models of buildings and streets from casual sets of photographs.
Generating 3D models from 2D images isn’t a particularly advanced
field, so these two new approaches definitely caught my eye. The state
of the art requires a fair amount of user guidance to help the
image-processing algorithms differentiate between a target object and
visual clutter, such as trees, passing cars, and street signs. There's
plenty of room for improvement in accuracy and detail, and users can
always hope for a faster process and simpler interfaces.
Currently, the most accessible method of 3D modeling from photographs is probably Google SketchUp’s Photo Match feature. SketchUp
is a modeling application that Google bought and then released almost
three years ago. In Photo Match, a user imports an image and then
traces over the lines of a building—the more sets of parallel lines,
the better. Not surprisingly, those lines carry information about the
perspective of the camera when the image was shot. The program uses
that data to extrapolate the overall shape of the building. Once the
rough outline is in place, the software can extract patterns from the
photo to overlay texture detail. Voila, a quick-and-dirty 3D building.
For better results, you can do the whole thing over again with another
photo of the hidden sides.
The two methods presented here apply new methods to processing a collection of photos of a target scene.
One
technique came out of a partnership between the University of North
Carolina–Chapel Hill, UC-Berkeley, ETH Zurich, and Microsoft Research. This approach
starts with a jumble of images of a building or city. Preliminary image
analysis identifies the image’s vanishing points, similar to Photo
Match. A user traces the rectangular outlines of the primary building
walls, a geometric model is generated, and the textures from the
original photograph are applied. My sense is that the main advances
here over Photo Match are in the intelligent way that the photos are
processed together to create a preliminary model, and in a simpler user
experience. In ten to fifteen minutes, you can easily generate a model
of a building from 8 or 9 photos. Give it an hour and 120 photographs
and it’ll generate a fairly accurate model of a city. Of course, it’s a
trade-off between the quantity of data needed to start off and the
fidelity of the model.
The second method
came from researchers at the Hong Kong University of Science and
Technology and the National University of Singapore. It focused on
facades rather than complete buildings. To start, a photographer drives
down a street and takes successive shots of a continuous façade (of a
shopping street, for example). Those photos are automatically lined up,
pattern-matched, and analyzed at a fairly deep level to generate a
large mapping of points that capture the color, texture, and depth of
various parts of a facade. The images are broken down into sections,
analyzed for things such as embedded symmetries (to identify evenly
spaced features that ought to be identical), then merged back together
to speed up the rendering. A user helps the program identify the
façade’s salient features (this part of the talk was left unclear), and
voila, an extremely detailed rendering of a street face pops up.
Neither approach is complete, but things move fast in the graphics
world. It could be a matter of months before something along these
lines gets incorporated into existing 3D modeling tools.
Recent Comments