VisualSize, a Southern California based company, will be launching shortly with a service that will give accurate three-dimensional measurements of things inside of 2D photos.
To measure things accurately they need two photos of the same thing, but from different angles. The VisualSize algorithm automatically detects feature points in the two pictures and finds the matching pairs. It then uses the matching pairs to calculate coordinates of the two camera positions (x, y, z axes and origin in a 3D coordinate frame), and uses triangulation from the image to plane to the 3D coordinate frame to reconstruct the 3D scene. VisualSize can then measure length, angle, area and volume with a high degree of accuracy. The screen shot shows an example of raw output, but the service itself is pre-alpha.
This isn’t as sexy as, say, what Fotowoosh is doing by creating 3D images from a single 2D photo. But it does have commercial value. Home improvements, interior design, general contracting estimates, crime scene investigation, military planning, insurance claim assessments, etc. all need accurate measurements. VisualSize says they can create them with measurement errors of 0.1 to 5%.
The company says they will offer the service on their website. More importantly, there will be an API to allow others to build applications around the service.
The founders are Yuan-fang Wang (a UCSB professor) and Ping Liang (a UCR professor). They are self funded via liquidity events from previous startups.





I believe it will require more information than simply two images. Without any known measurement in real-world units you can only measure up to an unknown scale factor. So you’d have very accurate relative measurements, but not in absolute units.
I can just imagine the fun that the paparazzi will have with this
This is actually useful if it works. An API could allow contractors - for example - to get materials without needing to go get exact measurements via phone pics and such. There is definitely a lot of industry out there that relies on measurements. I’d be interested to know how they plan on making a buck, but imagine a subscription service would be the way to do it.
I have looked at other companies with similar business plans. They generally do well on 2D, but 3D is an order of magnitude harder. I don’t know what the accuracy is for this method, but the ones I have seen do not have the minimum accuracies needed (at least tenths of an inch) for contracting purposes.
Also agree with Lex, you need a standard length for calibration. The other company I saw does use a rectangle pasted on the wall for reference.
http://www.bizorigin.com
This is how the USGS (and other people) make topographic maps.
The airplane flies over the area taking two pictures, the plane has moved in-between. Then as long as you know one benchmark elevation, the time the pictures were taken and the speed of the airplane you can accurately determine elevations.
You can make your own anaglyphs out of these kind of sets of pictures. Make one image red, the other blue (and slightly transparent) and use those funky 3D glasses. fun.
Mike,
Thanks for the coverage.
Lex and Toni: you are correct, an reference will be needed. Our algorithm can either accept a reference of known size, e.g., letter size paper is 8.5″x11″, or any single measurement that is easy to obtain.
Toni is absolutely correct that 3D is an order of magnitude harder than 2D, but 2D simply does not work except in very rare cases (when the object to be measured is in a plane that is parallel to the image plane of the camera). It is just impossible to get 3D measurements from a single 2D picture.
What we have been able to do is to come up with robust and fast algorithms that can reliably do two things:
1. Detect enough features points in two images, find out which feature point in one image matches with which in the other image, and know when the match is accurate enough.
2. Calculate the coordinates of the camera and the camera motion from the first position to the second position. This involves calculating essential values of transformation matrices, etc.
For consumer use, our accuracies are in the 0.1 to 5% range, which may be sufficient for some applications and not enough for others. For professional use, we provide guidelines on how to take the pictures to get higher accuracy and we provide a confidence level for each measurements to tell you how accurate it is.
There are several ways to make money from the service, including licensing fees for business use of our 3D metrology engine, private labeling by home improvements, interior design companies etc., licensing to law enforcements and military planning, and advertisements.
Hi Kris:
Thank you for your comments. A digital elevation map (DEM) is most commonly built by remote sensing techniques, e.g., using data from interferometric synthetic aperture radars. A DEM can also be generated by interpolating sampled elevation datasets (by GPS or ground survey), but usually with a much lower density and smaller coverage.
Using stereo images from a flyover is a much less common. Furthermore, you need the baseline information (how the aircraft moves in between the two pictures). While the aircraft (or camera) movement information can be obtained from an inertia navigation system in a flyover, such information is not available for most application scenarios in the consumer market. Our algorithm automatically infers the camera motion information in between stereo images.
While humans are exceedingly capable of perceiving QUALITATIVE 3D structure (from two images, even from a single image), they are a lot less capable at obtaining QUANTITATIVE dimension measurements robustly and accurately - which is what our system aims to achieve.
Sounds very similar to the technology in a product called SilverEye, which Microsoft acquired when they purchased Canadian software company GeoTango in late 2005. They use it to build their own 3D models I believe. I wonder what the differences are? And seems like an area that would be ripe with patents already.
It’s something people will probably use. It sounds similar to another LA company - iphotomeasure.
Google will probably snap this one up just to add that feature to SketchUp.
Alex,
We are aware of SilverEye, it creates 3D visual effect of urban scenes from a single satellite image or aerial photo. First of all, it is single image based, so no measurements, secondly, it is satellite image. It does not measure you window curtain or kitchen cabinet.
The big difference in what we are doing with other stereoscope reconstruction of 3D models is that these methods require the camera motion or the translation and rotation of the camera from the first image to the second image be known. Once you know the camera motion, reconstruction becomes a forward problem and is much easier. In our case, we do not know the motion of the camera from the first image to the second image. All a user does is to take one picture, walk a couple of step and take the second picture. We have to infer the accurate camera motion from the data, which is an inverse problem and is much harder.
This would have been a great help when I modeled my house for Google Earth. A lot safer then hanging off the roof with a tape measure.
This would be cool technology for digital camera’s with GPS built in - the item(s) in the picture could be mapped with the correct lon/lat - rather than the location of the camera taking the picture.
This sounds cool. I used to sell window covering and had to drive to lots of homes just to give free estimates. 80% of the customers never bought, so I wasted lots of time and gas. I could not rely on customers to take the measurements for me because they did not know how much space to leave, and could not really follow instructions to measure correctly. With this, I just need them to email me a couple of pictures.
I see police using this to get measurements from crime scenes or accident scenes after the scenes are long gone.
SilverEye does do measurements, I used it regularly back then. You are correct though, it is limited to using satellite imagery in the versions released. That said, there is no reason they could not use the same algorithms to measure other things I imagine.
As for the arbitrary placement of the camera, yes that is a strong advantage to systems that know about camera position like most photogrametry systems. That, ironically, sounds like the technology for creating 3D models from arbitrary photos in Microsoft Photosynth: http://labs.live.com/photosynth/
I agree with the comments, this is very useful technology. I just wonder how commodity it will be very soon. Seems like there is a lot of work in this space already and that it is likely others can easilly match these technologies and don’t need to use them as direct revenue products or can use them as loss leaders for other things. It would be naive to underestimate the abilities of Microsoft or Google in this space with all their recent build up of photogrammetery and related technologies over the last few years.
“This isn’t as sexy as, say, what Fotowoosh is doing by creating 3D images from a single 2D photo.”
Well, except they don’t do it, neved did and probably never will.
Looks like you did not read the comments to your own post.
The 4 pages “web site” did not change for the past 3 months.
Even the so called demo wasn’t originally produced by fotowoosh.
Alex:
Thank you for your comments. From a theoretical point of view, it is provable mathematically that
(1) You cannot recover 3D structure from a single image, without some prior information, domain-specific knowledge, and/or special imaging configuration, and
(2) You cannot obtain true 3D dimensions from discrete photographs (even from continuous videos) alone without some external reference of a known size.
Hence, to be a GENERAL 3D metrology system, you must use at least two photographs with a reference of a known size – this is what our system is designed to do. Anything less must fail for certain configurations or is applicable only for limited application domains and/or imaging configurations. Again, this is a mathematically-provable fact.
Furthermore, even with two pictures you must somehow obtain the movement of the camera between the two shots (i.e., the stereo baseline). For consumer applications, it is unrealistic to assume that an expensive, calibrated stereo rig is available or the user can tell you, with precision, how the two pictures are taken. Our algorithm recovers the movement automatically with high precision.
Thanks.
Thanks.
Quick, somebody try this on some porn sites and report back with results.
I think VisualSize could be useful to online clothing stores if it can be adapted to body shape. Clothing size is not an exact science so being able to order the right size by uploading a photo of yourself and your height will be compelling. With enough photos, one should be able to build and maintain full 3D measurement to ‘virtually try-on’ anything.
Alex,
You are correct that lots work has been done in 3D modeling and stereo vision, and Microsoft and Google have the resources to do similar things. We are familiar with the publications in the field and have published in the field ourselves.
Technology from Photosynth can be adapted to give measurements with some additional work. Photosynth was developed with very different goals and usage scenarios. It has taken a lot of work for Microsoft to build a few demo models. The key is to be able to get back accurate measurement results fast with very simple and intuitive user interface. Our system is optimized for just that: picking the right features and just the right number of features, quickly infer the camera motion and get accurate measurement, know when the measurements are accurate and when they are not. We believe we are doing this better than anyone else.
We do not try to reconstruct the entire 3D scene, just enough information for us to get the measurements, faster and more accurate than anyone else, at least so far.
Google Sketchup already has this. It’s called PhotoMatch.
It took me about four hours to learn how to use it, so there’s probably room for improvement.
In response to posting from MK: I used ipohotomeasure’s software before and it simply didn’t work. You cannot measure anything that is not on the same plane, and not too far away, from their target.
The idea of Visualsize sounds more correct (we have two eyes for perceiving three dimensions). I can see a lot of applications for this. The effort seems to be minimal – just one more click of the camera button. if the claim is true (you don’t even need to measure the movement of the camera between the two photos), it would be very cool.
Hi Vanr:
Google Photomatch is geared toward modeling (to generate dense 3D structures), not metrology (to obtain sparse 3D measurements). It makes some special assumptions about the scene (i.e., a two-point perspective model) to establish the position, orientation, and scale of the 3D model. In more detail,
(1) parallel lines and vanishing points are used to establish the model orientation,
(2) one single reference point (origin) is used to establish the model location, and
(3) one single length to establish the model scale
Our system requires only (3) but not (1) and (2). The reason is that 3D dimensions (length, area, volume) are not affected by translation (different positions) and rotation (different orientations). Hence, if the goal is 3D metrology, requiring (1) and (2) only limits your applicability. The bottom line is that you do not need to build a complete computerized 3D model of the scene (which requires significant user interaction and can be quite computationally intensive and time consuming) to obtain a few 3D measurements.
I see this development of 3D measurement could potentially have a significant impact in various fields of applications. It is quite unique in that it is accurate and simple to use.
The idea is interesting and its implementation as promised seems to be very useful. I can easily think of many potential applications. It would be encouraging if a start-up company like this one, instead of Microsoft, Google, or any well-established Software company, can successfully bring this development into fruition.
Ping and Yuang-fang,
it seems that what you are doing is some close range photogrametry preceded with (claimed - as the service is not open yet) roubust segmentation and feature extraction algorithms.
obviously this is not something totally new and the proof of the pudding is in the eating.
from my experience in such areas it would be *extremely* surprising if the application will allow for:
- ANY kind of input imagry (people, trees, buildings, water, interior) no
matter how complex and irregular or how many undercuts or occlusions
there are in the scene or what lighting levels are present
- no man in the loop intervention AT ALL
- complete roubustness - the algorithm will always converge quickly and
with no artifacts and will yield accurate results for any given two
measurement points in the picture
is this what you are claiming?
if your rev model is indeed licensing the API - will there be the ability to tune the algorithms for various applications?
who do i need to pay off to get a beta code
Mike
Hi Mike:
We have tested our algorithms on both indoor and outdoor images with different lighting and weather conditions. Man-in-the-loop is often unncessary (but the user does have to click on the objects that he/she wants to measure). Robustness is provided against errors in feature localization and correspondence. The algorithm always converges as no iteration is needed.
We will be happy to send you an invite code when the user interface is completed, if you are interested in participating in beta testing. If you have any questions, please do not hesitate to contact us at info@visualsize.com
– Yuan-fang
This is a pretty cool idea. Especially useful for my upcoming home improvements, as you noted in the main article.
Sounds awesome… looking forward to testing it out!
Just to provide related tools so some readers could have a better big picture.
A: Take two photos with your hand-held digital camera… and then perform 3D measurement.
iWitness - http://www.geocomp.com.au/
PhotoModeler - http://www.photomodeler.com/
Another interesting one to measures all objects in A photo (certainly with a different ‘technique’)
iPhotoMEASURE - http://www.iphotomeasure.com/
B: If the input source is aerial oblique photo, we probably have to mention the following two companies and their 3D measuring and modelling software. Pictometry’s aerial oblique photo has become a shining feature of its virtual globe platform (Virtual Earth) and been heavily explored for the reconstruction of immersive 3D land surfaces. The technology underlying this sort of things is becoming more relevant than ever as the competition for presenting a better virtual globe intensifies.
Pictometry http://www.pictometry.com
MultiVision http://www.mv-usa.com/
C: If the input source is space-based satellite imagery, too many algorithms and tools to be listed here for 3D measurement and reconstruction … Of course, it can be argued that in this case the main photogrammetric algorithm is different with some similarities …
All are interesting developments!
This is the method used by NASA to allow their rover to reconstruct 3d models or Mars environment.
It would be cool to have a freeware version of this algorithm to apply it on my home-made robot!
Just to provide a new plugin for Google SketchUp with, it seems, similar technology.
Pixdim - http://www.pixdim.com
It does similar measurements under Google SketchUp.
Houa !
It’s very funy for the re-building of the old building. But of course , it’s the same technologie like Pixdim !
The CAD browser is specialy developped for TechCrunch or it’s a standard ?
what’s format of export files are available ?