Today we’re looking at CSM’s AI image to 3D service.
Wait, what doss “image to 3D” mean? It’s yet another AI-powered service that instead of generating a picture from text, it accepts an image and makes a 3D model.
This is quite a feat, since a 3D model by definition is a 360-degree view of a subject, while an image is a 2D view that obscures the “other side”.
How can a service generate a 3D model when it hasn’t seen part of the subject? That’s the magic of AI.
If you think about it, you do this all the time. Look at a person. Can you imagine their hairstyle on the back of their head that you cannot actually see? Certainly you can. You can do this for many common objects.
This is the principle of CSM’s service: it’s trained on many common objects and is able to generate the most likely shape — on all sides — from a single image.
It sounds incredible, and I had to try this out.
As you can see above, the system seems to do OK with common objects, and in particular figurines. It seems that it “knows” body shapes and is able to generate relatively competent 3D models just from a single image.
I was curious to see how it would do with something a bit more mysterious, so I generated an image to send to CSM.
I used an AI text to image service to make a test image. My prompt was “A detailed image of a kitchen toaster, with a photorealistic finish, rendered in 8K resolution.”
Here’s what I got, pretty interesting. Note that you can easily imagine what the backside of this imaginary device might look like.
I had to remove the background for CSM to operate optimally, and that was also done by the AI service. I ended up with this for submission to CSM:
After submission, CSM came up with it’s guesses as to what the other sides of the toaster might look like:
Were these correct? Well, not really. But they weren’t totally unreasonable either, particularly for an imaginary toaster. I decided to proceed with the generation — as if I had a choice at that point, anyway.
The result was, erm, a toaster-like object. From the front it didn’t look too bad, although the shiny metallic finish somehow was tarnished:
But from the top you can see that the sides weren’t particularly straight:
The strange “stretch” seems to be along the viewpoint from the original image, if that means anything.
I was then able to download the mesh for this object using either GLB or USDZ format. Neither of these files are particularly friendly for those of us with STL and 3MF tools, but it is possible to convert the file into something you can use.
After conversion, here’s what I got, as viewed in MeshLab:
The front seems reasonable, and so does the back. It’s just that CSM seems to have misjudged the squareness of the toaster. I suspect that’s because its training has been fine-tuned to figures rather than objects.
In other words, I think this concept can work, if tweaked a bit.
Once this is working it just might be possible to generate competent 3D models of many things simply by taking a single 2D image.