Introducing Shape Tokens, a continuous 3D representation that is compact and easy to integrate into machine learning models. Shape tokens serve as conditioning vectors and represent shape information within a 3D flow matching model. This flow matching model is trained to approximate probability density functions corresponding to delta functions concentrated on the surfaces of 3D shapes. By incorporating Shape Tokens into various machine learning models, we can generate new shapes, convert images to 3D, align 3D shapes with text and images, and render shapes directly at variable user-specified resolutions. Additionally, Shape Tokens enable systematic analysis of geometric properties, including normals, density, and deformation fields. Across all tasks and experiments, using Shape Tokens demonstrates solid performance compared to existing baselines.
Figure 1: Our Shape Tokens representation can be easily used as input/output for machine learning models in various applications, including 3D single image (left), neural representation of normal maps (top right), and alignment 3D-CLIP (bottom right). The resulting models achieve strong performance compared to baselines for individual tasks.
Video 1: The video shows our unique image to 3D point cloud results. Images are invisible objects in the Objaverse test suite. Each video first shows the input image and then the generated point cloud. (Credits)
Video 2: From the same invisible input image, we generate multiple point clouds independently. (Credits)
Figure 2: Overview of our architecture. (Left) We model a 3D shape as a probability density function that is concentrated at the surface, forming a 3D delta function. (Right) Our tokenizer uses cross-attention to add information about the sampled point cloud in the form ST. The velocity estimator only uses cross-attention and MLP to maintain independence between points.
Figure 3: Reconstruction, densification and normal estimation of invisible point clouds in a GSO dataset. For each row, we are given a point cloud containing 16,384 points (xyz only), we calculate ST and iid shows the resulting p(x |s) for 262,144 points. Different columns represent the input and point clouds sampled from different viewpoints. Indicated by the label in parentheses, we color the input points according to their xyz coordinates and the sampled points according to the uvw coordinates of their initial noise and their estimated normal (last two columns). Note that we do not provide normal as input to the shape tokenizer.
Video 4: We compute shape tokens on input point clouds (16,384 points) from invisible Google scanned objects. We then sampled 16 times as many points (262,144 points). The video shows the uvw to xyz trajectory of the flow matching sampling process, that is, the ODE trajectory. We color the points with their initial position in the noise space (uvw). (Credits)
Figure 4: The ODE integration path defines a mapping from xyz (data) to uvw (noise).
Video 3: The video shows recent methods of converting a single image to 3D on Google scanned objects, which are not seen in all methods.
From left to right:
input image
Splash Image (CVPR 2024) – Trained in Objaverse
Point-e (2022) – Trained on several million proprietary 3D meshes.
Make-a-shape (ICML 2024) – trained on 18 datasets including Objaverse
Ours: trained in Objaverse
Please note that this video is not intended to compare individual methods: these models differ in their training data (e.g. Point-e was trained on proprietary 3D meshes) and mechanisms (e.g. Splatter-image is not a generative model, our method assumes a known camera model). We provide the results for the viewer's reference. (Credits)
Video 5: The page shows the results of neural representation in invisible point clouds. Using Shape Tokens we use a neural network to independently estimate the intersection point of each ray and its surface normal.
From left to right:
Normal ground truth surface
Pointersect (CVPR 2023)
Our
(Credits)
Mesh/Image Credits: Google Scanned Objects, feedomo.ru, Jacob.Elhatmi, WrenArt, undeadfae, Monicag97, STK_produktion, Andi R, xabi, th_jabba, johnnokomis, LasquetiSpice, AdiXXioN, taplinhvip111, Stolmark, Koppany.IDK, vetorprotensao, Jackson Sanders, remdwaas, GRAPHTEC AMERICA, iiircha, despinozavi, AstrumProjects, asleshka, ulmsklv, S.Duce, idcim, Darkkostas25, CREATRBOI, steam2020, feedomo.ru, AnirudhRao, 3DFoxHound, pattarrian, katienixdesigns, icepacha, A109082012, RyanCrosby, Armen Gevorgyan, EnjoyLife_Tlt, Fong Chen, WHA Arquitectos, andreagonzalez28, YouSaveTime, Cutestormy, amy3d, daand, EfrenR, Poppy, MARTINICE GROUP, julianChee, Whatsername, Stuart, danielleclark, redkaratz, LuDiChRiS, mbilalsiddique1, Frybrix, defnotdan, invisiprim, Brent Loncher, MrMaxICT, Stevie_66, Jesse Van Norman, WuhuAirline, anyaachan, Lustron.ru, КУКАЛЕВ, Maxmalow, Karolina K Bieńkowska, Steel Frame Solutions Limited, James Robson, tepapalearninglab, showcasebook, Christopher Cox, apoiocad, Padraig Daly, CurveCreativeStudio, DennisGray, awards, YouniqueĪdeaStudio, Nadieroo, dinomaster, pattarrian, rodrigo.ferrada, tamaliteitor123, George B, Csaba Baity (tsabszy), tim.a.schmitz, romane_bouverot, RPG_Engineer, rilisjr, DJMaesen, agglover, Adrian Carter, mohamedsuspeito, Kevin Bond, faizn0rdin, SpaceCowBoy, Giravolt, NukedGames, bhrf, mscla1r3, ScannerDev, Vikrama Raghuraman, NoobiePie, prostair.pl, Rzyas, Phil Gosch, gFiamma, pahlevidaffa, Onironauta digital, pixelsquare, SketchingSushi, Mateus Schwaab, archmixes, jacob_kenndey, lidija.simo, Jessica Peterson, Ltcolscotty , 3Dystopia, Vicente Laberge, frdifrn, Frédérick Pagé, camlaneve, Matt, IronEqual, Tursito, Davidk, Mrs.XAYarnArt, prostair.pl, ChrisLee, guseu, Guilhermino, dieterreinert, Mattyew, natalimedeiros, leopro , Trappemakersen, beehn, alisachen69, Chrifuf, cncbrasil, zuzana vajdova, nguyenlouis32, DarksProducer, globalshizaku, louayleo, semmert179, naruemol.pholnuangma, Eric Haines, 3DHA, Nick_Sherman, chaosexcell, ssarinareza, aveli.ladva, Tomas Rubianes, RainerWahnsinn, Lucas Jaenisch, cs_adam, trinityscsp, a109082026, JasseeNFT, Cowdi, Kisielev Mikhail, kay Quobad, secretariatep, me16019, scailman, Stichting Consortium Beroepsonderwijs, feedomo.ru, PatelDev, bipolarbear, Emm (Scenario), De Oliveira M., Наруто, Keita-sama, RodierGabrielle, mizuhi, shughes, Gregory Khodyrev, millerj449, Marko31, David_Holiday, edouard.angebault, feedomo.ru, Artem Shamsuarov, Alan Grice Staircase Co Ltd, THESTIG03, vamsikrishna.v, Dundee Howff Conservation Group, sinhoroto, jia100, 10668285, Born_Canadian, jashma82, aki.karppinen, DarkAaron999, Luckster, julius.j.bib, trolosqlfod, RBG_illustraciones, feedomo.ru, MOHMAX1, jamesdeantv1, moxmoin, Adrian21, andrea bocchini geometra, Re3xyyz, Binkley-Spacetrucker, FeralMan, unownlord, pigfinite, duperonvincent, ayekerik, 140813, antonio.a.longoria, cyber0063, Mateja Veljkovic, Vonka Stairs Ltd, Bresca, kishi, 97jana, Sogomonyan_Vaagn, Peachybunny, gb.prof.69, milen.margaryan2003 , nguyenhuydang, andysmiles4games, Aorie, jonamanz9673, mom long legs, buckygaming2019, gwen.domingo, PointXX, Lukas Guhse, arakiminoru, Tatiana Sumarokova, potaato, Lustron.ru, jhseok8927, Xillute | Dev, re1monsen, c4n, Ceat, joseph.terronez, matousekfoto, Max Wittig, rltw, lsbergin, KIΣITO, Aiden Huxley, 3Dystopia, MartyUkovGBS, Jamie Rose, Mihail.Burduja, ashpatz845, Schack-Trapper, brian.h.moyer, Excel Stairs Ltd, Behets, Noemi.Mancilla.Serrano, madison319478, Drake, xeratdragons, timpugh44, GSMRF, Lauren Hasegawa, Ca7chi, dewathoem, schaffsp, newfields-3dprinting, Dikart, MariaMam, Micayla Spiros, silvinomc00, Neut2000, Orie J. Braun, hafsa.ishtiaq97, Robwaah007 , shakiller, newfields-3dprinting, -Slash-, Saumleid, DreamSail Games – Graham, Jingbari, sualogo3d, maypassamon, Uğur Yakışık, Caitlin, LynSalvador, lanvalond, TheDesigner, e90r96, guilherme.vinicius, Lustron.ru, ZOMBIEFOLIFE, TroyMay21, Qubx. 3D
Related