The health, fashion and fitness industries are very interested in the difficult computer vision problem of 3D reconstruction of human body parts from images. In this study they address the question of the reconstruction of a human foot. Accurate foot models are useful for purchasing shoes, orthotics, and monitoring personal health, and the idea of retrieving a 3D foot model from images has become very attractive as the digital market for these companies grows. There are four types of foot reconstruction solutions: Expensive scanning devices are one method of reconstructing noisy point clouds, using depth maps or phone-based sensors like a TrueDepth camera, is another Structure from Motion ( SfM), followed by Multi-View. A fourth method is to adapt stereo (MVS) and generative models of feet to represent silhouettes.
They conclude that neither of these options is suitable for accurate scanning in a home environment: most people cannot afford expensive scanning equipment; phone-based sensors are not widely available or easy to use; Noisy point clouds are difficult to use for downstream activities such as rendering and measurement; Additionally, generative foot models have been low-quality and restrictive, and using only image silhouettes limits the amount of geometric information that can be obtained from images, which is especially problematic in a low-view environment. SfM relies on many input views to match dense features between images, and MVS can also produce noisy point clouds.
Insufficient availability of paired images and real 3D foot data for training further limits the performance of these approaches. To do this, researchers from the University of Cambridge present FOUND, or Foot Optimization, using uncertain normals for surface deformation. This algorithm uses uncertainties in addition to per-pixel surface normals to improve upon conventional multi-view reconstruction optimization approaches. For example, your technique requires a minimum number of input RGB photographs that have been calibrated. Despite relying solely on silhouettes, which lack geometric information, they use surface normals and key points as complementary clues. They also make available a considerable collection of artificially photorealistic photographs combined with real labels so that these types of signals overcome the data scarcity.
His main contributions are described below:
• Released SynFoot, a large-scale synthetic dataset of 50,000 photorealistic images of feet with precise silhouettes, surface normals, and keypoint labels, to aid research into 3D foot reconstruction. Although obtaining such information in real photographs requires expensive scanning devices, their data set exhibits great scalability. They demonstrate that their synthetic dataset captures enough variation within foot images for subsequent tasks to generalize to real images despite having only 8 real-world foot scans. Additionally, they make available an evaluation data set consisting of 474 actual 14-foot photographs. Each combined with high-resolution 3D scans and true per-pixel surface normals. Finally, they unveil their proprietary Python library for Blender, which enables the effective creation of large-scale synthetic data sets.
• They show that an uncertainty-aware surface normality estimation network can generalize to real foot images in the wild after being trained solely on their synthetic data from 8-foot scans. To reduce the difference in dominance between artificial and authentic feet photos, they employ an aggressive look and perspective magnification. The network calculates the associated uncertainty and surface normals at each pixel. Uncertainty is useful in two ways: first, by setting a threshold for uncertainty, you can get accurate silhouettes without having to train a different network; Second, by using the estimated uncertainty to weight the surface normal loss in their optimization scheme, they can increase robustness to the possibility that predictions made on some views may not be accurate.
• They provide an optimization strategy that uses differentiable rendering to fit a generative foot model to a series of photographs calibrated with expected keypoints and surface normals. Their pipeline outperforms state-of-the-art photogrammetry for surface reconstruction, is uncertainty aware, and can reconstruct a tight mesh from a limited number of views. It can also be used for data obtained from a consumer’s mobile phone.
Review the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to join. our 32k+ ML SubReddit, Facebook community of more than 40,000 people, Discord channel, and Electronic newsletterwhere we share the latest news on ai research, interesting ai projects and more.
If you like our work, you’ll love our newsletter.
we are also in Telegram and WhatsApp.
Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Data Science and artificial intelligence at the Indian Institute of technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around it. She loves connecting with people and collaborating on interesting projects.
<!– ai CONTENT END 2 –>