The animation, gaming, and fashion industries can benefit from the cutting-edge field of expressive human pose and shape estimation (EHPS) from monocular photographs or videos. To accurately represent the complex human anatomy, face and hands, this work often uses parametric human models (such as SMPL-X). Recent years have seen an influx of unique datasets, providing the community with additional opportunities to investigate topics such as capture environment, position distribution, body visibility, and camera viewpoints. However, state-of-the-art approaches are still limited to a small number of these data sets, causing a performance bottleneck in various scenarios and preventing generalization to unexplored terrain.
To build reliable and globally applicable models for EHPS, your goal in this work is to thoroughly analyze the available data sets. To do this, they created the first systematic benchmark for EHPS using 32 data sets and evaluated its performance against four key standards. This demonstrates significant inconsistencies between benchmarks, highlighting the complexity of the overall EHPS picture and calls for data expansion to address domain gaps between scenarios. This in-depth analysis highlights the need to reevaluate the use of existing data sets for EHPS, advocating a shift to more aggressive surrogates that provide better generalization capabilities.
His research emphasizes the value of using multiple data sets to benefit from their complementary nature. They also carefully analyze the relevant aspects that affect the transferability of these data sets. His research provides useful tips for future data set collection: 1) Data sets do not need to be particularly large to be beneficial as long as they contain more than 100,000 instances, according to his observation. 2) If a collection in nature (even outdoors) is not feasible, various indoor settings are a great alternative. 3) Synthetic datasets are becoming surprisingly more effective while also having detectable domain gaps. 4) In the absence of SMPL-X annotations, pseudo-SMPL-X tags are useful.
Using the benchmark information, researchers from Nanyang Technological University, SenseTime Research, Shanghai ai Laboratory, the University of Tokyo, and the International Digital Economy Academy (IDEA) created SMPLer-X. This basic generalist model is trained using a variety of data sets and provides remarkably balanced results under various circumstances. This work demonstrates the power of massively chosen data. They developed SMPLer-X with a minimalist design philosophy to disassociate themselves from algorithmic research works: SMPLer-X has a very basic architecture with only the most crucial components for EHPS. In contrast to a rigorous analysis of the algorithmic element, SMPLer-X aims to enable a large scale of data and parameters and serve as a basis for future field research.
A comprehensive model that outperforms all benchmark results from experiments with various data combinations and model sizes and challenges the widespread practice of training restricted data sets. The mean primary errors on five major benchmarks (AGORA, UBody, EgoBody, 3DPW and EHF) were reduced from more than 110 mm to less than 70 mm thanks to their base models, which also show impressive generalization capabilities by successfully adapting to new stages like RenBody and ARCTIC. Furthermore, they demonstrate the effectiveness of optimizing their basic generalist models to become experts in a specific domain, producing exceptional performance across the board.
They specifically employ the same data selection methodology that allows their specialized models to achieve SOTA performance in EgoBody, UBody and EHF, as well as becoming the first model to achieve 107.2 mm in NMVE (an 11.0% improvement) and break new records at AGORA. leaderboard. They provide three distinct contributions. 1) Using extensive EHPS data sets, they build the first systematic benchmark, which offers crucial direction for extending training data toward a reliable and transportable EHPS. 2) They investigate both data and model scaling to build a generalist core model that delivers balanced results in many scenarios and extends effectively to unexplored data sets. 3) They refine their basic model into a powerful multi-benchmark specialist by expanding the data selection technique.
Review the Paper, Project pageand GitHub. All credit for this research goes to the researchers of this project. Also, don’t forget to join. our 31k+ ML SubReddit, Facebook community of more than 40,000 people, Discord Channel, and Electronic newsletterwhere we share the latest news on ai research, interesting ai projects and more.
If you like our work, you’ll love our newsletter.
We are also on WhatsApp. Join our ai channel on Whatsapp.
Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Data Science and artificial intelligence at the Indian Institute of technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around it. She loves connecting with people and collaborating on interesting projects.
<!– ai CONTENT END 2 –>