SenseTime, a leading artificial intelligence company from China, has unveiled its latest breakthrough, the SenseNova 5.5at the 2024 World artificial intelligence Conference & High-Level Meeting on Global ai Governance. The announcement highlights SenseTime's commitment to innovation and practical application in various industries.
The large SenseNova 5.5 model represents a comprehensive upgrade that integrates the first real-time multimodal model in China, SenseNova 5o. This model features a new ai interaction framework comparable to GPT-4’s streaming interaction capabilities. SenseNova 5o’s multimodal nature allows it to process and respond to data in audio, text, image, and video formats in real-time, giving users an interactive experience similar to conversing with a human. This feature is valuable for real-time speech recognition and conversation applications as it showcases the model’s adaptability and contextual response capabilities.
One of the key highlights of SenseNova 5.5 is its cost-effective, large-scale edge model, which significantly reduces the cost per device to just RMB 9.90 per year. This affordability facilitates widespread deployment, making advanced ai accessible to various users and industries. SenseTime’s full-stack, cloud-to-edge large-scale product array ensures continuous upgrades, delivering innovative solutions for generative applications across multiple scenarios and industries. SenseNova’s large-scale model has already been deployed to over 3,000 government and corporate customers, spanning the technology, healthcare, finance, and programming sectors.
Dr. Xu Li, Chairman and CEO of SenseTime, highlighted the significance of this upgrade, saying, “This is a critical year for large-scale models as they evolve from unimodal to multimodal. In line with user needs, SenseTime is also focused on driving interactivity. With applications that drive model development and capabilities, coupled with technological advancements in multimodal streaming interactions, we will witness unprecedented transformations in human-ai interactions.”
SenseNova 5.5’s technical prowess is supported by a hybrid cloud-edge expert collaboration architecture, optimizing cloud-edge synergy and reducing inference costs. The model training used tokens from over 10TB of high-quality training data, including synthetically generated reasoning chain data, which enhanced its reasoning capabilities. Compared to its predecessor, SenseNova 5.0, the new model boasts a 30% improvement in overall performance, with enhanced capabilities in mathematical reasoning, English proficiency, and command following, closely aligned with GPT-4 core metrics.
In addition to major model upgrades, SenseTime has introduced SenseChat Lite-5.5, an edge model that features a reduced inference time of 0.19 seconds and a 40% improvement over SenseChat Lite-5.0. The inference speed has also increased by 15%, reaching 90.2 words per second, resulting in improved performance and efficiency. The edge model product array includes specialized models such as the SenseChat Mini Writing Assistant, Summarization Assistant, and Encyclopedia Assistant, each tailored to specific business needs.
An important addition to the SenseNova suite is Vimi, SenseTime’s first controllable ai avatar video generator. Vimi can generate short video clips with precise control of facial expressions and upper body movements. It is an ideal tool for long-form video generation in interactive and entertainment applications. This feature underscores SenseTime’s commitment to expanding generative ai applications under SenseNova’s large model series, serving diverse user needs and empowering industries in their digital transformation efforts.
SenseTime has also launched the “Project $0 Go” program, which offers a free, comprehensive onboarding package for enterprise users migrating from the OpenAI platform. This initiative includes a package of 50 million tokens and API migration consulting services, lowering the barriers to entry for enterprises looking to leverage SenseNova’s robust large-model capabilities.
In conclusion, 2024 is a banner year for super-sized models as it coincides with SenseTime’s 10th anniversary. The company’s decade-long journey culminated in a comprehensive super-sized model product array spanning applications from cloud to edge. As SenseTime continues to expand the SenseNova industry ecosystem, it remains dedicated to empowering more businesses and communities on their digital transformation journeys.
Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary engineer and entrepreneur, Asif is committed to harnessing the potential of ai for social good. His most recent initiative is the launch of an ai media platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easily understandable to a wide audience. The platform has over 2 million monthly views, illustrating its popularity among the public.