Text2BIM: A LLM-based multi-agent framework that makes it easier to express design intent more intuitively

Building information modeling (BIM) is a comprehensive method of representing built assets using geometric and semantic data. This data can be used throughout the life of a building and shared in specific forms among project stakeholders. Today’s building information modeling (BIM) authoring software takes into account diverse design needs. Because of this unified approach, the software now includes many features and tools, which has increased the complexity of the user interface. Translating design intentions into complicated command flows to generate building models in the software can be challenging for designers, who often need substantial training to overcome the steep learning curve.

Recent research suggests that large language models (LLMs) can be used to produce wall elements automatically. Advanced 3D generative models, such as Magic3D and DreamFusion, allow designers to convey their design intent in natural language rather than through laborious modeling commands; this is particularly useful in fields such as virtual reality and game development. However, these text-to-3D methods typically use implicit representations such as neural radiance fields (NeRFs) or voxels, which only have surface-level geometric data and do not include semantic information or model what 3D objects might have inside. It is difficult to incorporate these fully geometric 3D shapes into BIM-based architectural design processes because of discrepancies between native BIM models and these. It is difficult to employ these models in subsequent building simulation, analysis, and maintenance work because of the lack of semantic information and because designers cannot directly change and modify the created contents in BIM authoring tools.

A new study by researchers at the Technical University of Munich presents Text2BIM, a multi-agent architecture based on LLM. The team employs four LLM-based agents with specific functions and capabilities that communicate with each other via text to realize the aforementioned core idea. Product Owner Write comprehensive requirements documents and improve user instructions. professional architect develops textual construction plans based on architectural knowledge, programmer analyzes requirements and codes for modeling, and critical corrects model issues by suggesting ways to optimize the code. This collaborative approach ensures that the core idea of Text2BIM is implemented effectively and efficiently.

LLMs can naturally think of manually created tool functions as short, high-level API interfaces. Because of the typically low-level and fine-grained nature of native BIM authoring software APIs, each tool encapsulates the logic of merging various callable API functions to accomplish its task. The tool can tackle modeling jobs accurately while avoiding the complexity and tedium of low-level API calls by incorporating precise design criteria and engineering logic. However, it is not easy to build generic tool functionalities to handle different construction situations.

The researchers used quantitative and qualitative analysis approaches to determine which tool features to incorporate to overcome this challenge. They began by looking at user log files to learn which commands (tools) are most frequently used by human designers when working with BIM authoring software. They used a single day’s worth of log data collected from 1,000 anonymous users of the Vectorworks design program around the world, which included around 25 million records in seven languages. The fifty most commonly used commands are retrieved once the raw data is cleaned and filtered, ensuring that the Text2BIM framework is designed with user needs and preferences in mind.

To facilitate the development of tool-specific functionality, they omitted commands primarily controlled by the mouse and, in orange, emphasized generic graphical modeling commands that can be implemented via APIs. The researchers examined Vectorworks' built-in graphical programming tool, Marionette, comparable to Dynamo/Grasshopper. These visual scripting systems often offer encapsulated versions of underlying APIs that are tailored to certain circumstances. The nodes or stacks that designers work with provide a more intuitive, higher-level programming interface. Software vendors categorize default nodes by their capabilities to make them easier for designers to understand and use. With a similar goal, the team used these nodes in the “BIM” category because the use case produces conventional BIM models.

The researchers were able to create an interactive architecture-based software prototype by incorporating the suggested framework into Vectorworks, a BIM authoring tool. The open-source Vectorworks Web Palette plugin template was the basis for their implementation. Using Vue.js and a web environment built on the Chromium Embedded Framework (CEF), a dynamic web interface was built into Vectorworks using modern frontend technologies. This allowed them to create a web palette that is easy to use and understand. The logic of the web palette is built using C++ functions and the backend is a C++ application that allows defining and exposing asynchronous JavaScript functions within a web framework.

The evaluation is carried out by using prompts (instructions) for the test user and comparing the output of different LLMs, such as GPT-4o, Mistral-Large-2, and Gemini-1.5-Pro. Furthermore, the framework's ability to produce designs in open-ended contexts is tested by deliberately omitting some construction constraints from the test prompts. To account for the random nature of generative models, they ran each test question in each LLM five times, yielding 391 IFC models (including intermediate optimization results). The findings show that the method successfully creates construction models that are well-structured and logically consistent with the abstract ideas specified by the user.

The goal of this work is to generate regular building models during the early design stage. The produced models simply incorporate necessary structural elements such as walls, slabs, roofs, doors and windows and indicative semantic data such as narratives, locations and material descriptions. This work facilitates an intuitive expression of design intent by freeing designers from the monotony of recurring modeling commands. The team believes that the user can always return to the BIM authoring tool and change the generated models, achieving a balance between automation and technical autonomy.

Take a look at the Paper. All credit for this research goes to the researchers of this project. Also, don't forget to follow us on twitter.com/Marktechpost”>twitter and join our Telegram Channel and LinkedIn GrAbove!. If you like our work, you will love our fact sheet..

Don't forget to join our Subreddit with over 48 billion users

Find upcoming ai webinars here

Dhanshree Shenwai is a Computer Science Engineer with extensive experience in FinTech companies spanning the Finance, Cards & Payments and Banking space and is keenly interested in the applications of artificial intelligence. She is excited to explore new technologies and advancements in today’s ever-changing world, making life easier for everyone.

ai/webinar-unlock-the-power-of-your-snowflake-data-with-llms?utm_campaign=2408%20-%20Webinar%20-%20Snowflake%20data%20with%20LLMs&utm_source=marktechpost&utm_medium=banner-ad-desktop”>x-300.jpg” alt=””/>

Text2BIM: A LLM-based multi-agent framework that makes it easier to express design intent more intuitively

Technical Terrence Team

AI portfolio with $4.9 billion in cash

Leave a Reply Cancel reply

Recommended.

Is Esh released from the bearish patterns?

Solo Bitcoin Miner Earns $199,098 After Successfully Mining a Block

These are the consequences of the first days of trading of the Ethereum spot ETFs

Republican National Committee backs pro-Bitcoin platform in party draft

Morgan Stanley Positive on Managed Care in Election Year

Categories

Important Links

Text2BIM: A LLM-based multi-agent framework that makes it easier to express design intent more intuitively

Related

Technical Terrence Team

AI portfolio with $4.9 billion in cash

Leave a Reply Cancel reply

Recommended.

Is Esh released from the bearish patterns?

Solo Bitcoin Miner Earns $199,098 After Successfully Mining a Block

These are the consequences of the first days of trading of the Ethereum spot ETFs

Republican National Committee backs pro-Bitcoin platform in party draft

Morgan Stanley Positive on Managed Care in Election Year

Categories

Important Links

Get daily news updates to your inbox!