Common-sense background information is essential for making decisions under uncertainty in real-world settings. Let’s say you want to give the scenario in Figure 1 some labels. As a few key elements are recognized, it becomes clear that the image shows a bathroom. This helps solve some of the more difficult tags for certain objects, like the shower curtain in the scene instead of the window curtain, and the mirror instead of the portrait on the wall. In addition to visual tasks, prior knowledge of the co-occurrences of expected elements or events is crucial for navigating new environments and understanding the actions of other agents. Furthermore, such expectations are essential for object categorization and reading comprehension.
Unlike robot demos or segmented images, vast corpora of text are easily accessible and include virtually every aspect of the human experience. Current machine learning models use task-specific data sets to learn about the prior distribution of labels and judgments for most problem domains. When the training data is skewed or sparse, this can lead to systematic errors, particularly on unusual or out-of-distribution inputs. How could they provide models with broader and more adaptable past insights? They suggest the use of distributions learned over natural language strings known as language models as general probabilistic antecedents of tasks.
LMs have been used as sources of prior knowledge for tasks ranging from answering common sense questions to modeling scripts and stories to synthesizing probabilistic algorithms in language processing and other text production activities. They often provide greater diversity and fidelity than small task-specific data sets to encode much of this information, such as the fact that dishes are in kitchens and dining rooms and that eggs are cracked before they are beaten. Such language monitoring has also been proposed to contribute to common-sense human knowledge in areas that are difficult to learn from first-hand experience.
Model chaining techniques, which encode the output of perceptual systems as natural language strings that encourage LMs to produce labels or plans immediately, have also been used to address difficulties with grounded language comprehension. Instead, they focus on LMs in this study as a source of probabilistic background information that can be included with current domain models. LMs pair naturally with structured probabilistic modeling frameworks because they can be combined with domain-specific generative models or probability functions to integrate “top-down” prior knowledge with “bottom-up” task-specific predictors using them. to place priors on labels, decisions, or model parameters.
This type of modeling is known as LAMPP. This method offers a robust technique for combining linguistic supervision with structured uncertainty about non-linguistic factors, allowing one to benefit from LM expertise even in challenging jobs where LMs have trouble completing. LAMPP is customizable and can be used to solve many different types of problems. Semantic image segmentation, robot navigation, and video action segmentation are examples of tasks they offer in three case studies. LAMPP frequently improves performance on unusual, out-of-distribution, and structurally new inputs, and rarely even improves accuracy on samples within the training distribution of the domain model. These results show that language is a useful source of background knowledge for general decision making and that uncertainty in this background knowledge can be effectively integrated with uncertainty in non-linguistic problem domains.
review the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 14k+ ML SubReddit, discord channel, and electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.
Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Information Science and Artificial Intelligence at the Indian Institute of Technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around her. She loves connecting with people and collaborating on interesting projects.