This AI research paper proposes a policy framework for auditing large language LLM models by breaking down responsibilities at the governance, model, and application levels.

Technology companies and law enforcement agencies can discover and reduce dangers related to artificial intelligence (AI) systems by using audit as a governance tool. In particular, auditing is a systematic and impartial process of collecting and analyzing data about an entity’s operations or assets, and then reporting the results to appropriate parties.

Three insights support the promise of auditing as an AI governance mechanism:

Regularity and procedural transparency contribute to good governance.
Proactive AI system design helps identify risks and prevent damage before it occurs.
Operational independence between the auditor and the auditee contributes to the objectivity and professionalism of the assessment. Previous research on AI auditing has focused on ensuring that certain applications meet predetermined, often industry-specific standards.

For example, researchers have created protocols to audit artificial intelligence systems used in Internet searches, medical diagnosis, and recruitment.

🚨 Read our latest AI newsletter🚨

However, the capabilities of AI systems tend to expand in scope over time. The phrase “baseline models” was recently coined by Bommasani et al. to refer to models that can be transferred to various subsequent tasks through transfer learning. Technically speaking, the basic models could be newer1. Still, they differ from other AI systems because they are efficient at various tasks and exhibit emergent capabilities when scaled. While basic models are often trained and published by one actor and then modified for multiple applications by other actors, the growth of basic models also signals a change in the way AI systems are built and deployed. Basic models present serious difficulties from an AI audit standpoint.

For example, it can be challenging to assess the dangers that AI systems generate in isolation from the environment in which they are used. This document focuses on a subset of basic models, specifically Large Language Models (LLMs), to fill that gap. In addition, it should also be clarified how to divide the blame for damages between technology providers and downstream developers. When viewed collectively, the capabilities and methods of training foundation models have outpaced the methods and tools used to assess their moral, legal, and technological soundness. This suggests that additional types of monitoring and control need to be added to application-level audits that are crucial for AI governance.

Language models produce the most probable word, code, or other data sequences by starting with a source input known as a request. Historically, natural language processing (NLP) has employed a variety of model designs, including probabilistic techniques. However, most current LLMs, including those on which this article focuses, are built using deep neural networks trained on a sizeable corpus of texts. These LLMs include GPT-3, PaLM, LaMDA, Gopher, and OPT. After undergoing pre-training, an LLM can be modified (with or without adjustments) to serve a variety of applications, including spell checking and creative writing. For two reasons, creating LLM audit processes is a major endeavor.

Being able to audit LLM characteristics across various regulatory dimensions (such as privacy, bias, IP, etc.) is a crucial task in itself due to the urgency of resolving such concerns. Previous research has shown that LLMs present various ethical and social challenges, including the perpetuation of negative stereotypes, the leakage of personally identifiable information protected by privacy laws, the spread of false information, plagiarism, and unauthorized use of proprietary property. intellectual. CLIP, a vision and language model, was trained to predict what text captioned an image.

Figure 1: The proposed 3-layer approach

CLIP is not an LLM, but it can be customized for various downstream applications, and other models face similar governance issues. The same goes for other robotic content creators like DALLE2. In the future, the audit of other basic models and even more powerful generation systems can benefit from the improvement of the LLM audit processes. Three innovative contributions are made in this essay. They begin by stating six points about how auditing practices should be developed to account for the risks posed by LLMs. These statements are based on a review of the capabilities and limitations of current AI audit practices. Second, they offer a framework for auditing LLM based on the most effective procedures for IT governance and systems engineering. In particular, they suggest a three-layered strategy in which governance audits (of technology providers who design and distribute LLMs), model audits (of LLMs after pre-training but prior to launch), and audits of applications (of applications based on LLM) are complementary. and informing each other. (see Figure 1 above) Third, they address the shortcomings of their three-layer strategy and outline possible directions for further research.

Their work is connected to a broader research agenda and policy-making process. Organizations such as DeepMind, Microsoft, and Anthropic have published research mapping the risks of harm posed by LLMs and highlighting the need for new governance mechanisms to address the related ethical challenge. AI labs such as Cohere, OpenAI, and AI21 have expressed interest in understanding what it means to develop LLMs responsibly. Governments are also concerned with ensuring that society benefits from the use of LLM while limiting the dangers.

review the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 14k+ ML SubReddit, discord channel, and electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.

Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Information Science and Artificial Intelligence at the Indian Institute of Technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around her. She loves connecting with people and collaborating on interesting projects.

This AI research paper proposes a policy framework for auditing large language LLM models by breaking down responsibilities at the governance, model, and application levels.

Technical Terrence Team

Vapotherm GAAP EPS of -$0.78 in line, revenue of $18.7M up by $0.84M (NYSE:VAPO)

Leave a Reply Cancel reply

Recommended.

Salesforce AI introduces TACO: a new family of multimodal action models that combine reasoning with real-world actions to solve complex visual tasks

Daily Bitcoin and Ethereum spot ETF outflows exceed $470 million

Asian stocks jittery, US and European futures up By Reuters

Hugging Face's Transformers 4.42: Gemma 2 Release, RT-DETR, InstructBlip, LLaVa-NeXT-Video, Improved Tool Usage, RAG Support, GGUF Fine Tuning, and Quantized KV Cache

Palm.hr raises $5M, embarks on MENA growth • TechCrunch

Categories

Important Links

This AI research paper proposes a policy framework for auditing large language LLM models by breaking down responsibilities at the governance, model, and application levels.

Related

Technical Terrence Team

Vapotherm GAAP EPS of -$0.78 in line, revenue of $18.7M up by $0.84M (NYSE:VAPO)

Leave a Reply Cancel reply

Recommended.

Salesforce AI introduces TACO: a new family of multimodal action models that combine reasoning with real-world actions to solve complex visual tasks

Daily Bitcoin and Ethereum spot ETF outflows exceed $470 million

Asian stocks jittery, US and European futures up By Reuters

Hugging Face's Transformers 4.42: Gemma 2 Release, RT-DETR, InstructBlip, LLaVa-NeXT-Video, Improved Tool Usage, RAG Support, GGUF Fine Tuning, and Quantized KV Cache

Palm.hr raises $5M, embarks on MENA growth • TechCrunch

Categories

Important Links

Get daily news updates to your inbox!