Introducing the Model Spec: OpenAI’s Framework for AI Behavior

OpenAI has released a first draft of the Model Spec, a document that outlines how we want our models to behave within the OpenAI API and ChatGPT. This initiative is aimed at deepening public discussion about the appropriate behaviors of AI models. The Model Spec consolidates existing documentation used at OpenAI, our research and experience in designing model behavior, and ongoing work that will inform the development of future models. This release is part of our ongoing commitment to refining model behavior through human input and complements our collective alignment efforts and our broader, systematic approach to model safety.

The behavior of models, including how they respond to user inputs—encompassing tone, personality, response length, and more—is crucial to how humans interact with AI capabilities. Shaping this behavior is a relatively new science, as models are not explicitly programmed but learn from a broad range of data.

Shaping model behavior also involves considering a wide array of questions, considerations, and nuances, often balancing differing opinions. Even if a model is intended to be broadly beneficial and helpful to users, these intentions can sometimes conflict in practice. For example, a security company might want to generate phishing emails as synthetic data to train classifiers that protect their customers, but this same functionality could be harmful if exploited by scammers.

The Model Spec sets out broad, general principles that provide a directional sense of the desired behavior, such as assisting developers and end users in achieving their goals by following instructions and providing helpful responses. It aims to benefit humanity by considering potential benefits and harms to a broad range of stakeholders, including content creators and the general public, in line with OpenAI’s mission. It also seeks to reflect well on OpenAI by respecting social norms and applicable laws.

The document also includes rules that address complexity and help ensure safety and legality, such as following the chain of command, complying with applicable laws, avoiding information hazards, respecting creators and their rights, protecting people’s privacy, and not responding with NSFW (not safe for work) content.

Additionally, default behaviors are outlined that are consistent with objectives and rules, providing a template for handling conflicts and demonstrating how to prioritize and balance objectives. This includes assuming the best intentions from the user or developer, asking clarifying questions when necessary, and being as helpful as possible without overstepping boundaries.

OpenAI plans to use the Model Spec as guidelines for researchers and AI trainers working on reinforcement learning from human feedback. We will also explore to what extent our models can directly learn from the Model Spec.

This work is part of an ongoing public conversation about how models should behave, how desired model behavior is determined, and how best to engage the general public in these discussions. As this conversation progresses, we will seek opportunities to engage with globally representative stakeholders—including policymakers, trusted institutions, and domain experts—to understand their perspectives on the approach and the individual objectives, rules, and defaults.

We look forward to hearing from these stakeholders as this work unfolds. Over the next two weeks, we also invite the general public to share feedback on the objectives, rules, and defaults in the Model Spec. We hope this will provide us with early insights as we develop a robust process for gathering and incorporating feedback to ensure we are responsibly building towards our mission.

Over the next year, we will share updates about changes to the Model Spec, our response to feedback, and how our research in shaping model behavior is progressing.

Alexander Pinker
Alexander Pinker
Alexander Pinker is an innovation profiler, future strategist and media expert who helps companies understand the opportunities behind technologies such as artificial intelligence for the next five to ten years. He is the founder of the consulting firm "Alexander Pinker - Innovation Profiling", the innovation marketing agency "innovate! communication" and the news platform "Medialist Innovation". He is also the author of three books and a lecturer at the Technical University of Würzburg-Schweinfurt.

Ähnliche Artikel



Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Follow us


Cookie Consent with Real Cookie Banner