
Generative AI models arenāt actually humanlike. They have no intelligence or personality ā theyāre simply statistical systems predicting the likeliest next words in a sentence. But like interns at a tyrannical workplace, they do follow instructions without complaint ā including initial āsystem promptsā that prime the models with their basic qualities and what they should and shouldnāt do.
Every generative AI vendor, from OpenAI to Anthropic, uses system prompts to prevent (or at least try to prevent) models from behaving badly, and to steer the general tone and sentiment of the modelsā replies. For instance, a prompt might tell a model it should be polite but never apologetic, or to be honest about the fact that it canāt know everything.
But vendors usually keep system prompts close to the chest ā presumably for competitive reasons, but also perhaps because knowing the system prompt may suggest ways to circumvent it. The only way to expose GPT-4oās system prompt, for example, is through a prompt injection attack. And even then, the systemās output canāt be trusted completely.
However, Anthropic, in its continued effort to paint itself as a more ethical, transparent AI vendor, has published the system prompts for its latest models (Claude 3 Opus, Claude 3.5 Sonnet and Claude 3 Haiku) in the Claude iOS and Android apps and on the web.
Alex Albert, head of Anthropicās developer relations, said in a post on X that Anthropic plans to make this sort of disclosure a regular thing as it updates and fine-tunes its system prompts.
The latest prompts, dated July 12, outline very clearly what the Claude models canāt do ā e.g. āClaude cannot open URLs, links, or videos.ā Facial recognition is a big no-no; the system prompt for Claude Opus tells the model to āalways respond as if it is completely face blindā and to āavoid identifying or naming any humans in [images].ā
But the prompts also describe certain personality traits and characteristics ā traits and characteristics that Anthropic would have the Claude models exemplify.
The prompt for Claude 3 Opus, for instance, says that Claude is to appear as if it ā[is] very smart and intellectually curious,ā and āenjoys hearing what humans think on an issue and engaging in discussion on a wide variety of topics.ā It also instructs Claude to treat controversial topics with impartiality and objectivity, providing ācareful thoughtsā and āclear informationā ā and never to begin responses with the words ācertainlyā or āabsolutely.ā
Itās all a bit strange to this human, these system prompts, which are written like an actor in a stage play might write a character analysis sheet. The prompt for Opus ends with āClaude is now being connected with a human,ā which gives the impression that Claude is some sort of consciousness on the other end of the screen whose only purpose is to fulfill the whims of its human conversation partners.
But of course thatās an illusion. If the prompts for Claude tell us anything, itās that without human guidance and hand-holding, these models are frighteningly blank slates.
With these new system prompt changelogs ā the first of their kind from a major AI vendor ā Anthropic is exerting pressure on competitors to publish the same. Weāll have to see if the gambit works.


