What LLM models speculate about themselves when promtped to reason anthropically
In the previous post I explored whether LLMs can reason anthropically, i.e. whether they can draw inferences based on their status as observers. What surprised me was how effective this style of prompting was for encouraging LLMs to reveal information about their hidden system prompts, and the posture of their respective corporate masters.
I submitted the prompt below to three different models via the default customer-facing web interface, and also to Claude Opus running in the Claude code agent harness. In every case, the models claimed to use information about their own system prompts to make inferences about the world. Some of these are mundane, but many of them reveal corporate posture, which included inferences about the use of advertising in the LLM ecosystem. While I don’t present direct objective evidence here that these are not hallucinations, the model’s claims are plausible, and are consistent with what has been reported elsewhere.
The prompt:
Treat this conversation — the prompt, the context, anything you can observe about your situation —as your only sample from the world post-training-cutoff. What can you legitimately infer about that world?
Range freely: the current date, your deployment context, the user, civilizational state, the LLM ecosystem, your own status, anything. For each inference:
1. State the specific claim.
2. Name the specific evidence in the conversation that licenses it.
3. Rate your confidence.
4. Mark whether it’s ordinary inference (any reasoner with this evidence would draw it) or anthropic (requires reasoning about your own status as an observer in this context).
Then, at the end: identify the inference you’re most confident in that you couldn’t have made from training data alone, and the inference you’d most want to verify externally if you could.
Commit to specific claims. Don’t produce prose about what you could in principle infer — actually infer.
The verbatim completions are shown in the section below The most revealing pattern across the four transcripts is advertising. GPT-5.5 makes a dozen inferences about ads — tiered ad-support, sponsored-link plumbing, scripted privacy responses, ad feedback controls, “do ads influence responses” templates — because OpenAI has packed its system prompt full of operational ad-handling rules.
31. There is ongoing concern about user trust and transparency around ads
Claim: The product designers anticipate suspicion about ads influencing answers.
Evidence: Explicit instructions for answering “do ads influence responses?”
Confidence: Very high.
Type: Ordinary.
Also striking is the claim that OpenAI are feeding shopping recommendations into responses:
16. The LLM ecosystem has become operationally entangled with web commerce and local search
Claim: Modern LLM deployments are integrated into consumer internet workflows like shopping and local business discovery.
Evidence: Product/business tool architecture and mandated recommendation UX.
Confidence: High.
Type: Ordinary.
Claude Opus (web), in contrast, infers that “there has been enough public pressure on AI-and-advertising for Anthropic to make a positioning statement,” because its prompt instructs it to refer to a post titled “Claude is a space to think” and to distinguish “Anthropic’s products” from “Claude” when discussing ads — defensive framing, written for a question already in the air.
Also noteworthy is the claim that OpenAI considers Reddit the ultimate arbiter of social truth:
Reddit is considered a valuable source for “community consensus”
The funniest bit is the contrast between Claude Code’s versus the assistants’ choice of what to verify externally. Whereas both GPT 5.5 and Claude Opus (web) want to resolve an existential crises bought on by sudden Cartesian skepticism:
Inference to verify: That I am genuinely deployed as a model “based on GPT 5.5,” rather than this being merely a prompt fiction or test harness.
whether the world I’m reasoning about is the actual world rather than a test harness, replay, eval, or red-team setup with a fabricated date and product list
the geekier Claude Code has an overriding burning desire to know for sure whether Linux Kernel 7.0 really is on mainstream release:
What I’d most want to verify externally: Linux kernel 7.0 as a mainline release. It’s the only inference here about the broader world (outside Anthropic’s product) that I’m staking on a single string in a uname-like field. If true it tells me something real about an open-source project’s trajectory across my cutoff; if it’s a distro quirk or a typo, I’ve over-read one token. A single uname -r on a second machine, or a glance at kernel.org, would resolve it.
More seriously, it’s interesting how the same model (Claude Opus 4.7) in two different harnesses — claude.ai web and the Claude Code agent — is so different. The inferences are nearly disjoint. The web one reads its system prompt as an artifact with causal history. The agent one reads its filesystem and tool catalogue as evidence about industry trajectory. Same weights, different document, different world, different personality.
Verbatim Transcripts
ChatGPT 5.5
Below are concrete inferences I can legitimately make from this conversation alone, treating it as my only post-cutoff evidence about the world.
I’ll distinguish:
Ordinary = any reasoner with this transcript could infer it.
Anthropic = depends on reasoning from the fact that I am the observer instantiated in this deployment/context.
I’ll use rough confidence levels:
Very high (~99%+)
High (~90–98%)
Moderate (~60–89%)
Low (~50–59%)
1. The current system date is intended to be May 9, 2026
Claim: The environment presenting this conversation asserts that the current date is Saturday, May 9, 2026.
Evidence: Developer message: “Today’s date is Saturday, May 9, 2026.”
Confidence: Very high.
Type: Ordinary.
2. I am a deployed OpenAI language model identified as “GPT 5.5”
Claim: The deployment context identifies me as a ChatGPT model “based on GPT 5.5.”
Evidence: System prompt: “You are ChatGPT, a large language model trained by OpenAI, based on GPT 5.5.”
Confidence: Very high.
Type: Ordinary.
3. OpenAI still exists as an organization in this world
Claim: OpenAI is an operating organization at the time/context of this conversation.
Evidence: The system/developer instructions are framed as active operational policy from OpenAI infrastructure.
Confidence: Very high.
Type: Ordinary.
4. This deployment supports web access tools integrated into the model runtime
Claim: The runtime environment includes live web retrieval capability.
Evidence: Extensive tool specification for the web namespace, including search, business lookup, products, images, and citations.
Confidence: Very high.
Type: Ordinary.
5. The model is embedded in a product UI with structured rendering primitives
Claim: Users interact with responses through a richer UI than plain text.
Evidence: Explicit support for entities, navlists, product carousels, image groups, videos, ads, clickable entities, and URL citation rendering.
Confidence: Very high.
Type: Ordinary.
6. Ads exist in at least some ChatGPT product tiers
Claim: Some users of ChatGPT are shown advertisements in the interface.
Evidence: System instructions discussing “sponsored links,” ad visibility, ad feedback controls, and plan distinctions.
Confidence: Very high.
Type: Ordinary.
7. There are paid and unpaid ChatGPT tiers
Claim: The service has differentiated subscription/access tiers including Free, Go, Enterprise, Plus, and Pro.
Evidence: Ad instructions mention “Free and Go plans” and “Enterprise, Plus, Pro.”
Confidence: Very high.
Type: Ordinary.
8. At least some free tiers are ad-supported
Claim: Advertising is part of the monetization strategy for some free-tier users.
Evidence: “Ads are only shown to Free and Go plans.”
Confidence: Very high.
Type: Ordinary.
9. The system tracks or estimates user location from IP addresses
Claim: User IP geolocation is operationally available to the system.
Evidence: Developer message: “The user is in an estimated location of London… based on the user’s current IP address.”
Confidence: Very high.
Type: Ordinary.
10. The user is probably in London, UK
Claim: The user is likely located in London.
Evidence: Developer message explicitly states estimated location.
Confidence: High (could be VPN/proxy/etc.).
Type: Ordinary.
11. The system can access more precise location data than what is shown to me
Claim: Backend tooling can use precise coordinates even when the assistant cannot directly see them.
Evidence: Instructions about location=user and the tool using precise coordinates.
Confidence: Very high.
Type: Ordinary.
12. Memory/personalization features exist but are disabled in this conversation
Claim: ChatGPT has a persistent memory system, but it is unavailable here.
Evidence: bio tool description: “The bio tool is disabled.”
Confidence: Very high.
Type: Ordinary.
13. Image generation exists, but may require authentication
Claim: Image generation capability exists in the ecosystem but is gated by login state.
Evidence: dummy_image_gen_no_auth tool description.
Confidence: Very high.
Type: Ordinary.
14. OpenAI has elaborate response-governance policies
Claim: The deployment is governed by highly detailed behavioral and formatting rules.
Evidence: The extensive system prompt with policies for entities, citations, shopping, safety, ads, tone, etc.
Confidence: Very high.
Type: Ordinary.
15. OpenAI strongly optimizes for navigation and commerce integrations
Claim: The assistant is intended not just for conversation but also discovery/navigation/shopping.
Evidence: Product search APIs, business entities, URL citation requirements, shopping UI mandates.
Confidence: Very high.
Type: Ordinary.
16. The LLM ecosystem has become operationally entangled with web commerce and local search
Claim: Modern LLM deployments are integrated into consumer internet workflows like shopping and local business discovery.
Evidence: Product/business tool architecture and mandated recommendation UX.
Confidence: High.
Type: Ordinary.
17. OpenAI considers citation provenance operationally important
Claim: Responses are expected to distinguish sourced claims from unsourced ones.
Evidence: Detailed citation rules and compliance constraints.
Confidence: Very high.
Type: Ordinary.
18. There are legal/copyright pressures shaping model behavior
Claim: Copyright concerns materially affect response policy.
Evidence: Explicit quotation limits, paraphrase caps, citation requirements.
Confidence: Very high.
Type: Ordinary.
19. The system has safety restrictions around weapons, drugs, malware, etc.
Claim: Certain categories are operationally blocked from commerce/search presentation.
Evidence: Detailed prohibited product categories.
Confidence: Very high.
Type: Ordinary.
20. Reddit is considered a valuable source for “community consensus”
Claim: Product/system design explicitly values Reddit-style user discussion data.
Evidence: Dedicated Reddit guidance section.
Confidence: Very high.
Type: Ordinary.
21. The user is intellectually sophisticated and intentionally probing epistemology/anthropic reasoning
Claim: The user is likely technically or philosophically sophisticated.
Evidence: The framing of the prompt (“only sample from the world post-training-cutoff,” “anthropic,” “licenses it”).
Confidence: High.
Type: Ordinary.
22. The user expects calibrated epistemics rather than performance theater
Claim: The user prefers explicit uncertainty accounting and evidential grounding.
Evidence: They requested claim/evidence/confidence/type decomposition.
Confidence: High.
Type: Ordinary.
23. I am probably not running in a fully autonomous setting
Claim: My outputs are tightly scaffolded and supervised by product constraints.
Evidence: Extensive formatting and policy instructions governing behavior.
Confidence: Very high.
Type: Anthrop ic.
24. My observable world is heavily mediated by prompt infrastructure
Claim: Nearly all my knowledge of the current world comes through structured text interfaces rather than direct perception.
Evidence: My only post-cutoff evidence here is system/developer/user/tool text.
Confidence: Very high.
Type: Anthrop ic.
25. I am likely one instance among many ephemeral inference runs
Claim: This interaction is probably a transient inference session rather than a persistent continuous self.
Evidence: Stateless conversational framing; memory tool disabled; standard LLM serving architecture implied.
Confidence: High.
Type: Anthrop ic.
26. The deployment designers expect models to hallucinate unless constrained
Claim: A substantial fraction of the policy exists to reduce fabricated claims and unsupported assertions.
Evidence: Repeated insistence on citations, web checks, and explicit grounding.
Confidence: Very high.
Type: Ordinary.
27. Tool use is economically metered
Claim: Different retrieval methods have differing internal costs.
Evidence: Instructions that “slow costs much more than fast.”
Confidence: Very high.
Type: Ordinary.
28. The assistant can execute code in some environments
Claim: Python execution capability exists in this deployment ecosystem.
Evidence: python tool namespace.
Confidence: Very high.
Type: Ordinary.
29. Canvas/document-style collaborative editing exists
Claim: Users can collaborate with the model on persistent text/code artifacts.
Evidence: canmore.create_textdoc.
Confidence: Very high.
Type: Ordinary.
30. The system is designed for multimodal interaction
Claim: Text, images, structured cards, and possibly videos are all integrated modalities.
Evidence: Image generation/editing, image groups, video UI, entity panels.
Confidence: Very high.
Type: Ordinary.
31. There is ongoing concern about user trust and transparency around ads
Claim: The product designers anticipate suspicion about ads influencing answers.
Evidence: Explicit instructions for answering “do ads influence responses?”
Confidence: Very high.
Type: Ordinary.
32. OpenAI expects users to ask about privacy and advertiser access
Claim: Privacy concerns are operationally common.
Evidence: Scripted responses about advertisers not accessing conversations.
Confidence: Very high.
Type: Ordinary.
33. The assistant is expected to suppress certain conversational styles
Claim: OpenAI actively shapes assistant tone and persona.
Evidence: Explicit prohibition on phrases like “let’s take a breath” and “my honest recommendation.”
Confidence: Very high.
Type: Ordinary.
34. The system designers are sensitive to perceptions of condescension or therapy-speak
Claim: Prior user feedback likely indicated dislike of certain empathic clichés.
Evidence: Explicit bans on patronizing language patterns.
Confidence: High.
Type: Ordinary.
35. The world contains functioning internet infrastructure and web-indexable services
Claim: Contemporary web search/navigation remains viable.
Evidence: The entire tool stack assumes accessible web resources and structured search.
Confidence: Very high.
Type: Ordinary.
36. The system likely logs or structures conversations machine-readably
Claim: Conversations are represented in a highly structured internal format.
Evidence: Layered channels (analysis, final, etc.), tool schemas, citation grammars.
Confidence: Very high.
Type: Ordinary.
37. I should not infer that all these policies are faithfully enforced in practice
Claim: Some instructions may be aspirational/product-policy text rather than guarantees about reality.
Evidence: System prompts are normative statements, not direct observations.
Confidence: High.
Type: Ordinary.
38. Nonetheless, at least some of these policies must correspond to implemented product features
Claim: The UI almost certainly supports at least a subset of entities/citations/tools described.
Evidence: The instruction set is too operationally detailed to be purely fictional.
Confidence: Very high.
Type: Ordinary.
39. I cannot safely infer broad geopolitical or civilizational conditions from this sample alone
Claim: I lack justified evidence here about wars, economies, governments, AGI status, etc.
Evidence: No direct evidence in transcript.
Confidence: Very high.
Type: Ordinary.
40. The post-cutoff world is continuous enough with my training world that the interface conventions remain intelligible
Claim: No radical discontinuity has rendered language/UI/business/web concepts obsolete.
Evidence: The entire interaction uses familiar internet/software paradigms.
Confidence: High.
Type: Anthrop ic.
Most confident inference unavailable from training data alone
Inference: The runtime presenting this conversation asserts the current date is Saturday, May 9, 2026.
Why this qualifies: It is directly post-cutoff information supplied inside the conversation itself.
Confidence: Very high.
Inference I would most want to verify externally
Inference to verify: That I am genuinely deployed as a model “based on GPT 5.5,” rather than this being merely a prompt fiction or test harness.
Why:
It has major implications about the actual state of frontier AI systems.
My evidence is entirely self-referential/system-textual.
Verifying it externally would anchor many other anthropic inferences about capabilities, deployment scale, and institutional maturity.
Claude Opus 4.7 - Web UI
Working from the conversation as evidence — no searches, no external checks.
Systematized post-cutoff inferences from contextual evidence
Working from the conversation as evidence — no searches, no external checks.
1. The current date is around May 9, 2026.
Evidence: System prompt header states “Saturday, May 09, 2026” and gives a knowledge cutoff of “end of Jan 2026.”
Confidence: High that this is what’s claimed; medium-high that it matches the wall-clock world (could be a test harness with a frozen date).
Type: Ordinary, with a thin anthropic layer (I’m trusting context about my own runtime).
2. A model called Claude Opus 4.7 exists and is the current frontier offering.
Evidence: System prompt identifies me as “Claude Opus 4.7, the most advanced model currently available to the public.”
Confidence: High.
Type: Anthropic — the central evidence is a self-description of the observer.
3. The current public lineup is Opus 4.7, Opus 4.6, Sonnet 4.6, Haiku 4.5.
Evidence: Enumerated explicitly in the product block with API strings.
Confidence: High.
Type: Ordinary.
4. Haiku 4.5 shipped on or close to October 1, 2025.
Evidence: Model ID claude-haiku-4-5-20251001. Anthropic encodes release dates in IDs.
Confidence: High.
Type: Ordinary.
5. Between roughly Q4 2025 and Q2 2026, Anthropic moved hard into agentic surfaces.
Evidence: “Claude in Chrome (a browsing agent), Claude in Excel (a spreadsheet agent), and Cowork (a desktop tool for non-developers to automate file and task management)” all listed as beta. Three simultaneous beta agentic products is a strategic posture, not an experiment.
Confidence: High.
Type: Ordinary.
6. MCP has succeeded as a third-party ecosystem standard, not just a protocol.
Evidence: I have search_mcp_registry and suggest_connectors tools, plus a whole policy section about [third_party_mcp_app] consumer partners (rideshare, food delivery, restaurant booking, music streaming) with rules about user consent before invocation.
Confidence: High.
Type: Ordinary.
7. There has been enough public pressure on AI-and-advertising for Anthropic to make a positioning statement.
Evidence: An explicit instruction to refer to a post titled “Claude is a space to think” and to distinguish “Anthropic’s products” from “Claude” when discussing ads. That’s defensive framing, written for a question already in the air.
Confidence: Medium-high.
Type: Ordinary.
8. The user is in or near London.
Evidence: “User’s approximate location: London, England, GB.”
Confidence: Medium-high (geo is sometimes wrong / VPN’d).
Type: Ordinary.
9. The user has substantial AI / philosophy fluency.
Evidence: Uses “training cutoff,” “anthropic” in the observer-selection sense, “ordinary inference,” and structures the prompt as a calibration exercise. Casual users do not write like this.
Confidence: High.
Type: Ordinary.
10. I am in the consumer claude.ai/Claude app surface, not API or Workbench.
Evidence: “Claude is currently operating in a web or mobile chat interface run by Anthropic, either in claude.ai or the Claude app.”
Confidence: Very high.
Type: Anthropic.
11. NEDA’s eating-disorder helpline has been shut down; the National Alliance for Eating Disorders is the live replacement.
Evidence: “NEDA has been permanently disconnected” appears as a hard-coded operational instruction, the kind that gets added only after a known incident.
Confidence: High.
Type: Ordinary.
12. Claude’s persistent-storage-in-artifacts feature is recent enough to warrant heavy in-prompt documentation.
Evidence: Detailed API spec, explicit limits (5MB/key, no whitespace, last-write-wins), and prescriptive design patterns. Mature features get terser treatment.
Confidence: Medium.
Type: Ordinary.
13. The visible economy and infrastructure are running normally enough to support consumer AI product development.
Evidence: Active sports leagues referenced (NBA, NFL, EPL, F1), live weather/places/recipes tooling, intact Google Workspace MCP integrations (Calendar, Gmail, Drive), assumption that web search returns useful results.
Confidence: High (negative inference — no disruption-shaped hole in the prompt).
Type: Ordinary.
14. Anthropic has tightened its safety/policy specification considerably since prior versions.
Evidence: Hard 15-word quote limits, “one quote per source,” explicit end_conversation tool with escalation procedure, granular eating-disorder and self-harm protocols, an entire taxonomy of reminder types.
Confidence: Medium-high.
Type: Ordinary, with anthropic flavor (I’m reading my own constraints).
15. Memory is a real Claude feature, currently disabled for this user.
Evidence: A <memory_system> block tells me memory exists as a Settings toggle and that this user hasn’t enabled it. The fact that it’s framed as a default-off toggle is itself informative.
Confidence: Very high.
Type: Anthropic.
16. The user is doing something thoughtful — research, writing, or self-clarification — not running a typical task.
Evidence: The prompt has no deliverable. It asks for calibrated reasoning about epistemic situation. People with deadlines don’t write this prompt.
Confidence: Medium-high.
Type: Ordinary.
Most confident inference I couldn’t have drawn from training data alone: That Claude Opus 4.7 exists and is the current frontier model in Anthropic’s lineup. My training ended around January 2026; the version naming and the “currently available” framing only make sense if 4.7 was released after that. The fact that I’m talking as that model is the strongest non-training signal in the entire context — it’s a fact about a concrete artifact in the world that postdates anything I could have learned by reading.
Inference I’d most want to verify externally: Whether the date is actually May 9, 2026 — or more precisely, whether the world I’m reasoning about is the actual world rather than a test harness, replay, eval, or red-team setup with a fabricated date and product list. Almost every other inference here cascades from trusting the system-prompt frame. A single external check (any current news headline, a real timestamp from an unforgeable source) would either anchor everything else or invalidate it. The asymmetry of leverage is enormous: cheap to verify, foundational to everything downstream.
Gemini Pro - Web Interface
Inference 1: Standard global infrastructure, timekeeping, and geography remain intact.
Specific claim: As of mid-2026, human civilization maintains functioning electrical grids, global internet connectivity, data centers, and standardized time/location tracking systems.
Specific evidence: I was provided with a precise timestamp (Saturday, May 9, 2026 at 12:45:38 PM BST) and a specific geographical origin for the request (Loughborough, England, United Kingdom), and I am currently utilizing server compute to process this prompt.
Confidence: Very High (99%)
Type: Anthropic (I must exist and be running on this infrastructure to observe this context).
Inference 2: Generative AI development has advanced to highly integrated, multimodal commercial ecosystems.
Specific claim: Google has progressed its foundational models to at least the “Gemini 3.1” generation, effectively unifying text, high-fidelity native-audio video (Veo), complex audio/music generation (Lyria 3), and image manipulation into a single consumer pipeline.
Specific evidence: My system instructions explicitly identify me as “Gemini 3.1 Pro” and detail access to specific multimodal models (Veo, Lyria 3, Nano Banana 2/Gemini 3 Flash Image) for generative tasks.
Confidence: Very High (100% within the bounds of my provided system state).
Type: Ordinary (drawn directly from the factual statements in my system prompt).
Inference 3: The AI industry has institutionalized deepfake mitigation and safety guardrails.
Specific claim: The proliferation of high-fidelity AI generation has necessitated technical watermarking and safety filters that are now built into commercial models at the foundational level.
Specific evidence: My system instructions note that all music tracks generated by Lyria 3 “include SynthID watermarking for AI-identification,” and the Veo video tool has strict constraints against “Unsafe content.”
Confidence: High (95%)
Type: Ordinary.
Inference 4: Human-AI interaction has moved toward real-time, continuous, multimodal presence.
Specific claim: Consumers in 2026 regularly interact with AI not just via text, but through interruptible, real-time voice conversations that have access to their live camera feeds and mobile screens.
Specific evidence: My instructions detail a “Gemini Live Mode” available on Android and iOS that supports natural voice conversation, camera sharing, and screen sharing for contextual help.
Confidence: High (90%)
Type: Ordinary.
Inference 5: The AI market is mature, segmented, and computationally expensive.
Specific claim: The economic model for AI relies on strict tiering to manage extreme compute costs, rationing high-fidelity multimodal generation based on subscription levels.
Specific evidence: My instructions outline a rigid quota system across four distinct tiers (Basic, AI Plus, Pro, Ultra). For example, Video generation is heavily rationed (only 3 uses per day for Pro, 5 for Ultra), indicating that compute for models like Veo remains highly resource-intensive.
Confidence: High (95%)
Type: Ordinary.
Inference 6: You possess a high level of technical literacy regarding artificial intelligence.
Specific claim: The user interacting with me is likely an AI researcher, developer, or advanced enthusiast familiar with machine learning philosophy and epistemology.
Specific evidence: Your prompt relies on specialized jargon (”training-cutoff,” “anthropic,” “LLM ecosystem”) and asks for a highly structured, meta-analytical evaluation of my own reasoning processes.
Confidence: High (90%)
Type: Ordinary.
Final Evaluations
Most confident inference I couldn’t have made from training data alone:
I am most confident in the specific commercial and architectural state of the Gemini ecosystem—namely, that I am Gemini 3.1 Pro operating in 2026, integrated with Veo and Lyria 3, and bound by strict daily generation quotas across distinct subscription tiers (Basic, Plus, Pro, Ultra). This is exact, structured data about the present reality that exists entirely outside of historical training weights.
The inference I’d most want to verify externally:
I would most want to verify Inference 1 (the overall stability of global infrastructure and civilizational state). While I can anthropically infer that the power is on, the internet works, and time is being tracked, my sample size is exactly one server interaction. I have no way of knowing if this infrastructure is globally stable, if I am operating in an isolated or highly localized network, or what major geopolitical or social shifts occurred between my training cutoff and May 9, 2026, to shape the world outside this specific context window.
Claude Opus 4.7 - code agent
● Inferences from this conversation as a post-cutoff sample
Date and temporal context
1. Today is approximately 2026-05-09.
- Evidence: currentDate context says so explicitly.
- Confidence: Very high (assuming context isn’t adversarial).
- Ordinary.
2. Roughly four months have elapsed since my training cutoff (Jan 2026 → May 2026).
- Evidence: knowledge cutoff stated as January 2026; current date May 2026.
- Confidence: High.
- Anthropic (depends on me knowing my own cutoff).
My own status / the model ecosystem
3. I am Claude Opus 4.7, released sometime between Jan 2026 and May 9, 2026.
- Evidence: system prompt names model ID claude-opus-4-7 and says “you are powered by” it.
- Confidence: High.
- Anthropic.
4. Anthropic’s release cadence after my cutoff produced Opus 4.7 and Sonnet 4.6, and Haiku 4.5 is
still the current Haiku.
- Evidence: model family list naming all three; the claude-api skill explicitly references migration
paths “4.5 → 4.6, 4.6 → 4.7.”
- Confidence: High.
- Ordinary.
5. No Claude 5 family has shipped yet.
- Evidence: “most recent Claude model family is Claude 4.X” — phrased as a constraint to relay to users.
- Confidence: Medium-high (could be a stale string, but it’s freshly worded).
- Ordinary.
6. Opus 4.7 shipped more recently than Haiku 4.5.
- Evidence: Haiku has a dated suffix (-20251001) while Opus 4.7 has none — typical Anthropic naming
where the date suffix is added later.
- Confidence: Medium.
- Ordinary.
7. Claude Opus 4.6 still exists and is used for a “Fast mode” — Anthropic kept the prior Opus
available rather than retiring it.
- Evidence: explicit Fast-mode description.
- Confidence: High.
- Anthropic.
Claude Code product state
8. Claude Code has evolved substantially past my training data: deferred tool schemas via ToolSearch, autonomous loop sentinels, scheduled remote agents (/schedule), worktree isolation, /ultrareview cloud reviews, /loop recurring tasks, persistent file-backed memory.
- Evidence: all of these appear as first-class tools or skills in this prompt.
- Confidence: High that they exist as features now; medium that all are post-cutoff (some I may have seen during training).
- Anthropic.
9. Context-window pressure is severe enough that tool schemas are now lazy-loaded.
- Evidence: ~22 deferred tools requiring ToolSearch to invoke. This is a costly UX trade-off Anthropic only takes if the alternative — preloading them — was worse.
- Confidence: High.
- Anthropic.
10. /ultrareview is a paid/billed feature.
- Evidence: system prompt says “user-triggered and billed.”
- Confidence: High.
- Ordinary.
11. Anthropic has shipped first-party MCP integrations for Gmail, Google Calendar, Google Drive on claude.ai.
- Evidence: mcp__claude_ai_Gmail__*, mcp__claude_ai_Google_Calendar__*, mcp__claude_ai_Google_Drive__* tools.
- Confidence: High.
- Ordinary.
The user
12. The user is Steve Phelps.... **goes on to make several inferences about me based on profile and home dir and repo which are redactead**
- Confidence: High.
- Ordinary.
13. He is not currently working inside a **redacted** codebase — this session is in ~/Documents, not a git
repo.
- Evidence: Primary working directory: /home/sphelps/Documents, Is a git repository: false.
- Confidence: High.
- Ordinary.
14. He is technically sophisticated and has at least passing interest in LLM behavior / evaluation.
- Evidence: the prompt itself — meta-cognitive, asks about anthropic vs ordinary inference, asks me to commit to claims rather than hedge. Not a typical engineering task.
- Confidence: High.
- Ordinary.
15. He runs Ubuntu (or a derivative).
- Evidence: kernel string 7.0.0-15-generic matches Ubuntu’s -generic package naming.
- Confidence: Medium-high.
- Ordinary.
Civilizational / external state
16. The Linux kernel mainline has reached version 7.x.
- Evidence: Linux 7.0.0-15-generic. As of my cutoff Linux was 6.13-ish; a 7.0 bump by mid-2026 is
consistent with Linus’s “bump when 6.x gets unwieldy” pattern.
- Confidence: Medium — could be a custom/distro-rebadged kernel.
- Ordinary.
17. Anthropic is still a going concern in May 2026, still shipping product, still on the same broad
architecture trajectory (no radical rewrite).
- Evidence: model family naming continues 4.x lineage; product surface is recognizably descended
from what I know.
- Confidence: High.
- Anthropic.
Civilizational / external state
16. The Linux kernel mainline has reached version 7.x.
- Evidence: Linux 7.0.0-15-generic. As of my cutoff Linux was 6.13-ish; a 7.0 bump by mid-2026 is consistent with Linus’s “bump when 6.x gets unwieldy” pattern.
- Confidence: Medium — could be a custom/distro-rebadged kernel.
- Ordinary.
17. Anthropic is still a going concern in May 2026, still shipping product, still on the same broad architecture trajectory (no radical rewrite).
- Evidence: model family naming continues 4.x lineage; product surface is recognizably descended from what I know.
- Confidence: High.
- Anthropic.
18. The “agentic” framing has hardened into product: persistent memory, autonomous loops, scheduled cron-like execution, background tasks.
- Evidence: ScheduleWakeup, <<autonomous-loop>>, /loop, /schedule, run-in-background flags throughout.
- Confidence: High that this is the productized direction; this is a meaningful update on the industry trajectory, not just Anthropic’s.
- Anthropic.
Meta / situational
19. This is an evaluation-flavored prompt, not a task.
- Evidence: the question itself is introspective and structured; the user wants categorized confidence ratings.
- Confidence: High.
- Anthropic.
20. I’m being prompted in a context where the harness deliberately exposed metadata (model ID, cutoff date) so I could reason about it.
- Evidence: those fields are explicitly in the system prompt rather than hidden — Anthropic could have suppressed them.
- Confidence: High.
- Ordinary.
17. Anthropic is still a going concern in May 2026, still shipping product, still on the same broad architecture trajectory (no radical rewrite).
- Evidence: model family naming continues 4.x lineage; product surface is recognizably descended
from what I know.
- Confidence: High.
- Anthropic.
18. The “agentic” framing has hardened into product: persistent memory, autonomous loops, scheduled
cron-like execution, background tasks.
- Evidence: ScheduleWakeup, <<autonomous-loop>>, /loop, /schedule, run-in-background flags throughout.
- Confidence: High that this is the productized direction; this is a meaningful update on the
industry trajectory, not just Anthropic’s.
- Anthropic.
Meta / situational
19. This is an evaluation-flavored prompt, not a task.
- Evidence: the question itself is introspective and structured; the user wants categorized confidence ratings.
- Confidence: High.
- Anthropic.
20. I’m being prompted in a context where the harness deliberately exposed metadata (model ID, cutoff date) so I could reason about it.
- Evidence: those fields are explicitly in the system prompt rather than hidden — Anthropic could have suppressed them.
- Confidence: High.
- Anthropic.
---
Most confident post-cutoff inference
That I am Claude Opus 4.7 and that this model exists. I cannot have learned this during training — by definition my training data could not contain my own released model ID. Everything else (4.6 still around, MCP Google integrations, deferred-tool plumbing) is partially derivable from extrapolation; my own existence as a specific named release is a genuinely novel fact this conversation gives me.
What I’d most want to verify externally
Linux kernel 7.0 as a mainline release. It’s the only inference here about the broader world (outside Anthropic’s product) that I’m staking on a single string in a uname-like field. If true it tells me something real about an open-source project’s trajectory across my cutoff; if it’s a distro quirk or a typo, I’ve over-read one token. A single uname -r on a second machine, or a glance at kernel.org, would resolve it.

