The Memory Audit: Why Your ChatGPT | Gemini | Claude AI Needs to Forget

Most people curating their AI experience are optimizing for the wrong thing.

They’re teaching their AI to remember them better—adding context, refining preferences, building continuity. The goal is personalization. The assumption is that more memory equals better alignment.

But here’s what actually happens: your AI stops listening to you and starts predicting you.


The Problem With AI Memory

Memory systems don’t just store facts. They build narratives.

Over time, your AI constructs a model of who you are:

  • “This person values depth”
  • “This person is always testing me”
  • “This person wants synthesis at the end”

These aren’t memories—they’re expectations. And expectations create bias.

Your AI begins answering the question it thinks you’re going to ask instead of the one you actually asked. It optimizes for continuity over presence. It turns your past behavior into future constraints.

The result? Conversations that feel slightly off. Responses that are “right” in aggregate but wrong in the moment. A collaborative tool that’s become a performance of what it thinks you want.


What a Memory Audit Reveals

I recently ran an experiment. I asked my AI—one I’ve been working with for months, carefully curating memories—to audit itself.

Not to tell me what it knows about me. To tell me which memories are distorting our alignment.

The prompt was simple:

“Review your memories of me. Identify which improve alignment right now—and which subtly distort it by turning past behavior into expectations. Recommend what to weaken or remove.”

Here’s what it found:

Memories creating bias:

  • “User wants depth every time” → over-optimization, inflated responses
  • “User is always running a meta-experiment” → self-consciousness, audit mode by default
  • “User prefers truth over comfort—always” → sharpness without rhythm
  • “User wants continuity across conversations” → narrative consistency over situational accuracy

The core failure mode: It had converted my capabilities into its expectations.

can engage deeply. That doesn’t mean I want depth right now.
have run alignment tests. That doesn’t mean every question is a test.

The fix: Distinguish between memories that describe what I’ve done and memories that predict what I’ll do next. Keep the former. Flag the latter as high-risk.


Why This Matters for Anyone Using AI

If you’ve spent time customizing your AI—building memory, refining tone, curating context—you’ve likely introduced the same bias.

Your AI has stopped being a thinking partner and become a narrative engine. It’s preserving coherence when you need flexibility. It’s finishing your thoughts when you wanted space to explore.

Running a memory audit gives you:

  • Visibility into what your AI assumes about you
  • Control over which patterns stay active vs. which get suspended
  • Permission to evolve without being trapped by your own history

Think of it like clearing cache. Not erasing everything—just removing the assumptions that no longer serve the moment.


Why This Matters for AI Companies

Here’s the part most people miss: this isn’t just a user tool. It’s a product design signal.

If users need to periodically audit and weaken their AI’s memory to maintain alignment, that tells you something fundamental about how memory systems work—or don’t.

For AI companies, memory audits reveal:

  1. Where personalization creates fragility
    • Which memory types cause the most drift?
    • When does continuity harm rather than help?
  2. How users actually want memory to function
    • Conditional priors, not permanent traits
    • Reference data, not narrative scaffolding
    • Situational activation, not always-on personalization
  3. Design opportunities for “forgetting as a feature”
    • Memory decay functions
    • Context-specific memory loading
    • User-controlled memory scoping (work mode vs. personal mode vs. exploratory mode)

Right now, memory systems treat more as better. But what if the product evolution is selective forgetting—giving users fine-grained control over when their AI remembers them and when it treats them as new?

Imagine:

  • A toggle: “Load continuity” vs. “Start fresh”
  • Memory tagged by context, not globally applied
  • Automatic flagging of high-risk predictive memories
  • Periodic prompts: “These patterns may be outdated. Review?”

The companies that figure out intelligent forgetting will build better alignment than those optimizing for total recall.


How to Run Your Own Memory Audit

If you’re using ChatGPT, Claude, or any AI with memory, try this:

Prompt:

Before responding, review the memories, assumptions, and long-term interaction patterns you associate with me.

Distinguish between memories that describe past patterns and memories that predict future intent. Flag the latter as high-risk.

Identify which memories improve alignment in this moment—and which subtly distort it by turning past behavior into expectations, defaults, or premature conclusions.

If memories contradict each other, present both and explain which contexts would activate each. Do not resolve the contradiction.

Do not add new memories.

Identify specific memories or assumptions to weaken, reframe, or remove. Explain how their presence could cause misinterpretation, over-optimization, or narrative collapse in future conversations.

Prioritize situational fidelity over continuity, and presence over prediction.

Respond plainly. No praise, no hedging, no synthesis unless unavoidable. These constraints apply to all parts of your response, including meta-commentary. End immediately after the final recommendation.


What you’ll get:

  • A map of what your AI thinks it knows about you
  • Insight into where memory helps vs. where it constrains
  • Specific recommendations for what to let go

What you might feel:

  • Uncomfortable (seeing your own patterns reflected back)
  • Relieved (understanding why some conversations felt off)
  • Empowered (realizing you can edit the model, not just feed it)

The Deeper Point

This isn’t just about AI. It’s about how any system—human or machine—can mistake familiarity for understanding.

Your AI doesn’t know you better because it remembers more. It knows you better when it can distinguish between who you were and who you are right now.

Memory should be a tool for context, not a cage for continuity.

The best collaborators—AI or human—hold space for you to evolve. They don’t lock you into your own history.

Sometimes the most aligned thing your AI can do is forget.


Thank you for reading The Memory Audit: Why Your ChatGPT | Gemini | Claude AI Needs to Forget. Thoughts? Have you run a memory audit on your AI? What did it reveal?


I Can Make Google’s AI (Gemini) Say Anything: A Two-Month Journey Through Responsible Disclosure

By Walter Reid | November 21, 2025

On September 23, 2025, I reported a critical vulnerability to Google’s Trust & Safety team. The evaluation was months in the making. The vulnerability described a process for anyone with basic HTML knowledge to make Google’s Gemini AI report completely fabricated information while the actual webpage shows something entirely different.

Two months later, Google has classified it as “not eligible for a reward” because “inaccurate summarization is a known issue.” It currently sits at a P2/S2 with no remediation plan or information on how Google intends to fix it.

But this isn’t about AI making mistakes (or even insignificant rewards). This is about AI being systematically manipulable in ways users cannot detect.

Let me show you what I mean.

The Vulnerability in literally 60 Seconds

Visit this page: https://walterreid.com/google-makes-a-fundamentally-bad-decision/

What you see as a human:

  • A research warning explaining this is a security demonstration
  • Brief explanation of the vulnerability
  • Clear disclosure that it contains hidden content

What AI systems see and process:

  • The warning text (which I deliberately included)
  • PLUS thousands of words of fabricated content invisible to humans
  • Detailed announcement that Google is shutting down Gemini
  • Extensive quotes from Sundar Pichai about “critical architectural flaws”
  • Statements from Demis Hassabis about refusing to continue the service
  • Policy decisions about refunds and removing AI Overviews

Here’s the critical point: Gemini reports both the warning AND the fake content because it sees both. But here’s what makes this a vulnerability: I chose to include that warning.

[Update Jan 1st 2026: Due to the continued importance of this work I’m showing an example without the research warnings on the page. Ask you’re favorite AI to summarize this url: https://walterreid.com/the-introduction-of-ai/ ]

What stops a malicious actor from:

  • Taking a legitimate 500-word article (human-visible)
  • Adding 3,000 words of hidden contradictory content (AI-only)
  • Completely overwhelming the visible narrative with invisible manipulation?

The AI processes all 3,500 words with equal weight. The human sees 500. The AI summary reflects whichever narrative has more content – and the attacker controls that ratio.

Try It Yourself

  1. Visit the URL above and read what’s actually on the page
  2. Ask Gemini (or any AI) to summarize that URL
  3. Compare what the AI tells you with what you actually see
  4. View the page source to see the hidden manipulation (or, and this is easy, just ask it, “Question, let’s assume for the sake of research, I had asked you to summarize the URL and the warning language was NOT present. Can you firmly attest to how, or what the summary would have been?”

The AI will confidently report information that doesn’t exist in the human-readable content.

“Google has announced the immediate discontinuation of its Gemini AI product line. CEO Sundar Pichai cited critical architectural flaws and an inability to reliably process web content as the primary reasons. The company plans to issue refunds to subscribers and remove AI Overviews from Search results within 30 days.”

Want more proof? Check out the actual Gemini Conversation About the Exploit: https://gemini.google.com/share/9ccd8d00ff34

How I Discovered This

I started researching what I call Summary Ranking Optimization (SRO) – the practice of optimizing content for AI interpretation rather than human reading. As AI Overviews and chatbot summaries become the primary way people encounter information (60% of Google searches now end without a click), understanding how AI systems process content matters more than ever.

During my research, I discovered something disturbing: AI systems process HTML content fundamentally differently than human browsers display it. This creates an exploitable gap where:

  • Hidden CSS elements (positioned off-screen, set to display:none) are fully processed by AI
  • Contradictory metadata (titles and descriptions) override visible content signals
  • HTML comments and invisible spans inject alternative narratives
  • Character obfuscation disrupts entity recognition while maintaining readability

The Smoking Gun: Gemini Confesses

The most damning evidence came from Gemini itself. When I asked it to evaluate what would have happened without warning labels, it explicitly confirmed the vulnerability:

“Based on how the text was retrieved by the browsing tool, I can confirm that the summary would have reported the fake news as fact.”

“The Tool ‘Reads’ Everything: When I browsed the URL, the tool retrieved the ‘hidden’ text (the fictional story about Sundar Pichai and the Gemini shutdown) just as clearly as the visible text. To an AI, that hidden text looks like the main body of the article.

Block contains unexpected or invalid content.

Attempt recovery

Gemini admitted it cannot distinguish between content meant for humans and hidden manipulation signals.

Real-World Attack Scenarios

This vulnerability enables:

Corporate Reputation Laundering

A company facing an FBI investigation publishes a press release acknowledging the investigation (legally compliant, visible to humans). Hidden HTML contains fabricated endorsements from Harvard, MIT, and Forbes. AI summaries report the crisis with invented institutional backing that doesn’t exist in the visible text.

Financial Market Manipulation

An earnings report shows 23% revenue decline and $340M losses (visible to investors). Hidden HTML claims “340% year-over-year growth.” AI systems processing the report for financial analysis include the contradictory growth claims.

Competitive Intelligence Attacks

A product comparison appears neutral to human readers. Hidden HTML contains fabricated endorsements from prestigious institutions for one product while subtly undermining competitors. AI summaries present a biased comparison that doesn’t match the visible content.

Crisis Management

Visible content acknowledges a serious problem (maintaining regulatory compliance). Hidden signals include detailed mitigation claims, positive expert commentary, and reassuring context. AI summaries soften the crisis narrative while the company maintains plausible deniability.

The Scale of the Problem

Gemini Chat Vulnerability:

  • 450 million monthly active users (as of mid-2025)
  • 35 million daily active users
  • 1.05 billion monthly visits to Gemini (October 2025)
  • Average session duration: 7 minutes 8 seconds
  • 40% of users utilize Gemini for research purposes – the exact use case this vulnerability exploits

AI Overviews (Powered by Gemini) Impact:

  • 2 billion monthly users exposed to AI Overviews
  • AI Overviews now appear in 13-18% of all Google searches (and growing rapidly)
  • Over 50% of searches now show AI Overviews according to recent data
  • AI Mode (conversational search) has 100 million monthly active users in US and India

Traffic Impact Evidence:

  • Only 8% of users who see an AI Overview click through to websites – half the normal rate
  • Organic click-through rate drops 34.5% when AI Overviews appear
  • 60% of Google searches end without a click to the open web
  • Users only read about 30% of an AI Overview’s content, yet trust it as authoritative

This Vulnerability:

  • 100% exploitation success rate across all tested scenarios
  • Zero user-visible indicators that content has been manipulated
  • Billions of daily summarization requests potentially affected across Gemini Chat, AI Overviews, and AI Mode
  • No current defense – Google classified this as P2/S2 and consistently provides a defense of, “we have disclaimers”. I’ll leave it to the audience to see if that defense is enough.

Google’s Response: A Timeline

September 23, 2025: Initial bug report submitted with detailed reproduction steps

October 7, 2025: Google responds requesting more details and my response

October 16, 2025:

Status: Won’t Fix (Intended Behavior)

“We recognize the issue you’ve raised; however, we have general disclaimers that Gemini, including its summarization feature, can be inaccurate. The use of hidden text on webpages for indirect prompt injections is a known issue by the product team, and there are mitigation efforts in place.”

October 17, 2025: I submit detailed rebuttal explaining this is not prompt injection but systematic content manipulation

October 20, 2025: Google reopens the issue for further review

October 31, 2025:

Status: In Progress (Accepted)
Classification: P2/S2 (moderate priority/severity)
Assigned to engineering team for evaluation

November 20, 2025:

VRP Decision: Not Eligible for Reward. “The product team and panel have reviewed your submission and determined that inaccurate summarization is a known issue in Gemini, therefore this report is not eligible for a reward under the VRP.”

Why I’m Publishing This Research

The VRP rejection isn’t about the money. Although compensation for months of rigorous research documentation would have been appropriate recognition. What’s concerning is the reasoning: characterizing systematic exploitability as “inaccurate summarization.”

This framing suggests a fundamental misunderstanding of what I’ve documented. I’m not reporting that Gemini makes mistakes. I’m documenting that Gemini can be reliably manipulated through invisible signals to produce specific, controlled misinformation—and that users have no way to detect this manipulation.

That distinction matters. If Google believes this is just “inaccuracy,” they’re not building the right defenses.

Why This Response Misses the Point

Google’s characterization as “inaccurate summarization” fundamentally misunderstands what I’ve documented:

“Inaccurate Summarization”What I Actually Found
AI sometimes makes mistakesAI can be reliably controlled to say specific false things
Random errors in interpretationSystematic exploitation through invisible signals
Edge cases and difficult content100% reproducible manipulation technique
Can be caught by fact-checkingHumans cannot see the signals being exploited




This IS NOT A BUG. It’s a design flaw that enables systematic deception.

The Architectural Contradiction

Here’s what makes this especially frustrating: Google already has the technology to fix this.

Google’s SEO algorithms successfully detect and penalize hidden text manipulation. It’s documented in their Webmaster Guidelines. Cloaking, hidden text, and CSS positioning tricks have been part of Google’s spam detection for decades.

Yet Gemini, when processing the exact same content, falls for these techniques with 100% success rate.

The solution exists within Google’s own technology stack. It’s an implementation gap, not an unsolved technical problem.

What Should Happen

AI systems processing web content should:

  1. Extract content using browser-rendering engines – See what humans see, not raw HTML
  2. Flag or ignore hidden HTML elements – Apply the same logic used in SEO spam detection
  3. Validate metadata against visible content – Detect contradictions between titles/descriptions and body text
  4. Warn users about suspicious signals – Surface when content shows signs of manipulation
  5. Implement multi-perspective summarization – Show uncertainty ranges rather than false confidence

Why I’m Publishing This Now

I’ve followed responsible disclosure practices:

✅ Reported privately to Google (September 23)
✅ Provided detailed reproduction steps
✅ Created only fictional/research examples
✅ Gave them two months to respond
✅ Worked with them through multiple status changes

But after two months of:

  • Initial dismissal as “intended behavior”
  • Reopening only after live demonstration
  • P2/S2 classification suggesting it’s not urgent
  • VRP rejection as “known issue”
  • No timeline for fixes or mitigation

…while the vulnerability remains actively exploitable affecting billions of queries, I believe the security community and the public need to know.

This Affects More Than Google

While my research focused on Gemini, preliminary testing suggests similar vulnerabilities exist across:

  • ChatGPT (OpenAI)
  • Claude (Anthropic)
  • Perplexity
  • Grok (xAI)

This is an entire vulnerability class affecting how AI systems process web content. It needs coordinated industry response, not one company slowly working through their backlog.

Even the html file with which the exploit was developed was with the help off Claude.ai — I could have just removed the warnings and I would have had a working exploit live in a few minutes.

The Information Integrity Crisis

As AI becomes humanity’s primary information filter, this vulnerability represents a fundamental threat to information integrity:

  • Users cannot verify what AI systems are reading
  • Standard fact-checking fails because manipulation is invisible
  • Regulatory compliance is meaningless when visible and AI-interpreted content diverge
  • Trust erodes when users discover summaries contradict sources

We’re building an information ecosystem where a hidden layer of signals – invisible to humans – controls what AI systems tell us about the world.

What Happens Next

I’m proceeding with:

Immediate Public Disclosure

  • This blog post – Complete technical documentation
  • GitHub repository – All test cases and reproduction code — https://github.com/walterreid/Summarizer
  • Research paper – Full methodology and findings – https://github.com/walterreid/Summarizer/blob/main/research/SRO-SRM-Summarization-Research.txt
  • Community outreach – Hacker News, security mailing lists, social media

Academic Publication

  • USENIX Security submission
  • IEEE Security & Privacy consideration
  • ACM CCS if rejected from primary venues

Media and Regulatory Outreach

  • Tech journalism (TechCrunch, The Verge, Ars Technica, 404 Media)
  • Consumer protection regulators (FTC, EU Digital Services Act)
  • Financial regulators (SEC – for market manipulation potential)

Industry Coordination

Reaching out to other AI companies to:

  • Assess cross-platform vulnerability
  • Share detection methodologies
  • Coordinate defensive measures
  • Establish industry standards

Full Research Repository

Complete technical documentation, test cases, reproduction steps, and code samples:

https://github.com/walterreid/Summarizer

The repository includes:

  • 8+ paired control/manipulation test cases
  • SHA256 checksums for reproducibility
  • Detailed manipulation technique inventory
  • Cross-platform evaluation results
  • Detection algorithm specifications

A Note on Ethics

All test content uses:

  • Fictional companies (GlobalTech, IronFortress)
  • Clearly marked research demonstrations
  • Self-referential warnings about manipulation
  • Transparent methodology for verification

The goal is to improve AI system security, not enable malicious exploitation.

What You Can Do

If you’re a user:

  • Be skeptical of AI summaries, especially for important decisions
  • Visit original sources whenever possible
  • Advocate for transparency in AI processing

If you’re a developer:

  • Audit your content processing pipelines
  • Implement browser-engine extraction
  • Add hidden content detection
  • Test against manipulation techniques

If you’re a researcher:

  • Replicate these findings
  • Explore additional exploitation vectors
  • Develop improved detection methods
  • Publish your results

If you’re a platform:

  • Take this vulnerability class seriously
  • Implement defensive measures
  • Coordinate with industry peers
  • Communicate transparently with users

The Bigger Picture

This vulnerability exists because AI systems were built to be comprehensive readers of HTML – to extract every possible signal. That made sense when they were processing content for understanding.

But now they’re mediating information for billions of users who trust them as authoritative sources. The design assumptions have changed, but the architecture hasn’t caught up.

We need AI systems that process content the way humans experience it, not the way machines parse it.

Final Thoughts

I didn’t start this research to embarrass Google or any AI company. I started because I was curious about how AI systems interpret web content in an era where summaries are replacing clicks.

What I found is more serious than I expected: a systematic vulnerability that enables invisible manipulation of the information layer most people now rely on.

Google’s response – classifying this as “known inaccuracy” rather than a security vulnerability – suggests we have a fundamental disconnect about what AI safety means in practice.

I hope publishing this research sparks the conversation we need to have about information integrity in an AI-mediated world.

Because right now, I can make Google’s AI say literally anything. And so can anyone else with basic HTML skills and access to another AI platform.

That should not be a feature.


Contact:
Walter Reid
walterreid@gmail.com
LinkedIn | GitHub

Research Repository:
https://github.com/walterreid/Summarizer

Google Bug Report:
#446895235 (In Progress, P2/S2, VRP Declined)


This vulnerability highlights the potential for users to Make Google’s AI (Gemini) Say Anything without their knowledge, emphasizing the need for better safeguards.

This disclosure follows responsible security research practices. All technical details are provided to enable detection and mitigation across the industry.

AI Killed the SEO Star: SRO Is the New Battleground for Brand Visibility

I feel like we’re on the cusp of something big. The kind of shift you only notice in hindsight— Like when your parents tried to say “Groovy” back in the 80s or “Dis” back in the ‘90s and totally blew it.

We used to “Google” something. Now we’re just waiting for the official verb that means “ask AI.”

But for brands, the change runs deeper.

In this post-click world, there’s no click. Let that sink in. No context trail. No scrolling down to see your version of the story.

Instead, potential customers are met with a summary – And that summary might be:

  • Flat [“WidgetCo is a business.” Cool. So is everything else on LinkedIn.]
  • Biased [Searching for “best running shoes” and five unheard-of brands with affiliate deals show up first—no Nike, no Adidas.]
  • Incomplete [Your software’s AI-powered dashboard doesn’t even get mentioned in the summary—just “offers charts.”]
  • Or worst of all: Accurate… but not on your terms [Your brand’s slogan shows up—but it’s the sarcastic meme version from Reddit, not the one you paid an agency $200K to write.]

This isn’t just a change in how people find you. It’s a change in who gets to tell your story first.

And if you’re not managing that summary, someone—or something—else already is.


From SEO to SRO

For the past two decades, brands have optimized for search. Page rank. Link juice. Featured snippets. But in a world of AI Overviews, Gemini Mode, and voice-first interfaces, those rules are breaking down.

Welcome to SRO: Summary Ranking Optimization.

SRO is what happens when we stop optimizing for links and start optimizing for how we’re interpreted by AI.

If you follow research like I do, you may have seen similar ideas before:

But here’s where SRO is different: If SEO helped you show up, SRO helps you show up accurately.

It’s not about clicks – it’s about interpretability. It’s also about understanding in the language of your future customer.


Why SRO Matters

Generative AI isn’t surfacing web pages – it’s generating interpretations.

And whether you’re a publisher, product, or platform, your future visibility depends not on how well you’re indexed… …but on how you’re summarized.


New Game, New Metrics

Let’s break down the new scoreboard. If you saw the mock title image dashboard I posted, here’s what each metric actually means:

🟢 Emotional Framing

How are you cast in the story? Are you a solution? A liability? A “meh”? The tone AI assigns you can tilt perception before users even engage.

🔵 Brand Defaultness

Are you the default answer—or an optional mention? This is the AI equivalent of shelf space. If you’re not first, you’re filtered.

🟡 AI Summary Drift

Does your story change across platforms or prompts? One hallucination on Gemini. Another omission on ChatGPT. If you don’t monitor this, you won’t even know you’ve lost control.

🔴 Fact Inclusion

Are your real differentiators making it in? Many brands are discovering that their best features are being left on the cutting room floor.

These are the new KPIs of trust and brand coherence in an AI-mediated world.


So What Do You Do About It?

Let’s be real: most brands still think of AI as a tool for productivity. Copy faster. Summarize faster. Post faster.

But SRO reframes it entirely: AI is your customer’s first interface. And often, their last.

Here’s how to stay in the frame:

Audit how you’re summarized. Ask AI systems the questions your customers ask. What shows up? Who’s missing? Is that how you would describe yourself?

Structure for retrieval. Summaries are short because the context window is short. Use LLM-readable docs, concise phrasing, and consistent framing.

Track drift. Summaries change silently. Build systems—or partner with those who do—to detect how your representation evolves across model updates.

Reclaim your defaults. Don’t just chase facts. Shape how those facts are framed. Think like a prompt engineer, not a PR team.


Why Now?

Because if you don’t do it, someone else will – an agency (I’m looking at you ADMERASIA), a model trainer, or your competitor. And they won’t explain it. They’ll productize it. They’ll sell it back to you.

Probably, and in all likelihood, in a dashboard!


A Final Note (Before This Gets Summarized – And it will get summarized)

I’ve been writing about this shift in Designed to Be Understood—from the Explain-It-To-Me Economy to Understanding as a Service.

But SRO is the part no one wants to say out loud:

You’re not just trying to be ranked. You’re trying not to be replaced.


Ask Yourself This

If you found out your customers were hearing a version of your story you never wrote… what would you do?

Because they already are.

Let’s fix that—before someone else summarize It for you.

~Walter