HR365 - Human Resources Management Solution
TimeSheet 365 - Time recording Solution
FixIT 365 - IT Help Desk
LegalCase 365 - Legal Case Management Solution

Your Copilot Is Only as Smart as Your SharePoint: Why Stale Files Are Hallucinations Waiting to Happen

Microsoft 365 Copilot grounds every answer in your tenant’s own content. When that content is stale, contradictory, or duplicated, Copilot doesn’t shrug — it answers with confidence. Here’s why that’s a problem and what to do about it.

When you roll out Microsoft 365 Copilot, you’re handing every employee a large language model that grounds its responses in the documents, emails, chats, and meeting notes already sitting in your tenant.

That’s the value proposition — Copilot doesn’t make things up out of nothing, it answers from your data. 

 

The trouble is, your data isn’t a clean source of truth. It’s an archaeological site. 

 

Most tenants we scan have content from three different eras of the business: the founding-team docs from 2017 that nobody has touched in five years, the “transformation initiative” content from 2021 that everyone still references even though it was deprecated, and the current quarter’s working files.

Copilot doesn’t know which era to trust. It looks at all of them, weighs by recency and relevance scores, and gives you an answer that’s plausibly drawn from any of them — sometimes blending facts that contradict each other. 

 

That’s a hallucination. Not in the LLM sense of inventing facts from thin air, but in the more dangerous corporate sense: confidently surfacing content that’s wrong because it’s old, and presenting it with the same authority as content that’s right because it’s current. 

Three categories of trouble 

Stale policies. The 2019 expense policy capped lunch reimbursements at $25; the 2024 update raised it to $40. If both PDFs are in the same SharePoint site and the 2019 one has a more “official-sounding” filename, Copilot may quote $25 to a new hire who then submits a $30 lunch and gets denied. 

Duplicate-with-drift. Sales has version 1.0 of the price sheet in their site. Marketing has version 1.0 in theirs. Six months later, Sales updated theirs to 1.1 but Marketing didn’t. A prospect asks Copilot for current pricing through their account manager, who’s in Marketing. They get 1.0 prices. The deal closes at the wrong number. 

Orphaned answers. The 2018 acquisition’s product wiki is still in the tenant — nobody owns it, nobody maintains it. A support agent asks Copilot about a feature, Copilot finds the answer in the orphaned wiki, the feature was deprecated in 2020. Customer is told it still exists. 

In every case, the LLM is doing its job. The data is doing the lying. 

Why this gets worse with Copilot, not better 

Before Copilot, stale content was passively bad — it sat there until someone happened to read it.

Most of the time nobody did. Copilot changes the consumption model: every question pulls in candidate content, ranks it, and serves it.

Stale content goes from passively harmful to actively retrieved. The footprint of “files nobody looks at anymore” shrinks to zero — Copilot looks at everything. 

The fix is operational, not technical 

Microsoft’s response to this problem is “use sensitivity labels and information protection.” 

That’s correct but slow — it’s a multi-year program and most organisations don’t finish it. 

There’s a faster intervention available: identify the content that’s most likely to mislead Copilot (old, duplicate, unowned) and either archive, delete, or update it. 

 

A scan that produces this inventory takes under an hour for most tenants. The cleanup itself can be staged over a quarter — archive stale sites, dedupe price sheets, assign owners to orphans.

The result isn’t perfect Copilot accuracy, but it removes the most common categories of corporate hallucination. 

 

A storage scan and a Copilot-readiness scan are essentially the same scan. Storage savings are the financial argument; Copilot accuracy is the operational one. Most tenants we work with start the project for the financial reason and discover the operational benefit was worth more. 

You might also like

Copilot Governance Starts With Content Hygiene

Most Copilot governance programs focus on access controls, sensitivity labels, and prompt filters. They overlook the most consequential variable: the quality of the underlying content. Here’s why content hygiene is the foundation, and what to do this quarter.

5 Ways Your M365 Tenant Is Lying to Copilot Right Now

opilot grounds its answers in your tenant’s content — and the content is full of contradictions, duplicates, and zombie data. Here are the five patterns we see in every scan that turn Copilot into a confident misinformer.