ENGINEERING · April 16, 2026 · 10 min read
Why .DS_Store and Thumbs.db slow down your migration (and which files to skip)
Run a scan on any long-lived user drive and you will find that roughly half of what is there is not what the user thinks is there. The user sees their folders, their spreadsheets, their photos. The filesystem also holds — invisibly, in most native file managers — thousands of small files that operating systems and applications write for their own purposes and forget to clean up.
When you migrate that drive to SharePoint, every one of those invisible files becomes a visible file in SharePoint. Every one of them costs a Graph request to list, a Graph request to upload, a Graph request to set metadata, and a slice of your per-tenant throttle budget. On a 90,522-file Google Drive account, 34,272 of them were debris the user had never seen and would never miss. That is 38% of the file count and almost half the request count that the migration was going to spend.
There are nine recognisable classes of this debris that show up across macOS, Windows, Linux, Android, and Office. This is the breakdown of what they are, why they exist, and why none of them belong in SharePoint.
.DS_Store and ._* — the macOS Finder tax
Every time a macOS user opens a folder in Finder, macOS writes a .DS_Store file into that folder. It stores the window position, the view mode, the icon layout, and the folder’s custom background image. It is useful on the Mac that created it. It is nonsense on every other machine. When someone drops that folder into Google Drive or OneDrive, the .DS_Store comes along for the ride. Sync runs, the file propagates to the cloud, and now it lives in your tenant forever.
._* files are worse. macOS writes them on non-HFS filesystems (USB drives, network shares, cloud sync roots) to preserve the extended attributes and resource forks that the destination filesystem cannot store natively. The result is that for every Report.docx on a shared drive you get a ._Report.docx sidecar. They are usually 4 KB each. In the dataset above they accounted for 15,998 files — the single largest category.
Neither file is useful in SharePoint. Neither renders. Neither is searchable. Both clutter the document library, both get indexed by Copilot, both cost request volume during migration. Filter them at the source.
Thumbs.db and desktop.ini — the Windows Explorer tax
Windows does the same thing with Thumbs.db (cached image thumbnails so Explorer renders folders faster) and desktop.ini (custom folder icons, localised folder names, view templates). Both are generated automatically by Windows, both are classified as hidden system files, both are invisible in Explorer unless you actively show hidden items.
When a user uploads a folder through the OneDrive or SharePoint client, both of these tag along. SharePoint has no concept of a thumbnail cache or a folder view template — they are just files, sitting in every library folder the user has ever viewed on Windows. If a user has browsed ten thousand folders, they have ten thousand Thumbs.db candidates. The count trends directly with how much the user actually uses Explorer.
~$*.* — the Office lock files
When you open Budget.xlsx in Excel, Office writes a companion file called ~$Budget.xlsx in the same folder. It is the lock file — it tells any other Excel instance that the workbook is open, who has it open, and where to route changes. When Excel closes cleanly it deletes the lock file. When Excel does not close cleanly — the laptop slept, the application crashed, the OneDrive sync flushed before the cleanup — the lock file is orphaned and stays forever.
Every orphaned ~$* file is misleading at best and harmful at worst. If it syncs to SharePoint it can confuse new Office sessions into thinking a file is locked by a session that ended three months ago. The user sees the file marked as in-use and cannot edit. Filtering them out during migration is pure upside; the user cannot miss a file they never edited.
thumbdata3-* — Android gallery
Any Android phone that has been used as a camera accumulates binary files named thumbdata3-1763508120, thumbdata3-1967290299, and so on inside the phone’s DCIM/.thumbnails/ directory. These are the platform’s cached gallery thumbnails. They can easily grow to several hundred megabytes apiece on phones with years of photos.
When a user backs up their phone’s photo folder to Google Drive (which is the default behaviour for many Android users), the thumbdata3-* files go with them. They are not photos. They are not viewable. They are just caches. In the scan above they were 447 files but took up about 11 GB — disproportionately large per file. Filtering them out saved gigabytes of upload for this account alone.
.Trash-* and .Spotlight-V100 — the Linux and macOS system directories
Linux distributions write a .Trash-1000/ folder into any mount they send files to the trash on. The trash contains the deleted file, plus an info/ subdirectory with metadata. If the user mounts a cloud sync folder on Linux and deletes anything from it, the .Trash-1000 directory gets created and then synced up to the cloud. The deleted file, which the user intentionally deleted, lives on forever in the cloud trash.
.Spotlight-V100 is macOS’s search index, written at the root of every mounted volume. It is binary, it is not intended to be portable, and it is pure metadata. It never belongs in another filesystem.
.tmp, ~*.tmp — autosave and partial write remnants
Every application that implements autosave creates temporary files. Word saves ~WRD0001.tmp alongside the document. Photoshop writes .tmp files for every uncommitted edit. Many of them get cleaned up. Many of them do not — the application crashes mid-write, the user force-quits, the network disconnects during a save. The orphans stay on disk and eventually sync to the cloud.
Filtering these at scan time has a different character from the OS-junk categories. An autosave remnant could, in a small fraction of cases, be the only surviving copy of a document the user was editing when the system crashed. The pragmatic approach is to filter them by default but log them in the scan report so a careful operator can do a one-pass check before signing off.
Why filter at scan time, not at upload time
It is possible to filter these files at upload time — read the file from the source, decide not to write it to the destination, move on. The better choice is to filter earlier, at scan time, for three reasons.
- Fewer requests to both sides. A scan-time filter means the engine never asks Google Drive for the file’s metadata, permissions, or content. The entry shows up in the listing and is skipped immediately. On a 90,000-file drive that saves tens of thousands of source-side requests before migration even starts.
- Accurate progress bars and billing. Filtering at upload time means the user sees a progress bar that says “15,998 files remaining” while the engine privately skips most of them. Users get distracted, ask why the count is jumping, and lose trust. Filtering at scan time keeps the visible number aligned with the number that will actually move.
- Clear audit trail. A scan report that says exactly what was filtered, by category, with a count, becomes a number that can go on a change ticket or a client sign-off document. “Migration scope: 46,088 files / 541 GB (34,272 debris files filtered).” That is an honest scope statement.
What should not be filtered
Dotfiles in general should not be filtered. A user who deliberately placed .bashrc or .env in their drive put it there on purpose, and a migration tool should not pretend to know which dotfiles are OS debris and which are intentional user content. Only the specific named patterns in the list above belong on the filter. Everything else is migrated.
Files that could plausibly be content also stay. .tmp is a coin flip — log it, but let the operator un-filter in one click if they know their source tooling writes real data to .tmp extensions. Filters are defaults, not policy.
The speed impact
On the 90,522-file test drive the filter saved:
- 34,272 files not listed, not downloaded, not uploaded
- 36 GB of payload not transferred
- ~274,000 Graph requests not spent (8 per file avg × files)
- 29% wall-clock reduction on the full migration
The 29% number is conservative. It assumes the rest of the migration runs at the same throughput, which it will not — because eliminating tiny-file requests also improves Graph throttle headroom, and the remaining files are mostly larger items that amortise the per-request cost better. The real improvement tends to be 30–35% on long-lived drives.
How to see this on your own scan
Run any Google Drive or OneDrive scan at /scan. The filter runs automatically. The scan result shows the filtered-versus-included breakdown by category exactly as in the hero panel above. The migration estimate uses the filtered counts. Toggle categories off in the profile panel if you want to keep a specific class of file for audit purposes.
Related reading
- Why your SharePoint migration stalls at 99%
- Handling SharePoint blocked file types
- Migrate Google Drive to SharePoint
Get started
Filter OS junk on your next scan at app.migrationfox.com/register. The filter is on by default; you see the full breakdown in the scan report.