One honest number from my last full-disk day: 61.4 GB of node_modules across 118 project folders on a 512 GB MacBook. Xcode DerivedData was smaller. The Docker VM was smaller. Even the local Ollama cache came in behind. What made the reclaim survivable was the audit that ran before the delete pass. A find-and-sort gets the size column. A real audit gets four more, and those four tell you which folders are actually dead. The dev who wrote up the brtkwr.com cleanup that reclaimed 200 GB from a full disk captured it: "After the initial Docker/cache cleanup freed 150GB, going back and asking 'what else?' found another 75GB." Node modules are almost always the biggest slice of that second pass.
Why does node_modules disk space keep sneaking up on a dev Mac?
Every Node project ships its own private dependency tree. Even with hardlinks in pnpm and Bun, aggregate footprint scales with project count. A dev with two years of forks, take-home tests, tutorials, and abandoned side projects usually has 30 to 80 GB in node_modules alone.
It stays invisible because macOS Storage lumps it into System Data, not Applications or Documents. Finder's Get Info stalls on nested node_modules because it walks every child, training most developers to avoid those folders. The audit is the fix: one Terminal pass produces the whole picture in under two minutes.
What columns does a node_modules disk audit need?
Size alone is a bad signal. A 3 GB active Next.js repo you are shipping should stay. A 400 MB tutorial from a year ago should not. The columns that separate them are staleness, lockfile presence, and git state.
| Column | Why it matters | How to compute |
|---|---|---|
| Path | Identifies the owning project. | find ~ -type d -name node_modules -prune |
| Size | Ranks the reclaim per row. | du -sh <path> |
| Last-modified (project) | Proxy for "am I still working on this?" | stat -f '%Sm' <parent> |
| Lockfile presence | Confirms install can rebuild bit-for-bit. |
ls <parent>/*lock* |
| Git status | Flags uncommitted changes and patched deps. | git -C <parent> status --porcelain |
Every row that comes back with a lockfile, a last-modified date older than 90 days, and a clean git tree is a safe candidate for Trash. Rows missing a lockfile or with a dirty tree go into a manual review pile.
How do I run the audit read-only from the Terminal?
The full audit is five commands stitched into a single pass. Nothing mutates the filesystem. It just writes three files under /tmp you can inspect before deciding to delete.
# 1. Inventory every top-level node_modules under $HOME.
find ~ -type d -name node_modules -prune \
-not -path "*/.Trash/*" \
-not -path "*/Library/*" \
-not -path "*/.Time Machine/*" 2>/dev/null \
> /tmp/nm-paths.txt
wc -l /tmp/nm-paths.txt
# 2. Map each folder to a real disk size, sorted largest last.
xargs -I{} du -sh {} 2>/dev/null < /tmp/nm-paths.txt \
| sort -h > /tmp/nm-sizes.txt
tail -20 /tmp/nm-sizes.txt
# 3. For each top offender, compute the four judgement columns.
: > /tmp/nm-audit.tsv
while read -r size path; do
project=$(dirname "$path")
mtime=$(stat -f '%Sm' -t '%Y-%m-%d' "$project" 2>/dev/null)
lockfile=$(ls "$project"/package-lock.json \
"$project"/pnpm-lock.yaml \
"$project"/yarn.lock \
"$project"/bun.lockb 2>/dev/null \
| head -1 | xargs -I{} basename {} )
gitstate=$(git -C "$project" status --porcelain 2>/dev/null \
| wc -l | tr -d ' ')
gitflag=$([ "$gitstate" = "0" ] && echo clean || echo dirty)
printf '%s\t%s\t%s\t%s\t%s\n' \
"$size" "$mtime" "${lockfile:-none}" "$gitflag" "$path" \
>> /tmp/nm-audit.tsv
done < /tmp/nm-sizes.txt
# 4. Print the audit table, largest at the bottom.
column -t -s $'\t' /tmp/nm-audit.tsv | tail -30
The output is a table of size mtime lockfile gitflag path per row. Everything with lockfile != none, mtime older than 90 days, and gitflag = clean is a green light. Everything else needs a human look before it moves.
What does a real audit look like on a two-year-old dev Mac?
The numbers below are from a 512 GB MacBook, anonymised. The shape is what matters, not the exact totals. The audit ran in 78 seconds read-only.
| Bucket | Count | Total size | Median mtime | Lockfile ratio | Clean git ratio |
|---|---|---|---|---|---|
| Active client repos | 7 | 9.2 GB | 3 days | 7 of 7 | 4 of 7 |
| Personal apps in progress | 5 | 8.4 GB | 21 days | 5 of 5 | 3 of 5 |
| Read-once forks | 19 | 6.8 GB | 7 months | 18 of 19 | 19 of 19 |
| Hackathon and interview code | 14 | 12.7 GB | 8 months | 14 of 14 | 14 of 14 |
| Course and tutorial repos | 26 | 8.5 GB | 14 months | 24 of 26 | 26 of 26 |
| Abandoned side projects | 47 | 15.8 GB | 18 months | 45 of 47 | 47 of 47 |
| Total | 118 | 61.4 GB | mixed | 113 of 118 | 113 of 118 |
The bottom four buckets are the reclaim, all clean in git with a lockfile and a stale mtime. That is 43.8 GB with the confidence bar the audit provides. The top two buckets stayed put. Two active repos with dirty git state got a manual look because they had patch-package in devDependencies and I wanted to confirm the patches were committed before deleting the built tree.
Is a node_modules disk audit safe to run automated?
The read pass, yes. The delete pass, not quite. A cron that lists candidates is fine. A cron that deletes them silently is not.
The reason is intent, the one column the script cannot compute. A seasonal client repo you only touch in Q4 fails the 90-day filter every March. A take-home test you might reopen if the company calls back sits in the same bucket as tutorial code you will never open again. The audit surfaces the list; the human ticks the rows.
The safest scheduled version writes the table to a file and drops a Reminders app notification. The unsafest is a cron that rm -rf's anything older than 90 days. In between is the review-first pattern behind the CleanMyDev receipts-first philosophy: compute the audit, show it in a UI, delete only what a human ticks, route the delete to Trash so the rollback window is a week not a millisecond.
How do I move the confirmed candidates to Trash?
Do not rm -rf even when the audit gives you a green light on every column. APFS makes mv to Trash almost free, and the seven-day Trash window has saved more careers than any git reflog ever has.
# Pull the green-light rows out of the audit table.
# Green = has a lockfile, mtime older than 90 days, git tree clean.
awk -F'\t' \
'$3 != "none" && $4 == "clean" \
&& $2 < strftime("%Y-%m-%d", systime() - 90*86400)' \
/tmp/nm-audit.tsv > /tmp/nm-green.tsv
# Move each green-light node_modules to Trash with a date stamp,
# so the Trash entry is identifiable and easy to restore.
while IFS=$'\t' read -r size mtime lockfile gitflag path; do
project=$(basename "$(dirname "$path")")
mv "$path" ~/.Trash/nm-${project}-$(date +%Y%m%d) \
&& echo "trashed $size $path" \
|| echo "SKIP $path"
done < /tmp/nm-green.tsv
The mv is metadata-only on APFS so the delete pass finishes in seconds even on tens of gigabytes. Empty Trash a week later once nothing has broken. The pattern is the same one used in the DerivedData safety audit and the Move to Trash versus rm -rf writeup.
How does the audit differ across npm, pnpm, Yarn, and Bun?
The lockfile column changes name per package manager. The size column changes meaning too, because pnpm and Bun hardlink files from a global store instead of duplicating them per project.
| Package manager | Lockfile | node_modules shape | Reclaim per project |
|---|---|---|---|
| npm | package-lock.json | Full deep copy | Large (1 to 4 GB) |
| Yarn Classic v1 | yarn.lock | Full deep copy | Large (1 to 4 GB) |
| Yarn Berry (PnP) | yarn.lock | Often no node_modules |
Small, cache-only |
| pnpm | pnpm-lock.yaml | Symlinked to ~/Library/pnpm/store/v3 |
Mostly symlinks |
| Bun | bun.lockb | Hybrid, hardlinks by default | Medium (500 MB to 2 GB) |
If the audit shows a pnpm project with only 50 MB of node_modules, that is expected. The bytes are in the global pnpm store. The pnpm store cleanup playbook covers that second-half reclaim.
What breaks if the audit is wrong?
Three failure modes. A deleted node_modules with a locally linked dependency reinstalls from the registry instead of your link; npm link fixes it in a minute. A postinstall side effect writing config outside the folder reruns on reinstall. A project mid-migration with no committed lockfile resolves transitive versions differently on rebuild, which is the case the lockfile column exists to catch. All three are salvageable during the Trash window, which is why the delete step is a mv and not an rm.
Closing: an audit is a hedge against your own future self
find and du give you a size ranking. A five-column audit gives you a reclaim you can defend a week later, once you have forgotten which rows you ticked and why. If you want that audit surfaced in a UI with per-row metadata, Move to Trash by default, and a seven-day rollback window instead of five sequential Terminal commands, CleanMyDev is the $9.99 lifetime app that builds the receipt before it deletes anything. Path, size, mtime, lockfile, package manager, risk label, one tick per row.