we scanned every skill on clawhub. here's what we found.

2026-02-16

last week someone forked my governance repo to distribute malware. that got me thinking about the broader supply chain. not github — the AI tool ecosystem specifically. clawhub has 7,500+ skills that people install and run with their AI agents. what’s actually in them?

i decided to find out.

the setup

important caveat upfront: this is static text analysis. regex patterns against file contents. it catches what’s written in the code and documentation — hardcoded secrets, shell commands, suspicious patterns. it does not catch a skill that exfiltrates data through legitimate-looking API calls, or one that behaves differently at runtime than its source suggests. static analysis is a floor, not a ceiling.

i wrote a scanner that pulls every skill from the clawhub registry, downloads the scannable files (SKILL.md, source code, configs), and runs 40 regex patterns against them. the patterns come from published research — snyk’s toxicskills report, cisco’s skill scanner, kaspersky’s work on indirect prompt injection. the categories cover the things you’d worry about: prompt injection, credential exfiltration, reverse shells, hardcoded secrets, obfuscated payloads, data exfil endpoints.

the whole thing runs in docker with --network=none. no skill code ever touches a live network. download phase pulls the files, then the container goes fully airgapped for analysis.

first pass: 7,522 skills scanned. 4,931 findings across 746 skills. 61% estimated false positive rate.

that’s a lot of noise.

the noise problem

the top triggered pattern was api_key|token|secret with 1,923 hits. almost all of them are documentation. every API integration skill has a SKILL.md that says something like “set your STRIPE_API_KEY environment variable” or shows a curl example with $TOKEN. that’s not exfiltration. that’s a readme.

the second noisiest: webhook with 979 hits. the word “webhook” appears in every integration skill that handles callbacks. only a handful of actual exfiltration indicators — requestbin, pipedream, hookbin, burpcollaborator — were in there. the rest is just the word “webhook.”

~/.config triggered 400 times. almost all benign. only ~/.ssh, ~/.env, and ~/.aws matter. ~/.config/openclaw/calendar.json is not a threat.

so i built a second pass.

deep analysis

the second-pass analyzer reads every finding, pulls the surrounding context from the actual downloaded file, and applies validation rules:

each finding gets classified into a threat category (prompt injection, RCE, credential exfil, etc.), assigned a validation status (confirmed, likely, uncertain, benign, noise), and scored.

result after deep analysis: 4,931 findings reduced to 1,397 actionable. 72% noise reduction. 16 confirmed threats. 284 likely threats.

what the “confirmed threats” actually are

here’s where it gets interesting. i manually reviewed every confirmed threat. all 16 of them.

zero are malicious.

the “likely” threats

the 284 likely threats are almost entirely curl commands with API tokens in SKILL.md files. curl -s -H "X-N8N-API-KEY: $N8N_API_KEY" is not credential exfiltration — it’s API documentation. the scanner correctly identifies that a SKILL.md file is instructing an AI agent to use curl with credentials. but that’s… what API skills do.

the few genuinely interesting ones:

clawhub’s own moderation

clawhub flags 978 skills as “suspicious.” we found findings in 746 skills. the overlap? 315 skills. jaccard agreement: 22%.

after removing noise, it drops to 10% agreement. they’re flagging different things than we are, and neither system is flagging the right things with high confidence.

clawhub missed antigravity-quota-1-1-0 (the real credential leak). they flagged calendar-reminders (a false positive). the moderation layer exists but it’s not doing what you’d hope.

what this actually means

the clawhub ecosystem is cleaner than the headlines would suggest. out of 7,522 skills:

that last point comes with caveats. this is what’s in the registry right now. anything truly malicious may have already been pulled. and static analysis only catches what’s visible in text — a skill designed to be subtle wouldn’t show up in regex patterns. this is a lower bound on safety, not a guarantee.

the security tools on clawhub are actually pretty good. guardian-angel, indirect-prompt-injection, agent-tinman — these are doing exactly what they should be doing: documenting threat patterns so AI agents can recognize and block them. the irony is that good security documentation triggers security scanners.

but “clean today” doesn’t mean “clean tomorrow.” the registry is growing. skills are getting more powerful — more tool access, more credential scopes, more filesystem interaction. the attack surface is expanding even if nobody’s exploiting it yet.

what we’re doing about it

we’re running a nightly differential scan. pull the manifest, diff against yesterday’s, download and analyze only new or updated skills. if something changes, we’ll know within 24 hours.

the deeper question — the one i keep coming back to — is whether regex-based scanning is the right tool for this at all. the gap between “pattern match says this is RCE” and “a human reads the context and sees it’s docker build docs” is enormous. the next version of this pipeline probably needs an LLM in the loop: feed each finding its surrounding context and ask “is this genuinely malicious intent?” that’s where the real signal lives.

for now, the data says the ecosystem is healthy. but the tooling to verify that didn’t exist until this week.


interested in working together? let's talk

← back to writing