scrubfile is a local document redaction tool focused on removing personally identifiable information (PII). Its website highlights β100% Local,β no cloud APIs, no network calls at runtime, and support for a CLI, Python API, and JSON output. One important caveat: the homepage claims support for PDFs, images, DOCX, and automatic PII detection, but the detailed documentation says Phase 1 only supports redacting explicitly specified sensitive terms in PDFs. Images, DOCX, OCR, automatic detection, and the MCP Server are still on the roadmap, so the currently verifiable capability should be considered PDF-focused.
In terms of protection type, scrubfile is not a traditional gateway or endpoint security product; it is a document-level data redaction tool. For PDF processing, it uses PyMuPDF to locate text positions, add redaction annotations, and execute apply_redactions, removing text from the content stream rather than simply covering it with black boxes. It also clears standard and XMP metadata and saves files with garbage=3 and deflate enabled. Output file permissions are set to 0o600, and the CLI/JSON output masks original PII in the form of [TERM-1], reducing the risk of sensitive data leaking through logs. SSNs and US phone numbers support common format variants, but names, emails, and addresses are still primarily handled through exact matching.
Deployment is developer-friendly: it runs in a local Python 3.10+ environment, can be installed via pip/GitHub, and provides the scrubfile command, a Python redact() API, and machine-readable JSON output. Management and alerting are lightweight, mainly covering exit codes, processing status, redaction counts, and affected page counts. It does not provide centralized policy management, an audit platform, or alert integrations. For integration, it is well suited to scripts, batch jobs, CI/CD, or local data-processing pipelines. Although the MCP Server is marketed as Agent-ready, it is still listed as planned in the roadmap.
The pricing page shows Free, with no disclosed commercial edition or paid support. No compliance certifications are mentioned. A key point to watch is its dependency on PyMuPDF, which the documentation marks as AGPL-3.0 licensed. Private local use is not restricted, but if you distribute the tool, binaries, or a network service, you should assess open-source license compliance risks.
Its strengths are a clear privacy boundary, offline operation, no echoing of sensitive terms in output, and permanent removal from the PDF content stream, which is more reliable than visual masking. Its limitations include limited current support for scanned documents, text inside images, fuzzy matching, non-English PII recognition, and handling untrusted input from the internet. It is best suited for security teams, legal/HR teams, and data engineers who need to batch-redact PDFs locally. If you need enterprise-grade DLP, Chinese OCR, centralized auditing, and policy governance, you should still consider Adobe Acrobat, Google DLP, Presidio, or local Chinese PDF/DLP/data redaction solutions. China access and payment information are not disclosed, and GitHub/PyPI availability may depend on the network environment.
β This review is compiled from public sources and does not constitute a purchase recommendation. Verify all facts on the vendor's official site. Verify on scrubfile.com official site.
scrubfile.com is an Unknown Cybersecurity provider. TG4G tracks its product information, with monthly pricing from $240.00, an overall rating of 8.0/10, and a China-accessibility score of China direct-connect friendly. Click "Visit Official Site" to reach scrubfile.com directly.