Contact Jeff

Drop me a note. I usually respond within one business day.

PII Privacy Consulting,
Built on Open Source

I’m the founder of Philterd, LLC, a consultancy specializing in PII privacy. I help clients design privacy-focused architectures for cloud and AI | and bring the engineering to build them.

Every engagement is backed by my own open source privacy toolkit: Philter, Arbiter, Phield, and others. That gives clients real leverage: auditable, portable, vendor-neutral PII protection platforms that scale with their cloud and AI workloads.

View My Resume
Portrait of Jeff Zemerick
Industries served
  • Healthcare
  • Finance
  • Government
  • E-commerce
  • Shipping & Logistics
  • Natural Resources
  • Data & AI
Partnerships
  • Amazon Web Services
  • Google Cloud
  • Microsoft Azure

From the Founder

I started Philterd after a healthcare client’s PII pipeline failed in production and the commercial DLP tool couldn’t tell us why. After 15+ years building NLP systems and 17 on AWS and GCP, the lesson was the same: organizations need PII protection that’s auditable, open, and theirs to inspect. The Philterd toolkit gives clients a vendor-neutral foundation; the consulting helps them assemble it into infrastructure their own engineers can extend.

Jeff Zemerick, Founder · Philterd, LLC

Governed Cloud Architecture

Resilient, cloud-native architectures on AWS and GCP where privacy is wired in from day one | federated multi-cloud designs, Well-Architected governance audits, automated PII discovery with AWS Macie and GCP Sensitive Data Protection, Infrastructure-as-Code guardrails, edge-level privacy interception via Lambda@Edge and CloudFront Functions, and end-to-end encryption.

Multi-Cloud Strategies

I design federated architectures that bridge AWS and GCP for high availability, disaster recovery, and freedom from vendor lock-in. Workloads stay portable while identity, networking, and data planes remain consistent across providers.

Well-Architected Governance

I audit environments against the AWS and GCP Well-Architected frameworks, evaluating security, reliability, cost, and operational maturity. The deliverable is a prioritized remediation roadmap that raises your posture without disrupting delivery.

Automated PII Discovery at Scale

I implement cloud-native discovery using AWS Macie and GCP Sensitive Data Protection (DLP) to continuously monitor S3 and Cloud Storage for unencrypted or misplaced PII. Organizations move from manual audits to real-time visibility and remediation.

Privacy-by-Design Infrastructure

I bridge DevOps and data privacy by embedding security guardrails directly into Infrastructure as Code. Service Control Policies, OPA rules, and automated IAM audits ensure your data environments are private by default, not by exception.

Edge-Level Privacy Interception

I deploy real-time privacy filters at the network perimeter using Lambda@Edge and CloudFront Functions. Telemetry and user queries are sanitized before reaching centralized data lakes, mitigating accidental PII ingestion and shrinking your compliance footprint.

Encryption & Key Management

I architect end-to-end encryption with AWS KMS, CloudHSM, and GCP Cloud KMS, including customer-managed keys, envelope encryption, and automated rotation. Field-level encryption keeps sensitive attributes protected at rest, in transit, and across multi-cloud boundaries.

Compliance Engineering

End-to-end data privacy for cloud and AI workloads | from data governance frameworks and NLP-driven PII/PHI discovery, through AI guardrails for generative and RAG pipelines, to mathematically provable differential privacy and utility-preserving anonymization. Built on the open source Philterd toolkit so the controls you ship are auditable, portable, and vendor-neutral.

Data Governance

I establish the data classification, retention, and stewardship frameworks that transform sensitive information from a compliance liability into a governed enterprise asset | aligned with GDPR, HIPAA, CCPA, and your industry’s regulatory baseline.

PII / PHI Discovery

Using advanced NLP and deep-learning models, I build automated PII and PHI detection across high-volume, unstructured text and document streams | powered by the Philterd toolkit so the entire pipeline stays auditable, tunable, and vendor-neutral.

AI Guardrails

I engineer secure gateways for generative AI and RAG pipelines that stop PII, PHI, and proprietary content from ever reaching the model layer | protecting both prompts and outputs against data exfiltration and prompt-injection attacks.

Local Differential Privacy

I implement Local Differential Privacy at the source, adding calibrated mathematical noise to vector embeddings and telemetry so individual records can’t be re-identified, preventing both membership-inference and vector-inversion attacks against your AI systems.

Differential Privacy

I move organizations from best-effort redaction to mathematically provable privacy, managing formal Privacy Budgets (ε) across queries and training pipelines to meet the strictest regulatory standards in healthcare, finance, and government workloads.

Utility-Preserving Anonymization

Using Masked Language Models, I perform context-aware entity replacement that preserves the semantic integrity of unstructured text | producing synthetic data twins that remain fully private yet effective for training LLMs and refining NLP models.

Industry Leadership

Shaping privacy and NLP at the standards level | leading the open source Philterd toolkit used in production privacy stacks, chairing Apache OpenNLP as PMC Chair and ASF Member, speaking at international conferences including OpenSearchCon, ApacheCon, and Berlin Buzzwords, and contributing as an AWS Community Builder and certification exam author.

Open Source Privacy Toolkit

Through Philterd, LLC I develop the open source stack behind every engagement | Philter, Phileas, Arbiter, Phield, and supporting tooling. The goal is simple: give the engineering community high-performance, vendor-neutral building blocks for PII protection across modern cloud and AI pipelines.

Global Technical Advocacy

I am a frequent speaker at international technology conferences, including OpenSearchCon, ApacheCon, and Berlin Buzzwords. I share architectural insights on the intersection of cloud infrastructure and data privacy, helping organizations navigate the evolving landscape of emerging technologies and regulatory requirements.

Apache Foundation Leadership

I serve as PMC Chair of Apache OpenNLP and as an Apache Software Foundation Member. Beyond shipping code, that means shaping release cadence, mentoring committers, and stewarding a project the broader Java NLP community has relied on for over a decade.

AWS Community Building

I’ve been recognized as an AWS Community Builder, authored questions for AWS certification exams, and published commercial products to the AWS Marketplace. Each role keeps me close to the engineers using the platform and to the standards that define what excellence looks like on AWS.

Strategic Engagements

Direct advisory and engineering services for organizations requiring elite-level cloud architecture and AI privacy. I provide the governance and implementation expertise needed to bridge the gap between innovation and compliance.

Deep Audit

A high-impact evaluation of your existing stack. I identify structural vulnerabilities in cloud configuration and data privacy, delivering a prioritized roadmap to remediate risk while maintaining operational velocity.

Fractional Leadership

Technical oversight for high-stakes transitions. As an embedded advisor, I provide the governance and architectural direction necessary to scale AI and cloud initiatives with absolute confidence.

Specialized Engineering

Custom solutions for mission-critical challenges. I design and deploy bespoke redaction layers and NLP pipelines for environments where off-the-shelf tools fail to meet scale or latency requirements.

The Philterd Open Source Toolkit

The open source stack behind every engagement | from low-level libraries to turnkey services. I’m the primary developer on each of these projects, so clients get a stack that is auditable, vendor-neutral, and yours to extend, backed by the maintainer who built it.

Discover

Map where sensitive data lives and how it moves before remediation begins. Phinder crawls files and storage to surface PII and PHI at rest; Phield monitors data flow across the organization and alerts on anomalies; PhEye hosts the AI and NLP detection models that power both.

Evaluate

Measure policy effectiveness and prove privacy guarantees. Philter Scope scores redaction policies on precision and recall against labeled test sets so tuning decisions are data-driven. Philter Diffuse applies differential privacy to analytics, letting organizations derive insights from sensitive datasets while meeting formal privacy requirements.

Full toolkit at philterd.ai/open-source-software.

Frequently Asked Questions

Quick answers to the questions most clients ask before our first call.

How long are typical engagements?

Anything from a one-week Deep Audit to multi-month embedded engagements. Most clients land in the 3–6 month range.

What engagement models do you offer?

Hourly advisory, fixed-scope audits, and fractional retainers (typically 1–2 days per week through Philterd, LLC).

Can you sign an NDA?

Yes | before any technical detail is shared. Mutual NDAs are standard and I’m happy to use yours or provide one.

What do I get at the end of an engagement?

Documentation, runbooks, and a system your own engineers can extend. The toolkit underneath is open source, so there’s no licensing or vendor dependency on me once the work ships.

Do you write code, or just advise?

Both. I pair with your developers, contribute production-grade code to your repos, and ship features alongside the team | not just diagrams and decks.

What does a first call cover?

About 30 minutes to understand your stack, current privacy posture, and where the biggest risks or unknowns sit.

My Path to Privacy Consulting

I started out as a software engineer | JVM stack, search, NLP | drawn to the problems that lived in unstructured text. That work pulled me into Apache OpenNLP, a project I’ve contributed to for fifteen years and now serve as PMC Chair.

The cloud era turned infrastructure into the rest of my craft. Seventeen years on AWS and GCP, sixteen AWS certifications, recognition as an AWS Community Builder, and a stint authoring questions for the certification exams themselves. Cloud engineering stopped being adjacent to my software work | it became the foundation underneath it.

Privacy followed. As cloud, AI, and NLP converged in the same pipelines, the PII risk got impossible to ignore | and the NLP techniques I’d spent a decade refining were exactly what was needed to identify and protect sensitive data at scale. Consulting is where the three threads finally lined up against a single problem, and Philterd is the open source backbone underneath the work.

See my full resume →

My Experience and Philosophy

Everything below is the background behind the consulting work above | the experience, certifications, conference talks, and open source projects that support why you should hire me.

Deep Experience

My background isn’t just cloud | it’s software engineering and architecture. I write code daily and have worked in search, AI/ML, and NLP long before it was fashionable. 17 years on AWS, every current certification, AWS Community Builder, and I’ve even written questions for the AWS certification exams themselves.

Knowledge Transfer & Empowerment

My goal at the end of every engagement is making myself optional. I document decisions, write runbooks, train your team, and leave behind systems they can evolve confidently | because the best privacy platforms are the ones your engineers fully understand and can extend on their own.

My Open Source Contributions

The Philterd open source toolkit is the foundation of every privacy engagement | alongside long-running contributions to Apache OpenNLP and OpenSearch.

Phileas

Open source library that provides the redaction, anonymization, and policy-control primitives used by Philter to precisely redact PII.

Arbiter

Part of the Philterd open source toolkit, extending the privacy stack with additional capabilities for PII-aware pipelines.

Phield

Part of the Philterd open source toolkit, providing additional building blocks for privacy-focused architectures.

Apache OpenNLP

User, developer, and PMC Chair for ~15 years. Apache OpenNLP is a solid NLP framework for Java. I’m an ASF Member.

My Previous Clients

A selection of organizations across healthcare, finance, government, e-commerce, logistics, natural resources, and data & AI where I’ve helped design, secure, and ship cloud and privacy infrastructure.

Ready to get started?

Tell me a bit about your project. I usually respond within one business day.