Contact Jeff

Drop me a note. I usually respond within one business day.

Jeff Zemerick

Senior Engineer | Cloud, NLP & Search

25+ years designing high-stakes distributed systems, from NASA flight software and the FBI’s N-DEx to Fortune 500 enterprises and high-growth startups.

25+ Years Experience
17x AWS Certified
8x GCP Certified
17+ Conf. Talks
LinkedIn GitHub
Portrait of Jeff Zemerick

Summary

I’m a senior engineer specializing in cloud architecture, NLP, and search, with over 25 years of experience designing high-stakes distributed systems. My career spans mission-critical government work: software engineering and verification for NASA’s Mars “Curiosity” rover and the FBI’s N-DEx project, through Fortune 500 cloud transformations, data engineering, search relevance, and open-source leadership.

I specialize in multi-cloud architectures on AWS and GCP, scalable data pipelines, natural language processing, and search engineering. I hold 17 AWS certifications and 8 Google Cloud certifications, and I’ve been recognized as an AWS Community Builder. As PMC Chair of Apache OpenNLP and an Apache Software Foundation Member, I actively shape one of the most widely used NLP frameworks in the Java ecosystem.

Across every role I write production code, pair with engineering teams, and leave behind systems that are documented, tested, and fully owned by the people who inherit them.

Areas of Expertise

Cloud Architecture

Secure, highly-available, scalable multi-cloud architectures on AWS and GCP. Infrastructure as Code, HIPAA-compliant systems, DevSecOps, and Well-Architected governance.

Data Engineering

Scalable data pipelines using Apache NiFi, Kafka, Flink, Spark, and HDFS. End-to-end ingest, transformation, and delivery for analytics and ML workloads.

Natural Language Processing

NLP system design and model development with Apache OpenNLP, ONNX Runtime, and deep-learning models. Named-entity recognition, document classification, and ML pipeline automation.

Search Engineering

Search relevance engineering with OpenSearch, Elasticsearch, and Apache Solr. Learning-to-rank models, vector search, NLP-enhanced pipelines, and multi-language search.

Data Governance

Data governance frameworks, classification, retention policies, and compliance engineering (HIPAA, GDPR, CCPA). Automated data discovery and stewardship at scale.

Open Source Leadership

PMC Chair, Apache OpenNLP; ASF Member; OpenSearch UBI maintainer; AWS Community Builder. Speaker at international conferences on NLP, search, and cloud infrastructure.

Previous Clients

A selection of organizations across healthcare, finance, government, e-commerce, logistics, natural resources, and data & AI.

Professional Experience

  • Independent Consultant January 2019 – Present

    Mountain Fog, Inc.

    • Designing and implementing highly-available, secure cloud architectures on AWS and Google Cloud for clients ranging from high-growth startups to Fortune 500 enterprises across healthcare, natural resources, shipping, e-commerce, intelligence, and government.
    • Building efficient, maintainable data pipelines using open-source technologies.
    • Implementing search improvements through vector search, NLP, hybrid search, and data quality work.
    • Building NLP pipelines for entity recognition, document classification, and text processing using Apache OpenNLP and ONNX Runtime.
    • Building components for OpenSearch agentic search relevance and search engine migrations.
    • Emphasis on documentation and Infrastructure as Code for knowledge transfer and long-term team ownership.
  • Founder April 2023 – Present

    Philterd, LLC

    • Founded an open-source software company; built and published software adopted in commercial products including Graylog and available on major cloud marketplaces.
    • Trained and fine-tuned NLP models and large language models (LLMs); constructed synthetic datasets for model evaluation and testing.
    • Built automated data pipelines for continuous model training, evaluation, and deployment.
    • Software implemented in Java, Python, .NET, and Go.
  • Search Relevance Engineer June 2021 – November 2022

    OpenSource Connections, Charlottesville, VA

    • Improved performance and relevancy of search systems for enterprise clients across multiple verticals.
    • Provided search relevance training; participated in client discovery engagements to assess existing search architectures.
    • Contributed to open-source projects including ONNX model support for Apache OpenNLP.
    • Defined search KPIs, built offline search labs, and developed learning-to-rank (LTR) models.
  • Cloud Architect September 2017 – January 2019

    M&S Consulting, Inc.

    • Agile DevOps lead for an AWS infrastructure team of 15 engineers building commercial healthcare applications.
    • Led corporate transition to a DevSecOps approach; responsible for designing HIPAA-compliant cloud architectures.
    • Stack included microservices, API gateways, Apache Kafka, Flink, ZooKeeper, ActiveMQ, Hive, Spark, and HDFS.
  • Big-Data Engineer April 2017 – August 2017

    TeleTracking, Inc., Pittsburgh, PA

    • Designed and built end-to-end ingest systems to move data from edge devices into analytic environments.
    • Leveraged Apache NiFi and MiNiFi for complex data flow management; contributed features back to both projects.
    • Championed AWS CloudFormation to automate deployment of a layered, scalable data processing stack.
  • Sr. Software Engineer December 2013 – April 2017

    Leidos, Morgantown, WV

    • Proposed and designed cloud migration architectures and cost estimates for government clients, transitioning legacy systems to AWS and OpenStack.
    • Established and led weekly internal training on AWS for a team of engineers.
    • Served as lead engineer for risk-reduction efforts, developing early-stage prototypes to prove viability and secure stakeholder buy-in.
    • Built cloud-based systems using Go, Java, Spring Boot, Hibernate, and RESTful APIs.
  • Sr. Software Engineer November 2010 – December 2013

    Raytheon / SRA International, Clarksburg, WV

    • Provided software development, operations support, and maintenance for the FBI’s N-DEx project.
    • Engineered new system features and resolved critical defects to ensure mission-critical stability.
    • Designed and implemented Service-Oriented Architecture (SOA) solutions using SOAP and REST web services.
    • Stack included Java, Spring MVC, Hibernate, Oracle databases, JBoss Application Servers, and DataPower.
  • System Engineer April 2009 – November 2010

    Northrop Grumman / TASC, Inc., Fairmont, WV

    • Provided software engineering and verification for NASA manned and scientific missions, ensuring safety and reliability of flight systems.
    • Performed requirements and system analysis; used static and dynamic source code analysis to identify vulnerabilities in flight software.
    • Engineered custom analysis tools on the Eclipse platform to streamline verification and research dynamic code execution.
    • Honored with the NASA IV&V Special Act Award for technical contributions to the Mars Science Laboratory (MSL) “Curiosity” rover mission.
  • Founder & Software Developer May 2001 – December 2010

    Zemerick Software, Inc.

    • Designed and built specialized instant messaging management software serving over 10,000 customers worldwide, including Fortune 500 companies and government agencies.
    • Developed mission-critical monitoring and forensic logging tools used by parents, schools, and Internet Crimes Against Children (ICAC) law enforcement agencies.
    • Architected using the full Microsoft stack: .NET (C# and VB.NET), ASP.NET, SQL Server, and Windows Server.

Technical Skills

Programming Languages

Java Spring Boot Python Go Scala C# / .NET Bash

Cloud & Infrastructure

AWS Google Cloud Terraform Kubernetes Docker CloudFormation

Data & Streaming

Apache Kafka Apache NiFi Apache Flink Apache Spark HDFS Apache Hive

Search & NLP

OpenSearch Elasticsearch Apache Solr Apache OpenNLP ONNX Learning to Rank

Community & Open Source

Apache OpenNLP | PMC Chair

I serve as PMC Chair of Apache OpenNLP and as an Apache Software Foundation Member. I’ve contributed to the project for over 15 years, shaping release cadence, mentoring committers, and stewarding one of the most widely used NLP frameworks in the Java ecosystem.

OpenSearch | UBI Maintainer

I maintain the OpenSearch UBI (User Behavior Insights) project, which enables search relevance teams to capture and analyze implicit user feedback for learning-to-rank and A/B testing. I’m an OpenSearch member and active contributor to the project.

Open Source Software

I founded Phileas, an open-source text processing library available in Java, Python, and Go, adopted in commercial products including Graylog. I contribute to a broad range of NLP, search, and data infrastructure projects.

AWS Community Builder

Recognized as an AWS Community Builder. I’ve authored questions for AWS certification exams and published commercial products to the AWS Marketplace, staying close to both the practitioner community and AWS engineering standards.

Conference Presentations

A selection of talks at international conferences on NLP, search, cloud, and AI.

Outside of Work

I’m a wilderness guide at Deliberate Pace Guiding, leading backcountry trips.