I am passionate about open source, NLP, building efficient data pipelines, and improving search.
I am a fully certified AWS architect and engineer. With 15 years of experience working with AWS, I can help you build, modernize, and secure your cloud applications.
I have many Google Cloud certifications and years of experience helping clients build their apps on GCP. Whether it's AI or an ecommerce site, I can help you build it.
I have helped design multi-cloud architectures for high-availability and disaster recovery.
I have been a Natural Language Processing practitioner for about 15 years. I am the chair of the Apache OpenNLP project, a Java framework for NLP methods.
I love data pipelines and have set up many. I can help you set up pipelines to ingest, analyze, and index data to make it usable and searchable.
I am passionate about protecting PII and PHI in data pipelines.
What's data without search? I have experience with OpenSearch and Elasticsearch making data accessible and useful.
I still write code and it's still one of my favorite things to do. While helping you design your cloud applications, I also have the confidence to implement them.
If you design it, you should have a hand in building it. It shows confidence in your design.
I have helped clients adopt DevOps environments and cultures, and helped to streamline and automate their development processes.
Whether it's Terraform or CloudFormation, I believe strongly in infrastructure-as-code.
While most of my work is hands-on, I am available for advisory services around emerging technology, best practices, cloud architecture, and application development.
The technology landscape moves fast and your business success can depend on being prepared for the changes. AI/ML has clearly shown us that.
I have experience guiding companies, from startups to large corporations, with both utilizing emerging technology now, and helping to plan for its future adoption.
A Well-Architected review is an important part of a cloud architecture. A comprehensive review provides insights into your cloud posture and can identify areas of improvement.
I have experience performing Well-Architected reviews across many industries and would be glad to discuss a review of your cloud projects.
Great question, and one you should ask of anyone you consider hiring. Here's a couple reasons why I think I can help you. Of course, we'll talk about it to make sure! :)
For instance - I have over 15 years experience with AWS from back when S3 and SQS were about the only services. In those 15 years, I have passed all AWS certifications, been an AWS Community Builder, and written questions for the AWS certification exams. I have also published VM-based products to the AWS Marketplace.
I have worked with lots of clients across many industries, from e-commerce to natural resources to healthcare to shipping logistics to data and AI and government (and others). This broad experience helps me see the big picture allowing me to better understand your current position and needs.
I value sharing my work with the community whenever possible. Whether it's by contributing to open source software or presenting at conferences, a rising tide lifts all boats. It helps others learn, gives me the opportunity to meet new people, and allows others to get to know me.
I'm always happy to spend a few minutes discussing potential projects. I only take on projects where I am confident I have the skills to meet or exceed your goals.
I am Jeff and I am an independent engineer and consultant. I am a certified cloud architect, with a background is in software engineering, DevOps, big-data, and natural language processing. I have earned many AWS and Google Cloud certifications. I am a software engineer at heart, but I enjoy everything from cloud architecture, to data and search, and, of course, AI/ML.
I am available to hire for consulting. I can help with your cloud (AWS/Google Cloud) architecture, software development, data pipelines, search, and NLP projects.
I am proud to have a lot of AWS and Google Cloud certifications. Here are my certifications. My first AWS certifications were obtained in 2014. Back then the certifications had issue numbers! I am now fully AWS certified having achieved all certifications.
I have been an AWS SME (starting in February 2020) and an AWS Lead SME, which means I helped contribute to the development of the AWS certifications exams. I have found there is no better way to learn because writing wrong answers requires an in-depth knowledge of AWS! I "retired" as an AWS SME in 2023 to let others share in the rewarding opportunity.
I was an AWS Community Builder in 2023, 2022, and 2021, for Machine Learning.
It's one thing to have good data, but it's only as good as your ability to efficiently retrieve it. I help folks make "search better" whether it's Amazon OpenSearch, Elasticsearch, or something else. Check out my work on User Behavior Insights and in OpenSearch.
I have worked in many industries from government, security, healthcare, logistics, and education. I first started with cloud when AWS first launched around 2008.
My interest in natural language processing was sparked in 2012 when I was working with unstructured data and learning about the challenges in making it useful. That led me to the Apache OpenNLP project where I started contributing. Today, I am the chair of the Apache OpenNLP PMC (which just means I submit the project reports ;). I owe the Apache OpenNLP team a lot of gratitude for their help and guidance over the years. I added ONNX Runtime support to Apache OpenNLP to facilitate the use of large-language models from Java. Check out the blog post I wrote on the Microsoft website!
As a programmer, I started with QBasic, QuickBasic, and Visual Basic for DOS. I picked up C++ through Visual Studio 6 before moving to .NET. I moved to mostly Java around 2010 and have been there since. I also develop in Python and Go, and have experience with Scala. I do a lot of Bash, CloudFormation, and Terraform scripting and use Linux almost exclusively. Ubuntu has been my first pick since it was versioned using single digits and mailed out on CD-ROMs!
Check out my conference presentations below for a general idea of my favorite tools and areas.
Today, I do consulting primarily in the areas of cloud, data, search, and NLP through my company Mountain Fog. I enjoy tackling challenges around data ingestion, search, and AI/ML because those tasks often fall directly in the intersections of my primary interests.
I believe search is often overlooked yet very important because without the ability to efficiently locate data, the data is worthless. Vector search is now adding a new dimension to search - pun totally intended.
I am the maintainer of the Amazon OpenSearch User Behavior Insights plugin, and I try to be active in the OpenSearch community.
I created and maintain the Phileas project, along with the Philter software, under Philterd, LLC. Redacting PII and PHI from text is very challenging but equally important.
My Work
Through Mountain Fog, I offer consulting services in the areas of cloud (AWS/GCP), data pipelines, search, and natural language processing (NLP).
I started Philterd to provide AI-powered software to redact PII. Learn more at www.philterd.ai.
Open Source
I am passionate about open source and try to contribute whenever possible. Where I'm most active:
Apache OpenNLP - I am a committer and PMC chair of the Java NLP library.
OpenSearch User Behavior Insights Plugin - I am a maintainer and developer of the OpenSearch plugin for search relevance.
Phileas - I am the creator and owner of the Java library for finding and redacting PII.
There are a lot of other PII-focused open source projects I've made on GitHub.
I believe industry certifications provide an important method of encouragement and a valuable way for engineers to validate and showcase their experience and skills.
My AWS Certifications
I am fully AWS certified!
My AWS Certifications transcript
My Google Cloud Certifications
Visit my Google Cloud transcript and my Google Developers profile.
Building an NLP Model Training Pipeline with Apache OpenNLP and Apache NiFi
Apache Community Over Code
October 2024 - Denver, Colorado
Apache OpenNLP and LLMs – Where does OpenNLP fit in?
Apache Community Over Code
October 2023 - Halifax, Nova Scotia, Canada
Using Apache OpenNLP with OpenSearch k-NN Vector Search
Linux Foundation Open Source Summit
May 2023 – Vancouver, Canada
What’s New and Coming in Apache OpenNLP 2.0
ApacheCon
October 2022 – New Orleans, LA
Getting the most from your OpenSearch Contributions | Recording
Amazon Web Services OpenSearchCon
September 2022 – Seattle, WA
Searching for the right words: Bringing NLP Transformers to Apache Solr via Apache OpenNLP
Linux Foundation Open Source Summit
June 2022 – Austin, TX
Applied MLOps to Maintain Model Freshness on Kubernetes
Berlin Buzzwords
June 2021 – Virtual
From Training to Serving: Machine Learning Models with Terraform
HashiTalks
March 2021 – Virtual
Protecting the Healthcare Enterprise from PHI Breaches using Streaming and NLP
Strata Data
September 2019 – New York, NY USA
Leveraging Neural Networks and Learning-to-Rank in Document Workflows
Activate Search and AI Conference
September 2019 – Washington, DC, USA
Improving Organizational Knowledge with Natural Language Processing Enriched Data Pipelines
DataWorks Summit Washington DC
May 2019 – Washington, DC, USA
Using Sockeye Neural Machine Translation in a Streaming Pipeline
PyData Washington DC
November 2018 – McLean, Virginia, USA
Embracing Diversity: Searching Over Multiple Languages
Activate Search and AI Conference
October 2018 – Montreal, Quebec, Canada
Embracing Diversity: Implementing Multi-language Search
Haystack Search Relevance Conference
April 2018 – Charlottesville, Virginia, USA
I strongly believe in open source software. These are some of my contributions.
I have been involved with Apache OpenNLP as a user, developer, and PMC Chair for about 15 years. Apache OpenNLP is a solid NLP framework for Java. I'm an ASF Member.
Phileas is an open source library that provides many redaction capabilities. It supports redaction, anonymization, and logic controls to precisely redact PII.
I contribute to OpenSearch and am a maintainer of the User Behavior Insights component.
The Phinder PII Plugin for OpenSearch is a plugin that redacts PII from search results.
Philter is a turnkey PII redaction software. Philter is also available on the cloud marketplaces.
© 2025 Jeff Zemerick. All Rights Reserved.