I am passionate about open source, NLP, building efficient data pipelines, and improving search.
I am a fully certified AWS architect and engineer. With 15 years of experience working with AWS, I can help you build, modernize, and secure your cloud applications.
I have many Google Cloud certifications and years of experience helping clients build their apps on GCP. Whether it's AI or an ecommerce site, I can help you build it.
I have helped design multi-cloud architectures for high-availability and disaster recovery.
I have been a Natural Language Processing practitioner for about 15 years. I am the chair of the Apache OpenNLP project, a Java framework for NLP methods.
I love data pipelines and have set up many. I can help you set up pipelines to ingest, analyze, and index data to make it usable and searchable.
I am passionate about protecting PII and PHI in data pipelines.
What's data without search? I have experience with OpenSearch and Elasticsearch making data accessible and useful.
I still write code and it's still one of my favorite things to do. While helping you design your cloud applications, I also have the confidence to implement them.
If you design it, you should have a hand in building it. It shows confidence in your design.
I have helped clients adopt DevOps environments and cultures, and helped to streamline and automate their development processes.
Whether it's Terraform or CloudFormation, I believe strongly in infrastructure-as-code.
While most of my work is hands-on, I am available for advisory services around emerging technology, best practices, cloud architecture, and application development.
The technology landscape moves fast and your business success can depend on being prepared for the changes. AI/ML has clearly shown us that.
I have experience guiding companies, from startups to large corporations, with both utilizing emerging technology now, and helping to plan for its future adoption.
A Well-Architected review is an important part of a cloud architecture. A comprehensive review provides insights into your cloud posture and can identify areas of improvement.
I have experience performing Well-Architected reviews across many industries and would be glad to discuss a review of your cloud projects.
Great question, and one you should ask of anyone you consider hiring. Here's a couple reasons why I think I can help you. Of course, we'll talk about it to make sure! :)
For instance - I have over 15 years experience with AWS from back when S3 and SQS were about the only services. In those 15 years, I have passed all AWS certifications, been an AWS Community Builder, and written questions for the AWS certification exams. I have also published VM-based products to the AWS Marketplace.
I have worked with lots of clients across many industries, from e-commerce to natural resources to healthcare to shipping logistics to data and AI and government (and others). This broad experience helps me see the big picture allowing me to better understand your current position and needs.
I value sharing my work with the community whenever possible. Whether it's by contributing to open source software or presenting at conferences, a rising tide lifts all boats. It helps others learn, gives me the opportunity to meet new people, and allows others to get to know me.
I'm always happy to spend a few minutes discussing potential projects. I only take on projects where I am confident I have the skills to meet or exceed your goals.
I believe industry certifications provide an important method of encouragement and a valuable way for engineers to validate and showcase their experience and skills.
My AWS Certifications
I am fully AWS certified!
My AWS Certifications transcript
My Google Cloud Certifications
Visit my Google Cloud transcript and my Google Developers profile.
Building an NLP Model Training Pipeline with Apache OpenNLP and Apache NiFi
Apache Community Over Code
October 2024 - Denver, Colorado
Apache OpenNLP and LLMs – Where does OpenNLP fit in?
Apache Community Over Code
October 2023 - Halifax, Nova Scotia, Canada
Using Apache OpenNLP with OpenSearch k-NN Vector Search
Linux Foundation Open Source Summit
May 2023 – Vancouver, Canada
What’s New and Coming in Apache OpenNLP 2.0
ApacheCon
October 2022 – New Orleans, LA
Getting the most from your OpenSearch Contributions | Recording
Amazon Web Services OpenSearchCon
September 2022 – Seattle, WA
Searching for the right words: Bringing NLP Transformers to Apache Solr via Apache OpenNLP
Linux Foundation Open Source Summit
June 2022 – Austin, TX
Applied MLOps to Maintain Model Freshness on Kubernetes
Berlin Buzzwords
June 2021 – Virtual
From Training to Serving: Machine Learning Models with Terraform
HashiTalks
March 2021 – Virtual
Protecting the Healthcare Enterprise from PHI Breaches using Streaming and NLP
Strata Data
September 2019 – New York, NY USA
Leveraging Neural Networks and Learning-to-Rank in Document Workflows
Activate Search and AI Conference
September 2019 – Washington, DC, USA
Improving Organizational Knowledge with Natural Language Processing Enriched Data Pipelines
DataWorks Summit Washington DC
May 2019 – Washington, DC, USA
Using Sockeye Neural Machine Translation in a Streaming Pipeline
PyData Washington DC
November 2018 – McLean, Virginia, USA
Embracing Diversity: Searching Over Multiple Languages
Activate Search and AI Conference
October 2018 – Montreal, Quebec, Canada
Embracing Diversity: Implementing Multi-language Search
Haystack Search Relevance Conference
April 2018 – Charlottesville, Virginia, USA
I strongly believe in open source software. These are some of my contributions.
I have been involved with Apache OpenNLP as a user, developer, and PMC Chair for about 15 years. Apache OpenNLP is a solid NLP framework for Java. I'm an ASF Member.
Phileas is an open source library that provides many redaction capabilities. It supports redaction, anonymization, and logic controls to precisely redact PII.
I contribute to OpenSearch and am a maintainer of the User Behavior Insights component.
The Phinder PII Plugin for OpenSearch is a plugin that redacts PII from search results.
Philter is a turnkey PII redaction software. Philter is also available on the cloud marketplaces.
© 2025 Jeff Zemerick. All Rights Reserved.