Blog

HashiTalks 2021

For HashiTalks 2021 I was able to present a Terraform project that managed the training and serving of NLP models. Built in AWS and using ECS, S3, DynamoDB, SQS, Lambda, and EventBridge, the project provides a way to do automated containerized NLP model training. You can queue a model for training by describing the model […]

I am a 2021 AWS Community Builder for Machine Learning

I was selected as a 2021 AWS Community Builder for machine learning! The AWS Community Builders program selects participants to help share AWS knowledge and resources to the community through engagements such as blog posts, code, and videos. I was selected for machine learning so you can expect to see some upcoming machine learning on […]

The Fallacy of Avoiding Cloud Vendor Lock-In

I have worked with many companies to help them either migrate to the cloud or develop new cloud applications for over 10 years. A very common requirement is that the designed architecture avoid using any cloud vendor specific technologies or services. The rationale is usually that although we are running our application on vendor X […]

Querqy Chorus

For the past couple of months I have attended occasional presentations about Chrous, an open source stack for search, created by Querqy. The presentations have focused on the stack components of Apache Solr, SMUI (Search Management UI), the search relevancy tool Quepid, among others. There is a decent amount of search-related open source projects out […]

Some First Steps for a New NiFi Cluster

After installing Apache NiFi there are a few steps you might want to take before making your cluster available for prime time. None of these steps are required so make sure they are appropriate for your use-case before implementing them. Lowering NiFi’s Log File Retention Properties By default, Apache NiFi’s nifi-app.log files are capped at […]

A Tool for Every Data Engineer’s Toolbox

Collecting data from edge devices in manufacturing, processing medical records from electronic health systems, and analyzing text all sound like very different problems each requiring unique solutions. While that certainly is true there are some commonalities between each of these tasks. Each task requires a scalable method of data ingestion, predictable performance, and capabilities for […]