Setting Up Grafana with Docker 📊

So, Grafana is, like, your go-to tool for making charts and dashboards that really grab attention. 📊 It’s the kind of thing that gets people talking about data in a way that, you know, doesn’t feel so dull. Anyway, let’s jump into getting it up and running with Docker! 🐳 Step 1: Toss Together a Compose File 📄 Alright, the first thing you’ll need is a little file called docker-compose.yml. It’s basically like a recipe for Docker. 🍴 Here’s what it might look like if you’re keeping things simple: ...

December 10, 2024 · 2 min · 374 words · Wes

Spinning Up a PostgreSQL Instance for Your Data Stack

Well, we’ve done it. We have our orchestration layer set up with Airflow, a documentation server, and a notification server. Now it’s time to add a data warehouse to our stack. As I’ve mentioned before, it doesn’t really matter what you use as long as your data is centralized. Having data scattered across multiple sources only forces you to write a bunch of ETLs to bring it back together again, which is unnecessarily time-consuming. I try to keep all of my development on a single database instance locally and then commit to a single cloud DB instance for analytics. ...

November 22, 2024 · 3 min · 558 words · Wes

Airflow Setup

Well, here it is: getting Apache Airflow up and running! 🎉 To begin, you’ll want to familiarize yourself with the example docker-compose.yaml that Apache provides in the Airflow repository: 🔗 Official Docker Compose Example 🔍 Overview of the Services, Environment Variables, and Volumes The Apache-provided docker-compose.yaml file is described as: “Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL.” This setup provides a fully functional Airflow instance using a CeleryExecutor, supported by Redis for task queuing and PostgreSQL for the metadata database. Below is a breakdown of each service in the stack and its role: ...

November 21, 2024 · 6 min · 1179 words · Wes

Gotify Setup

Next up on our data stack journey is the push notification server. I’ll be using multiple notification channels, hoping that I’ll pay attention to at least one when something critical goes down. Alerting is a key piece of analytics infrastructure for me—data should come to you, not the other way around. So, what are doing here? We’re bridging the gap between the data stack and the real world by setting up a notification system to deliver push notifications straight to my phone. Once my Airflow server is up and running, I’ll create a custom function to send alerts using this server. ...

November 20, 2024 · 2 min · 415 words · Wes

Bookstack Setup

Welcome to the second post in my data stack setup journey! 🎉 My goal is to hustle through setting up all the components so I can get to the good stuff: data and analytics engineering. But first, I need a documentation server, because, let’s be real—I’m already forgetting where I’ve documented things. 🙃 What is Bookstack? BookStack describes itself as a “simple, self-hosted, easy-to-use platform for organizing and storing information.” I like to think of it as: a wiki that’s easy to host, simple to back up, and doesn’t make you cry. ...

November 20, 2024 · 4 min · 815 words · Wes

Let's Talk About Data Stacks

I’m in the process of spinning up a data stack on my home servers for portfolio projects, and I thought I’d share the journey. What Is a Data Stack, and Why Do I Need It? A data stack is a collection of tools, technologies, and platforms that work together to manage, process, and analyze data. If you need to move data between multiple systems into a centralized location for analytics, dashboards, reporting, or other uses, you’ll quickly encounter challenges like tech debt and messy workflows. ...

November 19, 2024 · 6 min · 1203 words · Wes