Kestra.io — Powerful Declarative Workflows

Jack P
6 min readMay 17, 2024

--

Hello World!

In today’s fast-paced digital landscape, businesses are constantly seeking ways to streamline their operations and enhance productivity. As the complexity of IT environments grows, the need for efficient coordination and development of data pipelines, automation pipelines, and general workflows becomes more critical than ever. This is where orchestration software steps in as a game-changer. In this blog post, we will delve into a popular orchestration software named Kestra that offers a declarative approach to building workflows, as well as an intuitive UI that welcomes experienced developers, as well as those that are new to development.

Let’s dive in!

What is Kestra?

Kestra is an open-source, scalable, orchestration that enables teams to come together and build business-critical workflows. These workflows are unique in that they can utilize hundreds of built-in plugins supplied by the open source community and the Kestra team. Additionally these workflows are also developed in a unique declarative-YAML format, and users have access to an incredible UI that lets you drag and drop “tasks” into workflows, and do so much more. In Kestra, you can utilize an embedded VS Code Editor and develop within your Kestra instance.

What is so appealing about Kestra?

I have been reading about Kestra over the past few months after hearing their talented product team speak at the Airbyte data conference. What stuck out to me was their strong Google Cloud Platform (GCP) support, their well-polished and well-performant UI, and their ‘language-agnostic’ approach. On top of this the community of Kestra is welcoming, and questions are answered very quickly and the documentation for Kestra is nothing short of excellent, and it continues to improve on a daily basis, their Product and Developer Relations teams are always updating documentation and creating new blueprints, which are code examples of workflows.

Kestra has an incredible GCP plugin network. Users can run tasks with the BigQuery API in a few lines in a YAML with little coding knowledge, such as loading files in a GCS bucket to BigQuery or querying data from the BigQuery API that will be used in downstream tasks.

Since running a proof of concept on Kestra, I have been able to move a lot of Python code out of our current repo, and into simple blocks of lines of YAML code, which has simplified the process for our team. Kestra also gives users the ability to run parallel tasks within GCP Batch Task Runner. Basically you can run parallel dockerized virtual machines (VMs) in parallel, woohoo!

The UI of Kestra is visually appealing, and more importantly, it performs very well and offers many functionalities, such as: searching for specific logs, viewing all executions in a central place, a really cool Gant chart for running processes, an incredible DAG in their Topology tab, and much more.

Kestra Execution UI

The language agnostic approach of Kestra is also very cool! We are in a position now where we can run Python, and R-Script from the same Workflow with ease.

Need inspiration on what great documentation looks like? Look no further, check these Kestra docs out! Our team was able to get a GCP VM running Kestra in under an hour and we were building pipelines and workflows on the same day. Kestra also has a great set of videos and intro documentation that taught us the basics of Kestra in under an hour, and we were playing with Pebble templating, Inputs, Variables, Outputs, Tasks, Subflows, Namespaces, and much more in little to no time.

Do you ever have trouble generating ideas for a pipeline or how to build an effective task line-up? Kestra’s team generously has numerous workflow blueprints. Whether it’s doing a bare minimum pipeline of loading local CSV files to a database, or running multiple FiveTran syncs and then finishing with dbt, Kestra has a blueprint for it.

Our Kestra POC

We ran our Kestra POC with our Google Ads Pipeline. This extensive pipeline runs 15 different jobs, loads the data to Google Cloud Storage (GCS), and then loads it from GCS to BigQuery (BQ). Lastly, a dbt core job is run.

For the POC, we simplified our Google Ads jobs and shrank the current setup from 1000s of lines of code to only be 100s of lines, thanks to Kestra YAML workflows being decoupled from the Python files. All we had to write in Python was to just fetch Google Ads data from the API and validate/land the data to GCS. We did not have to write code to run the actual Load to BQ, but instead just added a simple Kestra type from the GCP plugin. The whole process of building a working 15 Google Ads Pipeline jobs took under 1 day, thanks to the great documentation, helpful community, and easy declarative YAML files.

Google Ads Kestra Topology

Our pipelines were faster in GCP Batch service compared to Cloud Run Jobs, and they used even less Memory and CPU. I attribute this to the simplified code, and simplified design of how Kestra Task Runners only poll VMs as a whole.

The development time of this pipeline was fast too, the actual workflow development within Kestra’s YAML was easily done.

Conclusion

In this blogpost, I have spoken on the orchestration software, Kestra. I have talked about how it is Language Agnostic, and has a declarative approach to orchestrating your workflows and pipelines. I have commented on their easy-to-navigate, and extensive documentation, as well as their incredible blueprint bank. On top of this, they have an intuitive UI where technical and non-technical users can build simple workflows with YAML with an embedded VS Code Editor, local VS Code Editor, or through an easy-to-use drag-and-drop interface. Lastly, I have spoken about how we have had a successful and fast POC that was made possible by the great documentation, and responsive community.

I am sold on the scalability we will have with Kestra, as well as the consistent, and strong development that will consistently come from their Engineering and Product teams.

The orchestration-scape is filled with many incredible, battle-tested tools, such as Prefect, AirFlow, and more. These tools all have their own strengths and weaknesses, and with the right development can be very effective, but I found that for my team’s specific needs that Kestra is going to:

  • help us scale our pipelines with minimal development intervention needed with the new efficient Task Runners and GCP Batch option
  • increase the speed and efficiency of the development of our pipelines and workflows with their intuitive UI and the decoupled functionality between workflows and our Python extraction code
  • bring together my team in an easy to understand UI, and easy-to-develop-in UI
  • give us the option to easily integrate other languages into our stack, such as R-Script

Thank you for reading, happy coding, and most importantly…

Go checkout Kestra!

--

--