Sean Knapp, CEO of Ascend.io, discusses data pipelines and data pipeline automation. Sean spoke with Host Robert Blumen about the ubiquity of data pipelines; what data pipelines do; where the data comes from, how it is transformed, where it goes; and what it is used for (analytics, machine learning, reporting, alerting, business intelligence). Semi-automated and ad-hoc automation. Costly manual recovery from failure modes. Partial failures and bulk redo. Pipeline automation. Why automate? The orchestration layer; architecture of the orchestration layer. What type of state does the orchestration layer keep? Failure modes and optimizing redo. Monitoring pipelines. Privacy and pipelines. Pipeline automation-as a-services.
Show Notes
Related Links
- SE Radio 198: Wil van der Aalst on Workflow Management
- SE Radio 351: Bernd Rücker on Orchestrating Microservices with Workflow Management
- SE Radio 289: James Turnbull on Declarative Programming
- Ascend.io
- Sean Knapp on Twitter, LinkedIn
- Rebuilding Reliable Data Pipelines Through Modern Tools by Ted Malaska
- Data Pipelines with Apache Airflow by Bas Harenslak and Julian de Ruiter
- Pipeline Driven by Roy Osherove
SE Radio theme: “Broken Reality” by Kevin MacLeod (incompetech.com — Licensed under Creative Commons: By Attribution 3.0)