March 19, 2025

Born Again Release 😈🦸‍♂️

Born Again Release 😈🦸‍♂️

Welcome to the Born Again release, a significant update that brings new life to your data engineering workflows. This release is packed with exciting features designed to enhance your data integration, streaming capabilities, and pipeline resilience.

🚀 Simplified and Enhanced Spark Integration

Experience effortless PySpark integration with our latest enhancements. This feature simplifies the integration of Apache Spark, a unified analytics engine for large-scale data processing, into your workflows. Spark is particularly useful for data engineers, allowing rapid querying, analysis, and processing of big data across various sources and languages like Python, Scala, and R. Our enhancements streamline the collaboration between data engineering and data science teams, making it easier to integrate machine learning solutions with Spark.

📜 Doc: https://docs.mage.ai/integrations/compute/spark-pyspark#mage-pro:-effortless-pyspark-integration

🚀 Enhanced BigQuery Sink in Streaming Pipeline

Our BigQuery sink for streaming pipelines has been significantly upgraded, offering improved performance and reliability for real-time data ingestion into Google BigQuery.

🔄 OracleDB CDC Integration

Harness the power of Change Data Capture (CDC) with our new OracleDB integration. Capture and replicate data changes in real-time, ensuring your downstream systems always have the most up-to-date information.

📜 Doc: https://docs.mage.ai/guides/streaming/sources/oracledb#oracledb

🛡️ Pipeline-Level Retry Configuration

Take control of your pipeline's resilience with our new pipeline-level retry configuration. Set up comprehensive retry policies to handle transient failures and ensure your data workflows remain robust and reliable.

📜 Doc: https://docs.mage.ai/orchestration/pipeline-runs/retrying-block-runs#3-pipelne-level-retry-config

📈 Stateful Streaming Pipeline

Elevate your streaming data processing with our new stateful streaming pipeline feature. Maintain and update state across multiple events, enabling complex event processing and aggregations over time.

📜 Doc: https://docs.mage.ai/guides/streaming/tutorials/streaming-stateful-store

🕒 Window Aggregation

Implement sophisticated time-based analytics with our window aggregation feature for streaming pipelines. Perform calculations over specified time intervals, unlocking new possibilities for real-time data analysis.

📜 Doc: https://docs.mage.ai/guides/streaming/tutorials/streaming-stateful-store

📚 DataHub Data Catalog Integration (Work in Progress)

We're working on integrating DataHub into our platform, aiming to provide comprehensive data cataloging and discovery capabilities. This will empower you to manage metadata more effectively and streamline data workflows.

⚙️ Reusable IO Config (Work in Progress)

We're developing a unified IO configuration system that will work seamlessly across standard batch pipelines, data integration pipelines, and streaming pipelines. This will reduce redundancy and enhance workflow efficiency.

This release marks a significant step forward in our mission to provide you with the most powerful and flexible data engineering tools. We're committed to helping you build robust and efficient data workflows that can withstand any challenge. Stay tuned for more updates!

Browse Prologs

Prologs