OSS Edition

Our open-source version is a streamlined, self-hosted solution designed for individuals and small teams that prioritize control over their infrastructure. It offers the essential tools to build, run, and manage data pipelines, ideal for developers seeking flexibility with core functionality. For access to advanced features & enhanced scalability, Mage Pro is the next step.

Developer experience

Data Engineering Code Editor

Interactive code editor with visual feedback for immediately previewing execution results

Interactive code editor with visual feedback for immediately previewing execution results

OSS

Execute Python, SQL, and R code from a block within a pipeline

OSS

Mix and match dbt models and custom Python, SQL, or R code blocks within the same pipeline

OSS

Drag-and-drop code blocks in a visual dependency tree graph to customize the order of execution and flow of data between code blocks

OSS

View code side-by-side with the block's execution output while developing and building pipelines

Configure project, pipeline, or block level settings for limiting the data volume when running code blocks in development

New code editor with enhanced file browser, code management, and multi-row and multi-column layout

Autoscaling code execution framework for running blocks during pipeline development

Install and run VS Code extensions from the new code editor

New pipeline canvas editor for building complex graphs

Retrieval Augmented Generation (RAG) pipeline builder

Al Sidekick for creating pipelines, generating code blocks, and troubleshooting execution errors

Code Blocks

Control the flow of code executions using sensor blocks that pause a branch of code from running until a condition is met

OSS

Automatically retry block runs with customizable number of retries, delay, maximum delay, and exponential backoff

OSS

Search files, using full-text and natural language, across multiple projects and add code blocks to pipelines without duplicating code

Real-time Data Streaming

Build streaming pipelines that process real-time data as it arrives using no-code configurations

OSS

Execute custom Python code blocks on incoming real-time data from streaming sources

OSS

SQL Data Models

Interpolate runtime variables, environment variables, secrets, and function macros within the SQL command of a SQL code block

OSS

Special SQL block connectors for DuckDB, OracleDB, Teradata, StarRocks, Couchbase, etc.

OSS

Execute SQL commands on data output from other Python, SQL, or R blocks

Upgraded developer experience for building, managing, and monitoring thousands of SQL models

Development workflow

Data Validation, Quality, and Unit Testing

Built-in data testing framework for conveniently writing tests within code blocks

OSS

Create and run unit tests in Python

OSS

Framework for validating acceptable state of data using data quality test suites

Framework for validating acceptable state of data using data quality test suites

Data Orchestration

Scheduling

Confgure multiple triggers for a single pipeline with different schedules, frequency, conditions, SLAs, retries, and runtime variables

OSS

Schedule pipelines to trigger on a regular or custom interval; e.g. hourly, daily, weekly, monthly, CRON expression

Schedule pipelines to trigger on a regular or custom interval; e.g. hourly, daily, weekly, monthly, CRON expression

OSS

Schedule pipelines to start running and complete its execution by a specific date or time of day

Schedule pipelines to start running and complete its execution by a specific date or time of day

Backfills

Schedule, run, or re-run backfills multiple times across different windows of time

Schedule, run, or re-run backfills multiple times across different windows of time

OSS

Backfill data with dynamically generated configurations and variables at runtime using custom code blocks

Backfill data with dynamically generated configurations and variables at runtime using custom code blocks

Observability & monitoring

Observability

Alerting

Receive alerts through email, OpsGenie, Slack, Teams, Discord, Telegram, Google Chat, PagerDuty, etc.

OSS

Data pipeline execution runtime SLA and alerts

Data pipeline execution runtime SLA and alerts

OSS

Customizable alert notifications and templates, per channel, when pipeline or block run succeeds, is cancelled, or fails

Customizable alert notifications and templates, per channel, when pipeline or block run succeeds, is cancelled, or fails

OSS

Alerts via email notification fully integrated without the need for external email service providers

Alerts via email notification fully integrated without the need for external email service providers

Custom events, metrics, and alert notification rules

Custom events, metrics, and alert notification rules

Logging

Export events and logs to remote object storage locations and 3rd party services

Export events and logs to remote object storage locations and 3rd party services

OSS

Detailed structured logging for pipeline runs, pipeline triggers, and individual block runs

Detailed structured logging for pipeline runs, pipeline triggers, and individual block runs

OSS

Find and browse complete log history using full text search

Find and browse complete log history using full text search

Performance & scaling

Performance

Dynamic Blocks

Dynamically create unique instances of a block at runtime and execute its code logic on distinct sets of input data from upstream sources

Dynamically create unique instances of a block at runtime and execute its code logic on distinct sets of input data from upstream sources

OSS

Reduce the output across all dynamically created blocks and produce a single data output for downstream consumers

Reduce the output across all dynamically created blocks and produce a single data output for downstream consumers

OSS

Combine the data output from multiple dynamically created blocks and standard block types to create unique combinations of data operations

Combine the data output from multiple dynamically created blocks and standard block types to create unique combinations of data operations

OSS

Execute 100,000+ dynamically created block runs concurrently

Execute 100,000+ dynamically created block runs concurrently

Stream block output data to dynamically generated blocks without waiting for the upstream parent block to finish executing

Stream block output data to dynamically generated blocks without waiting for the upstream parent block to finish executing

Scalability & Automation

Application upgrades and new product features are instantly installed in your environment

Application upgrades and new product features are instantly installed in your environment

Reusable pipeline data output, accessible across multiple pipelines to reduce duplication, optimize compute, and ensure data consistency

Reusable pipeline data output, accessible across multiple pipelines to reduce duplication, optimize compute, and ensure data consistency

Granular block settings for controlling read/write data partitions using output size, number of chunks, and item count

Granular block settings for controlling read/write data partitions using output size, number of chunks, and item count

Batch generator framework to operate and process 1,000+ gigabytes (GB) of data without running out of memory

Batch generator framework to operate and process 1,000+ gigabytes (GB) of data without running out of memory

Autoscaling orchestration scheduler for maximum pipeline trigger frequency

Autoscaling orchestration scheduler for maximum pipeline trigger frequency

Automatically and intelligently scale data pipelines, both vertically and horizontally, using predictive analytics and machine learning

Automatically and intelligently scale data pipelines, both vertically and horizontally, using predictive analytics and machine learning

Customization & extensibility

Customization

© 2024 Mage Technologies, Inc.

© 2024 Mage Technologies, Inc.