Specifications

Specifications

From core capabilities to advanced features

From core capabilities to advanced features

Developer experience

Code Editor

Code Editor

Interactive code editor with visual feedback for immediately previewing execution results

Execute Python, SQL, and R code from a block within a pipeline

Mix and match dbt models and custom Python, SQL, or R code blocks within the same pipeline

Drag-and-drop code blocks in a visual dependency tree graph to customize the order of execution and flow of data between code blocks

View code side-by-side with the block's execution output while developing and building pipelines

Configure project, pipeline, or block level settings for limiting the data volume when running code blocks in development

New code editor with enhanced file browser, code management, and multi-row and multi-column layout

Autoscaling code execution framework for running blocks during pipeline development

Install and run VS Code extensions from the new code editor

New pipeline canvas editor for building complex graphs

Retrieval Augmented Generation (RAG) pipeline builder

Al Sidekick for creating pipelines, generating code blocks, and troubleshooting execution errors

Code Blocks

Code Blocks

Custom

Source

Dest.

Transform

Sensor

Scratchpad

dbt

Markdown

Automatically retry block runs with customizable number of retries, delay, maximum delay, and exponential backoff

Control the flow of code executions using sensor blocks that pause a branch of code from running until a condition is met

Search files, using full-text and natural language, across multiple projects and add code blocks to pipelines without duplicating code

Data Integrations

Data Integrations

Data integrations with sources and destinations from 100+ third-party services

Data connectors and integrations with data lakes using Apache Iceberg table format

Data syncs are 12-18x faster using optimized concurrent read/writes with high throughput and capacity

Run no-code data integrations alongside custom code blocks together within a single pipeline

Change data capture (CDC) with select databases

Create custom data integration sources and destinations without changing source code

Real-time Data Streaming

Real-time Data Streaming

Build streaming pipelines that process real-time data as it arrives using no-code configurations

Execute custom Python code blocks on incoming real-time data from streaming sources

SQL Data Models

SQL Data Models

Interpolate runtime variables, environment variables, secrets, and function macros within the SQL command of a SQL code block

Special SQL block connectors for DuckDB, OracleDB, Teradata, StarRocks, Couchbase, etc.

Execute SQL commands on data output from other Python, SQL, or R blocks

Upgraded developer experience for building, managing, and monitoring thousands of SQL models

dbt Integrations

dbt Integrations

Develop, build, test, run, document, manage, and monitor dbt models

Upgraded abt developer experience with new and more flexible user interface

Manage multiple dbt projects, with different remote repositories, across multiple Mage projects from a single application

Development workflow

Code Templates

Code Templates

Customizable pipeline and block templates on a team-by-team basis, reducing development time and production errors

Use 100+ pre-defined boilerplate templates for loading data, transforming data, and exporting data or re-use existing code blocks

Data Validation, Quality, and Unit Testing

Data Validation, Quality, and Unit Testing

Create and run unit tests in Python

Built-in data testing framework for conveniently writing tests within code blocks

Framework for validating acceptable state of data using data quality test suites

Development Environment

Development Environment

Documentation built-in using native tagging system and markdown blocks to document code blocks within a pipeline

Multiple Python virtual environments for development and code block execution

Automatically format code styling and run linters to fix syntax

Real-time collaboration with commenting, assignable action items, and reminder notes

Customize workspace themes, personalize accounts, and re-configure Ul component layouts

Manage multiple Mage projects and any other files from a single application

Global search bar, page launcher, browser history navigation, and command shortcuts application

Production Environment

Production Environment

Multi-tenant support with granular access controls and isolated workspaces

Customizable project and pipeline settings per environment

Launch multiple environments and manage shared team workspaces

Integrate your existing CI/CD custom build jobs and deployment steps into Mage Pro

Automatically deploy new code changes and data pipelines to different environments

Versioning

Versioning

Terminal for running console commands

Built-in version control application

Git terminal interface with built-in authentication and shortcuts

Remote repository import tool for easily migrating or syncing existing projects

Local file edit tracking and version history for restoring changes made in the past

Data version control for data pipeline versioning

Data Orchestration

Scheduling

Scheduling

Schedule pipelines to trigger on a regular or custom interval; e.g. hourly, daily, weekly, monthly, CRON expression

Schedule pipelines to start running and complete its execution by a specific date or time of day

Confgure multiple triggers for a single pipeline with different schedules, frequency, conditions, SLAs, retries, and runtime variables

Triggering

Triggering

Trigger pipelines from code and optionally wait until completion before proceeding

Trigger pipelines based on events from external 3rd party services; e.g. object deleted from Azure Blob Storage

Trigger pipelines across different projects and workspaces

Share and reuse a single trigger across different pipelines

Backfills

Backfills

Schedule, run, or re-run backfills multiple times across different windows of time

Backfill data with dynamically generated configurations and variables at runtime using custom code blocks

Observability & monitoring

Alerting

Alerting

Receive alerts through email, OpsGenie, Slack, Teams, Discord, Telegram, Google Chat, PagerDuty, etc.

Data pipeline execution runtime SLA and alerts

Customizable alert notifications and templates, per channel, when pipeline or block run succeeds, is cancelled, or fails

Alerts via email notification fully integrated without the need for external email service providers

Custom events, metrics, and alert notification rules

Monitoring

Monitoring

Monitoring dashboards for every pipeline and across the entire project

View current and past pipelines runs

Integrations with major 3rd party and open source tools, including DataDog, Metaplane, New Relic, Sentry, Prometheus, OpenTelemetry

Manage cross-pipeline dependencies and execution flow across every pipeline within a project

Customizable monitoring dashboard

System level metrics and logs with dashboard charts

Monitor upcoming pipeline execution runs from across all schedules, workspaces, and projects

Data catalog, metadata management, and data lineage

Logging

Logging

Detailed structured logging for pipeline runs, pipeline triggers, and individual block runs

Export events and logs to remote object storage locations and 3rd party services

Find and browse complete log history using full text search

Privacy & security

Data Governance

Data Governance

Securely store and access sensitive data or credentials using 3rd party or Mage's built-in secret manager

Granular pipeline and block run data output retention policies

User audit trail and logging

Single sign on (SSO)

User management and RBAC permission system with fine-grained access controls to create customizable roles and permissions

Network & Security

Network & Security

Custom security network access rules to allow or deny inbound/outbound traffic from/to a user-defined set of IP addresses

Virtual Private Network (VPN) for application account sign-in

VPN connection and SSL certificate authentication for select databases

Infrastructure Development

Infrastructure Development

Dedicated static IP address for development and running pipelines

Regional deployment for data processing operations

Performance & scaling

Compute Management

Compute Management

Architecture

Build and run data pipelines using Spark, Snowpark, and Databricks

Spark compute management and resource monitoring

Customized GPU accelerated resources for running AI/ML/LLM pipelines

Dynamic Blocks

Dynamic Blocks

Dynamically create unique instances of a block at runtime and execute its code logic on distinct sets of input data from upstream sources

Reduce the output across all dynamically created blocks and produce a single data output for downstream consumers

Combine the data output from multiple dynamically created blocks and standard block types to create unique combinations of data operations

Execute 100,000+ dynamically created block runs concurrently

Stream block output data to dynamically generated blocks without waiting for the upstream parent block to finish executing

Scalability & Automation

Scalability & Automation

Application upgrades and new product features are instantly installed in your environment

Reusable pipeline data output, accessible across multiple pipelines to reduce duplication, optimize compute, and ensure data consistency

Granular block settings for controlling read/write data partitions using output size, number of chunks, and item count

Batch generator framework to operate and process 1,000+ gigabytes (GB) of data without running out of memory

Autoscaling orchestration scheduler for maximum pipeline trigger frequency

Automatically and intelligently scale data pipelines, both vertically and horizontally, using predictive analytics and machine learning

Customization & extensibility

Flexible Execution

Flexible Execution

Execute data pipelines using fully customized Docker images

Control and execute pipelines from within other pipelines, using webhooks, or via API requests

Run pipelines and configure runtime variables using no-code user interface elements, such as dropdown menus and autocomplete inputs

Deploy high-performance, low-latency API endpoints for executing blocks and returning output data, such as inference endpoints

Personal Setup

Personal Setup

Install custom Python modules and system libraries on a per-project basis

Deploy and run third-party or custom services integrated with your application environment's cluster

Set up custom domains for different environments and workspaces

Manage, orchestrate, and configure infrastructure settings and system resources via API endpoints

Customizable Experience

Customizable Experience

High throughput API endpoints for integrating Mage Pro with any 3rd party or in-house services

Fully customize platform application behavior by executing code that transforms any API request payload or response

We're adding new features all the time. Explore more here.

Recover your precious developer time

Now you can focus on the fun, creative, and high-impact data engineering projects and let Mage AI handle the rest.

For engineers

Experience how Mage AI ships data pipelines faster, giving you a better work-life-balance.

For data teams

See how Mage AI accelerates your team velocity while reducing data and infrastructure costs.

Recover your precious developer time

Now you can focus on the fun, creative, and high-impact data engineering projects and let Mage AI handle the rest.

For engineers

Experience how Mage AI ships data pipelines faster, giving you a better work-life-balance.

For data teams

See how Mage AI accelerates your team velocity while reducing data and infrastructure costs.