Data

Here we cover all topics that involving data, meaning a lot of math and analytics. Datasets, big data, facts, statistics, information. We cover data and the process of collecting, measuring, transforming, reporting, analyzing, and more. Data is a cornerstone of machine learning and as you read through these you'll be one step closer to becoming a data scientist.

Thomas Chung

September 25, 2024

Thomas Chung

September 25, 2024

What are people saying about Mage, the modern day open-source data pipeline tool for transforming and integrating your data? Discover the buzz for yourself by checking out these blogs and articles written by those in the data engineering community!

Cole Freeman

June 25, 2024

Cole Freeman

June 25, 2024

Organizations face challenges managing vast amounts of fragmented data. Centralized data systems using integration pipelines and incremental models offer a practical solution. These systems unify data, improve quality, and enhance efficiency. Incremental models process only new or updated data, reducing computation time and costs. This approach enables faster decision-making, better resource optimization, and improved analytics capabilities. While implementation can be complex, the long-term benefits make it a valuable strategy for organizations dealing with large-scale, frequently updated data.

Cole Freeman

June 19, 2024

Cole Freeman

June 19, 2024

Global hooks in Mage are a powerful feature that allow executing custom code before or after API operations. They provide flexibility to extend functionality, integrate with external systems, validate data, and more across different components of your application. With targeting conditions and asynchronous execution, global hooks offer granular control and performance optimization.

Cole Freeman

June 11, 2024

Cole Freeman

June 11, 2024

In this tutorial, we integrate dbt with Mage to create a data pipeline, moving data from a source to a PostgreSQL database and performing SQL transformations through staged models. By setting up Docker and PostgreSQL, and following a step-by-step process, we effectively manage data orchestration and analytics using Mage and dbt.

Matt Palmer

February 9, 2024

Matt Palmer

February 9, 2024

Mage now supports a suite of DuckDB & MotherDuck features— from reading and writing DuckDB databases to executing dbt with dbt-duckdb!

Shashank Mishra

May 15, 2023

Edit: June 1, 2023

Shashank Mishra

May 15, 2023

Edit: June 1, 2023

This guide introduces Apache Flink and stream processing, explaining how to set up a Flink environment and create simple applications. Key Flink concepts are covered along with basic troubleshooting and monitoring techniques. It ends with resources for further learning and community support.

Tommy Dang

May 9, 2023

Edit: May 18, 2023

Tommy Dang

May 9, 2023

Edit: May 18, 2023

Combine powerful database features with the flexibility of an object storage system by using the Delta Lake framework.

Khuyen Tran

May 1, 2023

Khuyen Tran

May 1, 2023

Discover the Hidden Benefits and Drawbacks of dbt.

Shashank Mishra

April 26, 2023

Edit: June 1, 2023

Shashank Mishra

April 26, 2023

Edit: June 1, 2023

Apache Flink is a powerful open-source stream processing framework for big data, offering real-time and batch processing capabilities. With its flexibility and scalability, Flink is ideal for use cases like fraud detection, log analysis, IoT (Internet of Things), anomaly detection, and machine learning, making it a go-to solution for organizations needing real-time analytics and insights.

Thomas Chung

January 31, 2023

Edit: May 2, 2023

Thomas Chung

January 31, 2023

Edit: May 2, 2023

We setup Hyperquery at Mage to organize and streamline our growth data for better insights. In this blog, we go over our experience of quickly adopting the tool and easily sharing our analysis with our peers.