by Rocío Klan | Jan 21, 2025 | Blog, Data Analytics, Data Engineering
Apache Spark is a type of technology that uses distributed systems. In this article, we explain what it is, the key concepts to keep in mind, and provide guidance to help you start using it easily. What is Apache Spark? Apache Spark is a technology that employs...
by Rocío Klan | Oct 7, 2024 | Blog, Databricks
In this article, we explain what a data lakehouse is, what problems it aims to solve, and what its main characteristics are. Additionally, we develop the concept of the medallion architecture and delve into each of its layers. The problems of the data lake As data...
by Rocío Klan | Sep 25, 2024 | Blog, Data Analytics
In this article, we explain in detail what a data lake is, its advantages, and disadvantages. Additionally, we describe how these types of architectures are composed and what happens in each of their layers. Data lake: historical context In the early 21st century, new...
by Rocío Klan | Jun 25, 2024 | Blog, Databricks
In this article we show how to implement a data architecture in Azure Databricks. Additionally, we explain what the components of a lakehouse architecture are and which Microsoft services to use in each of its phases. What is a lakehouse data architecture? A...
by Rocío Klan | Jun 13, 2024 | Blog, Databricks
In this article, we explain what Azure Databricks is and what its costs and benefits are. Additionally, we provide details on what variables to consider when creating a budget to avoid surprises in billing. What is Azure Databricks? Azure Databricks is the...