Skip to content

An Overview of Snowflakes Data Cloud

Published: at 10:00 AM

In this article, we are going to discuss what Snowflake actually is? Is it a warehouse? Is it a BI tool? What is Snowflake?

Knowing about Snowflake as an offer for business will help you and your organisation make an informed choice where Snowflake might be the exact thing that is needed to unify data and bring about actionable insights.

Table of Contents

Open Table of Contents

An Overview of Snowflake’s Data Cloud

Snowflake’s Data Cloud is an advanced data platform provided as a self-managed service enabling data storage, processing, and analytics.

Snowflake is easier to use, faster, and more flexible than traditional offerings (we will not name them here 🙂).

But unlike existing database technologies or big data software platforms such as Hadoop, Snowflake combines a new SQL query engine with an innovative cloud-native architecture.

Data Platform as a Self-Managed Service

Snowflake operates as a fully managed service, eliminating the need for hardware or software installation, configuration, or management. Key features of Snowflake’s managed service include:

Snowflake operates exclusively on cloud infrastructure, these are servers and hard drives accessible on the internet in a secure manner, more information on cloud computing can be found here.

With Snowflake, leveraging virtual compute instances (also known as Warehouses) for computing needs and a storage service for persistent data storage.

Snowflake Architecture

Snowflake’s architecture is a hybrid of traditional shared-disk and shared-nothing database architectures.

What does that mean!? Shared-disk? Shared-nothing? - More information on shared-disk and shared-nothing database architectures can be found here.

By using this hybrid of traditional shared-disk and shared-nothing database architectures Snowflake have created a central data repository accessible from all compute nodes (shared-disk) and MPP (massively parallel processing) compute clusters where each node stores a portion of the data set locally (shared-nothing), making it easier for users and teams to work on data in the same place.

Architecture Layers

Snowflake’s architecture consists of three key layers:

  1. Database Storage:

    • Data loaded into Snowflake is re-organised into an optimised, compressed, columnar format stored in cloud storage.
    • Snowflake manages all aspects of data storage, including organisation, file size, structure, compression, metadata, and statistics.
  2. Query Processing:

    • Query execution is performed using “virtual warehouses,” which are MPP compute clusters.
    • Each virtual warehouse operates as an independent compute cluster, ensuring no resource sharing or performance impact between warehouses.
  3. Cloud Services:

    • This layer comprises a collection of services that coordinate activities across Snowflake, running on compute instances provisioned by Snowflake.
    • Services include authentication, infrastructure management, metadata management, query parsing and optimisation, and access control.

This can look and sound scarey, but remember, Snowflake manages provision on services in the background, you bring your data, ideas and security posture and Snowflake makes it happen.