Snowflake Zero copy Cloning | How to Clone Database Schema Table View Stage | Examples | Tutorial
Hello Data Pros, and welcome back to another exciting episode of our Snowflake learning series!
In our last video, we explained what time travel is! and demonstrated how it works with appropriate examples!
Today, we'll be exploring cloning; most widely known as zero-copy cloning within the Snowflake ecosystem!
Let’s jump right in!
Unlike cloning feature offered by other traditional databases, Snowflake does not physically duplicate the underlying data when cloning objects.
Let’s understand it further, imagine an employee table with 5 micro-partitions. When we clone this table, no changes occur in the storage layer. Instead, within the metadata service of the cloud services layer, another table named "employee-clone" is created, pointing to the existing micro-partitions of the employee table.
Since cloning is a metadata operation, and most importantly, since the underlying data is not physically duplicated, this process is popularly known as zero-copy cloning.
Immediately after cloning, both the base table and cloned table share the same micro-partitions, and if you run a query against, they return the same results.
However, please be aware that from the user perspective, these are 2 different tables, we can independently insert, update, or delete records from these two tables.
For instance, if you insert few records into the "employee-cloned" table, new micro-partitions are created, but they are associated only with the "employee-cloned" table.
Likewise, if some records are updated or deleted from the employee table, certain micro partitions are end-dated, and new ones are added, but that only affects the employee table, not for the cloned table.
It's also worth noting that the same micro-partition could be marked with an end date for one table, while it remains active for the other table.
There are some key considerations to keep in mind when cloning objects in Snowflake.
Snowflake supports cloning individual objects such as tables, stages, file formats, streams, tasks, etc.
In addition, container objects like databases and schemas can also be cloned.
For container objects, cloning is recursive, which means cloning a database will clone all the schemas and other supported child objects within it.
Similarly, cloning a schema will clone all the supported child objects within the schema.
Some objects, such as external tables and internal named stages, are never cloned, even if they are part of a database or schema that you clone.
On the other hand, objects such as views, UDFs, and procedures, cannot be cloned individually. However, they are implicitly cloned when you clone the schema or database they belong to.
With all this understanding, let's dive into the Snowflake UI and explore these cloning features in action.
To begin, we'll set up a database and a couple of schemas within it.
Following this, we'll switch to the newly created database and schema to carry out further operations.
Next, we'll create two tables: customers and orders, and populate the customers table with some initial data.
Let's just note down the current timestamp at this stage, so that I can later demonstrate cloning with Time Travel.
To make this demonstration broader, we'll also create a view, an internal stage, and an external stage.
So far, we've laid the groundwork by setting up the necessary objects to illustrate cloning across various scenarios.
Now, moving on to the cloning process, first, we clone the entire database demo-db!
When a database is cloned, all schemas and supported child objects within the schema are cloned. It's worth noting that internal named stages are not cloned as part of this. This behavior aligns with the fact that internal named stages are among the objects that do not support cloning.
Let's consider a scenario where I choose not to clone the entire database, but instead want to clone only a specific schema!
In this case, I create demo schema clone from demo schema. This action clones only that specific schema, along with its supported child objects.
But what if I do not want to clone all objects within my schema as well! But rather prefer to clone only a specific object or perhaps just a table.
Snowflake supports cloning individual objects too. For instance, this command creates a clone of the customer table with the name 'customers-clone'.
As we said earlier, immediately after cloning, both the original table and the cloned table, reference the same micropartitions!
If you query the tables, you will observe the same set of records in both.
However, it's important to note that from the user's perspective, both of these tables are considered completely separate objects. Users can independently insert, update, or delete records from these two tables without affecting the other.
Let’s witness this in action! I’ll insert 3 new rows into the original customers table.
Upon running these queries, it becomes apparent that the 3 new records are only reflected in the customers table, while the customers clone table remains unaffected.
Similarly, let’s attempt to insert one record into customers-clone, and rerun the select statements.
This aligns perfectly with our expectations!
Despite sharing some micro-partitions internally, the operations on the cloned table do not affect the original table, and vice versa.
Furthermore, you can indeed use the time travel feature while cloning a database, schema, or table, to clone an object as it existed at a specific timestamp in the past.
As we wrap up this video, here's a fun fact: you can even create a cloned object from a previously cloned one.
Overall, cloning enables quick backups and recovery, facilitates the rapid setup of testing environments, and offers numerous other benefits —all without physically duplicating the identical data, leading to significant reductions in storage costs!
That's all for today! Please stay tuned for our next video where we’ll explore more advanced Snowflake features!
If you found this video helpful, please give it a thumbs up, and hit that subscribe button to stay up-to-date with the latest technologies!
Thanks for watching!
Comments
Post a Comment