Snowflake Views | Types of views in Snowflake | Materialized Views | Secure Views

 


Hello Data Pros, and welcome back to another exciting episode of our Snowflake learning series!

In our previous video, we explored advanced table types, with easy-to-follow examples along the way.

Today, we'll take a step forward to understand the different types of views in Snowflake, along with their use cases and practical examples!

Just like tables and stages, views are also database objects, that allow the result of a SELECT query to be accessed as if it were a table.

The SELECT query is specified in the CREATE VIEW DDL statement. It's worth noting that we cannot perform any DML operations such as insert, update, and delete directly on views. However, any DML operations performed on the underlying tables are always reflected in the views.


--                  Views LAB - Non-materialized views, Materialized views, Secure Views                 --
--               All commands/statements can be found via the link in the video description              --


CREATE OR REPLACE TABLE employee_details (
    employee_id INTEGER,
    employee_name VARCHAR,
    department VARCHAR,
    salary NUMBER(10,2)
);


INSERT INTO employee_details (employee_id, employee_name, department, salary) VALUES
    (10001, 'John Smith', 'Delivery', 60000.00),
    (10002, 'Emma Johnson', 'Delivery', 58000.00),
    (10003, 'Michael Lee', 'Delivery', 62000.00),
    (10004, 'Emily Brown', 'Delivery', 59000.00),
    (10005, 'Daniel Williams', 'Helpdesk', 40000.00),
    (10006, 'Olivia Jones', 'Helpdesk', 38000.00),
    (10007, 'Liam Garcia', 'Helpdesk', 42000.00),
    (10008, 'Sophia Martinez', 'Helpdesk', 41000.00),
    (10009, 'Noah Rodriguez', 'HR', 65000.00),
    (10010, 'Ava Hernandez', 'HR', 63000.00),
    (10011, 'Ethan Lopez', 'HR', 67000.00),
    (10012, 'Isabella Gonzales', 'HR', 64000.00),
    (10013, 'Mason Perez', 'Finance', 70000.00),
    (10014, 'Sophia Torres', 'Finance', 68000.00),
    (10015, 'Jacob Ramirez', 'Finance', 72000.00),
    (10016, 'Emma Flores', 'Finance', 69000.00),
    (10017, 'William Smith', 'Training', 55000.00),
    (10018, 'Charlotte Johnson', 'Training', 53000.00),
    (10019, 'James Lee', 'Training', 57000.00),
    (10020, 'Amelia Brown', 'Training', 54000.00),
    (10021, 'Alexander Wilson', 'Delivery', 61000.00),
    (10022, 'Mia Anderson', 'Delivery', 63000.00),
    (10023, 'Ethan Martinez', 'Delivery', 59000.00);

select * from employee_details;

CREATE OR REPLACE VIEW employee_details_restricted AS
SELECT
    employee_id,
    employee_name,
    department
FROM
    employee_details
WHERE department = 'Delivery';

select * from employee_details_restricted;



CREATE TABLE orders (
    order_id INTEGER,
    customer_id INTEGER,
    order_date DATE
);

CREATE TABLE order_items (
    item_id INTEGER,
    order_id INTEGER,
    product_name VARCHAR,
    quantity INTEGER,
    price_per_unit NUMBER(10,2)
);

INSERT INTO orders (order_id, customer_id, order_date) VALUES
    (1, 101, '2024-01-01'),
    (2, 102, '2024-01-02'),
    (3, 103, '2024-01-03');

INSERT INTO order_items (item_id, order_id, product_name, quantity, price_per_unit) VALUES
    (1, 1, 'Product A', 2, 15.00),
    (2, 1, 'Product B', 1, 25.00),
    (3, 2, 'Product C', 5, 10.00),
    (4, 3, 'Product D', 3, 20.00),
    (5, 3, 'Product E', 1, 50.00);


CREATE VIEW order_summary AS
    SELECT
        o.order_id,
        o.customer_id,
        o.order_date,
        COUNT(oi.item_id) AS total_items,
        SUM(oi.quantity * oi.price_per_unit) AS total_order_value
    FROM
        orders o
    JOIN
        order_items oi ON o.order_id = oi.order_id
    GROUP BY
        o.order_id, o.customer_id, o.order_date
    HAVING
        SUM(oi.quantity * oi.price_per_unit) >= 50.00;


SELECT customer_id, total_order_value
FROM order_summary;




CREATE TABLE employee_salaries (
    employee_id INTEGER,
    salary_date DATE,
    employee_name VARCHAR,
    basic_salary NUMBER(10,2),
    allowances NUMBER(10,2),
    bonus NUMBER(10,2),
    tax NUMBER(10,2)
);

-- Sample data for a few employees
INSERT INTO employee_salaries (employee_id, salary_date, employee_name, basic_salary, allowances, bonus, tax) VALUES
    (1, '2024-01-01', 'John Doe', 4000.00, 500.00, 1000.00, 800.00),
    (2, '2024-01-01', 'Jane Smith', 4500.00, 600.00, 1200.00, 900.00),
    (3, '2024-01-01', 'Alice Johnson', 4200.00, 550.00, 1100.00, 850.00);
    -- Assume this pattern continues for all employees

CREATE or REPLACE MATERIALIZED VIEW yearly_employee_salaries_summary AS
    SELECT
        employee_id,
        EXTRACT(YEAR FROM salary_date) AS year,
        employee_name,
        SUM(basic_salary + allowances + bonus - tax) AS total_salary
    FROM
        employee_salaries
    GROUP BY
        employee_id,
        year,
        employee_name;

SELECT * FROM yearly_employee_salaries_summary;




CREATE OR REPLACE SECURE VIEW employee_details_delivery AS
SELECT
    employee_id,
    employee_name,
    department,
    salary
FROM
    employee_details
WHERE department = 'Delivery';



CREATE OR REPLACE SECURE MATERIALIZED VIEW employee_details_delivery_m AS
SELECT
    employee_id,
    employee_name,
    department,
    salary
FROM
    employee_details
WHERE department = 'Delivery';



SELECT GET_DDL('VIEW', 'employee_details_delivery');
-- Works on ACCOUNTADMIN role, but does not work if view is secure and user has SELECT-only access.








CREATE OR REPLACE SECURE VIEW employee_details_delivery AS
SELECT
    employee_id,
    employee_name,
    department,
    salary
FROM
    employee_details
WHERE department = 'Delivery';


SELECT *
FROM employee_details_delivery
WHERE department = 'HR' AND salary > 60000;

SELECT *
FROM employee_details_delivery
WHERE 1/ iff(department = 'HR' AND salary > 60000, 0, 1) = 0;

SELECT *
FROM employee_details_delivery
WHERE 1/ 0 = 0;



Views serve a variety of purposes. Let's quickly explore some of the most common uses, starting with security!

Views can restrict access to specific rows or columns in the base table, ensuring sensitive data is protected and only exposed to authorized users.

Imagine a scenario where a company maintains an employee details table. If you needed to grant access to someone to this table, but salary is sensitive data that you want to exclude, this can be easily achieved with the help of views.

By creating a view with only the required columns, access can now be granted to this view instead of the base table. This approach ensures that target users do not have access to sensitive salary information.

Beyond restricting access to specific columns, views can also protect data at the row level.

For example, if users should only have access to employees in the delivery department, a where-clause can be added to the view definition to filter the data accordingly.

Next main purpose of views, is reducing complexity!

They achieve this by hiding the underlying complex logics from end users.

Please take a look at this orders and order items table.

Suppose a user frequently want to do order summary analysis, a VIEW can be created, involving all complex logics. Now the end user can easily analyse the summary with a query as simple as this.

This makes the overall interaction with the data much easier.

 

Another important aspect of views is performance. Materialized views, which we'll cover shortly, store pre-processed query results. This means that each time users access the view, the need for redundant processing is eliminated, leading to improved performance.

With this foundational understanding in place, it's now time to explore the different types of views available in Snowflake. They are non-materialized views, materialized views, and secure views.

Let’s start with non-materialized views.

They're the most common type of view and are essentially named-SELECT queries.

Whenever users reference this view in their queries, Snowflake executes the SELECT statement defined in the view's DDL to derive the result set.

The results are not stored for future use, meaning there won’t be any additional storage cost for this type of view.

However, performance is slower because the query is executed each time to derive the result set.

Both the employee details restricted, and order summary views that we've already created are examples of non-materialized views.

 

Moving on, let’s talk about materialized views!

A materialized view is basically a pre-computed result set derived from a select query, and stored for later use.

Whenever the data in the base table changes, Snowflake automatically refreshes the data in the materialized view.

Because the data is pre-computed and stored, querying a materialized view is much faster than executing a query against the base table or a non-materialized view.

However, it does come with additional storage and compute costs! So, it's absolutely important to balance cost and actual need!

 

Snowflake recommends creating a materialized view ONLY when all these conditions are met:

The underlying base table does NOT change often!

The results of the view are used MORE frequently!

And the SELECT query involved in the view definition is complex, and consumes a lot of resources to run!

 

For example, think about an employee salaries table where data doesn’t change too often, perhaps updated monthly OR on salary dates.

Let's say the finance department frequently accesses these computed results, may be daily or hourly!

In such scenarios, it's worth creating a materialized view, as the additional storage and compute costs offset the expense of running the query multiple times, and also offers improved performance.

Another important use case for materialized views is one that we've already discussed in our previous videos, hope you guessed it! Yes, we can indeed create a materialized view on top of an external table.

The data of an external table is typically stored outside of Snowflake's environment. However, by defining a materialized view, Snowflake effectively maintains a local copy of that data within snowflake, offering better query performance.

 

Finally, let's talk about secure views!

A secure view is a special type of view that offers additional security features. Both non-materialized views and materialized views can be created as secure views.

So, what difference does it make when you add this "secure" keyword?

If a view is not secured, users of the view can get the DDL of the view, and infer some of the underlying table structure in your database.

This unintended exposure can be mitigated by defining the view as a secure view.

Additionally, when a query is run on the view, the Snowflake query optimizer might use other columns in the base table to optimize performance. While this may seem beneficial, it can potentially expose data unintentionally.

Consider this scenario, where you create a view called "employee details delivery" only for the managers in the delivery department. But what if some manager in the delivery department wants to know if anyone in HR department earns more than $60,000, which they don't have access to?

They might attempt a query, resulting in zero rows.

Well, if the view were not secure, the Snowflake optimizer could re-order the predicates in the WHERE clauses, potentially allowing the predicate in the user’s query to execute first, followed by the predicate in the view definition.

Despite the where clause re-order, the user receives zero records, because both predicates are definitely applied before finalizing the result set.

But this query can be tweaked, in such a way that it triggers a divide-by-zero runtime error, if the condition the user wants to check meets!

In this case, if there is at least one HR employee with a salary exceeding $60,000, and if the optimizer reorders the where-clause execution sequence, then the user would receive the error like this, with this the user can infer sensitive information for which he or she does not have access to!

This is where secure views play a vital role. By defining the view as secure, Snowflake always runs the where-clause in the view first, and also avoids using certain query optimizations that could potentially expose unauthorized data.

Due to this limited optimization, the secure views are generally slower than non-secure views. Therefore, it's advisable to define a view as secure only when dealing with sensitive data, that you want to protect against potential breaches.

 

That's all for today! Please stay tuned for our next video where we’ll explore more advanced Snowflake features!

If you found this video helpful, please give it a thumbs up, and hit that subscribe button to stay up-to-date with the latest technologies!

We also welcome your questions or thoughts in the discussion section below!

Thanks for watching!

Comments

Popular posts from this blog

How to Install Airflow on Windows

How to Install DBT and Set Up a Project, Create Your First dbt Model

Airflow DAGs, Operators, Tasks & Providers