What is Z-ORDER in Synapse Analytics Delta Table?

Z-Order is an optimization technique used in the Delta Lake. However, before discussing what is Z-Order, we need to first understand what happens when we write data to a delta table. Whenever we write data, statics gets collected, which are used when querying the table. When filter is applied to the column, the statistics are used to determine which file needs to be opened and scanned to return the result back to the user.

Using Z-Order enhances this functionality and helps in optimizing the table.

We recommend reading our previous blog on OPTIMIZE before diving into Z-Order, as both works well in conjunction with each other.

Let’s understand a scenario where we have applied the OPTIMIZE without using Z-Order.

If we perform OPTIMIZE, it just merges the files into a single or multiple files depending on their size, without any proper ordering of the rows. Since there is no ordering for a single record, the engine will have to scan many files, which ultimately hits the performance.

Let’s understand with the example.

If you look at the example, before performing Z-Order, the file which got created was not sorted, and the engine has to perform the scanning on all the files.

However, in the example below, after performing OPTIMIZE with Z-Order the statistics have been collected after sorting the records based on the EMP ID column (The column needs to be mentioned while performing the operation).

%%sql

OPTIMIZE BUSINESS.EMPLOYEE
ZORDER BY (EMP ID)

Multiple columns can be specified in the ZORDER, but it reduces effectiveness of the operation. For good practice, we should perform the operation on the single column and that column should be used frequently in FILTER,JOINS and WHERE conditions.

After performing the operation, it will return the information on how many file got removed/merged and the number of the files got added.

If you open the transaction log, we can see the name of the files that were removed/merged or added to the table. You can see the below screenshot.

Hope this blog gave you insights on what Z-Order is and how it can help to optimize our Delta table for good performance.

Leave a Reply

Your email address will not be published. Required fields are marked *