TL;DR: Key Findings Every Databricks cluster node downloads over 12 GiB of data during startup. When routed via Azure Firewall (Data Exfiltration pattern), this can lead to monthly costs exceeding €3,000+ just for control plane traffic. For one bank’s project, this behavior caused over 100 TB of unexpected traffic throughCzytaj dalej / Read more
Category: No category
Understanding the 4-Minute SLA in Azure Data Factory Activities
SLA of the Azure Data Factory service This might be the first time you’re hearing this, but the Azure Data Factory service does NOT GUARANTEE an immediate or even a few seconds’ delay in starting any single activity in your pipelines! In fact, the guarantee for execution time is justCzytaj dalej / Read more
Exporting data to PARQUET file and ADLS Gen2 using Azure Synapse Serverless
Exporting query data is quite simple as one-two-three: One: define your file format
1 2 |
CREATE EXTERNAL FILE FORMAT parquetfile1 WITH ( FORMAT_TYPE = PARQUET, DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec' ); |
Two: define your file location (note: you should have read/write/list permission the path)
1 2 |
CREATE EXTERNAL DATA SOURCE ADLS_DS WITH( LOCATION = 'abfss://synapse@deltapoc0storage0dest.dfs.core.windows.net') |
Three: Create external table in particular location, using format and path from previous steps:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
CREATE EXTERNAL TABLE dbo.n10M WITH ( LOCATION = '/n10M', DATA_SOURCE = ADLS_DS, FILE_FORMAT = parquetfile1 ) AS WITH lv0 AS (SELECT 0 g UNION ALL SELECT 0) ,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) -- 4 ,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) -- 16 ,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) -- 256 ,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) -- 65,536 ,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) -- 4,294,967,296 ,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5) SELECT TOP (10000000) n1=n, n2=n, n3=n, n4=n, n5=n, n6=n, n7=n, n8=n, n9=n, n10=n FROM Tally |
Took 43sec toCzytaj dalej / Read more