
Loading...
Optimize your dbt projects for Snowflake with DataChonk-specific patterns and best practices.
Configure your Snowflake connection in .datachonk.yml:
version: 1
warehouse: snowflake
connections:
prod:
type: snowflake
account: xy12345.us-east-1 # Account locator
warehouse: ANALYTICS_WH
database: ANALYTICS
role: ANALYST_ROLE
# Password stored securely - not in config
dev:
type: snowflake
account: xy12345.us-east-1
warehouse: DEV_WH
database: ANALYTICS_DEV
role: DEVELOPER_ROLEUse transient tables for staging models to reduce Time Travel storage costs:
{{ config(
materialized='table',
transient=true,
cluster_by=['loaded_date']
) }}
select
id,
customer_name,
loaded_date::date as loaded_date
from {{ source('raw', 'customers') }}DataChonk generates efficient merge statements for incremental models:
{{ config(
materialized='incremental',
unique_key='order_id',
incremental_strategy='merge',
merge_update_columns=['status', 'updated_at']
) }}
select
order_id,
customer_id,
status,
updated_at
from {{ source('raw', 'orders') }}
{% if is_incremental() %}
where updated_at > (select max(updated_at) from {{ this }})
{% endif %}Use Snowflake Dynamic Tables for near-real-time transformations:
{{ config(
materialized='dynamic_table',
target_lag='1 minute',
snowflake_warehouse='STREAMING_WH'
) }}
select
date_trunc('minute', event_time) as event_minute,
count(*) as event_count
from {{ source('streaming', 'events') }}
group by 1Tables over 1TB benefit from clustering. DataChonk analyzes your JOIN and WHERE clauses to suggest optimal clustering keys.
Snowflake caches query results for 24 hours. Structure your models to take advantage of this free caching.
Use separate warehouses for different workloads. DataChonk suggests optimal sizes based on query complexity.
While VARIANT columns are flexible, over-use impacts query performance. DataChonk suggests when to flatten semi-structured data.
Cause: Warehouse too small or missing clustering.
Solution: Scale up warehouse temporarily, or add clustering on frequently filtered columns.
Cause: Default 1-day retention on all tables.
Solution: Use transient tables for staging/intermediate models. Set DATA_RETENTION_TIME_IN_DAYS=0 for ephemeral tables.
Cause: Micro-partitions accumulating from many small merges.
Solution: Periodically run dbt run --full-refresh or enable automatic clustering.