DocsHelpMigration Guide

FAQ

Glossary

DocsHelpMigration Guide

HelpMigration Guide

Migration Guide

Guides for migrating to DataChonk from various starting points.

Manual SQL Development

DataChonk

Export your existing SQL files and documentation

Create a new DataChonk project connected to your warehouse

Run schema scan to discover tables and relationships

Create Source chonks for each raw data source

Use AI to generate Staging chonks from existing transformation logic

Build Entity and Fact chonks, referencing existing business rules

Export generated dbt project to your repository

From Manual SQL to DataChonk

If you're currently writing SQL by hand or using simple scripts, DataChonk can help you adopt dbt best practices while preserving your business logic.

Step 1: Analyze Existing SQL

Use the Brain chat to analyze your existing SQL and get recommendations:

Example: Analyzing existing transformation

You: I have this SQL transformation that calculates customer lifetime value:

SELECT 
  customer_id,
  SUM(order_total) as total_spend,
  COUNT(*) as order_count,
  MIN(order_date) as first_order,
  MAX(order_date) as last_order
FROM orders
GROUP BY customer_id

How should I model this in DataChonk?

Chonk: Great question! This looks like a customer dimension with aggregated order metrics. 
I'd recommend:

1. Source chonk: 'orders' - reference your raw orders table
2. Staging chonk: 'stg_orders' - clean and standardize the raw data
3. Entity chonk: 'dim_customers' - this is where your aggregation lives

Would you like me to generate these chonks for you?

Step 2: Create Source Chonks

After scanning your warehouse, create Source chonks for each table you need:

CLI: Create source chonks

# Scan warehouse and create sources
datachonk scan --connection snowflake://...

# Generate source chonks for specific tables
datachonk generate source --tables orders,customers,products

# Or use the web UI to visually select tables

From Existing dbt Project

DataChonk works great alongside existing dbt projects. Import your project to extend it with AI-powered generation.

Import Your Project

CLI: Import existing dbt project

# Import from local path
datachonk import --path ./my-dbt-project

# Or from GitHub
datachonk import --repo https://github.com/org/dbt-project --branch main

# DataChonk analyzes:
# - Model structure and dependencies
# - Naming conventions
# - Materialization patterns
# - Test coverage
# - Documentation style

Extend with New Models

New chonks automatically follow your existing patterns:

Generated model follows your conventions

-- DataChonk detected your patterns:
-- - Naming: stg_<source>__<table>
-- - Materialization: view for staging
-- - CTEs: source → renamed → final
-- - Tests: unique, not_null on keys

{{ config(materialized='view') }}

with source as (
    select * from {{ source('stripe', 'payments') }}
),

renamed as (
    select
        id as payment_id,
        customer_id,
        amount_cents / 100.0 as amount,
        created_at as payment_at
    from source
)

select * from renamed

From Legacy ETL Tools

Migrating from tools like Informatica, Talend, or SSIS? DataChonk can help translate complex transformation logic to modern SQL.

Migration Best Practices

Run old and new pipelines in parallel during transition
Generate comprehensive tests to verify data parity
Document all business rules in chonk descriptions
Start with simple models before tackling complex logic

Translate Complex Logic

Use the Brain to help translate ETL logic to SQL:

Example: Translating SCD Type 2 logic

You: I have an Informatica mapping that implements SCD Type 2 for customers. 
It tracks changes to name, email, and address. How do I replicate this in dbt?

Chonk: For SCD Type 2 in dbt, I recommend using the dbt_utils.snapshot approach 
or building a custom incremental model. Here's what I'd generate:

1. A snapshot for tracking changes:
{% snapshot snp_customers %}
{{ config(
    target_schema='snapshots',
    unique_key='customer_id',
    strategy='check',
    check_cols=['name', 'email', 'address']
) }}
select * from {{ source('crm', 'customers') }}
{% endsnapshot %}

2. A dimension model that joins current and historical records:
[generates full dim_customers model with SCD2 support]

Would you like me to create these as chonks?

Migration Checklist

Document existing business logic

Identify all data sources

Map current transformations to chonk types

Set up DataChonk project

Connect to warehouse

Scan and create source chonks

Generate staging models

Build entity and fact models

Add comprehensive tests

Generate documentation

Run parallel validation

Plan cutover strategy

Need Migration Help?

Complex migrations can be challenging. Our team can help you plan and execute your migration to DataChonk.

Contact Migration Team

FAQ