Primary Key
A column or combination of columns that uniquely identifies each row in a table. Essential for data integrity.

Loading...
Definitions for common terms used in DataChonk and dbt development
A model that appears upstream in the lineage graph. An ancestor is a dependency of the current model.
A test that validates data quality by checking that a condition is true for all rows in a model.
The foundational layer in a dbt project that directly references source tables. Also known as staging models.
A DataChonk term for a reusable, atomic unit of data modeling. Chonks come in five types: Source, Staging, Entity, Fact, and Metric.
A temporary named result set defined within a SQL statement using the WITH clause. CTEs make complex queries more readable.
DataChonk's AI corgi companion that understands your data context and generates dbt code using natural language. Woof!
A graph structure showing the dependencies between dbt models. Each node is a model, and edges show ref() relationships.
A subset of a data warehouse focused on a specific business area or team. Typically built from fact and dimension tables.
A centralized repository of structured data optimized for analytical queries. Examples include Snowflake, BigQuery, and Redshift.
An open-source command line tool that enables analytics engineers to transform data in their warehouses using SQL SELECT statements.
A hosted version of dbt that provides a web IDE, job scheduling, and documentation hosting.
The open-source command line version of dbt that runs locally or in CI/CD pipelines.
A model that appears downstream in the lineage graph. A descendant depends on the current model.
A descriptive attribute used to segment and filter fact data. Examples include customer name, product category, and date.
A DataChonk type representing a business entity with slowly changing attributes. Corresponds to dimension tables in traditional data modeling.
A dbt materialization that doesn't create a database object. The model's SQL is injected as a CTE into downstream models.
A DataChonk type representing business events or transactions. Contains measures that can be aggregated.
A table containing business events or transactions with quantitative measures. The central table in a star schema.
A measure of how recently source data was updated. dbt can warn or error if source data is stale.
The level of detail represented by a single row in a table. A table's grain defines what each row represents.
A dbt materialization that only processes new or changed records, improving performance for large tables.
DataChonk's feature that keeps the Brain up-to-date with your existing dbt project, semantic layer, and data warehouse schema.
The path of data transformations from source to final output. Shows dependencies between models.
A reusable piece of Jinja code that can be called in dbt models. Used for DRY (Don't Repeat Yourself) principles.
A transformation layer containing business-defined tables ready for consumption by BI tools and analysts.
How dbt persists a model in the warehouse. Options include table, view, incremental, and ephemeral.
A numeric value that can be aggregated in a fact table. Examples include revenue, quantity, and count.
A business calculation defined in the semantic layer. Combines measures with dimensions and time constraints.
A DataChonk type for semantic layer metric definitions. Includes calculation logic, dimensions, and time grains.
A SQL SELECT statement in dbt that creates a table or view. The fundamental building block of a dbt project.
A category of database systems optimized for complex analytical queries rather than transactional processing.
A column or combination of columns that uniquely identifies each row in a table. Essential for data integrity.
A technique for tracking historical changes to dimension attributes. Types include SCD Type 1 (overwrite) and SCD Type 2 (history).
A namespace within a database that contains tables, views, and other objects. Used to organize models.
A metadata layer that defines business metrics, dimensions, and relationships. Provides consistent definitions across tools.
A dbt feature that captures the state of a source table at a point in time, enabling SCD Type 2 tracking.
A raw table in the data warehouse that dbt doesn't manage. Defined in YAML and referenced with {{ source() }}.
A DataChonk type representing a raw source table. Contains metadata about the source and freshness expectations.
The first transformation layer that cleans and standardizes raw source data. Also called the base layer.
A DataChonk type for staging models. Handles column renaming, type casting, and basic cleaning.
A data modeling pattern with a central fact table surrounded by dimension tables. Optimized for analytical queries.
An artificial unique identifier generated for a table, typically using a hash function or auto-increment.
A validation that checks data quality. dbt supports generic tests (unique, not_null) and custom singular tests.
A dbt materialization that creates a database view. The SQL is executed on each query, always showing current data.
A human-readable data format used in dbt for configuration, documentation, and test definitions.
If you think we're missing an important term, please let us know.