DocsHelpGlossary

DocsHelpGlossary

Reference

Glossary

Definitions for common terms used in DataChonk and dbt development

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A

Ancestor

A model that appears upstream in the lineage graph. An ancestor is a dependency of the current model.

Related:Descendant, Lineage, ref

Assertion

A test that validates data quality by checking that a condition is true for all rows in a model.

Related:Test, Data Quality

B

Base Model

The foundational layer in a dbt project that directly references source tables. Also known as staging models.

Related:Staging, Source

See also:Staging Chonks

C

Chonk

A DataChonk term for a reusable, atomic unit of data modeling. Chonks come in five types: Source, Staging, Entity, Fact, and Metric.

CTE (Common Table Expression)

A temporary named result set defined within a SQL statement using the WITH clause. CTEs make complex queries more readable.

Related:SQL, Subquery

Chonk AI

DataChonk's AI corgi companion that understands your data context and generates dbt code using natural language. Woof!

D

DAG (Directed Acyclic Graph)

A graph structure showing the dependencies between dbt models. Each node is a model, and edges show ref() relationships.

Related:Lineage, ref

Data Mart

A subset of a data warehouse focused on a specific business area or team. Typically built from fact and dimension tables.

Data Warehouse

A centralized repository of structured data optimized for analytical queries. Examples include Snowflake, BigQuery, and Redshift.

Related:Data Mart, OLAP

dbt (data build tool)

An open-source command line tool that enables analytics engineers to transform data in their warehouses using SQL SELECT statements.

dbt Cloud

A hosted version of dbt that provides a web IDE, job scheduling, and documentation hosting.

Related:dbt Core

dbt Core

The open-source command line version of dbt that runs locally or in CI/CD pipelines.

Related:dbt Cloud

Descendant

A model that appears downstream in the lineage graph. A descendant depends on the current model.

Related:Ancestor, Lineage

Dimension

A descriptive attribute used to segment and filter fact data. Examples include customer name, product category, and date.

Related:Fact Table, Entity Chonk

See also:Entity Chonks

E

Entity Chonk

A DataChonk type representing a business entity with slowly changing attributes. Corresponds to dimension tables in traditional data modeling.

Related:Dimension, SCD

See also:Entity Chonks

Ephemeral

A dbt materialization that doesn't create a database object. The model's SQL is injected as a CTE into downstream models.

Related:Materialization, CTE

F

Fact Chonk

A DataChonk type representing business events or transactions. Contains measures that can be aggregated.

Related:Fact Table, Measure

Fact Table

A table containing business events or transactions with quantitative measures. The central table in a star schema.

Related:Dimension, Star Schema, Measure

Freshness

A measure of how recently source data was updated. dbt can warn or error if source data is stale.

Related:Source, Data Quality

G

Grain

The level of detail represented by a single row in a table. A table's grain defines what each row represents.

Related:Fact Table, Aggregation

I

Incremental

A dbt materialization that only processes new or changed records, improving performance for large tables.

Related:Materialization, Merge

J

Jinja

A templating language used in dbt to write dynamic SQL. Enables macros, loops, conditionals, and variable substitution.

Related:Macro, ref

K

Knowledge Sync

DataChonk's feature that keeps the Brain up-to-date with your existing dbt project, semantic layer, and data warehouse schema.

L

Lineage

The path of data transformations from source to final output. Shows dependencies between models.

Related:DAG, Ancestor, Descendant

M

Macro

A reusable piece of Jinja code that can be called in dbt models. Used for DRY (Don't Repeat Yourself) principles.

Related:Jinja, dbt_utils

Mart

A transformation layer containing business-defined tables ready for consumption by BI tools and analysts.

Related:Data Mart, Gold Layer

Materialization

How dbt persists a model in the warehouse. Options include table, view, incremental, and ephemeral.

Related:Table, View, Incremental, Ephemeral

Measure

A numeric value that can be aggregated in a fact table. Examples include revenue, quantity, and count.

Related:Fact Table, Metric

Metric

A business calculation defined in the semantic layer. Combines measures with dimensions and time constraints.

Related:Semantic Layer, Measure

See also:Metric Chonks

Metric Chonk

A DataChonk type for semantic layer metric definitions. Includes calculation logic, dimensions, and time grains.

See also:Metric Chonks

Model

A SQL SELECT statement in dbt that creates a table or view. The fundamental building block of a dbt project.

Related:Materialization, ref

O

OLAP (Online Analytical Processing)

A category of database systems optimized for complex analytical queries rather than transactional processing.

Related:Data Warehouse, OLTP

P

Primary Key

A column or combination of columns that uniquely identifies each row in a table. Essential for data integrity.

Related:Surrogate Key, Natural Key

R

ref

A dbt function that references another model, creating a dependency in the DAG. Written as {{ ref('model_name') }}.

Related:DAG, Lineage, source

S

SCD (Slowly Changing Dimension)

A technique for tracking historical changes to dimension attributes. Types include SCD Type 1 (overwrite) and SCD Type 2 (history).

Schema

A namespace within a database that contains tables, views, and other objects. Used to organize models.

Related:Database, Table

Semantic Layer

A metadata layer that defines business metrics, dimensions, and relationships. Provides consistent definitions across tools.

Related:Metric, Dimension

Snapshot

A dbt feature that captures the state of a source table at a point in time, enabling SCD Type 2 tracking.

Related:SCD, History

Source

A raw table in the data warehouse that dbt doesn't manage. Defined in YAML and referenced with {{ source() }}.

Related:source (function), Staging

See also:Source Chonks

Source Chonk

A DataChonk type representing a raw source table. Contains metadata about the source and freshness expectations.

See also:Source Chonks

Staging

The first transformation layer that cleans and standardizes raw source data. Also called the base layer.

Related:Base Model, Source

See also:Staging Chonks

Staging Chonk

A DataChonk type for staging models. Handles column renaming, type casting, and basic cleaning.

See also:Staging Chonks

Star Schema

A data modeling pattern with a central fact table surrounded by dimension tables. Optimized for analytical queries.

Surrogate Key

An artificial unique identifier generated for a table, typically using a hash function or auto-increment.

Related:Primary Key, Natural Key

T

Test

A validation that checks data quality. dbt supports generic tests (unique, not_null) and custom singular tests.

Related:Assertion, Data Quality

V

View

A dbt materialization that creates a database view. The SQL is executed on each query, always showing current data.

Related:Materialization, Table

Y

YAML

A human-readable data format used in dbt for configuration, documentation, and test definitions.

Related:Schema file, Properties

Missing a term?

If you think we're missing an important term, please let us know.

Migration Guide

Reference

Glossary

Definitions for common terms used in DataChonk and dbt development

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A

Ancestor

A model that appears upstream in the lineage graph. An ancestor is a dependency of the current model.

Related:Descendant, Lineage, ref

Assertion

A test that validates data quality by checking that a condition is true for all rows in a model.

Related:Test, Data Quality

B

Base Model

The foundational layer in a dbt project that directly references source tables. Also known as staging models.

Related:Staging, Source

See also:Staging Chonks

C

Chonk

A DataChonk term for a reusable, atomic unit of data modeling. Chonks come in five types: Source, Staging, Entity, Fact, and Metric.

CTE (Common Table Expression)

A temporary named result set defined within a SQL statement using the WITH clause. CTEs make complex queries more readable.

Related:SQL, Subquery

Chonk AI

DataChonk's AI corgi companion that understands your data context and generates dbt code using natural language. Woof!

D

DAG (Directed Acyclic Graph)

A graph structure showing the dependencies between dbt models. Each node is a model, and edges show ref() relationships.

Related:Lineage, ref

Data Mart

A subset of a data warehouse focused on a specific business area or team. Typically built from fact and dimension tables.

Data Warehouse

A centralized repository of structured data optimized for analytical queries. Examples include Snowflake, BigQuery, and Redshift.

Related:Data Mart, OLAP

dbt (data build tool)

An open-source command line tool that enables analytics engineers to transform data in their warehouses using SQL SELECT statements.

dbt Cloud

A hosted version of dbt that provides a web IDE, job scheduling, and documentation hosting.

Related:dbt Core

dbt Core

The open-source command line version of dbt that runs locally or in CI/CD pipelines.

Related:dbt Cloud

Descendant

A model that appears downstream in the lineage graph. A descendant depends on the current model.

Related:Ancestor, Lineage

Dimension

A descriptive attribute used to segment and filter fact data. Examples include customer name, product category, and date.

Related:Fact Table, Entity Chonk

See also:Entity Chonks

E

Entity Chonk

A DataChonk type representing a business entity with slowly changing attributes. Corresponds to dimension tables in traditional data modeling.

Related:Dimension, SCD

See also:Entity Chonks

Ephemeral

A dbt materialization that doesn't create a database object. The model's SQL is injected as a CTE into downstream models.

Related:Materialization, CTE

F

Fact Chonk

A DataChonk type representing business events or transactions. Contains measures that can be aggregated.

Related:Fact Table, Measure

Fact Table

A table containing business events or transactions with quantitative measures. The central table in a star schema.

Related:Dimension, Star Schema, Measure

Freshness

A measure of how recently source data was updated. dbt can warn or error if source data is stale.

Related:Source, Data Quality

G

Grain

The level of detail represented by a single row in a table. A table's grain defines what each row represents.

Related:Fact Table, Aggregation

I

Incremental

A dbt materialization that only processes new or changed records, improving performance for large tables.

Related:Materialization, Merge

J

Jinja

A templating language used in dbt to write dynamic SQL. Enables macros, loops, conditionals, and variable substitution.

Related:Macro, ref

K

Knowledge Sync

DataChonk's feature that keeps the Brain up-to-date with your existing dbt project, semantic layer, and data warehouse schema.

L

Lineage

The path of data transformations from source to final output. Shows dependencies between models.

Related:DAG, Ancestor, Descendant

M

Macro

A reusable piece of Jinja code that can be called in dbt models. Used for DRY (Don't Repeat Yourself) principles.

Related:Jinja, dbt_utils

Mart

A transformation layer containing business-defined tables ready for consumption by BI tools and analysts.

Related:Data Mart, Gold Layer

Materialization

How dbt persists a model in the warehouse. Options include table, view, incremental, and ephemeral.

Related:Table, View, Incremental, Ephemeral

Measure

A numeric value that can be aggregated in a fact table. Examples include revenue, quantity, and count.

Related:Fact Table, Metric

Metric

A business calculation defined in the semantic layer. Combines measures with dimensions and time constraints.

Related:Semantic Layer, Measure

See also:Metric Chonks

Metric Chonk

A DataChonk type for semantic layer metric definitions. Includes calculation logic, dimensions, and time grains.

See also:Metric Chonks

Model

A SQL SELECT statement in dbt that creates a table or view. The fundamental building block of a dbt project.

Related:Materialization, ref

O

OLAP (Online Analytical Processing)

A category of database systems optimized for complex analytical queries rather than transactional processing.

Related:Data Warehouse, OLTP

P

Primary Key

A column or combination of columns that uniquely identifies each row in a table. Essential for data integrity.

Related:Surrogate Key, Natural Key

R

ref

A dbt function that references another model, creating a dependency in the DAG. Written as {{ ref('model_name') }}.

Related:DAG, Lineage, source

S

SCD (Slowly Changing Dimension)

A technique for tracking historical changes to dimension attributes. Types include SCD Type 1 (overwrite) and SCD Type 2 (history).

Schema

A namespace within a database that contains tables, views, and other objects. Used to organize models.

Related:Database, Table

Semantic Layer

A metadata layer that defines business metrics, dimensions, and relationships. Provides consistent definitions across tools.

Related:Metric, Dimension

Snapshot

A dbt feature that captures the state of a source table at a point in time, enabling SCD Type 2 tracking.

Related:SCD, History

Source

A raw table in the data warehouse that dbt doesn't manage. Defined in YAML and referenced with {{ source() }}.

Related:source (function), Staging

See also:Source Chonks

Source Chonk

A DataChonk type representing a raw source table. Contains metadata about the source and freshness expectations.

See also:Source Chonks

Staging

The first transformation layer that cleans and standardizes raw source data. Also called the base layer.

Related:Base Model, Source

See also:Staging Chonks

Staging Chonk

A DataChonk type for staging models. Handles column renaming, type casting, and basic cleaning.

See also:Staging Chonks

Star Schema

A data modeling pattern with a central fact table surrounded by dimension tables. Optimized for analytical queries.

Surrogate Key

An artificial unique identifier generated for a table, typically using a hash function or auto-increment.

Related:Primary Key, Natural Key

T

Test

A validation that checks data quality. dbt supports generic tests (unique, not_null) and custom singular tests.

Related:Assertion, Data Quality

V

View

A dbt materialization that creates a database view. The SQL is executed on each query, always showing current data.

Related:Materialization, Table

Y

YAML

A human-readable data format used in dbt for configuration, documentation, and test definitions.

Related:Schema file, Properties

Missing a term?

If you think we're missing an important term, please let us know.