Skip to content

Update dbt State data storage FAQ with metadata specs and hosting details#9349

Open
luna-bianca wants to merge 2 commits into
currentfrom
state-faqs
Open

Update dbt State data storage FAQ with metadata specs and hosting details#9349
luna-bianca wants to merge 2 commits into
currentfrom
state-faqs

Conversation

@luna-bianca

@luna-bianca luna-bianca commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

What are you changing in this pull request and why?

Closes https://dbtlabs.atlassian.net/browse/PRODDOCS-1104

Updates the dbt State "How is data stored?" FAQ to clarify what metadata is stored, data transmission behavior, and hosting details.

Changes:

  • Clarified that dbt State stores last-modified timestamps and hashed SQL statements (not warehouse data)
  • Added that no warehouse data is transmitted to dbt Labs
  • Noted that the service runs in a single US multi-tenant instance
  • Added that dbt State makes no live connections to the data warehouse
  • Added a link to the dbt Labs privacy policy for data retention details

Previews:

Checklist


🚀 Deployment available! Here are the direct links to the updated files:

@luna-bianca luna-bianca requested a review from a team as a code owner June 8, 2026 15:45
@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs-getdbt-com Ready Ready Preview Jun 15, 2026 10:16am

Request Review

@github-actions github-actions Bot added the content Improvements or additions to content label Jun 8, 2026

No actual data from your warehouse is transmitted.

The dbt State service runs in a single US multi-tenant (MT) instance and does _not_ make any live connections to your data warehouse. For data retention details, refer to the [dbt Labs privacy policy](https://www.getdbt.com/cloud/privacy-policy).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtcohen6 Is it correct that we should link here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The dbt State service runs in a single US multi-tenant (MT) instance and does _not_ make any live connections to your data warehouse. For data retention details, refer to the [dbt Labs privacy policy](https://www.getdbt.com/cloud/privacy-policy).
The dbt State service runs in a single US multi-tenant (MT) instance. The service never connects to your data warehouse. No actual data from your warehouse is transmitted. The only connection is to your running dbt process (CLI or platform) in order to exchange the metadata described above. For data retention details, refer to the [dbt Labs privacy policy](https://www.getdbt.com/cloud/privacy-policy).

@coastlines

Copy link
Copy Markdown
Contributor

LGTM but defer to @jtcohen6 to confirm technical details

@jtcohen6 jtcohen6 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small updates

dbt State sends the following metadata to dbt Labs servers:

- **Last-modified timestamps** — used to determine whether upstream data has changed since the last run
- **SQL statement hashes** — SQL statements are hashed before transmission, so dbt Labs cannot see the contents; only hashes are stored and compared across runs to detect logic changes

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **SQL statement hashes** — SQL statements are hashed before transmission, so dbt Labs cannot see the contents; only hashes are stored and compared across runs to detect logic changes
- **SQL statement hashes** — SQL statements are processed to detect and classify changes, then hashed. Only the hash is persisted for future comparisons.


No actual data from your warehouse is transmitted.

The dbt State service runs in a single US multi-tenant (MT) instance and does _not_ make any live connections to your data warehouse. For data retention details, refer to the [dbt Labs privacy policy](https://www.getdbt.com/cloud/privacy-policy).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The dbt State service runs in a single US multi-tenant (MT) instance and does _not_ make any live connections to your data warehouse. For data retention details, refer to the [dbt Labs privacy policy](https://www.getdbt.com/cloud/privacy-policy).
The dbt State service runs in a single US multi-tenant (MT) instance. The service never connects to your data warehouse. No actual data from your warehouse is transmitted. The only connection is to your running dbt process (CLI or platform) in order to exchange the metadata described above. For data retention details, refer to the [dbt Labs privacy policy](https://www.getdbt.com/cloud/privacy-policy).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

content Improvements or additions to content

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants