What Is a Semantic Layer?
The semantic layer is an abstraction layer in the analytics architecture that sits between raw data storage (Iceberg tables, databases) and data consumers (BI tools, SQL analysts, AI agents). It translates the physical data model — with its technical table names, cryptic column names, normalized join structures, and implicit business rules — into a clean, business-friendly logical model where data is expressed in the vocabulary of the business.
A semantic layer answers the question that every analyst asks when they first encounter a new data source: 'What does this table actually mean, and how do I use it correctly?' Without a semantic layer, each analyst answers this question independently, often incorrectly or inconsistently. With a semantic layer, the answer is defined once by data experts and reused by everyone.
In the data lakehouse, the semantic layer is what makes the difference between a sophisticated data storage system that only engineers can navigate and a business intelligence platform that the entire organization can use effectively for self-service analytics.
Components of a Semantic Layer
A complete semantic layer has five components:
- Logical tables/views: Named representations of business entities (Customer, Order, Product) that may join multiple physical tables and apply business filters — the equivalent of Virtual Datasets in Dremio
- Business metrics: Named, formally defined calculations (LTV = SUM(order_value) per customer, Churn Rate = churned_customers / total_customers) that are computed consistently everywhere they appear
- Dimensions: The axes along which metrics are analyzed — time dimensions (day, week, month, quarter, year), organizational dimensions (region, segment, product category)
- Business glossary linkages: Connections between semantic objects and their formal business definitions in the data catalog
- Access control: User and role-based visibility of semantic objects — some metrics visible only to finance, some dimensions visible only to specific regions

Semantic Layer in Dremio
Dremio's semantic layer is built on Virtual Datasets — named SQL views that define business logic transformations on top of Iceberg tables. These VDSs form a hierarchy: foundation VDSs that clean and organize raw Iceberg data, business logic VDSs that implement metric calculations and joins, and BI-ready VDSs that are optimized for specific analytical domains.
Reflections can be created on top of VDSs — pre-computing the semantic layer's transformations so that queries against business-friendly views return in milliseconds rather than re-executing complex joins at query time. This combines semantic clarity with sub-second BI performance.
The AI Semantic Layer extends Dremio's VDS model with richer natural language descriptions, metric definitions with dimensional context, and an MCP server endpoint — enabling AI agents to autonomously discover and query business metrics by name and concept, not by technical table path.

Summary
The semantic layer is the organizational intelligence layer of the data lakehouse — encoding business knowledge into reusable, governed data objects that translate raw technical data into consistent business metrics and dimensions. For organizations that have invested in building beautiful lakehouse pipelines but struggle with analyst adoption and inconsistent numbers, investing in the semantic layer is the highest-ROI next step. Implemented in Dremio through Virtual Datasets and the AI Semantic Layer, the semantic layer is what transforms the open lakehouse into a trusted, self-service analytics platform for the entire organization.