Skip to content
Dremio-Specific Engine & Optimizations Last updated: May 29, 2026

Dremio Raw Reflections

Dremio Raw Reflections are pre-computed data layouts that preserve raw, row-level columns from a dataset, optimized with custom sorting, partitioning, and distribution settings to accelerate raw scans, selective filters, and complex joins.

dremio raw reflectionsraw reflectionsquery acceleration raw datascan acceleration dremiodremio partition reflection

Dremio Raw Reflections

Dremio Raw Reflections are pre-computed materializations that store row-level, granular data from a physical or virtual dataset. Unlike Aggregation Reflections, which group and summarize fields, Raw Reflections preserve the original structure of individual rows. They act as optimized, physical copies of the source data, reorganizing column layouts, sorting keys, and partitioning boundaries to minimize resource usage during query planning and execution.

When user queries request specific detailed records, perform lookups, run complex multi-table joins, or apply selective filter predicates, the Dremio query planner automatically redirects the scan phase to use the Raw Reflection files instead of reading the raw source tables.

Configurations and Optimizations

To maximize the performance benefits of Raw Reflections, administrators can configure three database design parameters:

Acceleration Use Cases

Raw Reflections are highly effective for specific query patterns:

  1. Selective Filters: Queries containing WHERE clauses filtering on sorted or partitioned columns (for example, looking up orders by a specific ID or date range).
  2. Complex Joins: Queries combining multiple large tables. By creating Raw Reflections on the tables with identical partition and distribution settings, Dremio can execute local co-segmented joins, eliminating expensive network shuffles.
  3. BI Tool Dashboards: Reports that allow users to drill down from high-level summaries into detailed transaction records.
  4. Data Extraction: Large-scale data exports and ETL pipelines that need to read massive record sets with minimal latency.

Incremental vs. Full Refreshes

Because Raw Reflections store row-level records, keeping them synchronized with source changes is vital. Dremio supports two refresh methods:

πŸ“š Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base