Skip to content
Table Format Maintenance & Operations Last updated: May 29, 2026

Iceberg Spark Procedure add_files

A Spark SQL procedure in Apache Iceberg used to register existing Parquet or ORC data files directly into an Iceberg table without copying data.

add_files sparkregister parquet files icebergspark sql call add_files

Iceberg Spark Procedure add_files

The Iceberg Spark Procedure add_files is a utility executed via Spark SQL to import existing data files into an Apache Iceberg table. If a data team has large volumes of historical data stored in Parquet or ORC format, copying that data to create a new Iceberg table can be expensive. The add_files procedure references the storage paths of these files and registers them directly in the Iceberg table’s metadata without copying or modifying the data.

Syntax and Parameters

The procedure takes the target Iceberg table, the source directory location, and the format of the files. It can optionally parse partition values from Hive-style directory structures:

/* Add existing Parquet files from an external path into the Iceberg table */
CALL prod.system.add_files(
    table => 'db.web_logs',
    source_table => '`parquet`.`s3://my-bucket/historical_logs/`',
    partition_filter => map('year', '2026')
);

Key Considerations

πŸ“š Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base