- Native management: RisingWave manages the table’s lifecycle (creation, schema, writes). You interact with it like any other RisingWave table (querying, inserting, using in materialized views).
- Iceberg format storage: Data is physically stored according to the Iceberg specification, using a configured Iceberg catalog and object storage path. This ensures compatibility with the Iceberg ecosystem.
- Simplified pipelines: You don’t need a separate CREATE SINK step to export data out of RisingWave into Iceberg format if Iceberg is your desired end format managed by RisingWave. Data ingested or computed can land directly in these Iceberg tables.
- Interoperability: Tables created with the Iceberg engine are standard Iceberg tables and can be read by external Iceberg-compatible query engines (like Spark, Trino, Flink, Dremio) using the same catalog and storage configuration.
Setup and usage
1. Create an Iceberg connection
The Iceberg connection contains information about catalog and object storage. For syntax and properties, seeCREATE CONNECTION.
The following examples show how to create an Iceberg connection using different catalog types.
For enhanced security, you can store credentials like S3 access keys as secrets instead of providing them directly. If you wish to use this feature, see Manage secrets.
JDBC catalog
Glue catalog
Rest catalog
2. Configure the engine to use the connection
You need to configure the iceberg table engine to use the connection that was just created. All tables created with the Iceberg table engine will use the connection by default.3. Create a table with the Iceberg engine
Now, you can create a table using the standardCREATE TABLE syntax, but adding the ENGINE = iceberg clause.
The commit_checkpoint_interval parameter controls how frequently (every N checkpoints) RisingWave commits changes to the Iceberg table, creating a new Iceberg snapshot. The default value is 60. Typically, the checkpoint time is 1s. That means RisingWave will commit changes to the Iceberg table every 60s.
4. Basic operations (insert and select)
For tables created with the Iceberg table engine, you can insert data using standardINSERT statements and query using SELECT. You can use them as base tables to create materialized views or join them with regular tables. However, RisingWave doesn’t support renaming or changing the schemas of natively managed Iceberg tables.
5. Streaming ingestion into Iceberg tables
You can define a source connector directly within the CREATE TABLE statement that uses the Iceberg engine. Data flows from the external source (e.g., Kafka, Pulsar) directly into the Iceberg table format managed by RisingWave.Use Amazon S3 Tables with the Iceberg engine
RisingWave’s Iceberg table engine supports using Amazon S3 Tables as an Iceberg catalog. This allows you to manage Iceberg tables directly within AWS S3 Tables via RisingWave. Follow these steps to configure this integration:- Create an Iceberg Connection
rest catalog type. You must provide your AWS credentials and specify the necessary S3 Tables REST catalog configurations.
Required REST Catalog Parameters:
| Parameter Name | Description | Value for S3 Tables |
|---|---|---|
catalog.rest.signing_region | The AWS region for signing REST catalog requests. | e.g., us-east-1 |
catalog.rest.signing_name | The service name for signing REST catalog requests. | s3tables |
catalog.rest.sigv4_enabled | Enables SigV4 signing for REST catalog requests. Set to true. | true |
CREATE CONNECTION
Replace placeholder values
<...> with your specific configuration details.- Configure the Iceberg table engine
- Create a table using the Iceberg table engine
ENGINE = iceberg. These tables will be registered in your specified Amazon S3 Tables catalog.
my_iceberg_table using the Iceberg format, with metadata stored and accessible via Amazon S3 Tables.
Features and considerations
Time travel
Iceberg’s snapshotting mechanism enables time travel queries. You can query the state of the table as of a specific timestamp or snapshot ID using theFOR SYSTEM_TIME AS OF or FOR SYSTEM_VERSION AS OF clauses.
External access
Since the data is stored in the standard Iceberg format, external systems (like Spark, Trino, Dremio) can directly query the tables created by RisingWave’s Iceberg engine. To do this, configure the external system with the same Iceberg connection details used in RisingWave:- Catalog type (
storage,jdbc,glue,rest) - Catalog configuration (URI, warehouse path, credentials)
- Object storage configuration (endpoint, credentials)
public.users_iceberg in RisingWave would be accessed as table users_iceberg within the public namespace/database in the configured Iceberg catalog by an external tool.
Limitations
RisingWave does not currently have a built-in automatic compaction service for tables created with the Iceberg engine. Streaming ingestion, especially with frequent commits (lowcommit_checkpoint_interval), can lead to many small files. You may need to run compaction procedures manually using external tools that operate on Iceberg tables to optimize read performance. We are actively working on integrating compaction features.