Skip to content

Conversation

@jerry-024
Copy link
Contributor

@jerry-024 jerry-024 commented Oct 31, 2025

Purpose

Linked issue: close #xxx

Tests

API and Format

Documentation

@jerry-024 jerry-024 marked this pull request as draft October 31, 2025 09:28
@jerry-024 jerry-024 requested a review from Copilot October 31, 2025 09:30
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces automatic file compression configuration for format tables based on their file format type. Instead of requiring users to explicitly set 'file.compression'='none' in table properties, the system now automatically determines appropriate compression defaults: none for CSV/JSON, snappy for Parquet, and zstd for ORC.

Key changes:

  • Added FORMAT_TABLE_FILE_COMPRESSION config option with fallback to FILE_COMPRESSION
  • Implemented getFormatTableFileCompression() utility method to determine compression based on format
  • Updated catalog implementations (HiveCatalog, RESTCatalog) to auto-populate file compression during table creation
  • Removed explicit 'file.compression'='none' declarations from test cases

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
paimon-api/src/main/java/org/apache/paimon/CoreOptions.java Added FORMAT_TABLE_FILE_COMPRESSION config option and formatTableFileImplementation() method
paimon-core/src/main/java/org/apache/paimon/catalog/CatalogUtils.java Implemented getFormatTableFileCompression() with format-specific defaults
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java Auto-set file compression during format table creation
paimon-core/src/main/java/org/apache/paimon/rest/RESTCatalog.java Auto-set file compression during format table creation
paimon-core/src/main/java/org/apache/paimon/table/format/FormatTableFileWriter.java Updated to use formatTableFileImplementation() instead of fileCompression()
paimon-core/src/test/java/org/apache/paimon/catalog/CatalogTestBase.java Added test coverage for format-specific file compression defaults
paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/table/PaimonFormatTableTest.scala Removed explicit file compression settings from test cases
paimon-spark/paimon-spark-ut/src/test/scala/org/apache/paimon/spark/sql/FormatTableTestBase.scala Removed explicit file compression settings from test cases
paimon-spark/paimon-spark-ut/src/test/java/org/apache/paimon/spark/SparkCatalogWithHiveTest.java Removed explicit file compression settings from test cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

} else if (options.containsKey(FORMAT_TABLE_FILE_COMPRESSION.key())) {
return options.get(FORMAT_TABLE_FILE_COMPRESSION);
} else {
return fileCompression();
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback logic at line 2311 calls fileCompression() which returns options.get(FILE_COMPRESSION) and can be null. This creates a circular logic issue: if FILE_COMPRESSION key is not present (checked at line 2306), calling fileCompression() will still return null. This method can return null, causing potential NullPointerException when used. The method should either return a non-null default value based on format type (like CatalogUtils.getFormatTableFileCompression() does) or be annotated as @Nullable.

Suggested change
return fileCompression();
// Fallback to a sensible default value if neither key is present.
// For example, "none" or another appropriate default.
return "none";
Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant