Skip to content

fix: Support on all-literal RLIKE expression#3647

Open
0lai0 wants to merge 5 commits intoapache:mainfrom
0lai0:native_engine_literal_RLIKE
Open

fix: Support on all-literal RLIKE expression#3647
0lai0 wants to merge 5 commits intoapache:mainfrom
0lai0:native_engine_literal_RLIKE

Conversation

@0lai0
Copy link
Contributor

@0lai0 0lai0 commented Mar 8, 2026

Which issue does this PR close?

Closes #3343

Rationale for this change

When ConstantFolding is disabled, all-literal RLIKE expressions crash the native engine with a misleading error. The engine should handle scalar literals gracefully.

What changes are included in this PR?

  • Fixed all-literal RLIKE expression handling in native engine
  • Added proper scalar evaluation for RLIKE with literal inputs
  • Improved error handling and fallback logic

How are these changes tested?

  • Added tests for all-literal RLIKE with ConstantFolding disabled
  • Verified rlike_enabled.sql test now passes
  • All existing RLIKE tests continue to pass
  • ./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite rlike_enabled" -Dtest=none

Copy link
Contributor

@coderfender coderfender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding additional support. Left couple of comments

let is_match = match scalar {
ScalarValue::Utf8(Some(s)) => self.pattern.is_match(s.as_str()),
ScalarValue::LargeUtf8(Some(s)) => self.pattern.is_match(s.as_str()),
ScalarValue::Utf8View(Some(s)) => self.pattern.is_match(s.as_str()),
Copy link
Contributor

@coderfender coderfender Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0lai0 would we ever get a UTF8View input from the spark side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, On the current Spark -> Comet proto -> native planner path, string literals are deserialized as ScalarValue::Utf8. So I think keeping Utf8 handling here is necessary.
Let me know if I missed anything.

ScalarValue::LargeUtf8(Some(s)) => self.pattern.is_match(s.as_str()),
ScalarValue::Utf8View(Some(s)) => self.pattern.is_match(s.as_str()),
_ => {
return internal_err!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0lai0 could we also add test on rust side ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks @coderfender.

Comment on lines +25 to +26
#[cfg(test)]
use datafusion::physical_expr::expressions::Literal;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import should just be moved inside mod test which is already behind the #[cfg(test)] guard

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the import that needs moving. Thanks @0lai0!

@0lai0
Copy link
Contributor Author

0lai0 commented Mar 11, 2026

Thanks @andygrove for the review. PR updated.

#[test]
fn test_rlike_scalar_utf8_literal() {
let expr = RLike::try_new(
Arc::new(Literal::new(ScalarValue::Utf8(Some("Rose".to_string())))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests the happy path (Utf8(Some)).
It would be good to also add a test case for Utf8(None).

You could also extend it to cover the LargeUtf8 and Utf8View paths too:

for scalar in &[ScalarValue::Utf8(Some("Utf8".to_string())), ScalarValue::LargeUtf8(Some("LargeUtf8".to_string())), ScalarValue::Utf8View(Some("Utf8View".to_string()))] { ... }

and some non-Utf8 type, e.g. ScalarValue::Boolean to trigger the error.

let is_match = match scalar {
ScalarValue::Utf8(Some(s))
| ScalarValue::LargeUtf8(Some(s))
| ScalarValue::Utf8View(Some(s)) => self.pattern.is_match(s.as_str()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| ScalarValue::Utf8View(Some(s)) => self.pattern.is_match(s.as_str()),
| ScalarValue::Utf8View(Some(s)) => self.pattern.is_match(&s),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Native engine crashes on all-literal RLIKE expression

4 participants