Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Results are mismatch with vanilla Spark when get_json_object({"dScore":0.0215434648799772}, "$.dScore") #4928

Open
kecookier opened this issue Mar 12, 2024 · 5 comments
Labels
bug Something isn't working triage

Comments

@kecookier
Copy link
Contributor

Backend

VL (Velox)

Bug description

The following SQL might lead to wrong results, but it's not yet certain if there are other factors involved.
I will try to reproduce later.

select get_json_object(factor_context, '$.factor_trust_model_score') as model_score
                FROM mart_finrisk.dwd_risk_antifraud_approve_scene_log_inc_d;

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

|gluten_model_score |  vanilla_model_score
+--------------------+-----------------------
| 0.0215435          | 0.0215434648799772
| 0.0128806          | 0.012880573063864434
| 0.0114058          | 0.011405845787963208
| 0.00863517         | 0.008635174896913173
@kecookier kecookier added bug Something isn't working triage labels Mar 12, 2024
@kecookier
Copy link
Contributor Author

The following unit test case can reproduce the issue. I'm sure that got wrong value while parsing double in function SIMDGetJsonObjectFunction::extractStringResult().
More info about UT can refer to this commit kecookier/velox@f87115d#diff-fdbb1f97f88c92ea26b933c047486a61efcf71cc9cd1995f77d09fd3c7578d7aR39

  EXPECT_EQ(
      "0.0215434648799772",
      getJsonObject(R"({"dScore":0.0215434648799772})", "$.dScore"));

---------------------------------------------------------------
[zk] dv:0.0215435 numberResult:0.0215435
Expected equality of these values:
  "0.0215434648799772"
    Which is: 0x53d043d
  getJsonObject(R"({"dScore":0.0215434648799772})", "$.dScore")
    Which is: ("0.0215435")

Hi @PHILO-HE , reviewing the commit history, I believe you have more expertise in this section. Would you be willing to assist in resolving this issue, please?

@kecookier kecookier changed the title [VL] Results are mismatch with vanilla Spark, it could be get_json_object() causing the issue. [VL] Results are mismatch with vanilla Spark when get_json_object({"dScore":0.0215434648799772}, "$.dScore") Mar 13, 2024
@PHILO-HE
Copy link
Contributor

Thanks for reporting this issue! I will take a look.

@PHILO-HE
Copy link
Contributor

Hi @kecookier, it looks the below small patch can fix this issue. Please help verify it. Thanks!

PHILO-HE/velox@a1e9be0

@kecookier
Copy link
Contributor Author

Hi @kecookier, it looks the below small patch can fix this issue. Please help verify it. Thanks!

PHILO-HE/velox@a1e9be0

@PHILO-HE Thanks for your help, I'll try it out later.

@kecookier
Copy link
Contributor Author

Hi @PHILO-HE , I have tested it, and that patch can fix the bug.

Hi @kecookier, it looks the below small patch can fix this issue. Please help verify it. Thanks!
PHILO-HE/velox@a1e9be0

@PHILO-HE Thanks for your help, I'll try it out later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants