Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] NeuralQueryBuilder doEquals() and doHashCode() don't consider all the parameters that can impact the query result #1010

Open
bzhangam opened this issue Dec 12, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@bzhangam
Copy link
Contributor

bzhangam commented Dec 12, 2024

What is the bug?

The doEquals() and doHashCode() methods in NeuralQueryBuilder fail to account for all parameters that influence query results. This oversight can lead to scenarios where two queries are mistakenly identified as equal, even though their outcomes differ.

How can one reproduce the bug?

Whenever two NeuralQueryBuilder instances have differing parameters that are not accounted for in the doEquals() and doHashCode() methods, these methods will incorrectly identify the instances as equal.

Below is the current implementation:

@Override
protected boolean doEquals(NeuralQueryBuilder obj) {
    if (this == obj) return true;
    if (obj == null || getClass() != obj.getClass()) return false;
    EqualsBuilder equalsBuilder = new EqualsBuilder();
    equalsBuilder.append(fieldName, obj.fieldName);
    equalsBuilder.append(queryText, obj.queryText);
    equalsBuilder.append(modelId, obj.modelId);
    equalsBuilder.append(k, obj.k);
    equalsBuilder.append(filter, obj.filter);
    return equalsBuilder.isEquals();
}

@Override
protected int doHashCode() {
    return new HashCodeBuilder().append(fieldName).append(queryText).append(modelId).append(k).toHashCode();
}

But it has more parameters.

What is the expected behavior?

doEquals() and doHashCode() methods should account for all the parameters that can impact the query result.

What is your host/environment?

N/A

Do you have any screenshots?

N/A

Do you have any additional context?

If these methods are used for OpenSearch caching, it could lead to incorrect results being returned when the difference between a new query and an old query lies solely in the parameters that are not considered. Further investigation is required to understand how these two functions are utilized within OpenSearch.

@bzhangam bzhangam added bug Something isn't working untriaged labels Dec 12, 2024
@bzhangam
Copy link
Contributor Author

Looking into this issue.

@martin-gaievski
Copy link
Member

that's interesting finding, thanks fro reporting it. There is a filter field that is missing from the hashCode method. Do you know scenario (in core or plugin specific) where we see effect of this?

@bzhangam
Copy link
Contributor Author

bzhangam commented Dec 16, 2024

Discussed with Martin offline that we don't know how those functions are used in core/plugin. But we should try to build them in a right behavior. So we propose to have parameters fieldName, queryText, modelId, filter, k, maxDistance, maxDistance, methodParameters, expandNested, rescoreContext in doEquals() and doHashCode() to match the KNNQueryBuilder since current NeuralQuery is just a wrapper of the KNNQuery. And we will rely on existing e2e tests to verify it should not break anything. Meanwhile will consult other people about this issue.

@heemin32 heemin32 moved this to Backlog(Hot) in Neural Search RoadMap Dec 26, 2024
@heemin32 heemin32 closed this as completed by moving to Backlog(Hot) in Neural Search RoadMap Dec 26, 2024
@heemin32 heemin32 reopened this Dec 27, 2024
@bzhangam
Copy link
Contributor Author

bzhangam commented Dec 27, 2024

Also noticed that in HybridQueryBuilder the doEquals() and doHashCode() doesn't match so will also fix it as part of this one. The missing field in the doHashCode() is fieldName which is not being used anywhere. I think we should remove it. Would wait for Martin's input on this before taking the next step since it was added by him.

@dblock
Copy link
Member

dblock commented Jan 6, 2025

[Catch All Triage - 1, 2, 3, 4, 5, 6]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 2.19
Development

No branches or pull requests

4 participants