Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should not use vector index when order by function different from metric_type #55009

Open
ZiheLiu opened this issue Jan 13, 2025 · 0 comments · May be fixed by #55123
Open

Should not use vector index when order by function different from metric_type #55009

ZiheLiu opened this issue Jan 13, 2025 · 0 comments · May be fixed by #55123
Labels
type/bug Something isn't working

Comments

@ZiheLiu
Copy link
Contributor

ZiheLiu commented Jan 13, 2025

Steps to reproduce the behavior (Required)

ADMIN SET FRONTEND CONFIG ("enable_experimental_vector" = "true");

CREATE TABLE t6 (
    id bigint(20) NOT NULL COMMENT "",
    vector ARRAY<FLOAT> NOT NULL COMMENT "",
    INDEX index_vector (vector) USING VECTOR (
        "index_type" = "hnsw", 
        "dim"="5", 
        "metric_type" = "cosine_similarity", 
        "is_vector_normed" = "false", 
        "M" = "16", 
        "efconstruction" = "40"
    )
) ENGINE=OLAP
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES (
    "replication_num" = "1"
);

insert into t6 values 
    (1, [10,12,13,14,15]),
    (2, [1,2,3,4,5]),
    (3, [0.1,0.2,0.3,0.4,0.5]);


explain select id, vector, approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector), l2_distance([0.1,0.2,0.3,0.4,0.5], vector), cosine_similarity([0.1,0.2,0.3,0.4,0.5], vector) from t6 order by approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector)  limit 1000;

select id, vector, approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector), l2_distance([0.1,0.2,0.3,0.4,0.5], vector), cosine_similarity([0.1,0.2,0.3,0.4,0.5], vector) from t6 order by approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector)  limit 1000;

Expected behavior (Required)

> explain select id, vector, approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector), l2_distance([0.1,0.2,0.3,0.4,0.5], vector), cosine_similarity([0.1,0.2,0.3,0.4,0.5], vector) from t6 order by approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector)  limit 1000;

|   0:OlapScanNode                                                                                |
|      TABLE: t6                                                                                  |
|      PREAGGREGATION: ON                                                                         |
|      VECTORINDEX: FALSE                                                                            |

> select id, vector, approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector), l2_distance([0.1,0.2,0.3,0.4,0.5], vector), cosine_similarity([0.1,0.2,0.3,0.4,0.5], vector) from t6 order by approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector)  limit 1000;
+----+-----------------------+-------------------------------------------------------+------------------------------------------------+------------------------------------------------------+
| id | vector                | approx_l2_distance([0.1, 0.2, 0.3, 0.4, 0.5], vector) | l2_distance([0.1, 0.2, 0.3, 0.4, 0.5], vector) | cosine_similarity([0.1, 0.2, 0.3, 0.4, 0.5], vector) |
+----+-----------------------+-------------------------------------------------------+------------------------------------------------+------------------------------------------------------+
| 1  | [10,12,13,14,15]      | 0.9525018                                             | 793.75                                         | 0.9525017                                            |
| 2  | [1,2,3,4,5]           | 1.0                                                   |  44.55                                         | 1.0                                                  |
| 1  | [0.1,0.2,0.3,0.4,0.5] | 1.0                                                   |   0.0                                          | 0.9999999                                            |
+----+-----------------------+-------------------------------------------------------+------------------------------------------------+------------------------------------------------------+

+----+-----------------------+--------------------------------------

Real behavior (Required)

> explain select id, vector, approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector), l2_distance([0.1,0.2,0.3,0.4,0.5], vector), cosine_similarity([0.1,0.2,0.3,0.4,0.5], vector) from t6 order by approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector)  limit 1000;

|   0:OlapScanNode                                                                                |
|      TABLE: t6                                                                                  |
|      PREAGGREGATION: ON                                                                         |
|      VECTORINDEX: ON                                                                            |

> select id, vector, approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector), l2_distance([0.1,0.2,0.3,0.4,0.5], vector), cosine_similarity([0.1,0.2,0.3,0.4,0.5], vector) from t6 order by approx_l2_distance([0.1,0.2,0.3,0.4,0.5], vector)  limit 1000;
+----+-----------------------+-------------------------------------------------------+------------------------------------------------+------------------------------------------------------+
| id | vector                | approx_l2_distance([0.1, 0.2, 0.3, 0.4, 0.5], vector) | l2_distance([0.1, 0.2, 0.3, 0.4, 0.5], vector) | cosine_similarity([0.1, 0.2, 0.3, 0.4, 0.5], vector) |
+----+-----------------------+-------------------------------------------------------+------------------------------------------------+------------------------------------------------------+
| 1  | [10,12,13,14,15]      | 0.9525018                                             | 793.75                                         | 0.9525017                                            |
| 2  | [1,2,3,4,5]           | 1.0                                                   |  44.55                                         | 1.0                                                  |
| 1  | [0.1,0.2,0.3,0.4,0.5] | 1.0                                                   |   0.0                                          | 0.9999999                                            |
+----+-----------------------+-------------------------------------------------------+------------------------------------------------+------------------------------------------------------+

StarRocks version (Required)

main ada1be1

@ZiheLiu ZiheLiu added the type/bug Something isn't working label Jan 13, 2025
@ZiheLiu ZiheLiu linked a pull request Jan 16, 2025 that will close this issue
24 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant