Skip to content

[Bug]: vector forcefilter search results failed #23796

@heni02

Description

@heni02

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

3.0-dev

Commit ID

f8a38d8

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

测试场景:使用测试工具使用暴力搜索生成预期结果后再进行召回率测试,3.0-dev 召回率只有 0.1 左右,main recall 0.73,推断 3.0-dev 暴力搜索结果错误
3.0-dev 召回率测试结果:

Image

main 召回率测试结果:

Image

测试 sql:
SELECT md5_id
FROM ca_comprehensive_dataset
WHERE delete_flag is null
AND locate('日更', content_type) = 0
AND (allow_access IS NULL OR FIND_IN_SET('OPEN', allow_access))
AND (allow_identities IS NULL OR locate('SA_O', allow_identities) > 0)
ORDER BY l2_distance(question_vector, %s) ASC
LIMIT %s;

Expected Behavior

No response

Steps to Reproduce

可使用回归测试中ca_comprehensive_dataset 这个表分析
ddl:
CREATE TABLE if not exists ca_comprehensive_dataset (md5_id varchar(255) NOT NULL,question text DEFAULT NULL,answer json DEFAULT NULL,source_type varchar(255) DEFAULT NULL,content_type varchar(255) DEFAULT NULL,keyword varchar(255) DEFAULT NULL,question_vector vecf32(1024) DEFAULT NULL COMMENT '摘要的向量集',allow_access varchar(511) DEFAULT NULL,allow_identities varchar(512) DEFAULT NULL,delete_flag int DEFAULT NULL,created_at timestamp DEFAULT CURRENT_TIMESTAMP(),updated_at timestamp DEFAULT CURRENT_TIMESTAMP() ON UPDATE CURRENT_TIMESTAMP(),PRIMARY KEY (md5_id),KEY idx_comprehensive_allow_access (allow_access),KEY idx_comprehensive_allow_identities (allow_identities),KEY idx_comprehensive_content_type (content_type));

load data url s3option {'endpoint'='http://cos.ap-guangzhou.myqcloud.com','access_key_id'='***','secret_access_key'='***','bucket'='mo-load-guangzhou-1308875761', 'filepath'='mo-big-data/ca_ai_ca_comprehensive_dataset.csv'} into table ca_comprehensive_dataset FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\n' PARALLEL 'TRUE'

Additional information

No response

Metadata

Metadata

Assignees

Labels

kind/bugSomething isn't workingseverity/s0Extreme impact: Cause the application to break down and seriously affect the use

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions