Skip to content

WEIGHT_STRING of U+FFFD collate utf8mb4_general_ci is changed between v8.5.2 and v8.5.3, leading to silent data-index inconsistency #64144

@kennytm

Description

@kennytm

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Start a tagged playground running v8.5.2
tiup playground v8.5.2 --db 1 --kv 1 --pd 1 --tiflash 0 --without-monitor -T revert61972
  1. Create a table with a utf8mb4_general_ci collation index, insert an entry with U+FFFD
create table test.a (id bigint primary key, name varchar(10) collate utf8mb4_general_ci, key(name));
insert into test.a values (1, 'A�B');
select weight_string(name) from test.a;
-- 0x0041FFFD0042
  1. Upgrade the playground to v8.5.3
# Ctrl+C
tiup playground v8.5.3 --db 1 --kv 1 --pd 1 --tiflash 0 --without-monitor -T revert61972
  1. Check the WEIGHT_STRING. Note that everything after the U+FFFD is truncated.
select name, weight_string(name) from test.a where id = 1;
-- 'A�B', 0x0041
  1. Note that the index cannot be used to look up the affected row
select * from a where name = (select name from a where id = 1);
-- Empty set
  1. Note that deleting the row leads to Error 8141.
delete from test.a where id = 1;
-- ERROR 8141 (HY000): assertion failed: key: 74800000000000006e5f698000000000000001010041000000000000f9038000000000000001, assertion: Exist, start_ts: 461786157551976449, existing start ts: 0, existing commit ts: 0
  1. Note that ADMIN CHECK TABLE is unable to detect any problem.
admin check table a;
-- Query OK, 0 rows affected

2. What did you expect to see? (Required)

3. What did you see instead (Required)

Everything after step 4 are unexpected.

4. What is your TiDB version? (Required)

v8.5.2 → v8.5.3

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions