SQLITE3 FTS5 prefix query get mismatch result

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

SQLITE3 FTS5 prefix query get mismatch result

ze tian
(I am not sure whether you have seen my message, so resend again)
Hi,
When I use FTS5, I have met that, there are some cases which will get mis-match results with prefix search.

Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I want to get records like "lucya","lucyabc" etc, and

"lux" or "lulu" is not what I want but returned.

Such problems are not common, But I have tried to build such test case which can lead to this problem very easy. Here is how I generate it:

1) create an fts5 table. and insert some record like "lucya","lucyb".

2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e) pengyu. a,b,c,d are have some same prefix(lu), e is some other random case.

3) before insert into the fts table with 2) records, appending some random letter to make each record different.

Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc

4) for-loop insert, and each loop trying to lantch the query 'lucy*', \

check the match result will finding the mis-match result, the corrent results should be "lucya","lucyb", not "luluabc"...


When mis-match happen, I try to analyze the prefix search mechanism and find that, there are 2 points which I think have problems:

1) fts5LeafSeek, when search failed, and exec goto search_failed, in search_failed, the 2 if condition will not satisfy commonly. In my mind, I think it should return,

but not, and then the search_success logic exec.

2) fts5SetupPrefixIter, when gather results, the logic to set the flag bNewTerm has some leak, which will set bNewTerm=false,

but the record is not what we want indeed.


These 2 logic problems lead to mis-match results. I try to remove the bNewTerm logic directly, and make it compare every loog,

then, the mis-match results disappear.

// relevant code

Change below

if (bNewTerm) {

if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;

}

to

if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;


Need your help to recheck the FTS5 prefix search logic, thank you very much.

Yours,

xiaojin zheng


获取 Outlook for Android<https://aka.ms/ghei36>

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLITE3 FTS5 prefix query get mismatch result

Dan Kennedy-4
On 03/20/2018 02:31 PM, zheng xiaojin wrote:
> (I am not sure whether you have seen my message, so resend again)
> Hi,
> When I use FTS5, I have met that, there are some cases which will get mis-match results with prefix search.
>
> Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I want to get records like "lucya","lucyabc" etc, and
>
> "lux" or "lulu" is not what I want but returned.
>
> Such problems are not common, But I have tried to build such test case which can lead to this problem very easy.

Thanks for looking into and reporting this problem. Are you able to post
the source for the program you used to execute the test described below?

Dan.


> Here is how I generate it:
>
> 1) create an fts5 table. and insert some record like "lucya","lucyb".
>
> 2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e) pengyu. a,b,c,d are have some same prefix(lu), e is some other random case.
>
> 3) before insert into the fts table with 2) records, appending some random letter to make each record different.
>
> Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc
>
> 4) for-loop insert, and each loop trying to lantch the query 'lucy*', \
>
> check the match result will finding the mis-match result, the corrent results should be "lucya","lucyb", not "luluabc"...
>
>
> When mis-match happen, I try to analyze the prefix search mechanism and find that, there are 2 points which I think have problems:
>
> 1) fts5LeafSeek, when search failed, and exec goto search_failed, in search_failed, the 2 if condition will not satisfy commonly. In my mind, I think it should return,
>
> but not, and then the search_success logic exec.
>
> 2) fts5SetupPrefixIter, when gather results, the logic to set the flag bNewTerm has some leak, which will set bNewTerm=false,
>
> but the record is not what we want indeed.
>
>
> These 2 logic problems lead to mis-match results. I try to remove the bNewTerm logic directly, and make it compare every loog,
>
> then, the mis-match results disappear.
>
> // relevant code
>
> Change below
>
> if (bNewTerm) {
>
> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>
> }
>
> to
>
> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>
>
> Need your help to recheck the FTS5 prefix search logic, thank you very much.
>
> Yours,
>
> xiaojin zheng
>
>
> 获取 Outlook for Android<https://aka.ms/ghei36>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

ze tian
(Please add.the head file and sqlite lib yourself, Thank you very much)

char *orgName[] = {
"yun j ji yunji yunj yji yj z zh zha zhao q qi qia qian d du dua duan zhaoqian zhaoq zhqian zhq zqian zq qianduan qiand qduan qd zhaoqianduan zhaoqiand zhaoqduan zhaoqd zhqianduan zhqiand zhqduan zhqd zqianduan zqiand zqduan zqd",
"murren m mu l lu lun mulun mul mlun ml",
"l lu m ma man luman lum lman lm",
"d   di   din   ding   d   di   din   ding   f   fu   w   wu   dingding   dingd   dding   dd   dingfu   dingf   dfu   df   fuwu   fuw   fwu   fw   dingdingfu   dingdingf   dingdfu   dingdf   ddingfu   ddingf   ddfu   ddf   dingfuwu   dingfuw   dingfwu   dingfw   dfuwu   dfuw   dfwu   dfw   dingdingfuwu   dingdingfuw   dingdingfwu   dingdingfw   dingdfuwu   dingdfuw   dingdfwu   dingdfw   ddingfuwu   ddingfuw   ddingfwu   ddingfw   ddfuwu   ddfuw   ddfwu   ddfw   l   li   x   xi   xia   lixia   lix   lxia   lx",
"lucy y ya yan x xi yanxi yanx yxi yx",
"p pe pen peng l le lei penglei pengl plei pl lucy"
};
char *nickName[] = {
"ab",
"bc",
"cd",
"de",
"ef",
"fg"
};
int id[] = { 21078,21218,21125,53234,40824,164873 };

void prepareSchema_prefix(sqlite3 *db) {
char *schemas[] = {
"PRAGMA journal_mode=WAL;",
"drop table if exists tbl;",
"drop table if exists tbl_fts;",
"create table if not exists tbl(name text, uid int primary key, nick text);",
"create virtual table if not exists tbl_fts USING fts5(name, uid);",
"create trigger if not exists trigger_insert_tbl after insert on tbl \
begin \
replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>); \
end;",
"create trigger if not exists trigger_update_tbl after update on tbl \
begin \
replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);\
end;"
};
int i, j, rc, cnt;
cnt = sizeof(schemas) / sizeof(char*);
char *errMsg;
for (i = 0; i < cnt; i++) {
rc = sqlite3_exec(db, schemas[i], NULL, NULL, &errMsg);
if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
printf("prepareSchema error: %s\n", errMsg);
return;
}
}
printf("prepareSchema over\n");
}

void fts5_prefix_match_test() {
sqlite3 *db;
int i, j, k, rc;
char *errMsg;
char *append = "abcdefghijklmnopqrstuvwxyz";
sqlite3_open("a.db<http://a.db>", &db);
prepareSchema_prefix(db);
int test_cnt = 300, rc_copy = 0;
char zSql[2048];
char *tmp;
const int len_zSql = 2047;
int len_left;
int nRow = 0, nCol = 0;
char **pazResult;
const int trx_cnt = 10000;
int trx_idx;
for (trx_idx = 0; trx_idx < 10000; trx_idx++) {
for (i = 0; i < test_cnt; i++) {
j = random(6);
len_left = len_zSql;
tmp = zSql;
if (j < 4) {
int uid = id[j] + random(300);
rc_copy = snprintf(tmp, len_left, "insert into tbl values(\"%s%c\",%d,\"%s%c\");",
orgName[j], append[random(26)], uid, nickName[j], append[random(26)]);
//printf("%s\n", zSql);
rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
if (rc != SQLITE_CONSTRAINT) {
printf("error %s: %s\n", zSql, errMsg);
continue;
}
else {
rc_copy = snprintf(tmp, len_left, "update tbl set name=\"%s%c\" where uid=%d\n",
orgName[j], append[random(26)], uid);
rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
printf("error %s: %s\n", zSql, errMsg);
continue;
}
}
}
}
else {
rc_copy = snprintf(zSql, len_zSql, "select * from tbl where uid=%d;", id[j]);
rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
if (nRow == 0) {
rc_copy = snprintf(tmp, len_left, "insert into tbl(name, uid, nick) values(\"%s\",%d,\"%s\");",
orgName[j], id[j], nickName[j]);
}
else {
rc_copy = snprintf(zSql, len_zSql, "update tbl set name=\"%s\" where uid=%d;",
orgName[j], id[j]);
}
rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
printf("error %s: %s\n", zSql, errMsg);
continue;
}
}
}
printf("insert 300 records\n");
sqlite3_sleep(200);
}
}


unsigned __stdcall writeFunc(void *pIn) {
fts5_prefix_match_test();
printf("write over\n");
return 0;
}
unsigned __stdcall readFunc(void *pIn) {
sqlite3 *db;
int i, j, rc;
char *errMsg;
char **pazResult;
int nRow, nCol;
sqlite3_sleep(2000);
rc = sqlite3_open("a.db<http://a.db>", &db);
if (rc != SQLITE_OK) {
printf("open db error\n");
return 0;
}
char *zSql = "select rowid, * from tbl_fts where tbl_fts match \'lucy*\';";
while (1) {
rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
if (nRow > 2) {
printf("mis-match\n");
for (i = 0; i <= nRow; i++) {
for (j = 0; j < nCol; j++) {
printf("%s\t", pazResult[i*nCol + j]);
}
printf("\n");
}
return 0;
}
else {
printf("match cnt = %d\n", nRow);
}
sqlite3_free_table(pazResult);
sqlite3_sleep(3000);
}
return 0;
}
#define THREAD_NUM 2

int thread(ThreadFuncType writeFunc, ThreadFuncType readFunc) {
HANDLE handle[THREAD_NUM];
if (writeFunc) {
handle[0] = (HANDLE)_beginthreadex(NULL, 0, writeFunc, NULL, 0, NULL);
//WaitForMultipleObjects(1, handle, TRUE, INFINITE);
}
if (readFunc) {
for (int i = 1; i < THREAD_NUM; i++) {
handle[i] = (HANDLE)_beginthreadex(NULL, 0, readFunc, NULL, 0, NULL);
Sleep(5);
}
if (writeFunc) {
WaitForMultipleObjects(THREAD_NUM, &handle[0], TRUE, INFINITE);
}
else {
WaitForMultipleObjects(THREAD_NUM - 1, &handle[1], TRUE, INFINITE);
}
printf("multithread are all over now.\n");
}
else {
WaitForMultipleObjects(1, &handle[0], TRUE, INFINITE);
printf("multithread are all over now.\n");
}
return 0;
}

int main() {
thread(writeFunc, readFunc);
return 0;
}






?取 Outlook for Android<https://aka.ms/ghei36>

________________________________
From: sqlite-users <[hidden email]> on behalf of Dan Kennedy <[hidden email]>
Sent: Tuesday, March 20, 2018 7:49:37 PM
To: [hidden email]
Subject: Re: [sqlite] SQLITE3 FTS5 prefix query get mismatch result

On 03/20/2018 02:31 PM, zheng xiaojin wrote:
> (I am not sure whether you have seen my message, so resend again)
> Hi,
> When I use FTS5, I have met that, there are some cases which will get mis-match results with prefix search.
>
> Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I want to get records like "lucya","lucyabc" etc, and
>
> "lux" or "lulu" is not what I want but returned.
>
> Such problems are not common, But I have tried to build such test case which can lead to this problem very easy.

Thanks for looking into and reporting this problem. Are you able to post
the source for the program you used to execute the test described below?

Dan.


> Here is how I generate it:
>
> 1) create an fts5 table. and insert some record like "lucya","lucyb".
>
> 2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e) pengyu. a,b,c,d are have some same prefix(lu), e is some other random case.
>
> 3) before insert into the fts table with 2) records, appending some random letter to make each record different.
>
> Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc
>
> 4) for-loop insert, and each loop trying to lantch the query 'lucy*', \
>
> check the match result will finding the mis-match result, the corrent results should be "lucya","lucyb", not "luluabc"...
>
>
> When mis-match happen, I try to analyze the prefix search mechanism and find that, there are 2 points which I think have problems:
>
> 1) fts5LeafSeek, when search failed, and exec goto search_failed, in search_failed, the 2 if condition will not satisfy commonly. In my mind, I think it should return,
>
> but not, and then the search_success logic exec.
>
> 2) fts5SetupPrefixIter, when gather results, the logic to set the flag bNewTerm has some leak, which will set bNewTerm=false,
>
> but the record is not what we want indeed.
>
>
> These 2 logic problems lead to mis-match results. I try to remove the bNewTerm logic directly, and make it compare every loog,
>
> then, the mis-match results disappear.
>
> // relevant code
>
> Change below
>
> if (bNewTerm) {
>
> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>
> }
>
> to
>
> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>
>
> Need your help to recheck the FTS5 prefix search logic, thank you very much.
>
> Yours,
>
> xiaojin zheng
>
>
> ?取 Outlook for Android<https://aka.ms/ghei36>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

Dan Kennedy-4
On 03/21/2018 03:16 PM, zheng xiaojin wrote:
> (Please add.the head file and sqlite lib yourself, Thank you very much)


Hi,

Thanks for doing this. I'm running this version of the test program
modified for posix threads:

   https://pastebin.com/d1HCX2aJ

but I haven't seen any errors yet. How long do you usually have to run
it for before seeing the mismatch error?

Which version of SQLite are you seeing the errors with?

Are you building SQLite with SQLITE_THREADSAFE=0?

Do you have access to a Linux or similar system on which you can run the
pthreads version above? Does it produce errors for you?

Thanks,
Dan Kennedy.





>
> char *orgName[] = {
> "yun j ji yunji yunj yji yj z zh zha zhao q qi qia qian d du dua duan zhaoqian zhaoq zhqian zhq zqian zq qianduan qiand qduan qd zhaoqianduan zhaoqiand zhaoqduan zhaoqd zhqianduan zhqiand zhqduan zhqd zqianduan zqiand zqduan zqd",
> "murren m mu l lu lun mulun mul mlun ml",
> "l lu m ma man luman lum lman lm",
> "d   di   din   ding   d   di   din   ding   f   fu   w   wu   dingding   dingd   dding   dd   dingfu   dingf   dfu   df   fuwu   fuw   fwu   fw   dingdingfu   dingdingf   dingdfu   dingdf   ddingfu   ddingf   ddfu   ddf   dingfuwu   dingfuw   dingfwu   dingfw   dfuwu   dfuw   dfwu   dfw   dingdingfuwu   dingdingfuw   dingdingfwu   dingdingfw   dingdfuwu   dingdfuw   dingdfwu   dingdfw   ddingfuwu   ddingfuw   ddingfwu   ddingfw   ddfuwu   ddfuw   ddfwu   ddfw   l   li   x   xi   xia   lixia   lix   lxia   lx",
> "lucy y ya yan x xi yanxi yanx yxi yx",
> "p pe pen peng l le lei penglei pengl plei pl lucy"
> };
> char *nickName[] = {
> "ab",
> "bc",
> "cd",
> "de",
> "ef",
> "fg"
> };
> int id[] = { 21078,21218,21125,53234,40824,164873 };
>
> void prepareSchema_prefix(sqlite3 *db) {
> char *schemas[] = {
> "PRAGMA journal_mode=WAL;",
> "drop table if exists tbl;",
> "drop table if exists tbl_fts;",
> "create table if not exists tbl(name text, uid int primary key, nick text);",
> "create virtual table if not exists tbl_fts USING fts5(name, uid);",
> "create trigger if not exists trigger_insert_tbl after insert on tbl \
> begin \
> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>); \
> end;",
> "create trigger if not exists trigger_update_tbl after update on tbl \
> begin \
> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);\
> end;"
> };
> int i, j, rc, cnt;
> cnt = sizeof(schemas) / sizeof(char*);
> char *errMsg;
> for (i = 0; i < cnt; i++) {
> rc = sqlite3_exec(db, schemas[i], NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> printf("prepareSchema error: %s\n", errMsg);
> return;
> }
> }
> printf("prepareSchema over\n");
> }
>
> void fts5_prefix_match_test() {
> sqlite3 *db;
> int i, j, k, rc;
> char *errMsg;
> char *append = "abcdefghijklmnopqrstuvwxyz";
> sqlite3_open("a.db<http://a.db>", &db);
> prepareSchema_prefix(db);
> int test_cnt = 300, rc_copy = 0;
> char zSql[2048];
> char *tmp;
> const int len_zSql = 2047;
> int len_left;
> int nRow = 0, nCol = 0;
> char **pazResult;
> const int trx_cnt = 10000;
> int trx_idx;
> for (trx_idx = 0; trx_idx < 10000; trx_idx++) {
> for (i = 0; i < test_cnt; i++) {
> j = random(6);
> len_left = len_zSql;
> tmp = zSql;
> if (j < 4) {
> int uid = id[j] + random(300);
> rc_copy = snprintf(tmp, len_left, "insert into tbl values(\"%s%c\",%d,\"%s%c\");",
> orgName[j], append[random(26)], uid, nickName[j], append[random(26)]);
> //printf("%s\n", zSql);
> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> if (rc != SQLITE_CONSTRAINT) {
> printf("error %s: %s\n", zSql, errMsg);
> continue;
> }
> else {
> rc_copy = snprintf(tmp, len_left, "update tbl set name=\"%s%c\" where uid=%d\n",
> orgName[j], append[random(26)], uid);
> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> printf("error %s: %s\n", zSql, errMsg);
> continue;
> }
> }
> }
> }
> else {
> rc_copy = snprintf(zSql, len_zSql, "select * from tbl where uid=%d;", id[j]);
> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
> if (nRow == 0) {
> rc_copy = snprintf(tmp, len_left, "insert into tbl(name, uid, nick) values(\"%s\",%d,\"%s\");",
> orgName[j], id[j], nickName[j]);
> }
> else {
> rc_copy = snprintf(zSql, len_zSql, "update tbl set name=\"%s\" where uid=%d;",
> orgName[j], id[j]);
> }
> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> printf("error %s: %s\n", zSql, errMsg);
> continue;
> }
> }
> }
> printf("insert 300 records\n");
> sqlite3_sleep(200);
> }
> }
>
>
> unsigned __stdcall writeFunc(void *pIn) {
> fts5_prefix_match_test();
> printf("write over\n");
> return 0;
> }
> unsigned __stdcall readFunc(void *pIn) {
> sqlite3 *db;
> int i, j, rc;
> char *errMsg;
> char **pazResult;
> int nRow, nCol;
> sqlite3_sleep(2000);
> rc = sqlite3_open("a.db<http://a.db>", &db);
> if (rc != SQLITE_OK) {
> printf("open db error\n");
> return 0;
> }
> char *zSql = "select rowid, * from tbl_fts where tbl_fts match \'lucy*\';";
> while (1) {
> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
> if (nRow > 2) {
> printf("mis-match\n");
> for (i = 0; i <= nRow; i++) {
> for (j = 0; j < nCol; j++) {
> printf("%s\t", pazResult[i*nCol + j]);
> }
> printf("\n");
> }
> return 0;
> }
> else {
> printf("match cnt = %d\n", nRow);
> }
> sqlite3_free_table(pazResult);
> sqlite3_sleep(3000);
> }
> return 0;
> }
> #define THREAD_NUM 2
>
> int thread(ThreadFuncType writeFunc, ThreadFuncType readFunc) {
> HANDLE handle[THREAD_NUM];
> if (writeFunc) {
> handle[0] = (HANDLE)_beginthreadex(NULL, 0, writeFunc, NULL, 0, NULL);
> //WaitForMultipleObjects(1, handle, TRUE, INFINITE);
> }
> if (readFunc) {
> for (int i = 1; i < THREAD_NUM; i++) {
> handle[i] = (HANDLE)_beginthreadex(NULL, 0, readFunc, NULL, 0, NULL);
> Sleep(5);
> }
> if (writeFunc) {
> WaitForMultipleObjects(THREAD_NUM, &handle[0], TRUE, INFINITE);
> }
> else {
> WaitForMultipleObjects(THREAD_NUM - 1, &handle[1], TRUE, INFINITE);
> }
> printf("multithread are all over now.\n");
> }
> else {
> WaitForMultipleObjects(1, &handle[0], TRUE, INFINITE);
> printf("multithread are all over now.\n");
> }
> return 0;
> }
>
> int main() {
> thread(writeFunc, readFunc);
> return 0;
> }
>
>
>
>
>
>
> ?取 Outlook for Android<https://aka.ms/ghei36>
>
> ________________________________
> From: sqlite-users <[hidden email]> on behalf of Dan Kennedy <[hidden email]>
> Sent: Tuesday, March 20, 2018 7:49:37 PM
> To: [hidden email]
> Subject: Re: [sqlite] SQLITE3 FTS5 prefix query get mismatch result
>
> On 03/20/2018 02:31 PM, zheng xiaojin wrote:
>> (I am not sure whether you have seen my message, so resend again)
>> Hi,
>> When I use FTS5, I have met that, there are some cases which will get mis-match results with prefix search.
>>
>> Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I want to get records like "lucya","lucyabc" etc, and
>>
>> "lux" or "lulu" is not what I want but returned.
>>
>> Such problems are not common, But I have tried to build such test case which can lead to this problem very easy.
> Thanks for looking into and reporting this problem. Are you able to post
> the source for the program you used to execute the test described below?
>
> Dan.
>
>
>> Here is how I generate it:
>>
>> 1) create an fts5 table. and insert some record like "lucya","lucyb".
>>
>> 2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e) pengyu. a,b,c,d are have some same prefix(lu), e is some other random case.
>>
>> 3) before insert into the fts table with 2) records, appending some random letter to make each record different.
>>
>> Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc
>>
>> 4) for-loop insert, and each loop trying to lantch the query 'lucy*', \
>>
>> check the match result will finding the mis-match result, the corrent results should be "lucya","lucyb", not "luluabc"...
>>
>>
>> When mis-match happen, I try to analyze the prefix search mechanism and find that, there are 2 points which I think have problems:
>>
>> 1) fts5LeafSeek, when search failed, and exec goto search_failed, in search_failed, the 2 if condition will not satisfy commonly. In my mind, I think it should return,
>>
>> but not, and then the search_success logic exec.
>>
>> 2) fts5SetupPrefixIter, when gather results, the logic to set the flag bNewTerm has some leak, which will set bNewTerm=false,
>>
>> but the record is not what we want indeed.
>>
>>
>> These 2 logic problems lead to mis-match results. I try to remove the bNewTerm logic directly, and make it compare every loog,
>>
>> then, the mis-match results disappear.
>>
>> // relevant code
>>
>> Change below
>>
>> if (bNewTerm) {
>>
>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>
>> }
>>
>> to
>>
>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>
>>
>> Need your help to recheck the FTS5 prefix search logic, thank you very much.
>>
>> Yours,
>>
>> xiaojin zheng
>>
>>
>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

ze tian
Sorry, but testcase maybe have some format problem, changed belowed,
I run this on SQLITE 3.15.2, both on Windows and linux. and only less than 2min can happen. It doesn't matter about threadsafe, because the write func is only used to protect the test data. It means you can stop the. write func when problem happen, and then you can just shell to the db and run the same query then problem happen definitely.



> "create trigger if not exists trigger_insert_tbl after insert on tbl \
> begin \
> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,,new.uid<http://new.uid>); \
> end;",
> "create trigger if not exists trigger_update_tbl after update on tbl \
> begin \
> replace into tbl_fts(rowid, name, uid) values(new.rowi<http://new.rowid>d,new.name<http://new.name>,new.uid<http://new.uid>);\
> end;"


?取 Outlook for Android<https://aka.ms/ghei36>

________________________________
From: sqlite-users <[hidden email]> on behalf of Dan Kennedy <[hidden email]>
Sent: Thursday, March 22, 2018 4:56:15 PM
To: [hidden email]
Subject: Re: [sqlite] here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

On 03/21/2018 03:16 PM, zheng xiaojin wrote:
> (Please add.the head file and sqlite lib yourself, Thank you very much)


Hi,

Thanks for doing this. I'm running this version of the test program
modified for posix threads:

   https://pastebin.com/d1HCX2aJ

but I haven't seen any errors yet. How long do you usually have to run
it for before seeing the mismatch error?

Which version of SQLite are you seeing the errors with?

Are you building SQLite with SQLITE_THREADSAFE=0?

Do you have access to a Linux or similar system on which you can run the
pthreads version above? Does it produce errors for you?

Thanks,
Dan Kennedy.





>
> char *orgName[] = {
> "yun j ji yunji yunj yji yj z zh zha zhao q qi qia qian d du dua duan zhaoqian zhaoq zhqian zhq zqian zq qianduan qiand qduan qd zhaoqianduan zhaoqiand zhaoqduan zhaoqd zhqianduan zhqiand zhqduan zhqd zqianduan zqiand zqduan zqd",
> "murren m mu l lu lun mulun mul mlun ml",
> "l lu m ma man luman lum lman lm",
> "d   di   din   ding   d   di   din   ding   f   fu   w   wu   dingding   dingd   dding   dd   dingfu   dingf   dfu   df   fuwu   fuw   fwu   fw   dingdingfu   dingdingf   dingdfu   dingdf   ddingfu   ddingf   ddfu   ddf   dingfuwu   dingfuw   dingfwu   dingfw   dfuwu   dfuw   dfwu   dfw   dingdingfuwu   dingdingfuw   dingdingfwu   dingdingfw   dingdfuwu   dingdfuw   dingdfwu   dingdfw   ddingfuwu   ddingfuw   ddingfwu   ddingfw   ddfuwu   ddfuw   ddfwu   ddfw   l   li   x   xi   xia   lixia   lix   lxia   lx",
> "lucy y ya yan x xi yanxi yanx yxi yx",
> "p pe pen peng l le lei penglei pengl plei pl lucy"
> };
> char *nickName[] = {
> "ab",
> "bc",
> "cd",
> "de",
> "ef",
> "fg"
> };
> int id[] = { 21078,21218,21125,53234,40824,164873 };
>
> void prepareSchema_prefix(sqlite3 *db) {
> char *schemas[] = {
> "PRAGMA journal_mode=WAL;",
> "drop table if exists tbl;",
> "drop table if exists tbl_fts;",
> "create table if not exists tbl(name text, uid int primary key, nick text);",
> "create virtual table if not exists tbl_fts USING fts5(name, uid);",
> "create trigger if not exists trigger_insert_tbl after insert on tbl \
> begin \
> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>); \
> end;",
> "create trigger if not exists trigger_update_tbl after update on tbl \
> begin \
> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);\
> end;"
> };
> int i, j, rc, cnt;
> cnt = sizeof(schemas) / sizeof(char*);
> char *errMsg;
> for (i = 0; i < cnt; i++) {
> rc = sqlite3_exec(db, schemas[i], NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> printf("prepareSchema error: %s\n", errMsg);
> return;
> }
> }
> printf("prepareSchema over\n");
> }
>
> void fts5_prefix_match_test() {
> sqlite3 *db;
> int i, j, k, rc;
> char *errMsg;
> char *append = "abcdefghijklmnopqrstuvwxyz";
> sqlite3_open("a.db<http://a.db>", &db);
> prepareSchema_prefix(db);
> int test_cnt = 300, rc_copy = 0;
> char zSql[2048];
> char *tmp;
> const int len_zSql = 2047;
> int len_left;
> int nRow = 0, nCol = 0;
> char **pazResult;
> const int trx_cnt = 10000;
> int trx_idx;
> for (trx_idx = 0; trx_idx < 10000; trx_idx++) {
> for (i = 0; i < test_cnt; i++) {
> j = random(6);
> len_left = len_zSql;
> tmp = zSql;
> if (j < 4) {
> int uid = id[j] + random(300);
> rc_copy = snprintf(tmp, len_left, "insert into tbl values(\"%s%c\",%d,\"%s%c\");",
> orgName[j], append[random(26)], uid, nickName[j], append[random(26)]);
> //printf("%s\n", zSql);
> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> if (rc != SQLITE_CONSTRAINT) {
> printf("error %s: %s\n", zSql, errMsg);
> continue;
> }
> else {
> rc_copy = snprintf(tmp, len_left, "update tbl set name=\"%s%c\" where uid=%d\n",
> orgName[j], append[random(26)], uid);
> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> printf("error %s: %s\n", zSql, errMsg);
> continue;
> }
> }
> }
> }
> else {
> rc_copy = snprintf(zSql, len_zSql, "select * from tbl where uid=%d;", id[j]);
> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
> if (nRow == 0) {
> rc_copy = snprintf(tmp, len_left, "insert into tbl(name, uid, nick) values(\"%s\",%d,\"%s\");",
> orgName[j], id[j], nickName[j]);
> }
> else {
> rc_copy = snprintf(zSql, len_zSql, "update tbl set name=\"%s\" where uid=%d;",
> orgName[j], id[j]);
> }
> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
> printf("error %s: %s\n", zSql, errMsg);
> continue;
> }
> }
> }
> printf("insert 300 records\n");
> sqlite3_sleep(200);
> }
> }
>
>
> unsigned __stdcall writeFunc(void *pIn) {
> fts5_prefix_match_test();
> printf("write over\n");
> return 0;
> }
> unsigned __stdcall readFunc(void *pIn) {
> sqlite3 *db;
> int i, j, rc;
> char *errMsg;
> char **pazResult;
> int nRow, nCol;
> sqlite3_sleep(2000);
> rc = sqlite3_open("a.db<http://a.db>", &db);
> if (rc != SQLITE_OK) {
> printf("open db error\n");
> return 0;
> }
> char *zSql = "select rowid, * from tbl_fts where tbl_fts match \'lucy*\';";
> while (1) {
> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
> if (nRow > 2) {
> printf("mis-match\n");
> for (i = 0; i <= nRow; i++) {
> for (j = 0; j < nCol; j++) {
> printf("%s\t", pazResult[i*nCol + j]);
> }
> printf("\n");
> }
> return 0;
> }
> else {
> printf("match cnt = %d\n", nRow);
> }
> sqlite3_free_table(pazResult);
> sqlite3_sleep(3000);
> }
> return 0;
> }
> #define THREAD_NUM 2
>
> int thread(ThreadFuncType writeFunc, ThreadFuncType readFunc) {
> HANDLE handle[THREAD_NUM];
> if (writeFunc) {
> handle[0] = (HANDLE)_beginthreadex(NULL, 0, writeFunc, NULL, 0, NULL);
> //WaitForMultipleObjects(1, handle, TRUE, INFINITE);
> }
> if (readFunc) {
> for (int i = 1; i < THREAD_NUM; i++) {
> handle[i] = (HANDLE)_beginthreadex(NULL, 0, readFunc, NULL, 0, NULL);
> Sleep(5);
> }
> if (writeFunc) {
> WaitForMultipleObjects(THREAD_NUM, &handle[0], TRUE, INFINITE);
> }
> else {
> WaitForMultipleObjects(THREAD_NUM - 1, &handle[1], TRUE, INFINITE);
> }
> printf("multithread are all over now.\n");
> }
> else {
> WaitForMultipleObjects(1, &handle[0], TRUE, INFINITE);
> printf("multithread are all over now.\n");
> }
> return 0;
> }
>
> int main() {
> thread(writeFunc, readFunc);
> return 0;
> }
>
>
>
>
>
>
> ?取 Outlook for Android<https://aka.ms/ghei36>
>
> ________________________________
> From: sqlite-users <[hidden email]> on behalf of Dan Kennedy <[hidden email]>
> Sent: Tuesday, March 20, 2018 7:49:37 PM
> To: [hidden email]
> Subject: Re: [sqlite] SQLITE3 FTS5 prefix query get mismatch result
>
> On 03/20/2018 02:31 PM, zheng xiaojin wrote:
>> (I am not sure whether you have seen my message, so resend again)
>> Hi,
>> When I use FTS5, I have met that, there are some cases which will get mis-match results with prefix search.
>>
>> Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I want to get records like "lucya","lucyabc" etc, and
>>
>> "lux" or "lulu" is not what I want but returned.
>>
>> Such problems are not common, But I have tried to build such test case which can lead to this problem very easy.
> Thanks for looking into and reporting this problem. Are you able to post
> the source for the program you used to execute the test described below?
>
> Dan.
>
>
>> Here is how I generate it:
>>
>> 1) create an fts5 table. and insert some record like "lucya","lucyb".
>>
>> 2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e) pengyu. a,b,c,d are have some same prefix(lu), e is some other random case.
>>
>> 3) before insert into the fts table with 2) records, appending some random letter to make each record different.
>>
>> Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc
>>
>> 4) for-loop insert, and each loop trying to lantch the query 'lucy*', \
>>
>> check the match result will finding the mis-match result, the corrent results should be "lucya","lucyb", not "luluabc"...
>>
>>
>> When mis-match happen, I try to analyze the prefix search mechanism and find that, there are 2 points which I think have problems:
>>
>> 1) fts5LeafSeek, when search failed, and exec goto search_failed, in search_failed, the 2 if condition will not satisfy commonly. In my mind, I think it should return,
>>
>> but not, and then the search_success logic exec.
>>
>> 2) fts5SetupPrefixIter, when gather results, the logic to set the flag bNewTerm has some leak, which will set bNewTerm=false,
>>
>> but the record is not what we want indeed.
>>
>>
>> These 2 logic problems lead to mis-match results. I try to remove the bNewTerm logic directly, and make it compare every loog,
>>
>> then, the mis-match results disappear.
>>
>> // relevant code
>>
>> Change below
>>
>> if (bNewTerm) {
>>
>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>
>> }
>>
>> to
>>
>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>
>>
>> Need your help to recheck the FTS5 prefix search logic, thank you very much.
>>
>> Yours,
>>
>> xiaojin zheng
>>
>>
>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

Dan Kennedy-4
On 03/22/2018 05:38 PM, zheng xiaojin wrote:
> Sorry, but testcase maybe have some format problem, changed belowed,
> I run this on SQLITE 3.15.2, both on Windows and linux. and only less than 2min can happen. It doesn't matter about threadsafe, because the write func is only used to protect the test data. It means you can stop the. write func when problem happen, and then you can just shell to the db and run the same query then problem happen definitely.


Can you do that (create a db that exhibits the problem) and upload it
somewhere or email it directly to me? Then I'll be able to demonstrate
the problem using just the shell.

Thanks,
Dan.



>
>
>
>> "create trigger if not exists trigger_insert_tbl after insert on tbl \
>> begin \
>> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,,new.uid<http://new.uid>); \
>> end;",
>> "create trigger if not exists trigger_update_tbl after update on tbl \
>> begin \
>> replace into tbl_fts(rowid, name, uid) values(new.rowi<http://new.rowid>d,new.name<http://new.name>,new.uid<http://new.uid>);\
>> end;"
>
> ?取 Outlook for Android<https://aka.ms/ghei36>
>
> ________________________________
> From: sqlite-users <[hidden email]> on behalf of Dan Kennedy <[hidden email]>
> Sent: Thursday, March 22, 2018 4:56:15 PM
> To: [hidden email]
> Subject: Re: [sqlite] here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result
>
> On 03/21/2018 03:16 PM, zheng xiaojin wrote:
>> (Please add.the head file and sqlite lib yourself, Thank you very much)
>
> Hi,
>
> Thanks for doing this. I'm running this version of the test program
> modified for posix threads:
>
>     https://pastebin.com/d1HCX2aJ
>
> but I haven't seen any errors yet. How long do you usually have to run
> it for before seeing the mismatch error?
>
> Which version of SQLite are you seeing the errors with?
>
> Are you building SQLite with SQLITE_THREADSAFE=0?
>
> Do you have access to a Linux or similar system on which you can run the
> pthreads version above? Does it produce errors for you?
>
> Thanks,
> Dan Kennedy.
>
>
>
>
>
>> char *orgName[] = {
>> "yun j ji yunji yunj yji yj z zh zha zhao q qi qia qian d du dua duan zhaoqian zhaoq zhqian zhq zqian zq qianduan qiand qduan qd zhaoqianduan zhaoqiand zhaoqduan zhaoqd zhqianduan zhqiand zhqduan zhqd zqianduan zqiand zqduan zqd",
>> "murren m mu l lu lun mulun mul mlun ml",
>> "l lu m ma man luman lum lman lm",
>> "d   di   din   ding   d   di   din   ding   f   fu   w   wu   dingding   dingd   dding   dd   dingfu   dingf   dfu   df   fuwu   fuw   fwu   fw   dingdingfu   dingdingf   dingdfu   dingdf   ddingfu   ddingf   ddfu   ddf   dingfuwu   dingfuw   dingfwu   dingfw   dfuwu   dfuw   dfwu   dfw   dingdingfuwu   dingdingfuw   dingdingfwu   dingdingfw   dingdfuwu   dingdfuw   dingdfwu   dingdfw   ddingfuwu   ddingfuw   ddingfwu   ddingfw   ddfuwu   ddfuw   ddfwu   ddfw   l   li   x   xi   xia   lixia   lix   lxia   lx",
>> "lucy y ya yan x xi yanxi yanx yxi yx",
>> "p pe pen peng l le lei penglei pengl plei pl lucy"
>> };
>> char *nickName[] = {
>> "ab",
>> "bc",
>> "cd",
>> "de",
>> "ef",
>> "fg"
>> };
>> int id[] = { 21078,21218,21125,53234,40824,164873 };
>>
>> void prepareSchema_prefix(sqlite3 *db) {
>> char *schemas[] = {
>> "PRAGMA journal_mode=WAL;",
>> "drop table if exists tbl;",
>> "drop table if exists tbl_fts;",
>> "create table if not exists tbl(name text, uid int primary key, nick text);",
>> "create virtual table if not exists tbl_fts USING fts5(name, uid);",
>> "create trigger if not exists trigger_insert_tbl after insert on tbl \
>> begin \
>> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>); \
>> end;",
>> "create trigger if not exists trigger_update_tbl after update on tbl \
>> begin \
>> replace into tbl_fts(rowid, name, uid) values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);\
>> end;"
>> };
>> int i, j, rc, cnt;
>> cnt = sizeof(schemas) / sizeof(char*);
>> char *errMsg;
>> for (i = 0; i < cnt; i++) {
>> rc = sqlite3_exec(db, schemas[i], NULL, NULL, &errMsg);
>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>> printf("prepareSchema error: %s\n", errMsg);
>> return;
>> }
>> }
>> printf("prepareSchema over\n");
>> }
>>
>> void fts5_prefix_match_test() {
>> sqlite3 *db;
>> int i, j, k, rc;
>> char *errMsg;
>> char *append = "abcdefghijklmnopqrstuvwxyz";
>> sqlite3_open("a.db<http://a.db>", &db);
>> prepareSchema_prefix(db);
>> int test_cnt = 300, rc_copy = 0;
>> char zSql[2048];
>> char *tmp;
>> const int len_zSql = 2047;
>> int len_left;
>> int nRow = 0, nCol = 0;
>> char **pazResult;
>> const int trx_cnt = 10000;
>> int trx_idx;
>> for (trx_idx = 0; trx_idx < 10000; trx_idx++) {
>> for (i = 0; i < test_cnt; i++) {
>> j = random(6);
>> len_left = len_zSql;
>> tmp = zSql;
>> if (j < 4) {
>> int uid = id[j] + random(300);
>> rc_copy = snprintf(tmp, len_left, "insert into tbl values(\"%s%c\",%d,\"%s%c\");",
>> orgName[j], append[random(26)], uid, nickName[j], append[random(26)]);
>> //printf("%s\n", zSql);
>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>> if (rc != SQLITE_CONSTRAINT) {
>> printf("error %s: %s\n", zSql, errMsg);
>> continue;
>> }
>> else {
>> rc_copy = snprintf(tmp, len_left, "update tbl set name=\"%s%c\" where uid=%d\n",
>> orgName[j], append[random(26)], uid);
>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>> printf("error %s: %s\n", zSql, errMsg);
>> continue;
>> }
>> }
>> }
>> }
>> else {
>> rc_copy = snprintf(zSql, len_zSql, "select * from tbl where uid=%d;", id[j]);
>> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
>> if (nRow == 0) {
>> rc_copy = snprintf(tmp, len_left, "insert into tbl(name, uid, nick) values(\"%s\",%d,\"%s\");",
>> orgName[j], id[j], nickName[j]);
>> }
>> else {
>> rc_copy = snprintf(zSql, len_zSql, "update tbl set name=\"%s\" where uid=%d;",
>> orgName[j], id[j]);
>> }
>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>> printf("error %s: %s\n", zSql, errMsg);
>> continue;
>> }
>> }
>> }
>> printf("insert 300 records\n");
>> sqlite3_sleep(200);
>> }
>> }
>>
>>
>> unsigned __stdcall writeFunc(void *pIn) {
>> fts5_prefix_match_test();
>> printf("write over\n");
>> return 0;
>> }
>> unsigned __stdcall readFunc(void *pIn) {
>> sqlite3 *db;
>> int i, j, rc;
>> char *errMsg;
>> char **pazResult;
>> int nRow, nCol;
>> sqlite3_sleep(2000);
>> rc = sqlite3_open("a.db<http://a.db>", &db);
>> if (rc != SQLITE_OK) {
>> printf("open db error\n");
>> return 0;
>> }
>> char *zSql = "select rowid, * from tbl_fts where tbl_fts match \'lucy*\';";
>> while (1) {
>> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
>> if (nRow > 2) {
>> printf("mis-match\n");
>> for (i = 0; i <= nRow; i++) {
>> for (j = 0; j < nCol; j++) {
>> printf("%s\t", pazResult[i*nCol + j]);
>> }
>> printf("\n");
>> }
>> return 0;
>> }
>> else {
>> printf("match cnt = %d\n", nRow);
>> }
>> sqlite3_free_table(pazResult);
>> sqlite3_sleep(3000);
>> }
>> return 0;
>> }
>> #define THREAD_NUM 2
>>
>> int thread(ThreadFuncType writeFunc, ThreadFuncType readFunc) {
>> HANDLE handle[THREAD_NUM];
>> if (writeFunc) {
>> handle[0] = (HANDLE)_beginthreadex(NULL, 0, writeFunc, NULL, 0, NULL);
>> //WaitForMultipleObjects(1, handle, TRUE, INFINITE);
>> }
>> if (readFunc) {
>> for (int i = 1; i < THREAD_NUM; i++) {
>> handle[i] = (HANDLE)_beginthreadex(NULL, 0, readFunc, NULL, 0, NULL);
>> Sleep(5);
>> }
>> if (writeFunc) {
>> WaitForMultipleObjects(THREAD_NUM, &handle[0], TRUE, INFINITE);
>> }
>> else {
>> WaitForMultipleObjects(THREAD_NUM - 1, &handle[1], TRUE, INFINITE);
>> }
>> printf("multithread are all over now.\n");
>> }
>> else {
>> WaitForMultipleObjects(1, &handle[0], TRUE, INFINITE);
>> printf("multithread are all over now.\n");
>> }
>> return 0;
>> }
>>
>> int main() {
>> thread(writeFunc, readFunc);
>> return 0;
>> }
>>
>>
>>
>>
>>
>>
>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>
>> ________________________________
>> From: sqlite-users <[hidden email]> on behalf of Dan Kennedy <[hidden email]>
>> Sent: Tuesday, March 20, 2018 7:49:37 PM
>> To: [hidden email]
>> Subject: Re: [sqlite] SQLITE3 FTS5 prefix query get mismatch result
>>
>> On 03/20/2018 02:31 PM, zheng xiaojin wrote:
>>> (I am not sure whether you have seen my message, so resend again)
>>> Hi,
>>> When I use FTS5, I have met that, there are some cases which will get mis-match results with prefix search.
>>>
>>> Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I want to get records like "lucya","lucyabc" etc, and
>>>
>>> "lux" or "lulu" is not what I want but returned.
>>>
>>> Such problems are not common, But I have tried to build such test case which can lead to this problem very easy.
>> Thanks for looking into and reporting this problem. Are you able to post
>> the source for the program you used to execute the test described below?
>>
>> Dan.
>>
>>
>>> Here is how I generate it:
>>>
>>> 1) create an fts5 table. and insert some record like "lucya","lucyb".
>>>
>>> 2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e) pengyu. a,b,c,d are have some same prefix(lu), e is some other random case.
>>>
>>> 3) before insert into the fts table with 2) records, appending some random letter to make each record different.
>>>
>>> Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc
>>>
>>> 4) for-loop insert, and each loop trying to lantch the query 'lucy*', \
>>>
>>> check the match result will finding the mis-match result, the corrent results should be "lucya","lucyb", not "luluabc"...
>>>
>>>
>>> When mis-match happen, I try to analyze the prefix search mechanism and find that, there are 2 points which I think have problems:
>>>
>>> 1) fts5LeafSeek, when search failed, and exec goto search_failed, in search_failed, the 2 if condition will not satisfy commonly. In my mind, I think it should return,
>>>
>>> but not, and then the search_success logic exec.
>>>
>>> 2) fts5SetupPrefixIter, when gather results, the logic to set the flag bNewTerm has some leak, which will set bNewTerm=false,
>>>
>>> but the record is not what we want indeed.
>>>
>>>
>>> These 2 logic problems lead to mis-match results. I try to remove the bNewTerm logic directly, and make it compare every loog,
>>>
>>> then, the mis-match results disappear.
>>>
>>> // relevant code
>>>
>>> Change below
>>>
>>> if (bNewTerm) {
>>>
>>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>>
>>> }
>>>
>>> to
>>>
>>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>>
>>>
>>> Need your help to recheck the FTS5 prefix search logic, thank you very much.
>>>
>>> Yours,
>>>
>>> xiaojin zheng
>>>
>>>
>>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [hidden email]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

Dan Kennedy-4
On 03/22/2018 06:05 PM, Dan Kennedy wrote:

> On 03/22/2018 05:38 PM, zheng xiaojin wrote:
>> Sorry, but testcase maybe have some format problem, changed belowed,
>> I run this on SQLITE 3.15.2, both on Windows and linux. and only less
>> than 2min can happen. It doesn't matter about threadsafe, because the
>> write func is only used to protect the test data. It means you can
>> stop the. write func when problem happen, and then you can just shell
>> to the db and run the same query then problem happen definitely.
>
>
> Can you do that (create a db that exhibits the problem) and upload it
> somewhere or email it directly to me? Then I'll be able to demonstrate
> the problem using just the shell.

Actually don't worry. I finally got it to work. I think it's the bug
fixed exactly a year and a day ago here:

   http://www.sqlite.org/src/info/840042cb2bed2924

Fix was published in 3.19.0. Can you confirm that 3.19.0 do not produce
the problem for you?

Thanks,
Dan.





>
> Thanks,
> Dan.
>
>
>
>>
>>
>>
>>> "create trigger if not exists trigger_insert_tbl after insert on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowid<http://new.rowid>,new.name<http://new.name>,,new.uid<http://new.uid>);
>>> \
>>> end;",
>>> "create trigger if not exists trigger_update_tbl after update on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowi<http://new.rowid>d,new.name<http://new.name>,new.uid<http://new.uid>);\
>>> end;"
>>
>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>
>> ________________________________
>> From: sqlite-users <[hidden email]> on
>> behalf of Dan Kennedy <[hidden email]>
>> Sent: Thursday, March 22, 2018 4:56:15 PM
>> To: [hidden email]
>> Subject: Re: [sqlite] here is my test case//Re: SQLITE3 FTS5 prefix
>> query get mismatch result
>>
>> On 03/21/2018 03:16 PM, zheng xiaojin wrote:
>>> (Please add.the head file and sqlite lib yourself, Thank you very much)
>>
>> Hi,
>>
>> Thanks for doing this. I'm running this version of the test program
>> modified for posix threads:
>>
>>     https://pastebin.com/d1HCX2aJ
>>
>> but I haven't seen any errors yet. How long do you usually have to run
>> it for before seeing the mismatch error?
>>
>> Which version of SQLite are you seeing the errors with?
>>
>> Are you building SQLite with SQLITE_THREADSAFE=0?
>>
>> Do you have access to a Linux or similar system on which you can run the
>> pthreads version above? Does it produce errors for you?
>>
>> Thanks,
>> Dan Kennedy.
>>
>>
>>
>>
>>
>>> char *orgName[] = {
>>> "yun j ji yunji yunj yji yj z zh zha zhao q qi qia qian d du dua
>>> duan zhaoqian zhaoq zhqian zhq zqian zq qianduan qiand qduan qd
>>> zhaoqianduan zhaoqiand zhaoqduan zhaoqd zhqianduan zhqiand zhqduan
>>> zhqd zqianduan zqiand zqduan zqd",
>>> "murren m mu l lu lun mulun mul mlun ml",
>>> "l lu m ma man luman lum lman lm",
>>> "d   di   din   ding   d   di   din   ding   f   fu   w   wu
>>> dingding   dingd   dding   dd   dingfu   dingf   dfu   df fuwu  
>>> fuw   fwu   fw   dingdingfu   dingdingf   dingdfu dingdf   ddingfu  
>>> ddingf   ddfu   ddf   dingfuwu   dingfuw dingfwu   dingfw   dfuwu  
>>> dfuw   dfwu   dfw   dingdingfuwu dingdingfuw   dingdingfwu  
>>> dingdingfw   dingdfuwu dingdfuw   dingdfwu   dingdfw   ddingfuwu  
>>> ddingfuw ddingfwu   ddingfw   ddfuwu   ddfuw   ddfwu   ddfw   l   li
>>> x   xi   xia   lixia   lix   lxia   lx",
>>> "lucy y ya yan x xi yanxi yanx yxi yx",
>>> "p pe pen peng l le lei penglei pengl plei pl lucy"
>>> };
>>> char *nickName[] = {
>>> "ab",
>>> "bc",
>>> "cd",
>>> "de",
>>> "ef",
>>> "fg"
>>> };
>>> int id[] = { 21078,21218,21125,53234,40824,164873 };
>>>
>>> void prepareSchema_prefix(sqlite3 *db) {
>>> char *schemas[] = {
>>> "PRAGMA journal_mode=WAL;",
>>> "drop table if exists tbl;",
>>> "drop table if exists tbl_fts;",
>>> "create table if not exists tbl(name text, uid int primary key, nick
>>> text);",
>>> "create virtual table if not exists tbl_fts USING fts5(name, uid);",
>>> "create trigger if not exists trigger_insert_tbl after insert on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);
>>> \
>>> end;",
>>> "create trigger if not exists trigger_update_tbl after update on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);\
>>> end;"
>>> };
>>> int i, j, rc, cnt;
>>> cnt = sizeof(schemas) / sizeof(char*);
>>> char *errMsg;
>>> for (i = 0; i < cnt; i++) {
>>> rc = sqlite3_exec(db, schemas[i], NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> printf("prepareSchema error: %s\n", errMsg);
>>> return;
>>> }
>>> }
>>> printf("prepareSchema over\n");
>>> }
>>>
>>> void fts5_prefix_match_test() {
>>> sqlite3 *db;
>>> int i, j, k, rc;
>>> char *errMsg;
>>> char *append = "abcdefghijklmnopqrstuvwxyz";
>>> sqlite3_open("a.db<http://a.db>", &db);
>>> prepareSchema_prefix(db);
>>> int test_cnt = 300, rc_copy = 0;
>>> char zSql[2048];
>>> char *tmp;
>>> const int len_zSql = 2047;
>>> int len_left;
>>> int nRow = 0, nCol = 0;
>>> char **pazResult;
>>> const int trx_cnt = 10000;
>>> int trx_idx;
>>> for (trx_idx = 0; trx_idx < 10000; trx_idx++) {
>>> for (i = 0; i < test_cnt; i++) {
>>> j = random(6);
>>> len_left = len_zSql;
>>> tmp = zSql;
>>> if (j < 4) {
>>> int uid = id[j] + random(300);
>>> rc_copy = snprintf(tmp, len_left, "insert into tbl
>>> values(\"%s%c\",%d,\"%s%c\");",
>>> orgName[j], append[random(26)], uid, nickName[j], append[random(26)]);
>>> //printf("%s\n", zSql);
>>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> if (rc != SQLITE_CONSTRAINT) {
>>> printf("error %s: %s\n", zSql, errMsg);
>>> continue;
>>> }
>>> else {
>>> rc_copy = snprintf(tmp, len_left, "update tbl set name=\"%s%c\"
>>> where uid=%d\n",
>>> orgName[j], append[random(26)], uid);
>>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> printf("error %s: %s\n", zSql, errMsg);
>>> continue;
>>> }
>>> }
>>> }
>>> }
>>> else {
>>> rc_copy = snprintf(zSql, len_zSql, "select * from tbl where
>>> uid=%d;", id[j]);
>>> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
>>> if (nRow == 0) {
>>> rc_copy = snprintf(tmp, len_left, "insert into tbl(name, uid, nick)
>>> values(\"%s\",%d,\"%s\");",
>>> orgName[j], id[j], nickName[j]);
>>> }
>>> else {
>>> rc_copy = snprintf(zSql, len_zSql, "update tbl set name=\"%s\" where
>>> uid=%d;",
>>> orgName[j], id[j]);
>>> }
>>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> printf("error %s: %s\n", zSql, errMsg);
>>> continue;
>>> }
>>> }
>>> }
>>> printf("insert 300 records\n");
>>> sqlite3_sleep(200);
>>> }
>>> }
>>>
>>>
>>> unsigned __stdcall writeFunc(void *pIn) {
>>> fts5_prefix_match_test();
>>> printf("write over\n");
>>> return 0;
>>> }
>>> unsigned __stdcall readFunc(void *pIn) {
>>> sqlite3 *db;
>>> int i, j, rc;
>>> char *errMsg;
>>> char **pazResult;
>>> int nRow, nCol;
>>> sqlite3_sleep(2000);
>>> rc = sqlite3_open("a.db<http://a.db>", &db);
>>> if (rc != SQLITE_OK) {
>>> printf("open db error\n");
>>> return 0;
>>> }
>>> char *zSql = "select rowid, * from tbl_fts where tbl_fts match
>>> \'lucy*\';";
>>> while (1) {
>>> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
>>> if (nRow > 2) {
>>> printf("mis-match\n");
>>> for (i = 0; i <= nRow; i++) {
>>> for (j = 0; j < nCol; j++) {
>>> printf("%s\t", pazResult[i*nCol + j]);
>>> }
>>> printf("\n");
>>> }
>>> return 0;
>>> }
>>> else {
>>> printf("match cnt = %d\n", nRow);
>>> }
>>> sqlite3_free_table(pazResult);
>>> sqlite3_sleep(3000);
>>> }
>>> return 0;
>>> }
>>> #define THREAD_NUM 2
>>>
>>> int thread(ThreadFuncType writeFunc, ThreadFuncType readFunc) {
>>> HANDLE handle[THREAD_NUM];
>>> if (writeFunc) {
>>> handle[0] = (HANDLE)_beginthreadex(NULL, 0, writeFunc, NULL, 0, NULL);
>>> //WaitForMultipleObjects(1, handle, TRUE, INFINITE);
>>> }
>>> if (readFunc) {
>>> for (int i = 1; i < THREAD_NUM; i++) {
>>> handle[i] = (HANDLE)_beginthreadex(NULL, 0, readFunc, NULL, 0, NULL);
>>> Sleep(5);
>>> }
>>> if (writeFunc) {
>>> WaitForMultipleObjects(THREAD_NUM, &handle[0], TRUE, INFINITE);
>>> }
>>> else {
>>> WaitForMultipleObjects(THREAD_NUM - 1, &handle[1], TRUE, INFINITE);
>>> }
>>> printf("multithread are all over now.\n");
>>> }
>>> else {
>>> WaitForMultipleObjects(1, &handle[0], TRUE, INFINITE);
>>> printf("multithread are all over now.\n");
>>> }
>>> return 0;
>>> }
>>>
>>> int main() {
>>> thread(writeFunc, readFunc);
>>> return 0;
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>>
>>> ________________________________
>>> From: sqlite-users <[hidden email]> on
>>> behalf of Dan Kennedy <[hidden email]>
>>> Sent: Tuesday, March 20, 2018 7:49:37 PM
>>> To: [hidden email]
>>> Subject: Re: [sqlite] SQLITE3 FTS5 prefix query get mismatch result
>>>
>>> On 03/20/2018 02:31 PM, zheng xiaojin wrote:
>>>> (I am not sure whether you have seen my message, so resend again)
>>>> Hi,
>>>> When I use FTS5, I have met that, there are some cases which will
>>>> get mis-match results with prefix search.
>>>>
>>>> Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I
>>>> want to get records like "lucya","lucyabc" etc, and
>>>>
>>>> "lux" or "lulu" is not what I want but returned.
>>>>
>>>> Such problems are not common, But I have tried to build such test
>>>> case which can lead to this problem very easy.
>>> Thanks for looking into and reporting this problem. Are you able to
>>> post
>>> the source for the program you used to execute the test described
>>> below?
>>>
>>> Dan.
>>>
>>>
>>>> Here is how I generate it:
>>>>
>>>> 1) create an fts5 table. and insert some record like "lucya","lucyb".
>>>>
>>>> 2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e)
>>>> pengyu. a,b,c,d are have some same prefix(lu), e is some other
>>>> random case.
>>>>
>>>> 3) before insert into the fts table with 2) records, appending some
>>>> random letter to make each record different.
>>>>
>>>> Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc
>>>>
>>>> 4) for-loop insert, and each loop trying to lantch the query
>>>> 'lucy*', \
>>>>
>>>> check the match result will finding the mis-match result, the
>>>> corrent results should be "lucya","lucyb", not "luluabc"...
>>>>
>>>>
>>>> When mis-match happen, I try to analyze the prefix search mechanism
>>>> and find that, there are 2 points which I think have problems:
>>>>
>>>> 1) fts5LeafSeek, when search failed, and exec goto search_failed,
>>>> in search_failed, the 2 if condition will not satisfy commonly. In
>>>> my mind, I think it should return,
>>>>
>>>> but not, and then the search_success logic exec.
>>>>
>>>> 2) fts5SetupPrefixIter, when gather results, the logic to set the
>>>> flag bNewTerm has some leak, which will set bNewTerm=false,
>>>>
>>>> but the record is not what we want indeed.
>>>>
>>>>
>>>> These 2 logic problems lead to mis-match results. I try to remove
>>>> the bNewTerm logic directly, and make it compare every loog,
>>>>
>>>> then, the mis-match results disappear.
>>>>
>>>> // relevant code
>>>>
>>>> Change below
>>>>
>>>> if (bNewTerm) {
>>>>
>>>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>>>
>>>> }
>>>>
>>>> to
>>>>
>>>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>>>
>>>>
>>>> Need your help to recheck the FTS5 prefix search logic, thank you
>>>> very much.
>>>>
>>>> Yours,
>>>>
>>>> xiaojin zheng
>>>>
>>>>
>>>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>>>
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> [hidden email]
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [hidden email]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [hidden email]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

ze tian
OK,Thank you very much,I will have a try

?取 Outlook for Android<https://aka.ms/ghei36>

________________________________
From: sqlite-users <[hidden email]> on behalf of Dan Kennedy <[hidden email]>
Sent: Thursday, March 22, 2018 7:26:29 PM
To: [hidden email]
Subject: Re: [sqlite] here is my test case//Re: SQLITE3 FTS5 prefix query get mismatch result

On 03/22/2018 06:05 PM, Dan Kennedy wrote:

> On 03/22/2018 05:38 PM, zheng xiaojin wrote:
>> Sorry, but testcase maybe have some format problem, changed belowed,
>> I run this on SQLITE 3.15.2, both on Windows and linux. and only less
>> than 2min can happen. It doesn't matter about threadsafe, because the
>> write func is only used to protect the test data. It means you can
>> stop the. write func when problem happen, and then you can just shell
>> to the db and run the same query then problem happen definitely.
>
>
> Can you do that (create a db that exhibits the problem) and upload it
> somewhere or email it directly to me? Then I'll be able to demonstrate
> the problem using just the shell.

Actually don't worry. I finally got it to work. I think it's the bug
fixed exactly a year and a day ago here:

   http://www.sqlite.org/src/info/840042cb2bed2924

Fix was published in 3.19.0. Can you confirm that 3.19.0 do not produce
the problem for you?

Thanks,
Dan.





>
> Thanks,
> Dan.
>
>
>
>>
>>
>>
>>> "create trigger if not exists trigger_insert_tbl after insert on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowid<http://new.rowid>,new.name<http://new.name>,,new.uid<http://new.uid>);
>>> \
>>> end;",
>>> "create trigger if not exists trigger_update_tbl after update on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowi<http://new.rowid>d,new.name<http://new.name>,new.uid<http://new.uid>);\
>>> end;"
>>
>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>
>> ________________________________
>> From: sqlite-users <[hidden email]> on
>> behalf of Dan Kennedy <[hidden email]>
>> Sent: Thursday, March 22, 2018 4:56:15 PM
>> To: [hidden email]
>> Subject: Re: [sqlite] here is my test case//Re: SQLITE3 FTS5 prefix
>> query get mismatch result
>>
>> On 03/21/2018 03:16 PM, zheng xiaojin wrote:
>>> (Please add.the head file and sqlite lib yourself, Thank you very much)
>>
>> Hi,
>>
>> Thanks for doing this. I'm running this version of the test program
>> modified for posix threads:
>>
>>     https://pastebin.com/d1HCX2aJ
>>
>> but I haven't seen any errors yet. How long do you usually have to run
>> it for before seeing the mismatch error?
>>
>> Which version of SQLite are you seeing the errors with?
>>
>> Are you building SQLite with SQLITE_THREADSAFE=0?
>>
>> Do you have access to a Linux or similar system on which you can run the
>> pthreads version above? Does it produce errors for you?
>>
>> Thanks,
>> Dan Kennedy.
>>
>>
>>
>>
>>
>>> char *orgName[] = {
>>> "yun j ji yunji yunj yji yj z zh zha zhao q qi qia qian d du dua
>>> duan zhaoqian zhaoq zhqian zhq zqian zq qianduan qiand qduan qd
>>> zhaoqianduan zhaoqiand zhaoqduan zhaoqd zhqianduan zhqiand zhqduan
>>> zhqd zqianduan zqiand zqduan zqd",
>>> "murren m mu l lu lun mulun mul mlun ml",
>>> "l lu m ma man luman lum lman lm",
>>> "d   di   din   ding   d   di   din   ding   f   fu   w   wu
>>> dingding   dingd   dding   dd   dingfu   dingf   dfu   df fuwu
>>> fuw   fwu   fw   dingdingfu   dingdingf   dingdfu dingdf   ddingfu
>>> ddingf   ddfu   ddf   dingfuwu   dingfuw dingfwu   dingfw   dfuwu
>>> dfuw   dfwu   dfw   dingdingfuwu dingdingfuw   dingdingfwu
>>> dingdingfw   dingdfuwu dingdfuw   dingdfwu   dingdfw   ddingfuwu
>>> ddingfuw ddingfwu   ddingfw   ddfuwu   ddfuw   ddfwu   ddfw   l   li
>>> x   xi   xia   lixia   lix   lxia   lx",
>>> "lucy y ya yan x xi yanxi yanx yxi yx",
>>> "p pe pen peng l le lei penglei pengl plei pl lucy"
>>> };
>>> char *nickName[] = {
>>> "ab",
>>> "bc",
>>> "cd",
>>> "de",
>>> "ef",
>>> "fg"
>>> };
>>> int id[] = { 21078,21218,21125,53234,40824,164873 };
>>>
>>> void prepareSchema_prefix(sqlite3 *db) {
>>> char *schemas[] = {
>>> "PRAGMA journal_mode=WAL;",
>>> "drop table if exists tbl;",
>>> "drop table if exists tbl_fts;",
>>> "create table if not exists tbl(name text, uid int primary key, nick
>>> text);",
>>> "create virtual table if not exists tbl_fts USING fts5(name, uid);",
>>> "create trigger if not exists trigger_insert_tbl after insert on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);
>>> \
>>> end;",
>>> "create trigger if not exists trigger_update_tbl after update on tbl \
>>> begin \
>>> replace into tbl_fts(rowid, name, uid)
>>> values(new.rowid<http://new.rowid>,new.name<http://new.name>,new.uid<http://new.uid>);\
>>> end;"
>>> };
>>> int i, j, rc, cnt;
>>> cnt = sizeof(schemas) / sizeof(char*);
>>> char *errMsg;
>>> for (i = 0; i < cnt; i++) {
>>> rc = sqlite3_exec(db, schemas[i], NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> printf("prepareSchema error: %s\n", errMsg);
>>> return;
>>> }
>>> }
>>> printf("prepareSchema over\n");
>>> }
>>>
>>> void fts5_prefix_match_test() {
>>> sqlite3 *db;
>>> int i, j, k, rc;
>>> char *errMsg;
>>> char *append = "abcdefghijklmnopqrstuvwxyz";
>>> sqlite3_open("a.db<http://a.db>", &db);
>>> prepareSchema_prefix(db);
>>> int test_cnt = 300, rc_copy = 0;
>>> char zSql[2048];
>>> char *tmp;
>>> const int len_zSql = 2047;
>>> int len_left;
>>> int nRow = 0, nCol = 0;
>>> char **pazResult;
>>> const int trx_cnt = 10000;
>>> int trx_idx;
>>> for (trx_idx = 0; trx_idx < 10000; trx_idx++) {
>>> for (i = 0; i < test_cnt; i++) {
>>> j = random(6);
>>> len_left = len_zSql;
>>> tmp = zSql;
>>> if (j < 4) {
>>> int uid = id[j] + random(300);
>>> rc_copy = snprintf(tmp, len_left, "insert into tbl
>>> values(\"%s%c\",%d,\"%s%c\");",
>>> orgName[j], append[random(26)], uid, nickName[j], append[random(26)]);
>>> //printf("%s\n", zSql);
>>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> if (rc != SQLITE_CONSTRAINT) {
>>> printf("error %s: %s\n", zSql, errMsg);
>>> continue;
>>> }
>>> else {
>>> rc_copy = snprintf(tmp, len_left, "update tbl set name=\"%s%c\"
>>> where uid=%d\n",
>>> orgName[j], append[random(26)], uid);
>>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> printf("error %s: %s\n", zSql, errMsg);
>>> continue;
>>> }
>>> }
>>> }
>>> }
>>> else {
>>> rc_copy = snprintf(zSql, len_zSql, "select * from tbl where
>>> uid=%d;", id[j]);
>>> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
>>> if (nRow == 0) {
>>> rc_copy = snprintf(tmp, len_left, "insert into tbl(name, uid, nick)
>>> values(\"%s\",%d,\"%s\");",
>>> orgName[j], id[j], nickName[j]);
>>> }
>>> else {
>>> rc_copy = snprintf(zSql, len_zSql, "update tbl set name=\"%s\" where
>>> uid=%d;",
>>> orgName[j], id[j]);
>>> }
>>> rc = sqlite3_exec(db, zSql, NULL, NULL, &errMsg);
>>> if (rc != SQLITE_OK&&rc != SQLITE_DONE) {
>>> printf("error %s: %s\n", zSql, errMsg);
>>> continue;
>>> }
>>> }
>>> }
>>> printf("insert 300 records\n");
>>> sqlite3_sleep(200);
>>> }
>>> }
>>>
>>>
>>> unsigned __stdcall writeFunc(void *pIn) {
>>> fts5_prefix_match_test();
>>> printf("write over\n");
>>> return 0;
>>> }
>>> unsigned __stdcall readFunc(void *pIn) {
>>> sqlite3 *db;
>>> int i, j, rc;
>>> char *errMsg;
>>> char **pazResult;
>>> int nRow, nCol;
>>> sqlite3_sleep(2000);
>>> rc = sqlite3_open("a.db<http://a.db>", &db);
>>> if (rc != SQLITE_OK) {
>>> printf("open db error\n");
>>> return 0;
>>> }
>>> char *zSql = "select rowid, * from tbl_fts where tbl_fts match
>>> \'lucy*\';";
>>> while (1) {
>>> rc = sqlite3_get_table(db, zSql, &pazResult, &nRow, &nCol, &errMsg);
>>> if (nRow > 2) {
>>> printf("mis-match\n");
>>> for (i = 0; i <= nRow; i++) {
>>> for (j = 0; j < nCol; j++) {
>>> printf("%s\t", pazResult[i*nCol + j]);
>>> }
>>> printf("\n");
>>> }
>>> return 0;
>>> }
>>> else {
>>> printf("match cnt = %d\n", nRow);
>>> }
>>> sqlite3_free_table(pazResult);
>>> sqlite3_sleep(3000);
>>> }
>>> return 0;
>>> }
>>> #define THREAD_NUM 2
>>>
>>> int thread(ThreadFuncType writeFunc, ThreadFuncType readFunc) {
>>> HANDLE handle[THREAD_NUM];
>>> if (writeFunc) {
>>> handle[0] = (HANDLE)_beginthreadex(NULL, 0, writeFunc, NULL, 0, NULL);
>>> //WaitForMultipleObjects(1, handle, TRUE, INFINITE);
>>> }
>>> if (readFunc) {
>>> for (int i = 1; i < THREAD_NUM; i++) {
>>> handle[i] = (HANDLE)_beginthreadex(NULL, 0, readFunc, NULL, 0, NULL);
>>> Sleep(5);
>>> }
>>> if (writeFunc) {
>>> WaitForMultipleObjects(THREAD_NUM, &handle[0], TRUE, INFINITE);
>>> }
>>> else {
>>> WaitForMultipleObjects(THREAD_NUM - 1, &handle[1], TRUE, INFINITE);
>>> }
>>> printf("multithread are all over now.\n");
>>> }
>>> else {
>>> WaitForMultipleObjects(1, &handle[0], TRUE, INFINITE);
>>> printf("multithread are all over now.\n");
>>> }
>>> return 0;
>>> }
>>>
>>> int main() {
>>> thread(writeFunc, readFunc);
>>> return 0;
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>>
>>> ________________________________
>>> From: sqlite-users <[hidden email]> on
>>> behalf of Dan Kennedy <[hidden email]>
>>> Sent: Tuesday, March 20, 2018 7:49:37 PM
>>> To: [hidden email]
>>> Subject: Re: [sqlite] SQLITE3 FTS5 prefix query get mismatch result
>>>
>>> On 03/20/2018 02:31 PM, zheng xiaojin wrote:
>>>> (I am not sure whether you have seen my message, so resend again)
>>>> Hi,
>>>> When I use FTS5, I have met that, there are some cases which will
>>>> get mis-match results with prefix search.
>>>>
>>>> Like "select * from tbl_fts where tbl_fts match 'lucy*';",which I
>>>> want to get records like "lucya","lucyabc" etc, and
>>>>
>>>> "lux" or "lulu" is not what I want but returned.
>>>>
>>>> Such problems are not common, But I have tried to build such test
>>>> case which can lead to this problem very easy.
>>> Thanks for looking into and reporting this problem. Are you able to
>>> post
>>> the source for the program you used to execute the test described
>>> below?
>>>
>>> Dan.
>>>
>>>
>>>> Here is how I generate it:
>>>>
>>>> 1) create an fts5 table. and insert some record like "lucya","lucyb".
>>>>
>>>> 2) prepare some records: a) lusheng b)lulu; c)lunix; d)luma; e)
>>>> pengyu. a,b,c,d are have some same prefix(lu), e is some other
>>>> random case.
>>>>
>>>> 3) before insert into the fts table with 2) records, appending some
>>>> random letter to make each record different.
>>>>
>>>> Like: "lulu","luluabc","luluefg", also "lunix","lunixabc",etc
>>>>
>>>> 4) for-loop insert, and each loop trying to lantch the query
>>>> 'lucy*', \
>>>>
>>>> check the match result will finding the mis-match result, the
>>>> corrent results should be "lucya","lucyb", not "luluabc"...
>>>>
>>>>
>>>> When mis-match happen, I try to analyze the prefix search mechanism
>>>> and find that, there are 2 points which I think have problems:
>>>>
>>>> 1) fts5LeafSeek, when search failed, and exec goto search_failed,
>>>> in search_failed, the 2 if condition will not satisfy commonly. In
>>>> my mind, I think it should return,
>>>>
>>>> but not, and then the search_success logic exec.
>>>>
>>>> 2) fts5SetupPrefixIter, when gather results, the logic to set the
>>>> flag bNewTerm has some leak, which will set bNewTerm=false,
>>>>
>>>> but the record is not what we want indeed.
>>>>
>>>>
>>>> These 2 logic problems lead to mis-match results. I try to remove
>>>> the bNewTerm logic directly, and make it compare every loog,
>>>>
>>>> then, the mis-match results disappear.
>>>>
>>>> // relevant code
>>>>
>>>> Change below
>>>>
>>>> if (bNewTerm) {
>>>>
>>>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>>>
>>>> }
>>>>
>>>> to
>>>>
>>>> if (nTerm < nToken || memcmp(pToken, pTerm, nToken)) break;
>>>>
>>>>
>>>> Need your help to recheck the FTS5 prefix search logic, thank you
>>>> very much.
>>>>
>>>> Yours,
>>>>
>>>> xiaojin zheng
>>>>
>>>>
>>>> ?取 Outlook for Android<https://aka.ms/ghei36>
>>>>
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> [hidden email]
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [hidden email]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [hidden email]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users