Hi. I've got kind of a weird situation and I'd value your thoughts.

Our OpenWRT-based Linux embedded system, on very first boot, will startup with a root filesystem on a RAM (ramFS) mount while the JFFS partition receives an initial format. Once the JFFS is formatted and most of the init scripts have run, the system copies all the files from the temporary RAM filesystem to the JFFS and uses the pivot() system call to make the JFFS the new root. (Future boots use the now-formatted JFFS mount as root directly at startup.)

We have a sqlite DB that various init scripts use, then close. Then the copy and pivot happens and our main application opens the DB on the JFFS volume. (That is, we don't keep any connections to the DB open during the switch from the temp ramFS root to the final JFFS root.)

Our main app is multi-threaded with one connection per thread. It runs the DB in WAL mode. Because sqlite can't memmap the .shm file on JFFS (or something), we move it onto /tmp (another ramFS) with the semi-obscure SQLITE_SHM_DIRECTORY define.

What I am seeing, and ONLY in this first boot situation (where a ramFS -> copy -> pivot -> JFFS) sequence occurred, is that some info written on one connection in one thread is not present when  read back on another connection in another thread. Since we start the app AFTER the transition to JFFS, I believe all connection handles the app holds should be to the same/right DB file on JFFS. If that is so, the problem should NOT be because one thread is writing into "the wrong" DB on ramFS and one is reading from "the right" DB on JFFS.

However, the .shm file has lived happily on /tmp this whole time, whether the DB was on ramFS or JFFS. My impression was this file only held file offsets for housekeeping -- but is there anything else in there that could break if the DB file it was tracking was secretly and transparently moved from one volume to another?

Or do you have any other conjecture?


-- Ward

