Column alignment in bash with utf8

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Column alignment in bash with utf8

Eddy-11
Hello,

I'm sorry if this subject has been dealt with before but I was unable to
find anything that I could understand.

I'm a very basic user of sqlite3 (version 3.2.1 from a ubuntu package)
and I manipulate my database trough bash commands in a terminal (KDE's
konsole actually) using the utf-8 charset.

If I create the following database
    BEGIN TRANSACTION;
    CREATE TABLE test(Field1 text, Field2 text);
    INSERT INTO "test" VALUES('eee', 'fff');
    INSERT INTO "test" VALUES('eée', 'fff');
    COMMIT;
and then type
    .mode column
    select * from test;
I get the annoying following result
    eee         fff
    eée        fff
where the first "f" on the second line is shifted one character to the
left thus breaking the intended alignment in column mode. Of course if
the "é" was displayed as two characters, the alignment would be correct :(

First I thought konsole was responsible for this but I get the same
result if I redirect output to a file using
    .output test.txt

Am I doing something wrong ?
Is it a know bug of sqlite3 ?
Is there a way to avoid it ?

Thanks for any advice.
In the meanwhile I'll have to stick to using iso-8859-1 with sqlite,
quite a pity knowing that sqlite uses unicode internally.

--
Eddy
Message réalisé et envoyé sous GNU/Linux
composé à 100 % d'électrons libres.

Reply | Threaded
Open this post in threaded view
|

Re: Column alignment in bash with utf8

D. Richard Hipp
Eddy <[hidden email]> wrote:
>
> Am I doing something wrong ?
> Is it a know bug of sqlite3 ?

Please distinguish between SQLite the C library and sqlite the
command-line shell that you can use to access databases.  We work
very hard to make sure that SQLite the C library is free of bugs.
But sqlite the command-line shell is considerably less-well tested.

I suppose you could argue that this is a bug in the command-line
shell.  I won't contradict you.  But being a monolingual american,
I have no clue how to reproduce the problem, much less fix it.

> Is there a way to avoid it ?
>

Write your own command-line shell.  Perhaps fix the CSV support
while you are at it.  Submit your patches.  Or put the sources
to your rewrite on the wiki someplace.

This is not a overly complex task.  The current command-line
shell is a single file containing 1492 lines of C code.  How
hard can it be to replace that?

--
D. Richard Hipp <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: Column alignment in bash with utf8

Matt Wilson-4
On Mon, Jan 02, 2006 at 04:47:31PM -0500, [hidden email] wrote:
>
> I suppose you could argue that this is a bug in the command-line
> shell.  I won't contradict you.  But being a monolingual american,
> I have no clue how to reproduce the problem, much less fix it.

You just need to compensate for the discrepancy between the number of
chars in a multibyte string and the actual amount of space (in
columns) that is used when printing that multibyte string.  The
attached patch is one way to do this.  It uses wcswidth(), which is a
UNIX98 function, so this isn't extremely portable.  Therefore, I
wouldn't recommend applying it without adding some autoconf-style
detection and stub out wstrlen() for platforms that don't have the
needed functions.

Cheers,

Matt
--
Matt Wilson
rPath, Inc.
[hidden email]