printf() with UTF-8 and \n \t format

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

printf() with UTF-8 and \n \t format

Tony Papadimitriou
A couple of questions about printf

1. Does it work with UTF-8? If so, how?

2. Does it understand \n and \t?  I put actual line breaks inside the string which is OK if run from script file but it won’t work with one-liners on the command-line.

Thank you
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: printf() with UTF-8 and \n \t format

R Smith-2

On 2017/12/19 8:37 PM, Tony Papadimitriou wrote:
> A couple of questions about printf
>
> 1. Does it work with UTF-8? If so, how?

- Yes.
- Very nicely.

> 2. Does it understand \n and \t?  I put actual line breaks inside the string which is OK if run from script file but it won’t work with one-liners on the command-line.

The \n, \t, \r etc. are really dependent on some factors (OS etc.). It
gets interpreted outside of SQLite and differently via different
IO/Console mechanisms - Which command line are you using? Do you mean
the sqlite3.exe CLI? (Or one of its bretheren on other OSes?)

If the first question about UTF8 is also based on the Command-Line
interface's IO handling of UTF8, then that might get tricky. In Windows
(for instance) it depends on the selected Code-Page. Perhaps just try it
and if you don't get it right, please ask on here stating your exact OS
and tried method, then I promise you someone will show you the easy way
to achieve it (the question comes along often and always gets answered
well, but it always depends on specifics).

At this point we know very little about your use-case, so it's a little
hard to answer.


Good luck!
Ryan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: printf() with UTF-8 and \n \t format

Keith Medcalf
In reply to this post by Tony Papadimitriou

Which printf?  There are a lot of them.

Assuming that you mean the SQLite3 built-in function printf() (as in SELECT PRINTF(...);) that function does not interpret backslash escape sequences.  Interpretation of such things are a user I/O feature, not a data storage/retrieval feature.  

As for the first question, every string in SQLite3 is UTF-8, unless you tell it explicitly that you want one of the UTF-16 formats or use blob (bag-o-bytes) which can be any old bag-o-bytes you happen to like and it is just that, a bag-o-bytes containing just, well, bytes.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Tony Papadimitriou
>Sent: Tuesday, 19 December, 2017 11:38
>To: General Discussion of SQLite Database
>Subject: [sqlite] printf() with UTF-8 and \n \t format
>
>A couple of questions about printf
>
>1. Does it work with UTF-8? If so, how?
>
>2. Does it understand \n and \t?  I put actual line breaks inside the
>string which is OK if run from script file but it won’t work with
>one-liners on the command-line.
>
>Thank you
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: printf() with UTF-8 and \n \t format

Tony Papadimitriou
In reply to this post by R Smith-2
-----Original Message-----
From: R Smith

On 2017/12/19 8:37 PM, Tony Papadimitriou wrote:
>> A couple of questions about printf
>>
>> 1. Does it work with UTF-8? If so, how?
>
>- Yes.
>- Very nicely.

I'm using SQL v3.21 and UTF-8 does not work correctly.  (Not from the
command line.)

I tried with latest trunk and it works fine.  Hmmm.
Let's try with the precompiled Windows binary for v3.21.0
It works.  Hmmm!
(...many, many trials later...)

Let's try with different default options.

Ta da!!!

When using the -column option (my own binary has this as default) the
problem shows up.  With the official default of -list option the problem is
not there.
And, it happens with the latest trunk, also.

So, that looks like a bug.

>> 2. Does it understand \n and \t?  I put actual line breaks inside the
>> string >which is OK if run from script file but it won’t work with
>> one-liners on the >command-line.>
>
>The \n, \t, \r etc. are really dependent on some factors (OS etc.).

Yes, I know all that.  The question was if it understands them, not how they
might behave depending on OS.

When I use \t or \n I get actual \t and \n strings displayed instead of
tabbing and advancing line, respectively.

So, is there any way to advance to next line from a command line printf()?

>Ryan

Thanks

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: printf() with UTF-8 and \n \t format

Keith Medcalf

>So, is there any way to advance to next line from a command line
>printf()?

print a linefeed.  That is how you tell a computer output device to advance to the beginning of the next line.

sqlite> select printf('%s%s%s', 'line 1', char(10), 'line 2');
line 1
line 2
sqlite>

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>
>>Ryan
>
>Thanks
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: printf() with UTF-8 and \n \t format

R Smith-2
In reply to this post by Tony Papadimitriou


On 2017/12/19 10:24 PM, Tony Papadimitriou wrote:

>
>>> 2. Does it understand \n and \t?  I put actual line breaks inside
>>> the string >which is OK if run from script file but it won’t work
>>> with one-liners on the >command-line.>
>>
>> The \n, \t, \r etc. are really dependent on some factors (OS etc.).
>
> Yes, I know all that.  The question was if it understands them, not
> how they might behave depending on OS.

But that is the entire point.
SQLite doesn't handle or understand \n, \t, \r, \x, \Father
Christmas....  Those things are OS-dependent escapes to get
non-printable characters into command line input. SQlite calls a
Line-Feed a Line-Feed. It is agnostic to  how the Line-Feed gets passed
to it. The OS (sometimes) facilitates a conversion of the character
sequence "\" + "n" as a "Line-Feed"  when passing it to a command-line
interface, such as sqlite.exe in our case, but by the time sqlite sees
it, it is already a Linefeed character (i.e. ASCII/Unicode/UTF8
Character no. 10) - sqlite doesn't know anything about "\n", it cannot,
does not, and need not interpret it, it is completely irrelevant to
sqlite.  "\n" is a function of the OS to allow the user to more easily
send character(10) to a console application, the console application
(thank goodness) need not know anything about "\n". i.e if your SQLite
DOESN'T understand \n, it is a shortcoming/by design/feature of your OS,
it has nothing to do with SQLite.

Is that more clear?

The same goes for UTF8.

SQLite isn't so much "ABLE" to read UTF8, as it is indeed the only way
it communicates, it doesn't do non-UTF8. UTF8 is the only language it
speaks. So if your UTF8 doesn't get passed correctly to SQLite it is
typically a problem of the OS Code page or the sqlite-using application
running in the console. (Btw. sqlite3.exe and its ilk are all command
line applications and isn't the sqlite engine itself, your fixes wrt the
UTF8 probably fixed the CLI in some way and it may well be a bug there,
so if you can submit a test case it would be helpful - please give all
the details).


> So, is there any way to advance to next line from a command line
> printf()?

Extremely easily:
SELECT Char(10);
- or -
SELECT printf( '%s', Char(10) );

And before you ask....
Char(07) = BELL
Char(08) = BACKSPACE
Char(09) = TAB = \t
Char(10) = LF = \n
Char(11) = VTAB (Vertical Tab)
Char(12) = FF (FormFeed) - in case you print to an old-style line
printer (LPT1)
Char(13) = CR = \r  (Carriage Return)
Char(27) = ESC
Char(32) = SPACE

That more or less covers the important non-printables / invisible
characters.

Hope that helps :)
Ryan





>
>> Ryan
>
> Thanks
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: printf() with UTF-8 and \n \t format

Tony Papadimitriou
In reply to this post by Keith Medcalf
Great! Didn't think of the char() function at all.  (Although I would prefer
a platform independent \n)

Thanks.

-----Original Message-----
From: Keith Medcalf

>So, is there any way to advance to next line from a command line
>printf()?

sqlite> select printf('%s%s%s', 'line 1', char(10), 'line 2');
line 1
line 2

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: printf() with UTF-8 and \n \t format

R Smith-2

On 2017/12/19 11:13 PM, Tony Papadimitriou wrote:
> Great! Didn't think of the char() function at all.  (Although I would
> prefer a platform independent \n)

"\n" is NOT platform independent.  Char(10) on the other hand IS
platform independent. That's perhaps the root of the misunderstanding.

I mean there are other factors too... Like on Linux we like to use only
Char(10) (which can sometimes be signaled by "\n" to your app if the
user pleases) having the great advantage of being nice and lean as a
line and record separator.

On Windows we use both a Char(13) and Char(10) to signal a line-break or
record separator (or "\r\n" if you will) which is fatter but allows you
to use a normal linefeed character within records without
breaking/ending the record.

These things are all OS dependent and so are the escape sequences
"\anything" - it has nothing to do with data storage. In sqlite (or any
other database) we use the actual really real control characters, not
some escaped representation of it.  If you need to produce database
output that will be fed via a command-line or console to a next
application and want to pop out those escapes character sequences in
stead of the real McCoy, then you can easily do that too by making a UDF
or even doing this:

SELECT  replace(replace(replace(replace( "SomeDataColumn", char(10),
'\n'), char(13), '\r'), char(09), '\t')....

But I have to tell you, that makes my spine twitch....

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: printf() with UTF-8 and \n \t format

Hick Gunter
In reply to this post by Tony Papadimitriou
The most common "problem" with UTF-8 and string lengths is that multibyte UTF-8 characters (most often characters with diacritical marks, e.g. german umlaut or special characters like the EUR sign) get truncated in between their constituent bytes. This leads to invalid byte sequences at the "end" of strings, which picky programs gag on.

The second most common "problem" I have come across is "double translation", i.e. an ISO input strings gets converted to UTF-8 and the result is subjected to another conversion. This will lead to invalid byte sequences "in the middle of" a string, to the same effect as above.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Tony Papadimitriou
Gesendet: Dienstag, 19. Dezember 2017 21:24
An: SQLite mailing list <[hidden email]>
Betreff: [EXTERNAL] Re: [sqlite] printf() with UTF-8 and \n \t format

-----Original Message-----
From: R Smith

On 2017/12/19 8:37 PM, Tony Papadimitriou wrote:
>> A couple of questions about printf
>>
>> 1. Does it work with UTF-8? If so, how?
>
>- Yes.
>- Very nicely.

I'm using SQL v3.21 and UTF-8 does not work correctly.  (Not from the command line.)

I tried with latest trunk and it works fine.  Hmmm.
Let's try with the precompiled Windows binary for v3.21.0 It works.  Hmmm!
(...many, many trials later...)

Let's try with different default options.

Ta da!!!

When using the -column option (my own binary has this as default) the problem shows up.  With the official default of -list option the problem is not there.
And, it happens with the latest trunk, also.

So, that looks like a bug.

>> 2. Does it understand \n and \t?  I put actual line breaks inside the
>> string >which is OK if run from script file but it won’t work with
>> one-liners on the >command-line.>
>
>The \n, \t, \r etc. are really dependent on some factors (OS etc.).

Yes, I know all that.  The question was if it understands them, not how they might behave depending on OS.

When I use \t or \n I get actual \t and \n strings displayed instead of tabbing and advancing line, respectively.

So, is there any way to advance to next line from a command line printf()?

>Ryan

Thanks

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users