Why Unicode is difficult

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Why Unicode is difficult

Simon Slavin-3
Every so often someone asks on this list for Unicode to be handled properly.  I did it myself.  Then other people have to explain how hard this is.  So here’s an article which, after introductory material, discusses the hard questions in Unicode:

<https://norasandler.com/2017/11/02/Around-the-with-Unicode.html>

Are two strings the same?
How long is a string?
How do you sort things in alphabetical order?

The first and third questions are requirements for implementing COLLATE in SQLite.  And the fact that the second question is a difficult one emphasises that one shouldn’t take Unicode as simple.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Why Unicode is difficult

Jay Kreibich


Next, we can talk about how dates and times are simple and straight-forward.

 -j



> On Dec 4, 2017, at 7:08 AM, Simon Slavin <[hidden email]> wrote:
>
> Every so often someone asks on this list for Unicode to be handled properly.  I did it myself.  Then other people have to explain how hard this is.  So here’s an article which, after introductory material, discusses the hard questions in Unicode:
>
> <https://norasandler.com/2017/11/02/Around-the-with-Unicode.html>
>
> Are two strings the same?
> How long is a string?
> How do you sort things in alphabetical order?
>
> The first and third questions are requirements for implementing COLLATE in SQLite.  And the fact that the second question is a difficult one emphasises that one shouldn’t take Unicode as simple.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Why Unicode is difficult

Igor Korot
Hi,

On Mon, Dec 4, 2017 at 7:42 AM, Jay Kreibich <[hidden email]> wrote:
>
>
> Next, we can talk about how dates and times are simple and straight-forward.

And then the number representation...

Thank you.

>
>  -j
>
>
>
>> On Dec 4, 2017, at 7:08 AM, Simon Slavin <[hidden email]> wrote:
>>
>> Every so often someone asks on this list for Unicode to be handled properly.  I did it myself.  Then other people have to explain how hard this is.  So here’s an article which, after introductory material, discusses the hard questions in Unicode:
>>
>> <https://norasandler.com/2017/11/02/Around-the-with-Unicode.html>
>>
>> Are two strings the same?
>> How long is a string?
>> How do you sort things in alphabetical order?
>>
>> The first and third questions are requirements for implementing COLLATE in SQLite.  And the fact that the second question is a difficult one emphasises that one shouldn’t take Unicode as simple.
>>
>> Simon.
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Why Unicode is difficult

Stephen Chrzanowski
... as in how 1 != "1"?

On Mon, Dec 4, 2017 at 11:07 AM, Igor Korot <[hidden email]> wrote:

> Hi,
>
> On Mon, Dec 4, 2017 at 7:42 AM, Jay Kreibich <[hidden email]> wrote:
> >
> >
> > Next, we can talk about how dates and times are simple and
> straight-forward.
>
> And then the number representation...
>
> Thank you.
>
> >
> >  -j
> >
> >
> >
> >> On Dec 4, 2017, at 7:08 AM, Simon Slavin <[hidden email]> wrote:
> >>
> >> Every so often someone asks on this list for Unicode to be handled
> properly.  I did it myself.  Then other people have to explain how hard
> this is.  So here’s an article which, after introductory material,
> discusses the hard questions in Unicode:
> >>
> >> <https://norasandler.com/2017/11/02/Around-the-with-Unicode.html>
> >>
> >> Are two strings the same?
> >> How long is a string?
> >> How do you sort things in alphabetical order?
> >>
> >> The first and third questions are requirements for implementing COLLATE
> in SQLite.  And the fact that the second question is a difficult one
> emphasises that one shouldn’t take Unicode as simple.
> >>
> >> Simon.
> >> _______________________________________________
> >> sqlite-users mailing list
> >> [hidden email]
> >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Why Unicode is difficult

Igor Korot
Stephen,

On Mon, Dec 4, 2017 at 1:01 PM, Stephen Chrzanowski <[hidden email]> wrote:
> ... as in how 1 != "1"?

No.
1000 vs 1,000 vs 1.000 vs 1,000.00 vs whatever.

>
> On Mon, Dec 4, 2017 at 11:07 AM, Igor Korot <[hidden email]> wrote:
>
>> Hi,
>>
>> On Mon, Dec 4, 2017 at 7:42 AM, Jay Kreibich <[hidden email]> wrote:
>> >
>> >
>> > Next, we can talk about how dates and times are simple and
>> straight-forward.
>>
>> And then the number representation...
>>
>> Thank you.
>>
>> >
>> >  -j
>> >
>> >
>> >
>> >> On Dec 4, 2017, at 7:08 AM, Simon Slavin <[hidden email]> wrote:
>> >>
>> >> Every so often someone asks on this list for Unicode to be handled
>> properly.  I did it myself.  Then other people have to explain how hard
>> this is.  So here’s an article which, after introductory material,
>> discusses the hard questions in Unicode:
>> >>
>> >> <https://norasandler.com/2017/11/02/Around-the-with-Unicode.html>
>> >>
>> >> Are two strings the same?
>> >> How long is a string?
>> >> How do you sort things in alphabetical order?
>> >>
>> >> The first and third questions are requirements for implementing COLLATE
>> in SQLite.  And the fact that the second question is a difficult one
>> emphasises that one shouldn’t take Unicode as simple.
>> >>
>> >> Simon.
>> >> _______________________________________________
>> >> sqlite-users mailing list
>> >> [hidden email]
>> >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> >
>> > _______________________________________________
>> > sqlite-users mailing list
>> > [hidden email]
>> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Why Unicode is difficult

Jay Kreibich

> On Dec 4, 2017, at 1:33 PM, Igor Korot <[hidden email]> wrote:
>
> Stephen,
>
> On Mon, Dec 4, 2017 at 1:01 PM, Stephen Chrzanowski <[hidden email]> wrote:
>> ... as in how 1 != "1"?
>
> No.
> 1000 vs 1,000 vs 1.000 vs 1,000.00 vs whatever.

I thought you meant how to represent 0.1

And the fact there are so many interpretations of “number representation” aught to give a clue about how complex something “so simple” can be.

 -j



>
>>
>> On Mon, Dec 4, 2017 at 11:07 AM, Igor Korot <[hidden email]> wrote:
>>
>>> Hi,
>>>
>>> On Mon, Dec 4, 2017 at 7:42 AM, Jay Kreibich <[hidden email]> wrote:
>>>>
>>>> Next, we can talk about how dates and times are simple and straight-forward.
>>>
>>> And then the number representation...
>>>

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Why Unicode is difficult

Keith Medcalf
In reply to this post by Stephen Chrzanowski

That depends if the value of the table column called "1" is 1 or not ...

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Stephen Chrzanowski
>Sent: Monday, 4 December, 2017 12:01
>To: SQLite mailing list
>Subject: Re: [sqlite] Why Unicode is difficult
>
>... as in how 1 != "1"?
>
>On Mon, Dec 4, 2017 at 11:07 AM, Igor Korot <[hidden email]>
>wrote:
>
>> Hi,
>>
>> On Mon, Dec 4, 2017 at 7:42 AM, Jay Kreibich <[hidden email]> wrote:
>> >
>> >
>> > Next, we can talk about how dates and times are simple and
>> straight-forward.
>>
>> And then the number representation...
>>
>> Thank you.
>>
>> >
>> >  -j
>> >
>> >
>> >
>> >> On Dec 4, 2017, at 7:08 AM, Simon Slavin <[hidden email]>
>wrote:
>> >>
>> >> Every so often someone asks on this list for Unicode to be
>handled
>> properly.  I did it myself.  Then other people have to explain how
>hard
>> this is.  So here’s an article which, after introductory material,
>> discusses the hard questions in Unicode:
>> >>
>> >> <https://norasandler.com/2017/11/02/Around-the-with-
>Unicode.html>
>> >>
>> >> Are two strings the same?
>> >> How long is a string?
>> >> How do you sort things in alphabetical order?
>> >>
>> >> The first and third questions are requirements for implementing
>COLLATE
>> in SQLite.  And the fact that the second question is a difficult
>one
>> emphasises that one shouldn’t take Unicode as simple.
>> >>
>> >> Simon.
>> >> _______________________________________________
>> >> sqlite-users mailing list
>> >> [hidden email]
>> >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>users
>> >
>> > _______________________________________________
>> > sqlite-users mailing list
>> > [hidden email]
>> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>users
>>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Why Unicode is difficult

John Gillespie-2
In reply to this post by Simon Slavin-3
Fascinating article.
Thanks.
John Gillespie


On 4 December 2017 at 13:08, Simon Slavin <[hidden email]> wrote:

> Every so often someone asks on this list for Unicode to be handled
> properly.  I did it myself.  Then other people have to explain how hard
> this is.  So here’s an article which, after introductory material,
> discusses the hard questions in Unicode:
>
> <https://norasandler.com/2017/11/02/Around-the-with-Unicode.html>
>
> Are two strings the same?
> How long is a string?
> How do you sort things in alphabetical order?
>
> The first and third questions are requirements for implementing COLLATE in
> SQLite.  And the fact that the second question is a difficult one
> emphasises that one shouldn’t take Unicode as simple.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users