Data types for date and time functions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Data types for date and time functions

Dave Wellman
Hi,

 

We're designing a new feature which will involve a lot of date/time
calculations which we intend to do within the SQLite engine. As far as we
can tell it has the functions that we need. Basically we'll be loading data
into SQLite and performing analysis and calculations using SQL.

 

The question has come up as to how we should store date/time values in our
tables? Basically how should we define our date/time columns?

 

For us an obvious choice is to store dates/times as REAL. We're working in a
Windows environment and so in the application code we're dealing with MS
serial date values.

 

Looking at the 'date and tine functions' page
(http://www.sqlite.org/lang_datefunc.html ) it says "All five date and time
functions take a time string as an argument". So my initial reaction is to
store dates and times as TEXT. I think this means that when passing such
date/time values into the functions there is one less conversion to do.

 

But then looking at some of the examples on that page I came across the
following:

Compute the date and time given a unix timestamp 1092941466.

SELECT datetime(1092941466, 'unixepoch');

 

In the sql syntax that I'm used to (which I thought was the same for SQLite)
a series of digits like that shown above is treated as a numeric data value
(and type), not text (i.e. a  "time string"). If that was meant to be string
data (i.e. text) then I'd have expected:

 

SELECT datetime('1092941466', 'unixepoch');

 

So to clarify: Is our idea of storing date/time values as TEXT data a
sensible one?

 

We're essentially thinking of performance and there are almost certainly
pros and cons to doing this.

Pro: (I think) more efficient processing as the data is supplied to these
functions as text which is what they're expecting/require - and therefore
there is no additional conversion required.

Con: Our full timestamps will be 19 bytes compared to a REAL which is only 8
bytes. So they will require more storage/disk space which ultimately means
more I/O to read the same number of rows and columns.

 

I accept that from a performance perspective there may not be much in it,
but I'd be interested in people's thoughts.

 

In anticipation, many thanks.

 

Cheers,

Dave

 

 

 

Ward Analytics Ltd - information in motion

Tel: +44 (0) 118 9740191

Fax: +44 (0) 118 9740192

www:  <http://www.ward-analytics.com> http://www.ward-analytics.com

 

Registered office address: The Oriel, Sydenham Road, Guildford, Surrey,
United Kingdom, GU1 3SR

Registered company number: 3917021 Registered in England and Wales.

 

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Data types for date and time functions

Clemens Ladisch
David Wellman wrote:
> The question has come up as to how we should store date/time values in our
> tables? Basically how should we define our date/time columns?

SQLite does not have a separate date/time type.

If you want to use the built-in date/time function, you can store values
in one of three different formats:
http://www.sqlite.org/datatype3.html#datetime

> in the application code we're dealing with MS serial date values.

That is not one of SQLite's supported formats.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Data types for date and time functions

Paul Sanderson
In reply to this post by Dave Wellman
Internally SQLite stores and process numbers as Julian day numbers, the
number of days since noon in Greenwich on November 24, 4714 B.C.

I have not examined the code in any depth but would assume that if you
store the data in the same format it would save on any processing overhead
for calculations but would just require a conversion for display.





Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 25 September 2017 at 09:12, David Wellman <[hidden email]>
wrote:

> Hi,
>
>
>
> We're designing a new feature which will involve a lot of date/time
> calculations which we intend to do within the SQLite engine. As far as we
> can tell it has the functions that we need. Basically we'll be loading data
> into SQLite and performing analysis and calculations using SQL.
>
>
>
> The question has come up as to how we should store date/time values in our
> tables? Basically how should we define our date/time columns?
>
>
>
> For us an obvious choice is to store dates/times as REAL. We're working in
> a
> Windows environment and so in the application code we're dealing with MS
> serial date values.
>
>
>
> Looking at the 'date and tine functions' page
> (http://www.sqlite.org/lang_datefunc.html ) it says "All five date and
> time
> functions take a time string as an argument". So my initial reaction is to
> store dates and times as TEXT. I think this means that when passing such
> date/time values into the functions there is one less conversion to do.
>
>
>
> But then looking at some of the examples on that page I came across the
> following:
>
> Compute the date and time given a unix timestamp 1092941466.
>
> SELECT datetime(1092941466, 'unixepoch');
>
>
>
> In the sql syntax that I'm used to (which I thought was the same for
> SQLite)
> a series of digits like that shown above is treated as a numeric data value
> (and type), not text (i.e. a  "time string"). If that was meant to be
> string
> data (i.e. text) then I'd have expected:
>
>
>
> SELECT datetime('1092941466', 'unixepoch');
>
>
>
> So to clarify: Is our idea of storing date/time values as TEXT data a
> sensible one?
>
>
>
> We're essentially thinking of performance and there are almost certainly
> pros and cons to doing this.
>
> Pro: (I think) more efficient processing as the data is supplied to these
> functions as text which is what they're expecting/require - and therefore
> there is no additional conversion required.
>
> Con: Our full timestamps will be 19 bytes compared to a REAL which is only
> 8
> bytes. So they will require more storage/disk space which ultimately means
> more I/O to read the same number of rows and columns.
>
>
>
> I accept that from a performance perspective there may not be much in it,
> but I'd be interested in people's thoughts.
>
>
>
> In anticipation, many thanks.
>
>
>
> Cheers,
>
> Dave
>
>
>
>
>
>
>
> Ward Analytics Ltd - information in motion
>
> Tel: +44 (0) 118 9740191
>
> Fax: +44 (0) 118 9740192
>
> www:  <http://www.ward-analytics.com> http://www.ward-analytics.com
>
>
>
> Registered office address: The Oriel, Sydenham Road, Guildford, Surrey,
> United Kingdom, GU1 3SR
>
> Registered company number: 3917021 Registered in England and Wales.
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Data types for date and time functions

R Smith
In reply to this post by Dave Wellman
On 2017/09/25 10:12 AM, David Wellman wrote:
> Hi,
>  
>
> We're designing a new feature which will involve a lot of date/time
> calculations which we intend to do within the SQLite engine//....

> The question has come up as to how we should store date/time values in our
> tables? Basically how should we define our date/time columns?

A short summary of considerations for that decision:

There are three standard date-time storage options (there are others
too, but these are the defacto go-to for most systems, all of which
SQLite handles well):
A - Storing an INTEGER Unix timestamp (Seconds or Milliseconds since a
base UTC date) or Julian day value,
B - Storing FLOAT/REAL where the Integer part describes the days since a
base date, typically 1/1/1900 or 1/1/1970, depending on who you roll
with and
C - Storing a string with a date or date and time, typically the
standard form is ISO8601 which looks like 'YYYY-MM-DDTHH:NN:SS.MSS
+ZZ:ZZ' with the T optionally being a space and the + could be + or - to
indicate offset (Zulu time) etc. Some parts are optional, there is a
good argument to not store Zulu time (offsets) but revert to UTC and let
the displaying app format the time to whatever time-frame the user
expects (much like Unix timestamps), so most actual stored dates end up
as simply 'YYYY-MM-DD HH:NN:SS' or even just 'YYYY-MM-DD' if not
interested in the time.

All of these have pro's and cons. Integer storage is usually most
efficient, but it takes some calculation to interpret, however SQLite is
very efficient at it, but if you want to do it in your own code, you
will need some standard functions to do so.

REAL/FLOAT is used by many systems (such as MS Excel) and it is a really
great storage method for doing time-intensive calculations and being
easy to understand by humans (well, a bit more-so than Unix stamps
anyway). Adding a day is as simple as adding 1 to the value (with the
Unix timestamp you have to add 86400), adding 12 hours (half a day) is
as easy as adding 0.5, etc. Subtracting 3 o'clock from 9 o'clock leaves
you with 0.25, which is a quarter of a day, so 6 hours exactly, and so on.

Strings take up some more space (though not that much actually) but has
the one significant advantage when trouble-shooting or debugging, that
the date is completely human readable as it is stored in the table, a
mere glance at the data will already tell you if the date is a factor in
whatever problem you are solving, whereas the other formats need a bit
of interpretation first (though very easy with a basic format statement
in the query). It has the disadvantage that, although SQLite provides
great and easy-to-use date-time adding/calculating functions, they do
take some more CPU cycles parsing the date string. (To be fair, I've
measured this at some point, and it was negligible, and even though it
was long ago, I doubt SQLite got worse at parsing dates.)

I always use ISO8601 strings storing UTC (non-offset) dates and times,
UNLESS data storage space is a big problem or date-time calculation is a
constant feature of the intended use of the table, then an Integer Unix
timestamp. If the date is simply stored and referenced in a normal way,
ISO date strings it is.

Note that these are my preferences when balancing data storage
considerations with ease of use and working/debugging efficiency, they
are by no means the "industry standard", if such a thing exists.

PS: If you do store ISO date strings, the suggested column TYPE in
SQLite terms is "NUMERIC" (not "TEXT"). At least that is what SQLite
itself relays DATE and DATETIME types to. I doubt it matters greatly,
but I use NUMERIC for any kind of date and it seems to work well, and if
I decide to change the date-time storage format later, it requires
merely an UPDATE query as opposed to a schema change.
Ref: Section 3.1.1 on this page:
http://www.sqlite.org/datatype3.html#affinity_name_examples

Cheers,
Ryan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Data types for date and time functions

J. King-3
On 2017-09-25 08:19:52, "R Smith" <[hidden email]> wrote:

>On 2017/09/25 10:12 AM, David Wellman wrote:
>
>C - Storing a string with a date or date and time, typically the
>standard form is ISO8601 which looks like 'YYYY-MM-DDTHH:NN:SS.MSS
>+ZZ:ZZ' with the T optionally being a space and the + could be + or -
>to indicate offset (Zulu time) etc. Some parts are optional, there is a
>good argument to not store Zulu time (offsets) but revert to UTC and
>let the displaying app format the time to whatever time-frame the user
>expects (much like Unix timestamps), so most actual stored dates end up
>as simply 'YYYY-MM-DD HH:NN:SS' or even just 'YYYY-MM-DD' if not
>interested in the time.

It's worth noting that the CURRENT_TIMESTAMP keyword uses "YYYY-MM-DD
HH:MM:SS" format, and implied UTC.

--
J. King


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Data types for date and time functions

Keith Medcalf
In reply to this post by R Smith

On Monday, 25 September, 2017 06:20, R Smith <[hidden email]> wrote:

>On 2017/09/25 10:12 AM, David Wellman wrote:

>All of these have pro's and cons. Integer storage is usually most
>efficient, but it takes some calculation to interpret, however SQLite
>is very efficient at it, but if you want to do it in your own code,
>you will need some standard functions to do so.

>REAL/FLOAT is used by many systems (such as MS Excel) and it is a
>really great storage method for doing time-intensive calculations
>and being easy to understand by humans (well, a bit more-so than
>Unix stamps anyway). Adding a day is as simple as adding 1 to the
>value (with the Unix timestamp you have to add 86400), adding 12
>hours (half a day) is as easy as adding 0.5, etc. Subtracting
>3 o'clock from 9 o'clock leaves you with 0.25, which is a quarter
>of a day, so 6 hours exactly, and so on.

>Strings take up some more space (though not that much actually) but
>has the one significant advantage when trouble-shooting or debugging,
>that the date is completely human readable as it is stored in the
>table, a mere glance at the data will already tell you if the date
>is a factor in whatever problem you are solving, whereas the other
>formats need a bit of interpretation first (though very easy with
>a basic format statement in the query). It has the disadvantage
>that, although SQLite provides great and easy-to-use date-time
>adding/calculating functions, they do take some more CPU cycles
>parsing the date string. (To be fair, I've measured this at
>some point, and it was negligible, and even though it was long
>ago, I doubt SQLite got worse at parsing dates.)

>I always use ISO8601 strings storing UTC (non-offset) dates and
>times, UNLESS data storage space is a big problem or date-time
>calculation is a constant feature of the intended use of the
>table, then an Integer Unix timestamp. If the date is simply
>stored and referenced in a normal way, ISO date strings it is.

To add to this, you can use any standard supported format, timestrings (ISO8601), julian day numbers (float), or Unix timestamps.  The advantage of using either the Julian or Unix timestamp is that they are already UTC based (or ought to be) and just about everything can handle them.  Using the timestring format can cause issues, however, because unless you "know" that the string is UTC, someone "just looking" or "editing" the database by hand may try to store a localtime value.  In this case, obfuscation is better (I believe).

Of course, an issue that you will run into, no matter what format you choose, is that Microsoft Products cannot convert between UTC and Localtime except for the current set of timezone rules (and assumes that the current rules are propeleptic), and you will need an external library and tables to do this conversion correctly.  This is a limitation of the core runtime libraries and cannot be fixed if you let Windows Platform API's do the conversions.

For this reason it is generally better to store a format such as the julian day or the unix timestamp so you do not have to deal with the vagaries of processing/converting "localtime" on Microsoft platforms.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users