(Info) Shox96 Compression as SQLite UDF

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

(Info) Shox96 Compression as SQLite UDF

Arun - Siara Logics (cc)
Shox96 is a compression technique for Short Strings. It can achieve upto 65% compression. This technique is available for compressing text columns in SQLite as loadable extension in the repository https://github.com/siara-cc/Shox96_Sqlite_UDF.

Output screenshot: https://github.com/siara-cc/Shox96_Sqlite_UDF/blob/master/output.png?raw=true
To find out more about Shox96 click: https://github.com/siara-cc/Shox96 
To find out how Shox96 works click: https://github.com/siara-cc/Shox96/blob/master/Shox96_Article_0_2_0.pdf?raw=true 

P.S.: The compressor and decompressor are built for short strings using less memory suitable for constrained environments such as Arduino Uno and ESP8266. So may not be as fast as Zip or GZip.

Related projects:
Sqlite3 Library for ESP32
https://github.com/siara-cc/esp32_arduino_sqlite3_lib 
Sqlite3 Library for ESP8266
https://github.com/siara-cc/esp_arduino_sqlite3_lib 
Sqlite3 Library for ESP-IDF
https://github.com/siara-cc/esp32-idf-sqlite3 
Storing compressed Shox96 text content in Arduino Flash Memory
https://github.com/siara-cc/Shox96_Arduino_Progmem_lib 
Shox96 Compression Library for Arduino
https://github.com/siara-cc/Shox96_Arduino_lib 

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: (Info) Shox96 Compression as SQLite UDF

wmertens
Wonderful!

Things I wonder:

* would it be possible to set up columns in such a way that the compression
is transparent, that is, existing queries remain unchanged?
* How does it fare on JSON strings? I notice that double-quotes are short,
but array and object delimiters are 11/12 bits?

It seems to me that it could be worthwhile to do transparent compression on
JSON strings, perhaps using a differently trained dictionary…

Wout.


On Wed, Feb 27, 2019 at 12:26 PM Arun - Siara Logics (cc) <[hidden email]>
wrote:

> Shox96 is a compression technique for Short Strings. It can achieve upto
> 65% compression. This technique is available for compressing text columns
> in SQLite as loadable extension in the repository
> https://github.com/siara-cc/Shox96_Sqlite_UDF.
>
> Output screenshot:
> https://github.com/siara-cc/Shox96_Sqlite_UDF/blob/master/output.png?raw=true
> To find out more about Shox96 click: https://github.com/siara-cc/Shox96
> To find out how Shox96 works click:
> https://github.com/siara-cc/Shox96/blob/master/Shox96_Article_0_2_0.pdf?raw=true
>
> P.S.: The compressor and decompressor are built for short strings using
> less memory suitable for constrained environments such as Arduino Uno and
> ESP8266. So may not be as fast as Zip or GZip.
>
> Related projects:
> Sqlite3 Library for ESP32
> https://github.com/siara-cc/esp32_arduino_sqlite3_lib
> Sqlite3 Library for ESP8266
> https://github.com/siara-cc/esp_arduino_sqlite3_lib
> Sqlite3 Library for ESP-IDF
> https://github.com/siara-cc/esp32-idf-sqlite3
> Storing compressed Shox96 text content in Arduino Flash Memory
> https://github.com/siara-cc/Shox96_Arduino_Progmem_lib
> Shox96 Compression Library for Arduino
> https://github.com/siara-cc/Shox96_Arduino_lib
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: (Info) Shox96 Compression as SQLite UDF

Jens Alfke-2


> On Feb 28, 2019, at 3:14 PM, Wout Mertens <[hidden email]> wrote:
>
> It seems to me that it could be worthwhile to do transparent compression on
> JSON strings, perhaps using a differently trained dictionary…

Regular LZ/zip type compression already does a good job on JSON, even fairly short bits. Not just because of repeated key strings, but because there are a number of very common byte sequences like
        },{    ":"    ","    {{    }}    "}    "},    "},"
(those are literal quotes)

Shox96 is an interesting idea, but it seems like a real disadvantage that it only compresses alphanumeric characters, fails on non-ASCII, and is biased toward English.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users