Avoiding resource exhaustion caused by executing untrusted SQL queries

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Avoiding resource exhaustion caused by executing untrusted SQL queries

Simon Willison
I'm working on a Python web application project (Datasette) where I'd like
to accept untrusted SQL queries from users and execute them against a
SQLite database, without risk of malicious SQL being used to perform a
denial of service attack against my application.

I'm opening the database file itself using SQLite's read only mode
(?mode=ro in the URI passed to sqite3.connect()) so I'm not worried about
them being able to modify or corrupt the database file itself.

My concern is more queries that might use an enormous quantity of RAM or
starve the CPU.

One example: if I have a database with 10 million rows, I worry that the
following query may break my application:

select group_concat(large_column) from large_table;

I'm using the Python standard library sqlite3 module, but I'm willing to
ship a custom compiled build of SQLite (based on
https://github.com/coleifer/pysqlite3 ) if necessary.

My absolute ideal solution would be a way of saying to SQLite "if a query
takes longer than X seconds or consumes more than Y amount of memory,
cancel the query and return an error".

I've actually implemented a time limit for queries by hooking into the
progress callback - and having it cancel the query if more than a second
has passed since it started - but I'm not sure how to handle the memory
challenge.

Or maybe I'm overthinking this?  Any tips on directions I could be
investigating here would be very much appreciated!

Thanks,

Simon
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding resource exhaustion caused by executing untrusted SQL queries

Richard Hipp-3
On 8/5/19, Simon Willison <[hidden email]> wrote:
>
> My concern is more queries that might use an enormous quantity of RAM or
> starve the CPU.
>
> I'm using the Python standard library sqlite3 module,

(1) You could monitor SQLite memory usage in your progress callback
and interrupt the running query if the memory usage goes over some
threshold, perhaps.

(2) There is also a patch
(https://www.sqlite.org/src/info/b0ccef61a7f92d20) that adds a new
"sqlite3_hard_heap_limit()" interface and PRAGMA.  If you apply this
patch, then you could set an upper bound on the amount of memory
SQLite is allowed to use and it will return SQLITE_NOMEM if it ever
goes over that limit.

(3) You can also register your own memory-allocator at start time, and
instrument your custom memory allocator to fail after using too much
memory.

(4) If you compile with -DSQLITE_ENABLE_MEMSYS5, then SQLite builds in
its own optimal memory allocator that uses only memory from a single
big allocation you provide it at start-time.  So you give SQLite 30MB
(or whatever you think is appropriate) to play with, and it uses that
and only that memory, and never goes to malloc().
--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users