home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

18 rows where issue = 837308703 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 18

issue 1

  • Figure out why SpatiaLite 5.0 hangs the database page on Linux · 18 ✖

author_association 1

  • OWNER 18
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
804261915 https://github.com/simonw/datasette/issues/1268#issuecomment-804261915 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwNDI2MTkxNQ== simonw 9599 2021-03-22T17:41:12Z 2021-03-22T17:41:12Z OWNER

Closing this because I've figured out the root of the problem now, and I have a potential solution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803802957 https://github.com/simonw/datasette/issues/1268#issuecomment-803802957 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzgwMjk1Nw== simonw 9599 2021-03-22T06:38:14Z 2021-03-22T06:38:14Z OWNER

Also worth trying is to change this code: python n = 1000 if ms < 50: n = 1 What happens with n = 10 instead?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803784902 https://github.com/simonw/datasette/issues/1268#issuecomment-803784902 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc4NDkwMg== simonw 9599 2021-03-22T05:59:06Z 2021-03-22T05:59:06Z OWNER

Even if I implement that workaround in #1269 I'm concerned that this could still allow users to deliberately crash Datasette (if it's running SpatiaLite 5.0) by executing select count(*) from SpatialIndex.

That interrupt timeout mechanism is worth digging into further.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803782705 https://github.com/simonw/datasette/issues/1268#issuecomment-803782705 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc4MjcwNQ== simonw 9599 2021-03-22T05:54:19Z 2021-03-22T05:54:19Z OWNER

Got two new TILs out of this:

  • Tracing every executed Python statement
  • Running gdb against a Python process in a running Docker container
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803777724 https://github.com/simonw/datasette/issues/1268#issuecomment-803777724 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc3NzcyNA== simonw 9599 2021-03-22T05:42:50Z 2021-03-22T05:43:23Z OWNER

If I want to avoid counting virtual tables, I need to detect which tables are virtual tables.

The safest way to do this is probably to pull the sql for every table and then, in Python, check for values that start with create virtual table after converting to lower case, using any number of spaces.

This would catch things like CREATE virtual TABLE which might be missed by a SQL like query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803775121 https://github.com/simonw/datasette/issues/1268#issuecomment-803775121 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc3NTEyMQ== simonw 9599 2021-03-22T05:36:26Z 2021-03-22T05:36:26Z OWNER

So one fix could be to avoid running counts for anything that turns out to be a virtual table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803774926 https://github.com/simonw/datasette/issues/1268#issuecomment-803774926 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc3NDkyNg== simonw 9599 2021-03-22T05:35:56Z 2021-03-22T05:35:56Z OWNER

That's in this code here: https://github.com/simonw/datasette/blob/c4f1ec7f33fd7d5b93f0f895dafb5351cc3bfc5b/datasette/database.py#L221-L241

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803774518 https://github.com/simonw/datasette/issues/1268#issuecomment-803774518 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc3NDUxOA== simonw 9599 2021-03-22T05:34:57Z 2021-03-22T05:34:57Z OWNER

... and sure enough, adding this code fixed the problem: diff diff --git a/datasette/database.py b/datasette/database.py index 3579cce..b466b12 100644 --- a/datasette/database.py +++ b/datasette/database.py @@ -224,6 +226,9 @@ class Database: # Try to get counts for each table, $limit timeout for each count counts = {} for table in await self.table_names(): + if table == "SpatialIndex": + counts[table] = 0 + continue try: table_count = ( await self.execute(

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803773484 https://github.com/simonw/datasette/issues/1268#issuecomment-803773484 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc3MzQ4NA== simonw 9599 2021-03-22T05:32:29Z 2021-03-22T05:32:29Z OWNER

To figure out which SQL query triggers the problem I added this code to write to a log file: python with sqlite_timelimit(conn, time_limit_ms): try: cursor = conn.cursor() with open("/tmp/sql.log", "ab", buffering=0) as fp: fp.write(("{}: {}\n".format(sql, params)).encode("utf-8")) cursor.execute(sql, params if params is not None else {}) I had to use ab binary mode because Python doesn't allow buffering=0 for non-binary file operations.

With the log enabled, I used docker exec -it 589ae68de943 bash to attach to the running container and tail -f /tmp/sql.log to see the logs. Here's where it broke:

select count(*) from [idx_civici_geom_parent]: None select count(*) from [sqlite_stat1]: None select count(*) from [sqlite_stat3]: None select count(*) from [SpatialIndex]: None So attempting to run a count(*) against the SpatialIndex virtual table is the thing that triggers the bug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803764919 https://github.com/simonw/datasette/issues/1268#issuecomment-803764919 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc2NDkxOQ== simonw 9599 2021-03-22T05:11:11Z 2021-03-22T05:11:11Z OWNER

Maybe I could implement SQLite query timeouts using the interrupt() method instead of the progress handler hack I'm currently using?

https://stackoverflow.com/questions/43240496/python-sqlite3-how-to-quickly-and-cleanly-interrupt-long-running-query-with-e has some tips.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803764200 https://github.com/simonw/datasette/issues/1268#issuecomment-803764200 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc2NDIwMA== simonw 9599 2021-03-22T05:09:13Z 2021-03-22T05:09:13Z OWNER

I tried building a container where the conn.set_progress_handler(handler, n) line was commented out... and it fixed the bug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803762969 https://github.com/simonw/datasette/issues/1268#issuecomment-803762969 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc2Mjk2OQ== simonw 9599 2021-03-22T05:05:51Z 2021-03-22T05:05:51Z OWNER

I had to run docker kill 16197781a7b5 to kill the broken container - Ctrl+C in the Datasette console window didn't do anything.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803762609 https://github.com/simonw/datasette/issues/1268#issuecomment-803762609 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc2MjYwOQ== simonw 9599 2021-03-22T05:05:00Z 2021-03-22T05:05:00Z OWNER

Using https://til.simonwillison.net/docker/attach-bash-to-running-container - I figured out how to run gdb. I had to use --privileged here because otherwise gdb showed a "Could not attach to process" error. ``` docker exec --privileged -it 16197781a7b5 bash

apt-get install gdb python3-dbg

gdb /usr/bin/python3 -p 20

This paused the process. I tried running this: (gdb) py-bt Traceback (most recent call first): File "/usr/lib/python3.8/asyncio/base_events.py", line 1845, in _run_once if handle._cancelled: File "/usr/lib/python3.8/asyncio/base_events.py", line 570, in run_forever self._run_once() File "/usr/lib/python3.8/asyncio/base_events.py", line 603, in run_until_complete self.run_forever() File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 49, in run loop.run_until_complete(self.serve(sockets=sockets)) File "/usr/local/lib/python3.8/dist-packages/uvicorn/main.py", line 386, in run server.run() File "/usr/local/lib/python3.8/dist-packages/datasette/cli.py", line 575, in serve uvicorn.run(ds.app(), uvicorn_kwargs) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 610, in invoke return callback(*args, kwargs) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 829, in call return self.main(*args, kwargs) File "/usr/local/bin/datasette", line 8, in <module> sys.exit(cli()) <built-in method exec of module object at remote 0x7f0981a280e0> File "/usr/lib/python3.8/trace.py", line 450, in runctx exec(cmd, globals, locals) File "/usr/lib/python3.8/trace.py", line 6632, in main File "/usr/lib/python3.8/trace.py", line 756, in <module> main() <built-in method exec of module object at remote 0x7f0981a280e0> File "/usr/lib/python3.8/runpy.py", line 343, in _run_code File "/usr/lib/python3.8/runpy.py", line 450, in _run_module_as_main ``` Not sure if that's useful or not.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803759051 https://github.com/simonw/datasette/issues/1268#issuecomment-803759051 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc1OTA1MQ== simonw 9599 2021-03-22T04:55:22Z 2021-03-22T04:55:22Z OWNER

So I think there's a bug in the way the set_progress_handler() mechanism works when used in conjunction with SpatiaLite 5.0 on Linux.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803758793 https://github.com/simonw/datasette/issues/1268#issuecomment-803758793 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc1ODc5Mw== simonw 9599 2021-03-22T04:54:32Z 2021-03-22T04:54:32Z OWNER

Hitting http://localhost:8001/tuscany_housenumbers triggers the bug. It gets stuck in a loop that looks like this:

Which looks to me like this code: https://github.com/simonw/datasette/blob/8e18c7943181f228ce5ebcea48deb59ce50bee1f/datasette/utils/init.py#L139-L158

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803758182 https://github.com/simonw/datasette/issues/1268#issuecomment-803758182 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc1ODE4Mg== simonw 9599 2021-03-22T04:52:15Z 2021-03-22T04:52:15Z OWNER

Hitting http://localhost:8001/ successfully shows the homepage (after a lot more scrolling).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803757746 https://github.com/simonw/datasette/issues/1268#issuecomment-803757746 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc1Nzc0Ng== simonw 9599 2021-03-22T04:50:40Z 2021-03-22T04:51:52Z OWNER

Here's a fun debugging trick:

docker run -it -p 8001:8001 -v `pwd`:/mnt datasette-spatialite:latest bash
root@16197781a7b5:/# python3 -m trace --trace $(which datasette) \
  -p 8001 -h 0.0.0.0 /mnt/tuscany_housenumbers.sqlite \
  --load-extension=spatialite

A huge amount of stuff scrolls past as Datasette starts up, since we are tracing every executed line of Python.

After about a minute it's finished starting and gets to this point:

selectors.py(452): if timeout is None: selectors.py(454): elif timeout <= 0: selectors.py(459): timeout = math.ceil(timeout * 1e3) * 1e-3 selectors.py(464): max_ev = max(len(self._fd_to_key), 1) selectors.py(466): ready = [] selectors.py(467): try: selectors.py(468): fd_event_list = self._selector.poll(timeout, max_ev) Now I can make some HTTP requests against it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  
803756495 https://github.com/simonw/datasette/issues/1268#issuecomment-803756495 https://api.github.com/repos/simonw/datasette/issues/1268 MDEyOklzc3VlQ29tbWVudDgwMzc1NjQ5NQ== simonw 9599 2021-03-22T04:46:04Z 2021-03-22T04:46:04Z OWNER

gdb may be able to help debug this: https://www.podoliaka.org/2016/04/10/debugging-cpython-gdb/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Figure out why SpatiaLite 5.0 hangs the database page on Linux 837308703  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 19.538ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows