home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

30 rows where issue = 1901416155 and user = 9599 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw · 30 ✖

issue 1

  • Server hang on parallel execution of queries to named in-memory databases · 30 ✖

author_association 1

  • OWNER 30
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1730388418 https://github.com/simonw/datasette/issues/2189#issuecomment-1730388418 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5nI6HC simonw 9599 2023-09-21T22:26:19Z 2023-09-21T22:26:19Z OWNER

1.0a7 is out with this fix as well now: https://docs.datasette.io/en/1.0a7/changelog.html#a7-2023-09-21

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1730232308 https://github.com/simonw/datasette/issues/2189#issuecomment-1730232308 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5nIT_0 simonw 9599 2023-09-21T20:11:16Z 2023-09-21T20:11:16Z OWNER

We're planning a breaking change in 1.0a7: - #2191

Since that's a breaking change I'm going to ship 1.0a7 right now with this fix, then ship that breaking change as 1.0a8 instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1730231404 https://github.com/simonw/datasette/issues/2189#issuecomment-1730231404 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5nITxs simonw 9599 2023-09-21T20:10:28Z 2023-09-21T20:10:28Z OWNER

Release 0.64.4: https://docs.datasette.io/en/stable/changelog.html#v0-64-4

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1730162283 https://github.com/simonw/datasette/issues/2189#issuecomment-1730162283 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5nIC5r simonw 9599 2023-09-21T19:19:47Z 2023-09-21T19:19:47Z OWNER

I'm going to release this in 1.0a7, and I'll backport it to a 0.64.4 release too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1726749355 https://github.com/simonw/datasette/issues/2189#issuecomment-1726749355 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5m7Bqr simonw 9599 2023-09-20T01:28:16Z 2023-09-20T01:28:16Z OWNER

Added a note to that example in the documentation: https://github.com/simonw/datasette/blob/4e6a34179eaedec44c1263275d7592fd83d7e2ac/docs/internals.rst?plain=1#L1320

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724325068 https://github.com/simonw/datasette/issues/2189#issuecomment-1724325068 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxxzM simonw 9599 2023-09-18T20:29:41Z 2023-09-18T20:29:41Z OWNER

The one other thing affected by this change is this documentation, which suggests a not-actually-safe pattern: https://github.com/simonw/datasette/blob/6ed7908580fa2ba9297c3225d85c56f8b08b9937/docs/internals.rst#L1292-L1321

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724317367 https://github.com/simonw/datasette/issues/2189#issuecomment-1724317367 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxv63 simonw 9599 2023-09-18T20:25:44Z 2023-09-18T20:25:44Z OWNER

My current hunch is that SQLite gets unhappy if multiple threads access the same underlying C object - which sometimes happens with in-memory connections and Datasette presumably because they are faster than file-backed databases.

I'm going to remove the asyncio.gather() code from the table view. I'll ship a 0.x release with that fix too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724315591 https://github.com/simonw/datasette/issues/2189#issuecomment-1724315591 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxvfH simonw 9599 2023-09-18T20:24:30Z 2023-09-18T20:24:30Z OWNER

Using SQLite In Multi-Threaded Applications

That indicates that there's a SQLite option for "Serialized" mode where it's safe to access anything SQLite provides from multiple threads, but as far as I can tell Python doesn't give you an option to turn that mode on or off for a connection - you can read sqlite3.threadsafety to see if that mode was compiled in or not, but not actually change it.

On my Mac sqlite3.threadsafety returns 1 which means https://docs.python.org/3/library/sqlite3.html#sqlite3.threadsafety "Multi-thread: In this mode, SQLite can be safely used by multiple threads provided that no single database connection is used simultaneously in two or more threads." - it would need to return 3 for that serialized mode.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724305169 https://github.com/simonw/datasette/issues/2189#issuecomment-1724305169 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxs8R simonw 9599 2023-09-18T20:16:22Z 2023-09-18T20:16:36Z OWNER

Looking again at this code:

https://github.com/simonw/datasette/blob/6ed7908580fa2ba9297c3225d85c56f8b08b9937/datasette/database.py#L87-L117

check_same_thread=False really stands out here.

Python docs at https://docs.python.org/3/library/sqlite3.html

check_same_thread (bool) -- If True (default), ProgrammingError will be raised if the database connection is used by a thread other than the one that created it. If False, the connection may be accessed in multiple threads; write operations may need to be serialized by the user to avoid data corruption. See threadsafety for more information.

I think I'm playing with fire by allowing multiple threads to access the same connection without doing my own serialization of those requests.

I do do that using the write connection - and in this particular case the bug isn't coming from write queries, it's coming from read queries - but perhaps SQLite has issues with threading for reads, too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724298817 https://github.com/simonw/datasette/issues/2189#issuecomment-1724298817 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxrZB simonw 9599 2023-09-18T20:11:26Z 2023-09-18T20:11:26Z OWNER

Now that I've confirmed that parallel query execution of the kind introduced in https://github.com/simonw/datasette/commit/942411ef946e9a34a2094944d3423cddad27efd3 can cause hangs (presumably some kind of locking issue) against in-memory databases, some options:

  1. Disable parallel execution entirely and rip out related code.
  2. Disable parallel execution entirely by leaving that code but having it always behave as if _noparallel=1
  3. Continue digging and try and find some way to avoid this problem

The parallel execution work is something I was playing with last year in the hope of speeding up Datasette pages like the table page which need to execute a bunch of queries - one for each facet, plus one for each column to see if it should be suggested as a facet.

I wrote about this at the time here: https://simonwillison.net/2022/May/6/weeknotes/

My hope was that despite Python's GIL this optimization would still help, because the SQLite C module releases the GIL once it gets to SQLite.

But... that didn't hold up. It looked like enough work was happening in Python land with the GIL that the optimization didn't improve things.

Running the nogil fork of Python DID improve things though! I left the code in partly on the hope that the nogil fork would be accepted into Python core.

... which it now has! But it will still be a year or two before it fully lands: https://discuss.python.org/t/a-steering-council-notice-about-pep-703-making-the-global-interpreter-lock-optional-in-cpython/30474

So I'm not particularly concerned about dropping the parallel execution. If I do drop it though do I leave the potentially complex code in that relates to it?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724281824 https://github.com/simonw/datasette/issues/2189#issuecomment-1724281824 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxnPg simonw 9599 2023-09-18T19:58:06Z 2023-09-18T19:58:06Z OWNER

I also confirmed that http://127.0.0.1:8064/airtable_refs/airtable_refs?_noparallel=1 does not trigger the bug but http://127.0.0.1:8064/airtable_refs/airtable_refs does.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724278386 https://github.com/simonw/datasette/issues/2189#issuecomment-1724278386 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxmZy simonw 9599 2023-09-18T19:55:32Z 2023-09-18T19:55:32Z OWNER

OK it looks like it found it!

``` 942411ef946e9a34a2094944d3423cddad27efd3 is the first bad commit commit

Author: Simon Willison swillison@gmail.com Date: Tue Apr 26 15:48:56 2022 -0700

Execute some TableView queries in parallel

Use ?_noparallel=1 to opt out (undocumented, useful for benchmark comparisons)

Refs #1723, #1715

datasette/views/table.py | 93 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 67 insertions(+), 26 deletions(-) bisect found first bad commit ``` https://github.com/simonw/datasette/commit/942411ef946e9a34a2094944d3423cddad27efd3 does look like the cause of this problem.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724276917 https://github.com/simonw/datasette/issues/2189#issuecomment-1724276917 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxmC1 simonw 9599 2023-09-18T19:54:23Z 2023-09-18T19:54:23Z OWNER

Turned out I wasn't running the datasette from the current directory, so it was not testing what I intended.

FIxed that with pip install -e . in the datasette/ directory.

Now I'm seeing some passes, which look like this: running '../airtable-export/testit.sh' INFO: Started server process [77810] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit) Running curls INFO: 127.0.0.1:59439 - "GET /airtable_refs/airtable_refs HTTP/1.1" 200 OK INFO: 127.0.0.1:59440 - "GET /airtable_refs/airtable_refs HTTP/1.1" 200 OK INFO: 127.0.0.1:59441 - "GET /airtable_refs/airtable_refs HTTP/1.1" 200 OK All curl succeeded Killing datasette server with PID 77810 ../airtable-export/testit.sh: line 54: 77810 Killed: 9 datasette pottery2.db -p $port All three curls succeeded. Bisecting: 4 revisions left to test after this (roughly 2 steps) [7463b051cf8d7f856df5eba9f7aa944183ebabe5] Cosmetic tweaks after blacken-docs, refs #1718 running '../airtable-export/testit.sh' INFO: Started server process [77826] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit) Running curls

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724257290 https://github.com/simonw/datasette/issues/2189#issuecomment-1724257290 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxhQK simonw 9599 2023-09-18T19:39:27Z 2023-09-18T19:44:26Z OWNER

I'm now trying this test script: ```bash

!/bin/bash

port=8064

Start datasette server in the background and get its PID

datasette pottery2.db -p $port & server_pid=$!

Wait for a moment to ensure the server has time to start up

sleep 2

Initialize counters and parameters

retry_count=0 max_retries=3 success_count=0 path="/airtable_refs/airtable_refs"

Function to run curl with a timeout

function test_curl { # Run the curl command with a timeout of 3 seconds timeout 3s curl -s "http://localhost:${port}${path}" > /dev/null if [ $? -eq 0 ]; then # Curl was successful ((success_count++)) fi }

Try three parallel curl requests

while [[ $retry_count -lt $max_retries ]]; do # Reset the success counter success_count=0

# Run the curls in parallel
echo "  Running curls"
test_curl
test_curl
test_curl #  & test_curl & test_curl &

# Wait for all curls to finish
#wait

# Check the success count
if [[ $success_count -eq 3 ]]; then
    # All curls succeeded, break out of the loop
    echo "  All curl succeeded"
    break
fi

((retry_count++))

done

Kill the datasette server

echo "Killing datasette server with PID $server_pid" kill -9 $server_pid sleep 2

Print result

if [[ $success_count -eq 3 ]]; then echo "All three curls succeeded." exit 0 else echo "Error: Not all curls succeeded after $retry_count attempts." exit 1 fi I run it like this:bash git bisect reset git bisect start git bisect good 0.59.4 git bisect bad 1.0a6 git bisect run ../airtable-export/testit.sh ``` But... it's not having the desired result, I think because the bug is intermittent so each time I run it the bisect spits out a different commit as the one that is to blame.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724263390 https://github.com/simonw/datasette/issues/2189#issuecomment-1724263390 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxive simonw 9599 2023-09-18T19:44:03Z 2023-09-18T19:44:03Z OWNER

I knocked it down to 1 retry just to see what happened.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724259229 https://github.com/simonw/datasette/issues/2189#issuecomment-1724259229 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxhud simonw 9599 2023-09-18T19:40:56Z 2023-09-18T19:40:56Z OWNER

I tried it with a path of / and everything passed - so it's definitely the path of /airtable_refs/airtable_refs (an in-memory database created by an experimental branch of https://github.com/simonw/airtable-export) that triggers the problem.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724258279 https://github.com/simonw/datasette/issues/2189#issuecomment-1724258279 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxhfn simonw 9599 2023-09-18T19:40:13Z 2023-09-18T19:40:13Z OWNER

Output while it is running looks like this: running '../airtable-export/testit.sh' INFO: Started server process [75649] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit) Running curls Running curls Running curls Killing datasette server with PID 75649 ../airtable-export/testit.sh: line 54: 75649 Killed: 9 datasette pottery2.db -p $port Error: Not all curls succeeded after 3 attempts. Bisecting: 155 revisions left to test after this (roughly 7 steps) [247e460e08bf823142f7b84058fe44e43626787f] Update beautifulsoup4 requirement (#1703) running '../airtable-export/testit.sh' INFO: Started server process [75722] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit) Running curls Running curls Running curls Killing datasette server with PID 75722 ../airtable-export/testit.sh: line 54: 75722 Killed: 9 datasette pottery2.db -p $port Error: Not all curls succeeded after 3 attempts. Bisecting: 77 revisions left to test after this (roughly 6 steps) [3ef47a0896c7e63404a34e465b7160c80eaa571d] Link rel=alternate header for tables and rows running '../airtable-export/testit.sh' INFO: Started server process [75818] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8064 (Press CTRL+C to quit) Running curls

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724159882 https://github.com/simonw/datasette/issues/2189#issuecomment-1724159882 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxJeK simonw 9599 2023-09-18T18:32:29Z 2023-09-18T18:32:29Z OWNER

This worked, including on macOS even though GPT-4 thought timeout would not work there: https://chat.openai.com/share/cc4628e9-5240-4f35-b640-16a9c178b315 ```bash

!/bin/bash

Run the command with a timeout of 5 seconds

timeout 5s datasette pottery2.db -p 8045 --get /airtable_refs/airtable_refs

Check the exit code from timeout

if [ $? -eq 124 ]; then echo "Error: Command timed out after 5 seconds." exit 1 fi ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724157182 https://github.com/simonw/datasette/issues/2189#issuecomment-1724157182 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mxIz- simonw 9599 2023-09-18T18:30:30Z 2023-09-18T18:30:30Z OWNER

OK, I can trigger the bug like this:

bash datasette pottery2.db -p 8045 --get /airtable_refs/airtable_refs Can I write a bash script that fails (and terminates the process) if it takes longer than X seconds?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724089666 https://github.com/simonw/datasette/issues/2189#issuecomment-1724089666 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mw4VC simonw 9599 2023-09-18T17:49:24Z 2023-09-18T17:49:24Z OWNER

I switched that particular implementation to using an on-disk database instead of an in-memory database and could no longer recreate the bug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724084199 https://github.com/simonw/datasette/issues/2189#issuecomment-1724084199 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mw2_n simonw 9599 2023-09-18T17:47:01Z 2023-09-18T17:47:01Z OWNER

I managed to trigger it by loading http://127.0.0.1:8045/airtable_refs/airtable_refs - which worked - and then hitting refresh on that page a bunch of times until it hung.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724083324 https://github.com/simonw/datasette/issues/2189#issuecomment-1724083324 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mw2x8 simonw 9599 2023-09-18T17:46:21Z 2023-09-18T17:46:21Z OWNER

Sometimes it takes a few clicks for the bug to occur, but it does seem to always be within the in-memory database.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724081909 https://github.com/simonw/datasette/issues/2189#issuecomment-1724081909 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mw2b1 simonw 9599 2023-09-18T17:45:27Z 2023-09-18T17:45:27Z OWNER

Maybe it's not related to faceting - I just got it on a hit to http://127.0.0.1:8045/airtable_refs/airtable_refs instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724072390 https://github.com/simonw/datasette/issues/2189#issuecomment-1724072390 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mw0HG simonw 9599 2023-09-18T17:39:06Z 2023-09-18T17:39:06Z OWNER

Landing a version of that test anyway.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724064440 https://github.com/simonw/datasette/issues/2189#issuecomment-1724064440 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mwyK4 simonw 9599 2023-09-18T17:36:00Z 2023-09-18T17:36:00Z OWNER

I wrote this test, but it passes: python @pytest.mark.asyncio async def test_facet_against_in_memory_database(): ds = Datasette() db = ds.add_memory_database("mem") await db.execute_write("create table t (id integer primary key, name text)") await db.execute_write_many( "insert into t (name) values (?)", [["one"], ["one"], ["two"]] ) response1 = await ds.client.get("/mem/t.json") assert response1.status_code == 200 response2 = await ds.client.get("/mem/t.json?_facet=name") assert response2.status_code == 200 assert response2.json() == { "ok": True, "next": None, "facet_results": { "results": { "name": { "name": "name", "type": "column", "hideable": True, "toggle_url": "/mem/t.json", "results": [ { "value": "one", "label": "one", "count": 2, "toggle_url": "http://localhost/mem/t.json?_facet=name&name=one", "selected": False, }, { "value": "two", "label": "two", "count": 1, "toggle_url": "http://localhost/mem/t.json?_facet=name&name=two", "selected": False, }, ], "truncated": False, } }, "timed_out": [], }, "rows": [ {"id": 1, "name": "one"}, {"id": 2, "name": "one"}, {"id": 3, "name": "two"}, ], "truncated": False, }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724055823 https://github.com/simonw/datasette/issues/2189#issuecomment-1724055823 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mwwEP simonw 9599 2023-09-18T17:31:10Z 2023-09-18T17:31:10Z OWNER

That line was added in https://github.com/simonw/datasette/commit/942411ef946e9a34a2094944d3423cddad27efd3 which first shipped in 0.62a0.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724051886 https://github.com/simonw/datasette/issues/2189#issuecomment-1724051886 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mwvGu simonw 9599 2023-09-18T17:28:20Z 2023-09-18T17:30:30Z OWNER

The bug exhibits when I try to add a facet. I think it's caused by the parallel query execution I added to facets at some point.

http://127.0.0.1:8045/airtable_refs/airtable_refs - no error http://127.0.0.1:8045/airtable_refs/airtable_refs?_facet=table_name#facet-table_name - hangs the server

Crucial line in the traceback: await gather(execute_facets(), execute_suggested_facets()) From here: https://github.com/simonw/datasette/blob/917272c864ad7b8a00c48c77f5c2944093babb4e/datasette/views/table.py#L568

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724049538 https://github.com/simonw/datasette/issues/2189#issuecomment-1724049538 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mwuiC simonw 9599 2023-09-18T17:26:44Z 2023-09-18T17:26:44Z OWNER

Just managed to get this exception trace: return await self.route_path(scope, receive, send, path) File "/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/app.py", line 1354, in route_path response = await view(request, send) File "/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/base.py", line 134, in view return await self.dispatch_request(request) File "/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/base.py", line 91, in dispatch_request return await handler(request) File "/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/base.py", line 361, in get response_or_template_contexts = await self.data(request, **data_kwargs) File "/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/table.py", line 158, in data return await self._data_traced(request, default_labels, _next, _size) File "/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/table.py", line 568, in _data_traced await gather(execute_facets(), execute_suggested_facets()) File "/Users/simon/.local/share/virtualenvs/airtable-export-Ca4U-3qk/lib/python3.8/site-packages/datasette/views/table.py", line 177, in _gather_parallel return await asyncio.gather(*args) asyncio.exceptions.CancelledError INFO: 127.0.0.1:64109 - "GET /airtable_refs/airtable_refs?_facet=table_name&table_name=Sessions HTTP/1.1" 500 Internal Server Error ^CError in atexit._run_exitfuncs: Traceback (most recent call last): File "/Users/simon/.pyenv/versions/3.8.17/lib/python3.8/concurrent/futures/thread.py", line 40, in _python_exit t.join() File "/Users/simon/.pyenv/versions/3.8.17/lib/python3.8/threading.py", line 1011, in join self._wait_for_tstate_lock() File "/Users/simon/.pyenv/versions/3.8.17/lib/python3.8/threading.py", line 1027, in _wait_for_tstate_lock elif lock.acquire(block, timeout): KeyboardInterrupt

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724048314 https://github.com/simonw/datasette/issues/2189#issuecomment-1724048314 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mwuO6 simonw 9599 2023-09-18T17:25:55Z 2023-09-18T17:25:55Z OWNER

The good news is that this bug is currently unlikely to affect most users since named in-memory databases (created using datasette.add_memory_database("airtable_refs") (docs) are a pretty obscure feature, only available to plugins.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  
1724045748 https://github.com/simonw/datasette/issues/2189#issuecomment-1724045748 https://api.github.com/repos/simonw/datasette/issues/2189 IC_kwDOBm6k_c5mwtm0 simonw 9599 2023-09-18T17:24:07Z 2023-09-18T17:24:07Z OWNER

I need reliable steps to reproduce, then I can bisect and figure out which exact version of Datasette introduced the problem.

I have a hunch that it relates to changes made to the datasette/database.py module, maybe one of these changes here: https://github.com/simonw/datasette/compare/0.61...0.63.1#diff-4e20309c969326a0008dc9237f6807f48d55783315fbfc1e7dfa480b550e16f9

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Server hang on parallel execution of queries to named in-memory databases 1901416155  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1313.813ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows