home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 1216508080 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 5

issue 1

  • Research running SQL in table view in parallel using `asyncio.gather()` · 5 ✖

author_association 1

  • OWNER 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1110330554 https://github.com/simonw/datasette/issues/1723#issuecomment-1110330554 https://api.github.com/repos/simonw/datasette/issues/1723 IC_kwDOBm6k_c5CLky6 simonw 9599 2022-04-26T23:06:20Z 2022-04-26T23:06:20Z OWNER

Deployed here: https://latest-with-plugins.datasette.io/github/commits?_facet=repo&_trace=1&_facet=committer

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research running SQL in table view in parallel using `asyncio.gather()` 1216508080  
1110305790 https://github.com/simonw/datasette/issues/1723#issuecomment-1110305790 https://api.github.com/repos/simonw/datasette/issues/1723 IC_kwDOBm6k_c5CLev- simonw 9599 2022-04-26T22:19:04Z 2022-04-26T22:19:04Z OWNER

I realized that seeing the total time in queries wasn't enough to understand this, because if the queries were executed in serial or parallel it should still sum up to the same amount of SQL time (roughly).

Instead I need to know how long the page took to render. But that's hard to display on the page since you can't measure it until rendering has finished!

So I built an ASGI plugin to handle that measurement: https://github.com/simonw/datasette-total-page-time

And with that plugin installed, http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel2&_facet=other_fuel1&_parallel=1 (the parallel version) takes 377ms:

While http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel2&_facet=other_fuel1 (the serial version) takes 762ms:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research running SQL in table view in parallel using `asyncio.gather()` 1216508080  
1110279869 https://github.com/simonw/datasette/issues/1723#issuecomment-1110279869 https://api.github.com/repos/simonw/datasette/issues/1723 IC_kwDOBm6k_c5CLYa9 simonw 9599 2022-04-26T21:45:39Z 2022-04-26T21:45:39Z OWNER

Getting some nice traces out of this:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research running SQL in table view in parallel using `asyncio.gather()` 1216508080  
1110278577 https://github.com/simonw/datasette/issues/1723#issuecomment-1110278577 https://api.github.com/repos/simonw/datasette/issues/1723 IC_kwDOBm6k_c5CLYGx simonw 9599 2022-04-26T21:44:04Z 2022-04-26T21:44:04Z OWNER

And some simple benchmarks with ab - using the ?_parallel=1 hack to try it with and without a parallel asyncio.gather():

``` ~ % ab -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2'
This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001

Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2 Document Length: 314187 bytes

Concurrency Level: 1 Time taken for tests: 68.279 seconds Complete requests: 100 Failed requests: 13 (Connect: 0, Receive: 0, Length: 13, Exceptions: 0) Total transferred: 31454937 bytes HTML transferred: 31418437 bytes Requests per second: 1.46 [#/sec] (mean) Time per request: 682.787 [ms] (mean) Time per request: 682.787 [ms] (mean, across all concurrent requests) Transfer rate: 449.89 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 621 683 68.0 658 993 Waiting: 620 682 68.0 657 992 Total: 621 683 68.0 658 993

Percentage of the requests served within a certain time (ms) 50% 658 66% 678 75% 687 80% 711 90% 763 95% 879 98% 926 99% 993 100% 993 (longest request)


In parallel:

~ % ab -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001

Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1 Document Length: 315703 bytes

Concurrency Level: 1 Time taken for tests: 34.763 seconds Complete requests: 100 Failed requests: 11 (Connect: 0, Receive: 0, Length: 11, Exceptions: 0) Total transferred: 31607988 bytes HTML transferred: 31570288 bytes Requests per second: 2.88 [#/sec] (mean) Time per request: 347.632 [ms] (mean) Time per request: 347.632 [ms] (mean, across all concurrent requests) Transfer rate: 887.93 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 311 347 28.0 338 450 Waiting: 311 347 28.0 338 450 Total: 312 348 28.0 338 451

Percentage of the requests served within a certain time (ms) 50% 338 66% 348 75% 361 80% 367 90% 396 95% 408 98% 436 99% 451 100% 451 (longest request)


With concurrency 10, not parallel:

~ % ab -c 10 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001

Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel= Document Length: 314346 bytes

Concurrency Level: 10 Time taken for tests: 38.408 seconds Complete requests: 100 Failed requests: 93 (Connect: 0, Receive: 0, Length: 93, Exceptions: 0) Total transferred: 31471333 bytes HTML transferred: 31433733 bytes Requests per second: 2.60 [#/sec] (mean) Time per request: 3840.829 [ms] (mean) Time per request: 384.083 [ms] (mean, across all concurrent requests) Transfer rate: 800.18 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 1 Processing: 685 3719 354.0 3774 4096 Waiting: 684 3707 353.7 3750 4095 Total: 685 3719 354.0 3774 4096

Percentage of the requests served within a certain time (ms) 50% 3774 66% 3832 75% 3855 80% 3878 90% 3944 95% 4006 98% 4057 99% 4096 100% 4096 (longest request)


Concurrency 10 parallel:

~ % ab -c 10 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001

Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1 Document Length: 315703 bytes

Concurrency Level: 10 Time taken for tests: 36.762 seconds Complete requests: 100 Failed requests: 89 (Connect: 0, Receive: 0, Length: 89, Exceptions: 0) Total transferred: 31606516 bytes HTML transferred: 31568816 bytes Requests per second: 2.72 [#/sec] (mean) Time per request: 3676.182 [ms] (mean) Time per request: 367.618 [ms] (mean, across all concurrent requests) Transfer rate: 839.61 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 0 Processing: 381 3602 419.6 3609 4458 Waiting: 381 3586 418.7 3607 4457 Total: 381 3603 419.6 3609 4458

Percentage of the requests served within a certain time (ms) 50% 3609 66% 3741 75% 3791 80% 3821 90% 3972 95% 4074 98% 4386 99% 4458 100% 4458 (longest request)

Trying -c 3 instead. Non parallel:

~ % ab -c 3 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001

Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel= Document Length: 314346 bytes

Concurrency Level: 3 Time taken for tests: 39.365 seconds Complete requests: 100 Failed requests: 83 (Connect: 0, Receive: 0, Length: 83, Exceptions: 0) Total transferred: 31470808 bytes HTML transferred: 31433208 bytes Requests per second: 2.54 [#/sec] (mean) Time per request: 1180.955 [ms] (mean) Time per request: 393.652 [ms] (mean, across all concurrent requests) Transfer rate: 780.72 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 731 1153 126.2 1189 1359 Waiting: 730 1151 125.9 1188 1358 Total: 731 1153 126.2 1189 1359

Percentage of the requests served within a certain time (ms) 50% 1189 66% 1221 75% 1234 80% 1247 90% 1296 95% 1309 98% 1343 99% 1359 100% 1359 (longest request)


Parallel:

~ % ab -c 3 -n 100 'http://127.0.0.1:8001/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1' This is ApacheBench, Version 2.3 <$Revision: 1879490 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done

Server Software: uvicorn Server Hostname: 127.0.0.1 Server Port: 8001

Document Path: /global-power-plants/global-power-plants?_facet=primary_fuel&_facet=other_fuel1&_facet=other_fuel3&_facet=other_fuel2&_parallel=1 Document Length: 315703 bytes

Concurrency Level: 3 Time taken for tests: 34.530 seconds Complete requests: 100 Failed requests: 18 (Connect: 0, Receive: 0, Length: 18, Exceptions: 0) Total transferred: 31606179 bytes HTML transferred: 31568479 bytes Requests per second: 2.90 [#/sec] (mean) Time per request: 1035.902 [ms] (mean) Time per request: 345.301 [ms] (mean, across all concurrent requests) Transfer rate: 893.87 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 412 1020 104.4 1018 1280 Waiting: 411 1018 104.1 1014 1275 Total: 412 1021 104.4 1018 1280

Percentage of the requests served within a certain time (ms) 50% 1018 66% 1041 75% 1061 80% 1079 90% 1136 95% 1176 98% 1251 99% 1280 100% 1280 (longest request) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research running SQL in table view in parallel using `asyncio.gather()` 1216508080  
1110278182 https://github.com/simonw/datasette/issues/1723#issuecomment-1110278182 https://api.github.com/repos/simonw/datasette/issues/1723 IC_kwDOBm6k_c5CLYAm simonw 9599 2022-04-26T21:43:34Z 2022-04-26T21:43:34Z OWNER

Here's the diff I'm using: ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index d66adb8..f15ef1e 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -1,3 +1,4 @@ +import asyncio import itertools import json

@@ -5,6 +6,7 @@ import markupsafe

from datasette.plugins import pm from datasette.database import QueryInterrupted +from datasette import tracer from datasette.utils import ( await_me_maybe, CustomRow, @@ -150,6 +152,16 @@ class TableView(DataView): default_labels=False, _next=None, _size=None, + ): + with tracer.trace_child_tasks(): + return await self._data_traced(request, default_labels, _next, _size) + + async def _data_traced( + self, + request, + default_labels=False, + _next=None, + _size=None, ): database_route = tilde_decode(request.url_vars["database"]) table_name = tilde_decode(request.url_vars["table"]) @@ -159,6 +171,20 @@ class TableView(DataView): raise NotFound("Database not found: {}".format(database_route)) database_name = db.name

  • For performance profiling purposes, ?_parallel=1 turns on asyncio.gather

  • async def _gather_parallel(*args):
  • return await asyncio.gather(*args) +
  • async def _gather_sequential(*args):
  • results = []
  • for fn in args:
  • results.append(await fn)
  • return results +
  • gather = (
  • _gather_parallel if request.args.get("_parallel") else _gather_sequential
  • ) + # If this is a canned query, not a table, then dispatch to QueryView instead canned_query = await self.ds.get_canned_query( database_name, table_name, request.actor @@ -174,8 +200,12 @@ class TableView(DataView): write=bool(canned_query.get("write")), )

  • is_view = bool(await db.get_view_definition(table_name))

  • table_exists = bool(await db.table_exists(table_name))
  • is_view, table_exists = map(
  • bool,
  • await gather(
  • db.get_view_definition(table_name), db.table_exists(table_name)
  • ),
  • )

     # If table or view not found, return 404
     if not is_view and not table_exists:
    

    @@ -497,33 +527,44 @@ class TableView(DataView): ) )

  • if not nofacet:

  • for facet in facet_instances:
  • (
  • async def execute_facets():
  • if not nofacet:
  • Run them in parallel

  • facet_awaitables = [facet.facet_results() for facet in facet_instances]
  • facet_awaitable_results = await gather(*facet_awaitables)
  • for ( instance_facet_results, instance_facets_timed_out,
  • ) = await facet.facet_results()
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out)

  • Calculate suggested facets

  • ) in facet_awaitable_results:
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out) + suggested_facets = []
  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • for facet in facet_instances:
  • suggested_facets.extend(await facet.suggest()) +
  • async def execute_suggested_facets():
  • Calculate suggested facets

  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • Run them in parallel

  • facet_suggest_awaitables = [
  • facet.suggest() for facet in facet_instances
  • ]
  • for suggest_result in await gather(*facet_suggest_awaitables):
  • suggested_facets.extend(suggest_result) +
  • await gather(execute_facets(), execute_suggested_facets())
     # Figure out columns and rows for the query
     columns = [r[0] for r in results.description]
    

    ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research running SQL in table view in parallel using `asyncio.gather()` 1216508080  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 18.102ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows