home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where issue = 1087181951 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 12

issue 1

  • Traces should include SQL executed by subtasks created with `asyncio.gather` · 12 ✖

author_association 1

  • OWNER 12
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1030530071 https://github.com/simonw/datasette/issues/1576#issuecomment-1030530071 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c49bKQX simonw 9599 2022-02-05T05:21:35Z 2022-02-05T05:21:35Z OWNER

New documentation section: https://docs.datasette.io/en/latest/internals.html#datasette-tracer

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
1030528532 https://github.com/simonw/datasette/issues/1576#issuecomment-1030528532 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c49bJ4U simonw 9599 2022-02-05T05:09:57Z 2022-02-05T05:09:57Z OWNER

Needs documentation. I'll document from datasette.tracer import trace too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
1030525218 https://github.com/simonw/datasette/issues/1576#issuecomment-1030525218 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c49bJEi simonw 9599 2022-02-05T04:45:11Z 2022-02-05T04:45:11Z OWNER

Got a prototype working with contextvars - it identified two parallel executing queries using the patch from above:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
1017112543 https://github.com/simonw/datasette/issues/1576#issuecomment-1017112543 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c48n-ff simonw 9599 2022-01-20T04:35:00Z 2022-02-05T04:33:46Z OWNER

I dropped support for Python 3.6 in fae3983c51f4a3aca8335f3e01ff85ef27076fbf so now free to use contextvars for this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
1027635925 https://github.com/simonw/datasette/issues/1576#issuecomment-1027635925 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c49QHrV simonw 9599 2022-02-02T06:47:20Z 2022-02-02T06:47:20Z OWNER

Here's what I was hacking around with when I uncovered this problem: ```diff diff --git a/datasette/views/table.py b/datasette/views/table.py index 77fb285..8c57d08 100644 --- a/datasette/views/table.py +++ b/datasette/views/table.py @@ -1,3 +1,4 @@ +import asyncio import urllib import itertools import json @@ -615,44 +616,37 @@ class TableView(RowTableShared): if request.args.get("_timelimit"): extra_args["custom_time_limit"] = int(request.args.get("_timelimit"))

  • Execute the main query!

  • results = await db.execute(sql, params, truncate=True, **extra_args)

  • Calculate the total count for this query

  • filtered_table_rows_count = None
  • if (
  • not db.is_mutable
  • and self.ds.inspect_data
  • and count_sql == f"select count(*) from {table} "
  • ):
  • We can use a previously cached table row count

  • try:
  • filtered_table_rows_count = self.ds.inspect_data[database]["tables"][
  • table
  • ]["count"]
  • except KeyError:
  • pass

  • Otherwise run a select count(*) ...

  • if count_sql and filtered_table_rows_count is None and not nocount:
  • try:
  • count_rows = list(await db.execute(count_sql, from_sql_params))
  • filtered_table_rows_count = count_rows[0][0]
  • except QueryInterrupted:
  • pass

  • Faceting

  • if not self.ds.setting("allow_facet") and any(
  • arg.startswith("_facet") for arg in request.args
  • ):
  • raise BadRequest("_facet= is not allowed")
  • async def execute_count():
  • Calculate the total count for this query

  • filtered_table_rows_count = None
  • if (
  • not db.is_mutable
  • and self.ds.inspect_data
  • and count_sql == f"select count(*) from {table} "
  • ):
  • We can use a previously cached table row count

  • try:
  • filtered_table_rows_count = self.ds.inspect_data[database][
  • "tables"
  • ][table]["count"]
  • except KeyError:
  • pass +
  • if count_sql and filtered_table_rows_count is None and not nocount:
  • try:
  • count_rows = list(await db.execute(count_sql, from_sql_params))
  • filtered_table_rows_count = count_rows[0][0]
  • except QueryInterrupted:
  • pass +
  • return filtered_table_rows_count +
  • filtered_table_rows_count = await execute_count()

     # pylint: disable=no-member
     facet_classes = list(
         itertools.chain.from_iterable(pm.hook.register_facet_classes())
     )
    
    • facet_results = {}
    • facets_timed_out = [] facet_instances = [] for klass in facet_classes: facet_instances.append( @@ -668,33 +662,58 @@ class TableView(RowTableShared): ) )
  • if not nofacet:

  • for facet in facet_instances:
  • (
  • instance_facet_results,
  • instance_facets_timed_out,
  • ) = await facet.facet_results()
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out)

  • Calculate suggested facets

  • suggested_facets = []
  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • for facet in facet_instances:
  • suggested_facets.extend(await facet.suggest())
  • async def execute_suggested_facets():
  • Calculate suggested facets

  • suggested_facets = []
  • if (
  • self.ds.setting("suggest_facets")
  • and self.ds.setting("allow_facet")
  • and not _next
  • and not nofacet
  • and not nosuggest
  • ):
  • for facet in facet_instances:
  • suggested_facets.extend(await facet.suggest())
  • return suggested_facets +
  • async def execute_facets():
  • facet_results = {}
  • facets_timed_out = []
  • if not self.ds.setting("allow_facet") and any(
  • arg.startswith("_facet") for arg in request.args
  • ):
  • raise BadRequest("_facet= is not allowed") +
  • if not nofacet:
  • for facet in facet_instances:
  • (
  • instance_facet_results,
  • instance_facets_timed_out,
  • ) = await facet.facet_results()
  • for facet_info in instance_facet_results:
  • base_key = facet_info["name"]
  • key = base_key
  • i = 1
  • while key in facet_results:
  • i += 1
  • key = f"{base_key}_{i}"
  • facet_results[key] = facet_info
  • facets_timed_out.extend(instance_facets_timed_out) +
  • return facet_results, facets_timed_out +
  • Execute the main query, facets and facet suggestions in parallel:

  • (
  • results,
  • suggested_facets,
  • (facet_results, facets_timed_out),
  • ) = await asyncio.gather(
  • db.execute(sql, params, truncate=True, **extra_args),
  • execute_suggested_facets(),
  • execute_facets(),
  • ) +
  • results = await db.execute(sql, params, truncate=True, **extra_args)
     # Figure out columns and rows for the query
     columns = [r[0] for r in results.description]
    

    ``` It's a hacky attempt at running some of the table page queries in parallel to see what happens.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
1000935523 https://github.com/simonw/datasette/issues/1576#issuecomment-1000935523 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c47qRBj simonw 9599 2021-12-24T21:33:05Z 2021-12-24T21:33:05Z OWNER

Another option would be to attempt to import contextvars and, if the import fails (for Python 3.6) continue using the current mechanism - then let Python 3.6 users know in the documentation that under Python 3.6 they will miss out on nested traces.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
999990414 https://github.com/simonw/datasette/issues/1576#issuecomment-999990414 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c47mqSO simonw 9599 2021-12-23T02:08:39Z 2021-12-23T18:16:35Z OWNER

It's tiny: I'm tempted to vendor it. https://github.com/Skyscanner/aiotask-context/blob/master/aiotask_context/init.py

No, I'll add it as a pinned dependency, which I can then drop when I drop 3.6 support.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
999987418 https://github.com/simonw/datasette/issues/1576#issuecomment-999987418 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c47mpja simonw 9599 2021-12-23T01:59:58Z 2021-12-23T02:02:12Z OWNER

Another option: https://github.com/Skyscanner/aiotask-context - looks like it might be better as it's been updated for Python 3.7 in this commit https://github.com/Skyscanner/aiotask-context/commit/67108c91d2abb445655cc2af446fdb52ca7890c4

The Skyscanner one doesn't attempt to wrap any existing factories, but that's OK for my purposes since I don't need to handle arbitrary asyncio code written by other people.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
999876666 https://github.com/simonw/datasette/issues/1576#issuecomment-999876666 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c47mOg6 simonw 9599 2021-12-22T20:59:22Z 2021-12-22T21:18:09Z OWNER

This article is relevant: Context information storage for asyncio - in particular the section https://blog.sqreen.com/asyncio/#context-inheritance-between-tasks which describes exactly the problem I have and their solution, which involves this trickery:

```python def request_task_factory(loop, coro): child_task = asyncio.tasks.Task(coro, loop=loop) parent_task = asyncio.Task.current_task(loop=loop) current_request = getattr(parent_task, 'current_request', None) setattr(child_task, 'current_request', current_request) return child_task

loop = asyncio.get_event_loop() loop.set_task_factory(request_task_factory) ```

They released their solution as a library: https://pypi.org/project/aiocontext/ and https://github.com/sqreen/AioContext - but that company was acquired by Datadog back in April and doesn't seem to be actively maintaining their open source stuff any more: https://twitter.com/SqreenIO/status/1384906075506364417

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
999878907 https://github.com/simonw/datasette/issues/1576#issuecomment-999878907 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c47mPD7 simonw 9599 2021-12-22T21:03:49Z 2021-12-22T21:10:46Z OWNER

context_vars can solve this but they were introduced in Python 3.7: https://www.python.org/dev/peps/pep-0567/

Python 3.6 support ends in a few days time, and it looks like Glitch has updated to 3.7 now - so maybe I can get away with Datasette needing 3.7 these days?

Tweeted about that here: https://twitter.com/simonw/status/1473761478155010048

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
999874886 https://github.com/simonw/datasette/issues/1576#issuecomment-999874886 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c47mOFG simonw 9599 2021-12-22T20:55:42Z 2021-12-22T20:57:28Z OWNER

One way to solve this would be to introduce a set_task_id() method, which sets an ID which will be returned by get_task_id() instead of using id(current_task(loop=loop)).

It would be really nice if I could solve this using with syntax somehow. Something like: python with trace_child_tasks(): ( suggested_facets, (facet_results, facets_timed_out), ) = await asyncio.gather( execute_suggested_facets(), execute_facets(), )

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  
999874484 https://github.com/simonw/datasette/issues/1576#issuecomment-999874484 https://api.github.com/repos/simonw/datasette/issues/1576 IC_kwDOBm6k_c47mN-0 simonw 9599 2021-12-22T20:54:52Z 2021-12-22T20:54:52Z OWNER

Here's the full current relevant code from tracer.py: https://github.com/simonw/datasette/blob/ace86566b28280091b3844cf5fbecd20158e9004/datasette/tracer.py#L8-L64

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Traces should include SQL executed by subtasks created with `asyncio.gather` 1087181951  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 25.866ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows