home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where issue = 910088936 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 2

issue 1

  • datasette --get should efficiently handle streaming CSV · 2 ✖

author_association 1

  • OWNER 2
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1073362979 https://github.com/simonw/datasette/issues/1355#issuecomment-1073362979 https://api.github.com/repos/simonw/datasette/issues/1355 IC_kwDOBm6k_c4_-jgj simonw 9599 2022-03-20T22:38:53Z 2022-03-20T22:38:53Z OWNER

Built a research prototype: ```diff diff --git a/datasette/app.py b/datasette/app.py index 5c8101a..5cd3e63 100644 --- a/datasette/app.py +++ b/datasette/app.py @@ -1,6 +1,7 @@ import asyncio import asgi_csrf import collections +import contextlib import datetime import functools import glob @@ -1490,3 +1491,11 @@ class DatasetteClient: return await client.request( method, self._fix(path, avoid_path_rewrites), kwargs ) + + @contextlib.asynccontextmanager + async def stream(self, method, path, kwargs): + async with httpx.AsyncClient(app=self.app) as client: + print("async with as client") + async with client.stream(method, self._fix(path), **kwargs) as response: + print("async with client.stream about to yield response") + yield response diff --git a/datasette/cli.py b/datasette/cli.py index 3c6e1b2..3025ead 100644 --- a/datasette/cli.py +++ b/datasette/cli.py @@ -585,11 +585,19 @@ def serve( asyncio.get_event_loop().run_until_complete(check_databases(ds))

 if get:
  • client = TestClient(ds)
  • response = client.get(get)
  • click.echo(response.text)
  • exit_code = 0 if response.status == 200 else 1
  • sys.exit(exit_code) +
  • async def _run_get():
  • print("_run_get")
  • async with ds.client.stream("GET", get) as response:
  • print("Got response:", response)
  • async for chunk in response.aiter_bytes(chunk_size=1024):
  • print(" chunk")
  • sys.stdout.buffer.write(chunk)
  • sys.stdout.buffer.flush()
  • exit_code = 0 if response.status_code == 200 else 1
  • sys.exit(exit_code) +
  • asyncio.get_event_loop().run_until_complete(_run_get()) return

    # Start the server But for some reason it didn't appear to stream out the response - it would print this out: % datasette covid.db --get '/covid/ny_times_us_counties.csv?_size=10&_stream=on' _run_get async with as client ``` And then hang. I would expect it to start printing out chunks of CSV data here, but instead it looks like it waited for everything to be generated before returning anything to the console.

No idea why. I dropped this for the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
datasette --get should efficiently handle streaming CSV 910088936  
853557439 https://github.com/simonw/datasette/issues/1355#issuecomment-853557439 https://api.github.com/repos/simonw/datasette/issues/1355 MDEyOklzc3VlQ29tbWVudDg1MzU1NzQzOQ== simonw 9599 2021-06-03T04:43:14Z 2021-06-03T04:43:14Z OWNER

It's using TestClient at the moment which is a wrapper around httpx (as of ) that uses the @async_to_sync decorator to hide the async nature.

https://github.com/simonw/datasette/blob/f78ebdc04537a6102316d6dbbf6c887565806078/datasette/utils/testing.py#L102-L156

Maybe the fix here is to switch the --get implementation to using httpx directly with https://www.python-httpx.org/async/#streaming-responses

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
datasette --get should efficiently handle streaming CSV 910088936  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 22.402ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows