home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 991805516

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions issue performed_via_github_app
https://github.com/simonw/datasette/issues/1550#issuecomment-991805516 https://api.github.com/repos/simonw/datasette/issues/1550 991805516 IC_kwDOBm6k_c47HcBM 9599 2021-12-11T23:43:24Z 2021-12-11T23:43:24Z OWNER

I built a tiny Starlette app to experiment with this a bit: ```python import asyncio import janus from starlette.applications import Starlette from starlette.responses import JSONResponse, HTMLResponse, StreamingResponse from starlette.routing import Route import sqlite3 from concurrent import futures

executor = futures.ThreadPoolExecutor(max_workers=10)

async def homepage(request): return HTMLResponse( """ <html> <head><title>SQL CSV Server</title> <style>body { width: 40rem; font-family: helvetica; margin: 2em auto; }</style> <body>

SQL CSV Server

<form action="/csv"> <label style="display: block">SQL query: <textarea style="width: 90%; height: 20em" name="sql"></textarea> </label> </form> </head> """ )

def run_query_in_thread(sql, sync_q): db = sqlite3.connect("../datasette/covid.db") cursor = db.cursor() cursor.arraysize = 100 # Default is 1 apparently? cursor.execute(sql) columns = [d[0] for d in cursor.description] sync_q.put([columns]) # Now start putting batches of rows while True: rows = cursor.fetchmany() if rows: sync_q.put(rows) else: break # Let queue know we are finished\ sync_q.put(None)

async def csv_query(request): sql = request.query_params["sql"]

queue = janus.Queue()
loop = asyncio.get_running_loop()

async def csv_generator():
    loop.run_in_executor(None, run_query_in_thread, sql, queue.sync_q)
    while True:
        rows = await queue.async_q.get()
        if rows is not None:
            for row in rows:
                yield ",".join(map(str, row)) + "\n "
            queue.async_q.task_done()
        else:
            # Cleanup
            queue.close()
            await queue.wait_closed()
            break

return StreamingResponse(csv_generator(), media_type='text/plain')

app = Starlette( debug=True, routes=[ Route("/", homepage), Route("/csv", csv_query), ], ) But.. if I run this in a terminal window: /tmp % wget 'http://127.0.0.1:8000/csv?sql=select+*+from+ny_times_us_counties' ``` it takes about 20 seconds to run and returns a 50MB file - but while it is running no other requests can be served by that server - not even the homepage! So something is blocking the event loop.

Maybe I should be using fut = loop.run_in_executor(None, run_query_in_thread, sql, queue.sync_q) and then awaiting fut somewhere, like in the Janus documentation? Don't think that's needed though. Needs more work to figure out why this is blocking.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
1077628073  
Powered by Datasette · Queries took 1.259ms · About: github-to-sqlite