home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1258167564

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions issue performed_via_github_app
https://github.com/simonw/datasette/issues/526#issuecomment-1258167564 https://api.github.com/repos/simonw/datasette/issues/526 1258167564 IC_kwDOBm6k_c5K_h0M 536941 2022-09-26T14:57:44Z 2022-09-26T15:08:36Z CONTRIBUTOR

reading the database execute method i have a few questions.

https://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/database.py#L229-L242


unless i'm missing something (which is very likely!!), the max_returned_rows argument doesn't actually offer any protections against running very expensive queries.

It's not like adding a LIMIT max_rows argument. it make sense that it isn't because, the query could already have an LIMIT argument. Doing something like select * from (query) limit {max_returned_rows} might be protective but wouldn't always.

Instead the code executes the full original query, and if still has time it fetches out the first max_rows + 1 rows.

this does offer some protection of memory exhaustion, as you won't hydrate a huge result set into python (however, there are data flow patterns that could avoid that too)

given the current architecture, i don't see how creating a new connection would be use?


If we just removed the max_return_rows limitation, then i think most things would be fine except for the QueryViews. Right now rendering, just 5000 rows takes a lot of client-side memory so some form of pagination would be required.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
459882902  
Powered by Datasette · Queries took 0.955ms · About: github-to-sqlite