home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 830567275 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 5

issue 1

  • Research using CTEs for faster facet counts · 5 ✖

author_association 1

  • OWNER 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
803674728 https://github.com/simonw/datasette/issues/1259#issuecomment-803674728 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDgwMzY3NDcyOA== simonw 9599 2021-03-21T22:55:31Z 2021-03-21T22:55:31Z OWNER

CTEs were added in 2014-02-03 SQLite 3.8.3 - so I think it's OK to depend on them for Datasette.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  
797827038 https://github.com/simonw/datasette/issues/1259#issuecomment-797827038 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDc5NzgyNzAzOA== simonw 9599 2021-03-13T00:15:40Z 2021-03-13T00:15:40Z OWNER

If all of the facets were being calculated in a single query, I'd be willing to bump the facet time limit up to something a lot higher, maybe even a full second. There's a chance that could work amazingly well with a materialized CTE.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  
797804869 https://github.com/simonw/datasette/issues/1259#issuecomment-797804869 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDc5NzgwNDg2OQ== simonw 9599 2021-03-12T23:05:05Z 2021-03-12T23:05:05Z OWNER

I wonder if I could optimize facet suggestion in the same way?

One challenge: the query time limit will apply to the full CTE query, not to the individual columns.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  
797801075 https://github.com/simonw/datasette/issues/1259#issuecomment-797801075 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDc5NzgwMTA3NQ== simonw 9599 2021-03-12T22:53:56Z 2021-03-12T22:55:16Z OWNER

OK, a better comparison:

https://global-power-plants.datasettes.com/global-power-plants?sql=WITH+data+as+%28%0D%0A++select%0D%0A++++%0D%0A++from%0D%0A++++%5Bglobal-power-plants%5D%0D%0A%29%2C%0D%0Acountry_long+as+%28select+%0D%0A++%27country_long%27+as+col%2C+country_long+as+value%2C+count%28%29+as+c+from+data+group+by+country_long%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aprimary_fuel+as+%28%0D%0Aselect%0D%0A++%27primary_fuel%27+as+col%2C+primary_fuel+as+value%2C+count%28%29+as+c+from+data+group+by+primary_fuel%0D%0A++order+by+c+desc+limit+31%0D%0A%29%2C%0D%0Aowner+as+%28%0D%0Aselect%0D%0A++%27owner%27+as+col%2C+owner+as+value%2C+count%28%29+as+c+from+data+group+by+owner%0D%0A++order+by+c+desc+limit+31%0D%0A%29%0D%0Aselect++from+primary_fuel+union+select++from+country_long%0D%0Aunion+select++from+owner+order+by+col%2C+c+desc calculates facets against three columns. It takes 78.5ms* (and 34.5ms when I refreshed it, presumably after warming some SQLite caches of some sort).

https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet=country_long&_facet=primary_fuel&_trace=1&_size=0 shows those facets with size=0 on the SQL query - and shows a SQL trace at the bottom of the page.

The country_long facet query takes 45.36ms, owner takes 38.45ms, primary_fuel takes 49.04ms - so a total of 132.85ms

That's against https://global-power-plants.datasettes.com/-/versions says SQLite 3.27.3 - so even on a SQLite version that doesn't materialize the CTEs there's a significant performance boost to doing all three facets in a single CTE query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  
797790017 https://github.com/simonw/datasette/issues/1259#issuecomment-797790017 https://api.github.com/repos/simonw/datasette/issues/1259 MDEyOklzc3VlQ29tbWVudDc5Nzc5MDAxNw== simonw 9599 2021-03-12T22:22:12Z 2021-03-12T22:22:12Z OWNER

https://sqlite.org/lang_with.html

Prior to SQLite 3.35.0, all CTEs where treated as if the NOT MATERIALIZED phrase was present

It looks like this optimization is completely unavailable on SQLite prior to 3.35.0 (released 12th March 2021). But I could still rewrite the faceting to work in this way, using the exact same SQL - it would just be significantly faster on 3.35.0+ (assuming it's actually faster in practice - would need to benchmark).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Research using CTEs for faster facet counts 830567275  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 65.094ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows