home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where author_association = "OWNER" and issue = 1096558279 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 9

issue 1

  • create-index should run analyze after creating index · 9 ✖

author_association 1

  • OWNER · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1009521921 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1009521921 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48LBUB simonw 9599 2022-01-11T01:37:53Z 2022-01-11T01:37:53Z OWNER

I decided to go with making this opt-in, mainly for consistency with the other places where I added this feature - see: - #379 - #366

You can now run the following:

sqlite-utils create-index mydb.db mytable mycolumn --analyze

And ANALYZE will be run on the index once it has been created.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1008229839 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008229839 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48GF3P simonw 9599 2022-01-09T04:51:44Z 2022-01-09T04:51:44Z OWNER

Found one report on Stack Overflow from 9 years ago of someone seeing broken performance after running ANALYZE, hard to say that's a trend and not a single weird edge-case though! https://stackoverflow.com/q/12947214/6083

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1008163585 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163585 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48F1sB simonw 9599 2022-01-08T22:14:39Z 2022-01-09T03:03:07Z OWNER

The reason I'm hesitating on this is that I've not actually used ANALYZE at all in nearly five years of messing around with SQLite! So I'm nervous that there are surprise downsides I haven't thought of.

My hunch is that ANALYZE is only worth worrying about on much larger databases, in which case I'm OK supporting it as a thoroughly documented power-user feature rather than a default.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1008163050 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008163050 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48F1jq simonw 9599 2022-01-08T22:10:51Z 2022-01-08T22:10:51Z OWNER

Is there a downside to having a sqlite_stat1 table if it has wildly incorrect statistics in it?

Imagine the following sequence of events:

  • User imports a few records, creating the table, using sqlite-utils insert
  • User runs sqlite-utils create-index ... which also creates and populates the sqlite_stat1 table
  • User runs insert again to populate several million new records

The user now has a database file with several million records and a statistics table that is wildly out of date, having been populated when they only had a few.

Will this result in surprisingly bad query performance compared to it that statistics table did not exist at all?

If so, I lean much harder towards ANALYZE as a strictly opt-in optimization, maybe with the --analyze option added to sqlite-utils insert top to help users opt in to updating their statistics after running big inserts.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1008158357 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1008158357 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48F0aV simonw 9599 2022-01-08T21:33:07Z 2022-01-08T21:33:07Z OWNER

The one thing that worries me a little bit about doing this by default is that it adds a surprising new table to the database - it may be confusing to users if they run create-index and their database suddenly has a new sqlite_stat1 table, see https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132

Options here are:

  • Do it anyway. People can tolerate a surprise table appearing when they create an index.
  • Only run ANALYZE if the user says sqlite-utils create-index ... --analyze
  • Use the --analyze option, but also automatically run ANALYZE if they create an index and the database they are working with already has a sqlite_stat1 table

I'm currently leading towards that third option - @fgregg any thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1007643254 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007643254 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48D2p2 simonw 9599 2022-01-07T18:37:56Z 2022-01-07T18:37:56Z OWNER

Or I could leave off --no-analyze and tell people that if they want to add an index without running analyze they can execute the CREATE INDEX themselves.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1007642831 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007642831 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48D2jP simonw 9599 2022-01-07T18:37:18Z 2022-01-07T18:37:18Z OWNER

After implementing #366 I can make it so sqlite-utils create-index automatically runs db.analyze(index_name) afterwards, maybe with a --no-analyze option in case anyone wants to opt out of that for specific performance reasons.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1007634999 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007634999 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48D0o3 simonw 9599 2022-01-07T18:26:22Z 2022-01-07T18:26:22Z OWNER

I've not used the ANALYZE feature in SQLite at all before. Should probably add Python library methods for it.

Annoyingly I use the word "analyze" to mean something else in the CLI - for these features: - #207 - #320

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  
1007633376 https://github.com/simonw/sqlite-utils/issues/365#issuecomment-1007633376 https://api.github.com/repos/simonw/sqlite-utils/issues/365 IC_kwDOCGYnMM48D0Pg simonw 9599 2022-01-07T18:24:07Z 2022-01-07T18:24:07Z OWNER

Relevant documentation: https://www.sqlite.org/lang_analyze.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
create-index should run analyze after creating index 1096558279  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 29.707ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows