home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 1096563265 and user = 9599 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw · 10 ✖

issue 1

  • Python library methods for calling ANALYZE · 10 ✖

author_association 1

  • OWNER 10
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1009508865 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009508865 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48K-IB simonw 9599 2022-01-11T01:08:51Z 2022-01-11T01:08:51Z OWNER

The Python methods are all done now, next step is the CLI options. I'll do those in a separate issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1009288898 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009288898 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48KIbC simonw 9599 2022-01-10T19:54:04Z 2022-01-10T19:54:04Z OWNER

Having browsed the API reference I think the methods that would benefit from an analyze=True parameter are:

  • db.create_index
  • table.insert_all
  • table.upsert_all
  • table.delete_where
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1009285627 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009285627 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48KHn7 simonw 9599 2022-01-10T19:49:19Z 2022-01-10T19:51:25Z OWNER

Documentation for those two new methods: https://sqlite-utils.datasette.io/en/latest/python-api.html#optimizing-index-usage-with-analyze

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1009286373 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009286373 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48KHzl simonw 9599 2022-01-10T19:50:22Z 2022-01-10T19:50:22Z OWNER

With respect to #365, I'm now thinking that having the ability to say "... and then run ANALYZE" could be useful for a bunch of Python methods. For example:

python db["dogs"].insert_all(list_of_dogs, analyze=True) db["dogs"].create_index(["name"], analyze=True)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1009273525 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1009273525 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48KEq1 simonw 9599 2022-01-10T19:32:39Z 2022-01-10T19:32:39Z OWNER

I'm going to implement the Python library methods based on the prototype: ```diff commit 650f97a08f29a688c530e5f6c9eedc9269ed7bdc Author: Simon Willison swillison@gmail.com Date: Sat Jan 8 13:34:01 2022 -0800

Initial prototype of .analyze(), refs #366

diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index dfc4723..1348b4a 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -923,6 +923,13 @@ class Database: "Run a SQLite VACUUM against the database." self.execute("VACUUM;")

  • def analyze(self, name=None):
  • "Run ANALYZE against the entire database or a named table or index."
  • sql = "ANALYZE"
  • if name is not None:
  • sql += " [{}]".format(name)
  • self.execute(sql) +

class Queryable: def exists(self) -> bool: @@ -2902,6 +2909,10 @@ class Table(Queryable): ) return self

  • def analyze(self):
  • "Run ANALYZE against this table"
  • self.db.analyze(self.name) + def analyze_column( self, column: str, common_limit: int = 10, value_truncate=None, total_rows=None ) -> "ColumnDetails": ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1008158616 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008158616 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48F0eY simonw 9599 2022-01-08T21:35:32Z 2022-01-08T21:35:32Z OWNER

Built a prototype in a branch, see #367.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1008157132 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1008157132 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48F0HM simonw 9599 2022-01-08T21:23:08Z 2022-01-08T21:25:05Z OWNER

Running ANALYZE creates a new visible table called sqlite_stat1: https://www.sqlite.org/fileformat.html#the_sqlite_stat1_table

This should be added to the default list of hidden tables in Datasette.

It looks something like this:

| tbl | idx | stat | |---------------------------------|------------------------------------|-----------| | _counts | sqlite_autoindex__counts_1 | 5 1 | | global-power-plants_fts_config | global-power-plants_fts_config | 1 1 | | global-power-plants_fts_docsize | | 33643 | | global-power-plants_fts_idx | global-power-plants_fts_idx | 199 40 1 | | global-power-plants_fts_data | | 136 | | global-power-plants | "global-power-plants_owner" | 33643 4 | | global-power-plants | "global-power-plants_country_long" | 33643 202 |

In each such row, the sqlite_stat.stat column will be a string consisting of a list of integers followed by zero or more arguments. The first integer in this list is the approximate number of rows in the index. (The number of rows in the index is the same as the number of rows in the table, except for partial indexes.) The second integer is the approximate number of rows in the index that have the same value in the first column of the index. The third integer is the number number of rows in the index that have the same value for the first two columns. The N-th integer (for N>1) is the estimated average number of rows in the index which have the same value for the first N-1 columns. For a K-column index, there will be K+1 integers in the stat column. If the index is unique, then the last integer will be 1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1007641634 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007641634 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48D2Qi simonw 9599 2022-01-07T18:35:35Z 2022-01-07T18:35:35Z OWNER

Since the existing CLI feature is this:

$ sqlite-utils analyze-tables github.db tags

I can add sqlite-utils analyze to reflect the Python library method.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1007639860 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007639860 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48D100 simonw 9599 2022-01-07T18:32:59Z 2022-01-07T18:33:07Z OWNER

From the SQLite docs:

If no arguments are given, all attached databases are analyzed. If a schema name is given as the argument, then all tables and indices in that one database are analyzed. If the argument is a table name, then only that table and the indices associated with that table are analyzed. If the argument is an index name, then only that one index is analyzed.

So I think this becomes two methods:

  • db.analyze() calls analyze on the whole database
  • db.analyze(name_of_table_or_index) for a specific named table or index
  • table.analyze() is a shortcut for db.analyze(table.name)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  
1007637963 https://github.com/simonw/sqlite-utils/issues/366#issuecomment-1007637963 https://api.github.com/repos/simonw/sqlite-utils/issues/366 IC_kwDOCGYnMM48D1XL simonw 9599 2022-01-07T18:30:13Z 2022-01-07T18:30:13Z OWNER

Annoyingly I use the word "analyze" to mean something else in the CLI - for these features:

  • 207

  • 320

there's only one method with a similar name in the Python library though and that's this one:

https://github.com/simonw/sqlite-utils/blob/6e46b9913411682f3a3ec66f4d58886c1db8654b/sqlite_utils/db.py#L2904-L2906

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Python library methods for calling ANALYZE 1096563265  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 539.76ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows