home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where issue = 421971339 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 13

issue 1

  • Fix all the places that currently use .inspect() data · 13 ✖

author_association 1

  • OWNER 13
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
481310295 https://github.com/simonw/datasette/issues/420#issuecomment-481310295 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ4MTMxMDI5NQ== simonw 9599 2019-04-09T15:50:52Z 2019-04-09T15:50:52Z OWNER

Efficient row counts are even more important for the DatabaseView and IndexView pages.

The row counts on those pages don't have to be precise, so one option is for me to calculate them and cache them occasionally. I could even have a dedicated thread which just does the counting?

In #422 I've figured out a mechanism for getting accurate or lower-bound counts within a time limit (accurate if possible, lower-bound otherwise).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
480556166 https://github.com/simonw/datasette/issues/420#issuecomment-480556166 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ4MDU1NjE2Ng== simonw 9599 2019-04-07T03:35:59Z 2019-04-07T03:48:14Z OWNER

Still need to solve: TableView.data() - but this is the one with a row count in hence the need to solve #422

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
480552387 https://github.com/simonw/datasette/issues/420#issuecomment-480552387 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ4MDU1MjM4Nw== simonw 9599 2019-04-07T02:06:20Z 2019-04-07T02:06:20Z OWNER

expand_foreign_keys() relies on the .inspect() command having automatically derived the label_column for a table, which it does using this code:

https://github.com/simonw/datasette/blob/97331f3435ba1583a0f9dbcaffc25de8894cf1f8/datasette/inspect.py#L34-L42

This needs access to the column names for the table. I think we can drop this entirely in favour of a new utility function - and that function can incorporate the metadata check as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
478393116 https://github.com/simonw/datasette/issues/420#issuecomment-478393116 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3ODM5MzExNg== simonw 9599 2019-03-31T22:52:48Z 2019-03-31T22:52:48Z OWNER

This means the Datasette class needs a new property, keeping track of all of the connected databases.

ds.databases = { "name_used_in_urls": { "type": "file", # or "memory" "path": filepath # or None if memory "mutable": True # or False, "hash": "..." # or None if mutable } }

Maybe these should be objects, not dictionaries.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
478391708 https://github.com/simonw/datasette/issues/420#issuecomment-478391708 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3ODM5MTcwOA== simonw 9599 2019-03-31T22:33:32Z 2019-03-31T22:34:02Z OWNER

Next I need to fix this:

https://github.com/simonw/datasette/blob/0209a0a344503157351e625f0629b686961763c9/datasette/app.py#L420-L435

Given the name of the database (from the URL e.g. https://latest.datasette.io/fixtures) I need to figure out what name I used to cache the collection.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
477636768 https://github.com/simonw/datasette/issues/420#issuecomment-477636768 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3NzYzNjc2OA== simonw 9599 2019-03-28T15:09:27Z 2019-03-28T15:09:27Z OWNER

Even more tricky: table_exists() is currently a synchronous function. If it's going to be executing a SQL query it needs to become an async function.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
477633354 https://github.com/simonw/datasette/issues/420#issuecomment-477633354 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3NzYzMzM1NA== simonw 9599 2019-03-28T15:01:37Z 2019-03-28T15:01:37Z OWNER

I started looking at how I would implement table_exists() with a direct call that uses sqlite-utils to see if a table exists.

https://github.com/simonw/datasette/blob/82fec6048148b58748040a7e2caa163387e982a3/datasette/app.py#L303-L304

sqlite-utils needs access to the database connection - but the database connection itself is currently only available in code that runs in a thread inside the .execute() method:

https://github.com/simonw/datasette/blob/82fec6048148b58748040a7e2caa163387e982a3/datasette/app.py#L413-L426

So I'm going to need to refactor this a bit. I think I need a way to say "here is a function which needs access to the connection object for database named X - run that function in a thread, give it access to that connection and then give me back the result".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
474407617 https://github.com/simonw/datasette/issues/420#issuecomment-474407617 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3NDQwNzYxNw== simonw 9599 2019-03-19T14:55:51Z 2019-03-19T14:55:51Z OWNER

A microbenchmark against fivethirtyeight.db (415 tables):

In [1]: import sqlite3                                                                                              
In [2]: c = sqlite3.connect("fivethirtyeight.db")                                                                   
In [3]: %timeit c.execute("select name from sqlite_master where type = 'table'").fetchall()                         
283 µs ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [4]: tables = [r[0] for r in c.execute("select name from sqlite_master where type = 'table'").fetchall()]        
In [5]: len(tables)                                                                                                 
Out[5]: 415
In [6]: %timeit [c.execute("pragma foreign_keys([{}])".format(t)).fetchall() for t in tables]                       
1.81 ms ± 161 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So running pragma foreign_keys() against 415 tables only takes 1.81ms. This is going to be fine.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
474399630 https://github.com/simonw/datasette/issues/420#issuecomment-474399630 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3NDM5OTYzMA== simonw 9599 2019-03-19T14:38:14Z 2019-03-19T14:38:14Z OWNER

Most of these can be replaced with relatively straight-forward direct introspection of the SQLite table.

The one exception is the incoming foreign keys: these can only be found by inspecting ALL of the other tables.

This requires running PRAGMA foreign_key_list([table_name]) against every other table in the database. How expensive is doing this on a database with hundreds of tables?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
474398127 https://github.com/simonw/datasette/issues/420#issuecomment-474398127 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3NDM5ODEyNw== simonw 9599 2019-03-19T14:34:55Z 2019-03-19T14:34:55Z OWNER

I systematically reviewed the codebase for things that .inspect() is used for:

In app.py:

  • table_exists() uses table in self.inspect().get(database, {}).get("tables")
  • .execute() looks up the database name to get the info["file"] (the correct filename with the .db extension)

In cli.py:

  • The datasette inspect command dumps it to JSON
  • datasette skeleton iterates over it
  • datasette serve calls it on startup (to populate static cache of inspect data)

In base.py:

  • .database_url(database) calls it to lookup the hash (if hash_urls config turned on)
  • .resolve_db_name() uses it to lookup the hash

In database.py:

  • DatabaseView uses it to find up the list of tables and views to display, plus the size of the DB file in bytes
  • DatabaseDownload uses it to get the filepath for download

In index.py:

  • IndexView uses it extensively - to loop through every database and every table. This would make a good starting point for the refactor.

In table.py:

  • sortable_columns_for_table() uses it to find the columns in a table
  • expandable_columns() uses it to find foreign keys
  • expand_foreign_keys() uses it to find foreign keys
  • display_columns_and_rows() uses it to find primary keys and foreign keys... but also has access to a cursor.description which it uses to list the columns
  • TableView.data uses it to lookup columns and primary keys and the table_rows_count (used if the thing isn't a view) and probably a few more things, this method is huge!
  • RowView.data uses it for primary keys
  • foreign_key_tables() uses it for foreign keys

In the tests it's used by test_api.test_inspect_json() and by a couple of tests in test_inspect.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
473744172 https://github.com/simonw/datasette/issues/420#issuecomment-473744172 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3Mzc0NDE3Mg== simonw 9599 2019-03-18T02:08:12Z 2019-03-18T02:08:12Z OWNER

Maybe this is a good opportunity to improve the introspection capabilities in sqlite-utils and add it as a dependency.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
473726587 https://github.com/simonw/datasette/issues/420#issuecomment-473726587 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3MzcyNjU4Nw== simonw 9599 2019-03-17T23:29:22Z 2019-03-17T23:29:22Z OWNER

Needed for #419

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  
473713946 https://github.com/simonw/datasette/issues/420#issuecomment-473713946 https://api.github.com/repos/simonw/datasette/issues/420 MDEyOklzc3VlQ29tbWVudDQ3MzcxMzk0Ng== simonw 9599 2019-03-17T20:56:38Z 2019-03-17T20:58:17Z OWNER

Some examples:

https://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L34-L40

https://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L45-L48

https://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L62-L65

https://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/table.py#L112-L123

https://github.com/simonw/datasette/blob/1f54e092306b208125f39d06712b02895eb75168/datasette/views/index.py#L11-L19

https://github.com/simonw/datasette/blob/afe9aa3ae03c485c5d6652741438d09445a486c1/datasette/views/base.py#L143-L147

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Fix all the places that currently use .inspect() data 421971339  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 22.105ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows