home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 431800286 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 10

issue 1

  • New design for facet abstraction, including querystring and metadata.json · 10 ✖

author_association 1

  • OWNER 10
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
488564761 https://github.com/simonw/datasette/issues/427#issuecomment-488564761 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4ODU2NDc2MQ== simonw 9599 2019-05-02T06:24:49Z 2019-05-03T00:07:16Z OWNER

https://github.com/simonw/datasette/compare/facet-refactor-2 is almost ready to merge now. The remaining things to do are listed as TODOs there:

  • [x] Ensure facet is not suggested if it is already active
  • [x] Don't allow facets to be hidden if they were configured in metadata.json
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
488564891 https://github.com/simonw/datasette/issues/427#issuecomment-488564891 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4ODU2NDg5MQ== simonw 9599 2019-05-02T06:25:41Z 2019-05-02T06:25:41Z OWNER

It would be neat to ship at least one additional face with this work - probably either ArrayFacet or DateFacet. I think ArrayFacet because it demonstrates the only-if-json1-enabled functionality.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
482865424 https://github.com/simonw/datasette/issues/427#issuecomment-482865424 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4Mjg2NTQyNA== simonw 9599 2019-04-13T18:56:25Z 2019-04-13T19:42:08Z OWNER

I think there's a Facet base class.

class ColumnFacet(Facet): is the default behaviour we have today

class ArrayFacet(Facet): facet by JSON array

class ManyToManyFacet(Facet): facet by M2M table

class DateFacet(Facet): facet by date

class DateTimeFacet(Facet): facet by datetime

class EmojiFacet(Facet): super-fun demo plugin I have planned

Could even have a facet against a numerical column which loads the entire set of column values into numpy or pandas and calculates complex statistics facets in memory .

There’s actually a lot of potential for Datasette plugins that load several MBs of data and analyze using other Python libraries.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
482864457 https://github.com/simonw/datasette/issues/427#issuecomment-482864457 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4Mjg2NDQ1Nw== simonw 9599 2019-04-13T18:51:44Z 2019-04-13T18:57:51Z OWNER

A facet needs to: - given a sql query and a list of configs, return a list of buckets - Know how to generate URLs for selecting and deselecting a filter (along with underlying filter application sql logic) - Tell if a specific filter is currently selected or not - Set a time limit and report if it times out - Generate human readable labels - In some cases: expand foreign keys - which means they need access to foreign key information - just the name of the table and the name of the column is enough to call expand_foreign_keys() (I moved that to the Datasette class to make it easier to access) - Make suggestions for facets. Let's give it access to the whole table here so it could either run against each column in return and rely with a list of suggestions or it could spot eg a latitude and a longitude column

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
482864837 https://github.com/simonw/datasette/issues/427#issuecomment-482864837 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4Mjg2NDgzNw== simonw 9599 2019-04-13T18:53:43Z 2019-04-13T18:53:43Z OWNER

TableView.data is currently the longest, hairiest method in the codebase. It's 775 - 177 = 598 lines of code! Extracting faceting logic should help reduce that quite a bit.

https://github.com/simonw/datasette/blob/274ef43bb7b129ddc2e68805b4f4ff3776fb9503/datasette/views/table.py#L177-L775

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
482627099 https://github.com/simonw/datasette/issues/427#issuecomment-482627099 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4MjYyNzA5OQ== simonw 9599 2019-04-12T15:54:41Z 2019-04-12T15:54:41Z OWNER

Bonus idea: since we are having a Facet abstraction we should allow additional facet type apps to be registered using a plugin.

Fun idea for a (very inefficient) demo plugin: facet-by-emoji! Would work by counting all emoji in text fields using a horrible slow full-scan regular expression, then would apply selected emoji facets using a LIKE query.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
482626534 https://github.com/simonw/datasette/issues/427#issuecomment-482626534 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4MjYyNjUzNA== simonw 9599 2019-04-12T15:52:53Z 2019-04-12T15:52:53Z OWNER

I just realized: a key part of faceting is being able to correctly apply the facet (and know that it has been applied).

Existing facets are exact match only, so they can be applied and detected with ?foo=bar

More advanced facets like _facet_array and _facet_m2m will need different ways of applying themselves. This needs to be bundled up in the new Facet abstraction somehow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
481957313 https://github.com/simonw/datasette/issues/427#issuecomment-481957313 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4MTk1NzMxMw== simonw 9599 2019-04-11T04:07:00Z 2019-04-11T04:07:40Z OWNER

This means the metadata.json format can look like this: { "databases": { "sf-trees": { "tables": { "Street_Tree_List": { "facets": ["qLegalStatus", {"array": "tags"}, {"percentile": {"blah": "options"}}] } } } } }

So any advanced facets are represented here as a dictionary with a single key - the type - that maps to the options.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
481957014 https://github.com/simonw/datasette/issues/427#issuecomment-481957014 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4MTk1NzAxNA== simonw 9599 2019-04-11T04:05:07Z 2019-04-11T04:05:07Z OWNER

OK, I have a plan:

?_facet=foo ?_facet_facettype=options

Options here can be one of the following:

  • A single value which is the name of a table
  • A comma separated list of options
  • A JSON object starting with { or [

If the table name itself contains a ,, { or ] then you have to escape it by putting it in a JSON object, ?_facet_percentile={"column":"{this_is,a_weird[column_name"} for example.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  
481940539 https://github.com/simonw/datasette/issues/427#issuecomment-481940539 https://api.github.com/repos/simonw/datasette/issues/427 MDEyOklzc3VlQ29tbWVudDQ4MTk0MDUzOQ== simonw 9599 2019-04-11T02:26:43Z 2019-04-11T02:26:43Z OWNER

I quite like the Solr idea. It could look like this for Datasette:

?_facet=name - default behaviour, same as today. But that's actually an alias for ?_facet.name=name - which defines a name for the facet.

?_facet.tags.array=tags - would define a facet called tags that uses an array facet against the tags column.

I don't like the need to say tags twice in that though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
New design for facet abstraction, including querystring and metadata.json 431800286  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 206.989ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows