home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where issue = 323718842 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • simonw 8
  • mhalle 2
  • rayvoelker 2

author_association 2

  • OWNER 8
  • NONE 4

issue 1

  • Mechanism for ranking results from SQLite full-text search · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
880153069 https://github.com/simonw/datasette/issues/268#issuecomment-880153069 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg4MDE1MzA2OQ== simonw 9599 2021-07-14T19:31:00Z 2021-07-14T19:31:00Z OWNER

... though interestingly I can't replicate that error on latest.datasette.io - https://latest.datasette.io/fixtures/searchable?_search=park.&_searchmode=raw

That's running https://latest.datasette.io/-/versions SQLite 3.35.4 whereas https://www.niche-museums.com/-/versions is running 3.27.2 (the most recent version available with Vercel) - but there's nothing in the SQLite changelog between those two versions that suggests changes to how the FTS5 parser works. https://www.sqlite.org/changes.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
880150755 https://github.com/simonw/datasette/issues/268#issuecomment-880150755 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg4MDE1MDc1NQ== simonw 9599 2021-07-14T19:26:47Z 2021-07-14T19:29:08Z OWNER

What are the side-effects of turning that on in the query string, or even by default as you suggested? I see that you stated in the docs... "to ensure they do not cause any confusion for users who are not aware of them", but I'm not sure what those could be.

Mainly that it's possible to generate SQL queries that crash with an error. This was the example that convinced me to default to escaping:

  • https://www.niche-museums.com/browse/museums?_search=park.&_searchmode=raw (returns fts5: syntax error near ".")
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
876721585 https://github.com/simonw/datasette/issues/268#issuecomment-876721585 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg3NjcyMTU4NQ== rayvoelker 9308268 2021-07-08T20:22:17Z 2021-07-08T20:22:17Z NONE

I do like the idea of there being a option for turning that on by default so that you could use those terms in the default "Search" bar presented when you browse to a table where FTS has been enabled. Maybe even a small inline pop up with a short bit explaining the FTS feature and the keywords (e.g. case matters). What are the side-effects of turning that on in the query string, or even by default as you suggested? I see that you stated in the docs... "to ensure they do not cause any confusion for users who are not aware of them", but I'm not sure what those could be.

Isn't it the case that those keywords are only picked up by sqlite in where you're using the MATCH clause?

Seems like a really powerful feature (even though there are a lot of hurdles around setting it up in the sqlite db ... sqlite-utils makes that so simple by the way!)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
876616414 https://github.com/simonw/datasette/issues/268#issuecomment-876616414 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg3NjYxNjQxNA== simonw 9599 2021-07-08T17:29:04Z 2021-07-08T17:29:04Z OWNER

I had setup a full text search on my instance of Datasette for title data for our public library, and was noticing that some of the features of the SQLite FTS weren't working as expected ... and maybe the issue is in the escape_fts() function

That's a deliberate feature (albeit controversial, see #759) - part of the main problem here is that it's easy to construct a SQLite full-text search string which results in a database error. This is a bad user-experience!

You can opt-in to raw SQL queries by appending ?_searchmode=raw to the page, see https://docs.datasette.io/en/stable/full_text_search.html#advanced-sqlite-search-queries

But maybe there should be an option for turning that on by default without needing the query string?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
876428348 https://github.com/simonw/datasette/issues/268#issuecomment-876428348 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDg3NjQyODM0OA== rayvoelker 9308268 2021-07-08T13:13:12Z 2021-07-08T13:13:12Z NONE

I had setup a full text search on my instance of Datasette for title data for our public library, and was noticing that some of the features of the SQLite FTS weren't working as expected ... and maybe the issue is in the escape_fts() function

vs removing the function...

Also, on the issue of sorting by rank by default .. perhaps something like this could work for the baked-in default SQL query for Datasette?

link to the above search in my instance of Datasette

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
790257263 https://github.com/simonw/datasette/issues/268#issuecomment-790257263 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDc5MDI1NzI2Mw== mhalle 649467 2021-03-04T03:20:23Z 2021-03-04T03:20:23Z NONE

It's kind of an ugly hack, but you can try out what using the fts5 table as an actual datasette-accessible table looks like without changing any datasette code by creating yet another view on top of the fts5 table:

create view proxyview as select *, rank, table_fts as fts from table_fts;

That's now visible from datasette, just like any other view, but you can use fts match escape_fts(search_string) order by rank.

This is only good as a proof of concept because you're inefficiently going from view -> fts5 external content table -> view -> data table. However, it does show it works.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
789409126 https://github.com/simonw/datasette/issues/268#issuecomment-789409126 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDc4OTQwOTEyNg== mhalle 649467 2021-03-03T03:57:15Z 2021-03-03T03:58:40Z NONE

In FTS5, I think doing an FTS search is actually much easier than doing a join against the main table like datasette does now. In fact, FTS5 external content tables provide a transparent interface back to the original table or view.

Here's what I'm currently doing: * build a view that joins whatever tables I want and rename the columns to non-joiny names (e.g, chapter.name AS chapter_name in the view where needed) * Create an FTS5 table with content="viewname" * As described in the "external content tables" section (https://www.sqlite.org/fts5.html#external_content_tables), sql queries can be made directly to the FTS table, which behind the covers makes select calls to the content table when the content of the original columns are needed. * In addition, you get "rank" and "bm25()" available to you when you select on the _fts table.

Unfortunately, datasette doesn't currently seem happy being coerced into doing a real query on an fts5 table. This works: select col1, col2, col3 from table_fts where coll1="value" and table_fts match escape_fts("search term") order by rank

But this doesn't work in the datasette SQL query interface: select col1, col2, col3 from table_fts where coll1="value" and table_fts match escape_fts(:search) order by rank (the "search" input text field doesn't show up)

For what datasette is doing right now, I think you could just use contentless fts5 tables (content=""), since all you care about is the rowid since all you're doing a subselect to get the rowid anyway. In fts5, that's just a contentless table.

I guess if you want to follow this suggestion, you'd need a somewhat different code path for fts5.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
726419027 https://github.com/simonw/datasette/issues/268#issuecomment-726419027 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDcyNjQxOTAyNw== simonw 9599 2020-11-13T00:09:04Z 2020-11-13T00:09:04Z OWNER

Part of the challenge here is that this is the first time the TableView will have had a complete rewrite of the SQL it is going to execute. That SQL is currently constructed here: https://github.com/simonw/datasette/blob/5eb8e9bf250b26e30b017d39a392c33973997656/datasette/views/table.py#L628-L636

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
723740546 https://github.com/simonw/datasette/issues/268#issuecomment-723740546 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDcyMzc0MDU0Ng== simonw 9599 2020-11-09T04:01:50Z 2020-11-09T04:01:50Z OWNER

I should depend on sqlite-fts4 - I'm doing that in sqlite-utils now and it works great: https://github.com/simonw/sqlite-utils/issues/198

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
721896822 https://github.com/simonw/datasette/issues/268#issuecomment-721896822 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDcyMTg5NjgyMg== simonw 9599 2020-11-04T18:23:29Z 2020-11-04T18:23:29Z OWNER

Worth noting that joining to get the rank works for FTS5 but not for FTS4 - see comment here: https://github.com/simonw/sqlite-utils/issues/192#issuecomment-721420539

Easiest solution would be to only support sort-by-rank for FTS5 tables. Alternative would be to depend on https://github.com/simonw/sqlite-fts4

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
675725464 https://github.com/simonw/datasette/issues/268#issuecomment-675725464 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDY3NTcyNTQ2NA== simonw 9599 2020-08-18T21:18:07Z 2020-08-18T21:18:35Z OWNER

I want this on the table page - but that means that the table page will need to run a slightly more complex query since it needs access to a rank column to sort by - which it gets from running a join.

BUT... that join needs to be constructed in a way that keeps existing filters, ?_where= clauses etc intact.

Here's a prototype using SQLite CTEs: https://register-of-members-interests.datasettes.com/regmem?sql=with+original+as+%28select+rowid%2C++from+items%29%0D%0Aselect%0D%0A++original.%2C%0D%0A++items_fts.rank+as+items_fts_rank%0D%0Afrom%0D%0A++original+join+items_fts+on+original.rowid+%3D+items_fts.rowid%0D%0Awhere%0D%0A++items_fts+match+escape_fts%28%3Asearch%29%0D%0Aorder+by+items_fts_rank+desc+limit+10&search=hotel

sql with original as ( select rowid, * from items ) select original.*, items_fts.rank as items_fts_rank from original join items_fts on original.rowid = items_fts.rowid where items_fts match escape_fts(:search) order by items_fts_rank desc limit 10

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  
504880796 https://github.com/simonw/datasette/issues/268#issuecomment-504880796 https://api.github.com/repos/simonw/datasette/issues/268 MDEyOklzc3VlQ29tbWVudDUwNDg4MDc5Ng== simonw 9599 2019-06-24T06:47:23Z 2019-06-24T06:47:23Z OWNER

I did a bunch of research relevant to this a while ago: https://simonwillison.net/2019/Jan/7/exploring-search-relevance-algorithms-sqlite/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Mechanism for ranking results from SQLite full-text search 323718842  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 137.021ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows