home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 607770595 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 4

issue 1

  • escape_fts() does not correctly escape * wildcards · 4 ✖

author_association 1

  • OWNER 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
620177365 https://github.com/simonw/datasette/issues/743#issuecomment-620177365 https://api.github.com/repos/simonw/datasette/issues/743 MDEyOklzc3VlQ29tbWVudDYyMDE3NzM2NQ== simonw 9599 2020-04-27T19:11:01Z 2020-04-27T19:11:30Z OWNER

Huh... turns out the documentation already claims that wildcards work! Closing this as wontfix:

https://datasette.readthedocs.io/en/stable/full_text_search.html#the-table-view-api

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
escape_fts() does not correctly escape * wildcards 607770595  
620174977 https://github.com/simonw/datasette/issues/743#issuecomment-620174977 https://api.github.com/repos/simonw/datasette/issues/743 MDEyOklzc3VlQ29tbWVudDYyMDE3NDk3Nw== simonw 9599 2020-04-27T19:05:56Z 2020-04-27T19:05:56Z OWNER

The other option would be to leave this as-is, and let people wildcard search all they want. I'm leaning in that direction.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
escape_fts() does not correctly escape * wildcards 607770595  
620170826 https://github.com/simonw/datasette/issues/743#issuecomment-620170826 https://api.github.com/repos/simonw/datasette/issues/743 MDEyOklzc3VlQ29tbWVudDYyMDE3MDgyNg== simonw 9599 2020-04-27T18:58:04Z 2020-04-27T18:58:04Z OWNER

Maybe this is moot because you can't store a * character in a FTS table anyway, so it would never make sense to search for one? In which case maybe escape_fts() should just strip out * entirely?

Best source of information I could find was this tiny thread from 2014 about FTS4:

http://sqlite.1065341.n5.nabble.com/Escaping-conventions-for-FTS4-virtual-table-queries-td74589.html

Dave Baggett wrote:

What if I want docids of documents containing the exact literal token "any*"?

You would have to use one of the Unicode tokenizers, and configure it to interpret * as a token character.

how do I escape the asterisk so that it's not interpreted as a wildcard?

There are no escapes. When * is a token character, you lose the ability to do prefix searches.

I could investigate further by learning to use the fts5vocab virtual table debugging tool to see what's actually stored in those FTS5 indexes and check if * is indeed stripped by them.

https://www.sqlite.org/fts5.html#the_fts5vocab_virtual_table_module

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
escape_fts() does not correctly escape * wildcards 607770595  
620166959 https://github.com/simonw/datasette/issues/743#issuecomment-620166959 https://api.github.com/repos/simonw/datasette/issues/743 MDEyOklzc3VlQ29tbWVudDYyMDE2Njk1OQ== simonw 9599 2020-04-27T18:50:30Z 2020-04-27T18:50:30Z OWNER

Here's the escape_fts() function: https://github.com/simonw/datasette/blob/89c4ddd4828623888e91a1d2cb396cba12d4e7b4/datasette/utils/init.py#L742-L753

https://latest.datasette.io/fixtures?sql=select+escape_fts%28%27bar%2A%27%29

So apparently wrapping a SQLite FTS word like "bar*" doesn't prevent SQLite from expanding the wildcard.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
escape_fts() does not correctly escape * wildcards 607770595  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 21.217ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows