home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 546752311

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions issue performed_via_github_app
https://github.com/simonw/datasette/issues/607#issuecomment-546752311 https://api.github.com/repos/simonw/datasette/issues/607 546752311 MDEyOklzc3VlQ29tbWVudDU0Njc1MjMxMQ== 8431341 2019-10-28T00:37:10Z 2019-10-28T00:37:10Z NONE

UPDATE: According to tips suggested in Squeezing Performance from SQLite: Indexes? Indexes! I have added an index to my large table and benchmarked query speeds in the case where I want to return all rows, rows exactly equal to 'Musk Elon' and, rows like 'musk'. Indexing reduced query time for each of those measures and dramatically reduced the time to return rows exactly equal to 'Musk Elon' as shown below:

table: edgar_idx rows: 16,428,090 rows indexed: False Return all rows where company name exactly equal to Musk Elon query: select rowid, * from edgar_idx where "company" = :p0 order by rowid limit 101 query time: Query took 21821.031ms

Return all rows where company name contains Musk query: select rowid, * from edgar_idx where "company" like :p0 order by rowid limit 101 query time: Query took 20505.029ms

Return everything query: select rowid, * from edgar_idx order by rowid limit 101 query time: Query took 7985.011ms

indexed: True Return all rows where company name exactly equal to Musk Elon query: select rowid, * from edgar_idx where "company" = :p0 order by rowid limit 101 query time: Query took 30.0ms

Return all rows where company name contains Musk query: select rowid, * from edgar_idx where "company" like :p0 order by rowid limit 101 query time: Query took 13340.019ms

Return everything query: select rowid, * from edgar_idx order by rowid limit 101 query time: Query took 2190.003ms

So indexing reduced query time for an exact match to "Musk Elon" from almost 22 seconds to 30.0ms. That's amazing and truly promising! However, an autocomplete feature relies on fuzzy / incomplete matching, which is more similar to the contains 'musk' query... Unfortunately, that takes 13 seconds even after indexing. So the hunt for a fast fuzzy / autocomplete search capability persists.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
512996469  
Powered by Datasette · Queries took 0.886ms · About: github-to-sqlite