home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 688670158

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
688670158 MDU6SXNzdWU2ODg2NzAxNTg= 147 SQLITE_MAX_VARS maybe hard-coded too low 96218 open 0     7 2020-08-30T07:26:45Z 2021-02-15T21:27:55Z   CONTRIBUTOR  

I came across this while about to open an issue and PR against the documentation for batch_size, which is a bit incomplete.

As mentioned in #145, while:

SQLITE_MAX_VARIABLE_NUMBER ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too.

In working to understand what batch_size was actually doing and why, I realized that by setting SQLITE_MAX_VARS in db.py to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to insert_all() my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save...

Unfortunately, it seems that sqlite3 in the python standard library doesn't expose the get_limit() C API (even though pysqlite used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding SQLITE_MAX_VARS to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is echo "" | sqlite3 -cmd ".limits variable_number" (only available in sqlite >= 2015-05-07 (3.8.10)).

Obviously this couldn't be relied upon in sqlite_utils, but I wonder what your opinion would be about exposing SQLITE_MAX_VARS as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.

140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   

Links from other tables

  • 1 row from issues_id in issues_labels
  • 7 rows from issue in issue_comments
Powered by Datasette · Queries took 2.018ms · About: github-to-sqlite