home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 473083260 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • simonw 3
  • chapmanjacobd 1

author_association 2

  • OWNER 3
  • CONTRIBUTOR 1

issue 1

  • "Too many SQL variables" on large inserts · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1303660293 https://github.com/simonw/sqlite-utils/issues/50#issuecomment-1303660293 https://api.github.com/repos/simonw/sqlite-utils/issues/50 IC_kwDOCGYnMM5NtEcF chapmanjacobd 7908073 2022-11-04T14:38:36Z 2022-11-04T14:38:36Z CONTRIBUTOR

where did you see the limit as 999? I believe the limit has been 32766 for quite some time. If you could detect which one this could speed up batch insert of some types of data significantly

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Too many SQL variables" on large inserts 473083260  
515752204 https://github.com/simonw/sqlite-utils/issues/50#issuecomment-515752204 https://api.github.com/repos/simonw/sqlite-utils/issues/50 MDEyOklzc3VlQ29tbWVudDUxNTc1MjIwNA== simonw 9599 2019-07-28T10:48:14Z 2019-07-28T10:48:14Z OWNER

Here's the diff where I tried to use .executemany() and ran into the lastrowid problem: diff diff --git a/sqlite_utils/db.py b/sqlite_utils/db.py index ef55976..7f85759 100644 --- a/sqlite_utils/db.py +++ b/sqlite_utils/db.py @@ -881,13 +881,10 @@ class Table: or_what=or_what, table=self.name, columns=", ".join("[{}]".format(c) for c in all_columns), - rows=", ".join( - """ + rows=""" ({placeholders}) """.format( - placeholders=", ".join(["?"] * len(all_columns)) - ) - for record in chunk + placeholders=", ".join(["?"] * len(all_columns)) ), ) values = [] @@ -902,15 +899,15 @@ class Table: extract_table = extracts[key] value = self.db[extract_table].lookup({"value": value}) record_values.append(value) - values.extend(record_values) + values.append(record_values) with self.db.conn: try: - result = self.db.conn.execute(sql, values) + result = self.db.conn.executemany(sql, values) except sqlite3.OperationalError as e: if alter and (" has no column " in e.args[0]): # Attempt to add any missing columns, then try again self.add_missing_columns(chunk) - result = self.db.conn.execute(sql, values) + result = self.db.conn.executemany(sql, values) else: raise self.last_rowid = result.lastrowid

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Too many SQL variables" on large inserts 473083260  
515752129 https://github.com/simonw/sqlite-utils/issues/50#issuecomment-515752129 https://api.github.com/repos/simonw/sqlite-utils/issues/50 MDEyOklzc3VlQ29tbWVudDUxNTc1MjEyOQ== simonw 9599 2019-07-28T10:46:49Z 2019-07-28T10:46:49Z OWNER

The problem with .executemany() is it breaks lastrowid:

This read-only attribute provides the rowid of the last modified row. It is only set if you issued an INSERT or a REPLACE statement using the execute() method. For operations other than INSERT or REPLACE or when executemany() is called, lastrowid is set to None.

So I think I need to continue to use my existing way of executing bulk inserts (with a giant repeated INSERT INTO ... VALUES block) but ensure that I calculate the chunk size such that I don't ever try to pass more than 999 values at once.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Too many SQL variables" on large inserts 473083260  
515751719 https://github.com/simonw/sqlite-utils/issues/50#issuecomment-515751719 https://api.github.com/repos/simonw/sqlite-utils/issues/50 MDEyOklzc3VlQ29tbWVudDUxNTc1MTcxOQ== simonw 9599 2019-07-28T10:40:11Z 2019-07-28T10:40:11Z OWNER

I think the fix here is for me to switch to using executemany() - example from the Python docs: https://docs.python.org/3/library/sqlite3.html python purchases = [('2006-03-28', 'BUY', 'IBM', 1000, 45.00), ('2006-04-05', 'BUY', 'MSFT', 1000, 72.00), ('2006-04-06', 'SELL', 'IBM', 500, 53.00), ] c.executemany('INSERT INTO stocks VALUES (?,?,?,?,?)', purchases)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
"Too many SQL variables" on large inserts 473083260  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 84.393ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows