home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

494 rows where repo = 140912432 and type = "issue" sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: milestone, comments, author_association, created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 409
  • open 85

type 1

  • issue · 494 ✖

repo 1

  • sqlite-utils · 494 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1066474200 I_kwDOCGYnMM4_kRrY 344 Support STRICT tables simonw 9599 closed 0     14 2021-11-29T20:32:23Z 2023-12-08T05:22:39Z 2023-12-08T05:22:39Z OWNER  

New in SQLite 3.37.0, released a few days ago: https://www.sqlite.org/stricttables.html

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/344/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1988525411 I_kwDOCGYnMM52hn1j 603 Pyhton 3.12 Bug report constantinedev 1324252 open 0     1 2023-11-10T22:57:48Z 2023-12-08T05:10:31Z   NONE  

I start with new python3 verison 3.12.0 Also have the error where connect DataBase

Traceback (most recent call last): File "/home/t/Development/python/FKPJ/ClinicSYS/run.py", line 1, in <module> import re, os, io, json, sqlite_utils, requests, pytz, logging File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/__init__.py", line 1, in <module> from .db import Database File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py", line 277, in <module> class Database: File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py", line 306, in Database filename_or_conn: Optional[Union[str, pathlib.Path, sqlite3.Connection]] = None, ^^^^^^^^^^^^^^^^^^ This bug come from sqlite-utils since's v3.33. Anyone get the same ?

As well now of the resolved plan just keep the sqlite-utils version in python3.12 with v3.32.1 [tested] but where are the sqlite3.Connection problem....

This won't happen on python version down to 3.11[tested] Just the python3.12.0, I have test this error are come from the sqlite3 connection The error say from sqlite_utils and with the sqlite3 Connection, what can I do.

Let fix together.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/603/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
2007893839 I_kwDOCGYnMM53rgdP 605 Insert fails with `Error: Python int too large to convert to SQLite INTEGER`; can we use `NUMERIC` here? Zac-HD 12229877 closed 0     1 2023-11-23T10:19:46Z 2023-12-08T05:07:54Z 2023-12-08T05:07:54Z NONE  

I'm currently working on a new feature for Hypothesis, where we can dump a tidy jsonlines table of all the test cases we tried - including arguments, outcomes, timings, coverage, etc. Exploring this seems like a perfect cases for sqlite-utils and datasette, but I pretty quickly ran into an integer overflow problem and don't want to recommend that experience to my users.

I originally went to report this as a bug... and then found https://github.com/simonw/sqlite-utils/issues/309#issuecomment-895581038 almost exactly matched my repro 😅

https://github.com/simonw/sqlite-utils/issues/110#issuecomment-626391063 suggests that using NUMERIC would avoid this overflow error, although "If the TEXT value is a well-formed integer literal that is too large to fit in a 64-bit signed integer, it is converted to REAL." suggests that this would come at the cost of rounding to the nearest float value. Maybe I should just convert large integers to float before writing out my json?

After a bit more hacking, "manually cast large integers to float" seems like a decent solution for my particular case, but having written it up I thought I might as well post this issue anyway - I hope it's useful feedback, and won't mind at all if you close as wontfix if it's not.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/605/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
2029161033 I_kwDOCGYnMM548opJ 606 str and int as aliases for text and integer simonw 9599 closed 0     2 2023-12-06T18:35:49Z 2023-12-06T19:44:04Z 2023-12-06T18:49:32Z OWNER  

I keep making this mistake: bash sqlite-utils add-column content.db assets _since int ``` Usage: sqlite-utils add-column [OPTIONS] PATH TABLE COL_NAME [[integer|float|b lob|text|INTEGER|FLOAT|BLOB|TEXT]] Try 'sqlite-utils add-column -h' for help.

Error: Invalid value for '[[integer|float|blob|text|INTEGER|FLOAT|BLOB|TEXT]]': 'int' is not one of 'integer', 'float', 'blob', 'text', 'INTEGER', 'FLOAT', 'BLOB', 'TEXT'. ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/606/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1978603203 I_kwDOCGYnMM517xbD 602 `sqlite-utils transform` removes the `AUTOINCREMENT` keyword ArsTapatun 4472046 open 0     0 2023-11-06T08:48:43Z 2023-11-06T08:48:43Z   NONE  

Context

We ran into this bug randomly, noticing that deleted ROWID would get reused after migrating the DB. Using transform to change any column in the table will also unexpectedly strip away the AUTOINCREMENT keyword from the primary key definition, even if it was not the transformation target.

Reproducible example

Original database

```sql $ sqlite3 test.db << EOF CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ) EOF

$ sqlite3 test.db ".schema mytable" CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ); ```

Modified database after sqlite-utils

```sql $ sqlite-utils transform test.db mytable --rename col2 renamedcol2

$ sqlite3 test.db "SELECT sql FROM sqlite_master WHERE name = 'mytable';" CREATE TABLE IF NOT EXISTS "mytable" ( [col1] INTEGER PRIMARY KEY, [renamedcol2] TEXT NOT NULL ); ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/602/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1977155641 I_kwDOCGYnMM512QA5 601 Move plugin directory into documentation simonw 9599 open 0     0 2023-11-04T04:07:52Z 2023-11-04T04:07:52Z   OWNER  

https://github.com/simonw/sqlite-utils-plugins should be in the official documentation.

I can use the same pattern as https://llm.datasette.io/en/stable/plugins/directory.html

https://til.simonwillison.net/readthedocs/stable-docs

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/601/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1976986318 I_kwDOCGYnMM511mrO 599 Cannot find spatialite on arm64 linux MikeCoats 37802088 closed 0     1 2023-11-03T22:05:51Z 2023-11-04T01:06:31Z 2023-11-04T00:33:28Z CONTRIBUTOR  

Initially, I found an issue in datasette where it wouldn’t find spatialite when running on my Radxa Rock 5B - an RK3588 powered SBC, running the arm64 build of Debian Bullseye. I confirmed the same behaviour on my Raspberry Pi 4 - a BCM2711 powered SBC, running the arm64 build of Debian Bookworm.

$ datasette --load-extension=spatialite example.db Error: Could not find SpatiaLite extension

I did some digging and realised the issue originates in this project. Even with the libsqlite3-mod-spatialite package installed, pytest skips all of the GIS tests in the project.

``` $ apt list --installed | grep spatial […] libsqlite3-mod-spatialite/stable,now 5.0.1-3 arm64 [installed]

$ ls -l /usr/lib//spatial* lrwxrwxrwx 1 root root 23 Dec 1 2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so -> mod_spatialite.so.7.1.0 lrwxrwxrwx 1 root root 23 Dec 1 2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so.7 -> mod_spatialite.so.7.1.0 -rw-r--r-- 1 root root 7348584 Dec 1 2022 /usr/lib/aarch64-linux-gnu/mod_spatialite.so.7.1.0 ```

$ pytest tests/test_get.py ...... [ 73%] tests/test_gis.py ssssssssssss [ 75%] tests/test_hypothesis.py .... [ 75%]

I tracked the issue down to the find_sqlite() function in the utils.py file. The SPATIALITE_PATHS array doesn’t have an entry for the location of this module on arm64 linux.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/599/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1553425465 I_kwDOCGYnMM5cl2Q5 522 Add COLUMN_TYPE_MAPPING for timedelta maport 81377 closed 0     0 2023-01-23T16:49:54Z 2023-11-04T00:49:51Z 2023-11-04T00:49:51Z NONE  

Currently trying to create a column with Python type datetime.timedelta results in an error:

```

from sqlite_utils import Database db = Database("test.db") test_tbl = db['test'] test_tbl.insert({'col1': datetime.timedelta()}) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python3.10/dist-packages/sqlite_utils/db.py", line 2979, in insert return self.insert_all( File "/usr/local/lib/python3.10/dist-packages/sqlite_utils/db.py", line 3082, in insert_all self.create( File "/usr/local/lib/python3.10/dist-packages/sqlite_utils/db.py", line 1574, in create self.db.create_table( File "/usr/local/lib/python3.10/dist-packages/sqlite_utils/db.py", line 961, in create_table sql = self.create_table_sql( File "/usr/local/lib/python3.10/dist-packages/sqlite_utils/db.py", line 852, in create_table_sql column_type=COLUMN_TYPE_MAPPING[column_type], KeyError: <class 'datetime.timedelta'> ```

The reason this would be useful is that MySQLdb uses timedelta for MySQL TIME columns:

```

import MySQLdb conn = MySQLdb.connect(host='database', user='user', passwd='pw') csr = conn.cursor() csr.execute("SELECT CAST('11:20' AS TIME)") tuple(csr) ((datetime.timedelta(seconds=40800),),) ```

So currently any attempt to convert a MySQL DB with a TIME column using db-to-sqlite will result in the above error.

I was rather surprised that MySQLdb uses timedelta for TIME columns but I see that this column type is intended for time intervals as well as the time of day so it makes sense.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/522/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1239034903 I_kwDOCGYnMM5J2iwX 433 CLI eats my cursor chapmanjacobd 7908073 closed 0     10 2022-05-17T18:52:52Z 2023-11-04T00:46:30Z 2023-11-04T00:46:30Z CONTRIBUTOR  

I'm not sure why this happens but sqlite-utils makes my terminal cursor disappear after running commands like sqlite-utils insert. I've only noticed this behavior in sqlite-utils, not in any other CLI tools

I can still type commands after it runs but the text cursor is invisible

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/433/reactions",
    "total_count": 5,
    "+1": 5,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1920416843 I_kwDOCGYnMM5ydzxL 597 sqlite-utils insert-files should be able to convert fields grimnight 1737541 open 0     0 2023-09-30T22:20:47Z 2023-09-30T22:20:47Z   NONE  

Currently using both insert-files and convert is needed in order to create sqlar files, it would be more convenient if it could be done with just one command.

```shell ~ ❯ cat test.py import os

class Example: def init(self, arg1, arg2): self.arg1 = arg1

~ ❯ sqlite-utils insert-files test.sqlar sqlar test.py -c name:name -c data:content -c mode:mode -c mtime:mtime -c sz:size --pk=name [####################################] 100%

~ ❯ sqlite-utils convert test.sqlar sqlar data "zlib.compress(value)" --import=zlib --where "name = 'test.py'" [####################################] 100%

~ ❯ cat test.py | sqlite-utils convert test.sqlar sqlar data "zlib.compress(sys.stdin.buffer.read())" --import=zlib --import=sys --where "name = 'test.py'" # Alternative way [####################################] 100%

~ ❯ sqlite3 test.sqlar "SELECT hex(data) FROM sqlar WHERE name = 'test.py';" | python3 -c "import sys, zlib; sys.stdout.buffer.write(zlib.decompress(bytes.fromhex(sys.stdin.read())))" import os

class Example: def init(self, arg1, arg2): self.arg1 = arg1

~ ❯ rm test.py

~ ❯ sqlar -l test.sqlar test.py

~ ❯ sqlar -x test.sqlar

~ ❯ cat test.py import os

class Example: def init(self, arg1, arg2): self.arg1 = arg1

```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/597/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1907281675 I_kwDOCGYnMM5xrs8L 595 Cascading DELETE not working with Table.delete(pk) cycle-data 123451970 closed 0     1 2023-09-21T15:46:41Z 2023-09-25T09:38:57Z 2023-09-25T09:38:13Z NONE  

Hi ! I noticed that when I am trying to use the delete method of the Table object, the record get properly deleted from the table, but the cascading delete triggers on foreign keys do not activate.

self.db["contact"].delete(contact_id)

I tried querying the database directly via DB Browser and the triggers work without any issue. Looked up the source code and behind the scene this method is just querying the database normally so I'm not exactly sure where this behavior comes from.

Thank you in advance for your time !

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/595/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
944846776 MDU6SXNzdWU5NDQ4NDY3NzY= 297 Option for importing CSV data using the SQLite .import mechanism simonw 9599 open 0     23 2021-07-14T22:36:41Z 2023-09-22T20:49:52Z   OWNER  

As seen in https://til.simonwillison.net/sqlite/import-csv - .mode csv and then .import school.csv schools is hugely faster than importing via sqlite-utils insert and doing the work in Python - but it can only be implemented by shelling out to the sqlite3 CLI tool, it's not functionality that is exposed to the Python sqlite3 module.

An option to use this would be useful - maybe something like this:

sqlite-utils insert blah.db blah blah.csv --fast
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/297/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1891614971 I_kwDOCGYnMM5wv8D7 594 Represent compound foreign keys in table.foreign_keys output simonw 9599 open 0     2 2023-09-12T03:48:24Z 2023-09-12T03:51:13Z   OWNER  

Given this schema: sql CREATE TABLE departments ( campus_name TEXT NOT NULL, dept_code TEXT NOT NULL, dept_name TEXT, PRIMARY KEY (campus_name, dept_code) ); CREATE TABLE courses ( course_code TEXT PRIMARY KEY, course_name TEXT, campus_name TEXT NOT NULL, dept_code TEXT NOT NULL, FOREIGN KEY (campus_name, dept_code) REFERENCES departments(campus_name, dept_code) ); The output of db["courses"].foreign_keys right now is: [ForeignKey(table='courses', column='campus_name', other_table='departments', other_column='campus_name'), ForeignKey(table='courses', column='dept_code', other_table='departments', other_column='dept_code')] Which suggests two normal foreign keys, not one compound foreign key.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/594/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1886771493 I_kwDOCGYnMM5wddkl 592 `table.transform()` should preserve `rowid` values simonw 9599 closed 0     6 2023-09-08T00:42:38Z 2023-09-10T17:46:41Z 2023-09-09T00:45:32Z OWNER  

I just spotted a bug when using https://datasette.io/plugins/datasette-configure-fts and https://datasette.io/plugins/datasette-edit-schema at the same time.

Steps to reproduce:

  • Configure FTS for a table, then run a test search
  • Edit the schema for that table and change the order of columns
  • Run the test search again

I got the wrong search results, which I think is because the _fts table pointed to the first table by rowid but those rowid values were entirely rewritten as a consequence of running table.transform() on the table.

Reconfiguring FTS on the table fixed the problem.

I think table.transform() should be able to preserve rowid values.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/592/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1879214365 I_kwDOCGYnMM5wAokd 590 Ability to tell if a Database is an in-memory one simonw 9599 open 0     1 2023-09-03T19:50:15Z 2023-09-03T19:50:36Z   OWNER  

Currently the constructor accepts memory=True or memory_name=... and uses those to create a connection, but does not record what those values were:

https://github.com/simonw/sqlite-utils/blob/1260bdc7bfe31c36c272572c6389125f8de6ef71/sqlite_utils/db.py#L307-L349

This makes it hard to tell if a database object is to an in-memory or a file-based database, which is sometimes useful to know.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/590/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1879209560 I_kwDOCGYnMM5wAnZY 589 Mechanism for de-registering registered SQL functions simonw 9599 open 0     3 2023-09-03T19:32:39Z 2023-09-03T19:36:34Z   OWNER  

I used a custom SQL function in a migration script and then realized that it should be de-registered before the end of the script to avoid leaking into the calling code.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/589/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1868713944 I_kwDOCGYnMM5vYk_Y 588 `table.get(column=value)` option for retrieving things not by their primary key simonw 9599 open 0     1 2023-08-28T00:41:23Z 2023-08-28T00:41:54Z   OWNER  

This came up working on this feature: - https://github.com/simonw/llm/pull/186

I have a table with this schema: sql CREATE TABLE [collections] ( [id] INTEGER PRIMARY KEY, [name] TEXT, [model] TEXT ); CREATE UNIQUE INDEX [idx_collections_name] ON [collections] ([name]); So the primary key is an integer (because it's going to have a huge number of rows foreign key related to it, and I don't want to store a larger text value thousands of times), but there is a unique constraint on the name - that would be the primary key column if not for all of those foreign keys.

Problem is, fetching the collection by name is actually pretty inconvenient.

Fetch by numeric ID:

python try: table["collections"].get(1) except NotFoundError: # It doesn't exist Fetching by name: python def get_collection(db, collection): rows = db["collections"].rows_where("name = ?", [collection]) try: return next(rows) except StopIteration: raise NotFoundError("Collection not found: {}".format(collection)) It would be neat if, for columns where we know that we should always get 0 or one result, we could do this instead: python try: collection = table["collections"].get(name="entries") except NotFoundError: # It doesn't exist The existing .get() method doesn't have any non-positional arguments, so using **kwargs like that should work:

https://github.com/simonw/sqlite-utils/blob/1260bdc7bfe31c36c272572c6389125f8de6ef71/sqlite_utils/db.py#L1495

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/588/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1857851384 I_kwDOCGYnMM5uvI_4 587 New .add_foreign_key() can break if PRAGMA legacy_alter_table=ON and there's an invalid foreign key reference simonw 9599 closed 0     3 2023-08-19T20:01:26Z 2023-08-19T20:04:33Z 2023-08-19T20:04:32Z OWNER  

Extremely detailed story of how I got to this point:

  • https://github.com/simonw/llm/issues/162

Steps to reproduce (only if that pragma is on though): bash python -c ' import sqlite_utils db = sqlite_utils.Database(memory=True) db.execute(""" CREATE TABLE "logs" ( [id] INTEGER PRIMARY KEY, [model] TEXT, [prompt] TEXT, [system] TEXT, [prompt_json] TEXT, [options_json] TEXT, [response] TEXT, [response_json] TEXT, [reply_to_id] INTEGER, [chat_id] INTEGER REFERENCES [log]([id]), [duration_ms] INTEGER, [datetime_utc] TEXT ); """) db["logs"].add_foreign_key("reply_to_id", "logs", "id") ' This succeeds in some environments, fails in others.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/587/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1817289521 I_kwDOCGYnMM5sUaMx 577 Get `add_foreign_keys()` to work without modifying `sqlite_master` simonw 9599 closed 0     9 2023-07-23T20:40:18Z 2023-08-18T17:43:11Z 2023-08-18T00:48:10Z OWNER  

https://github.com/simonw/sqlite-utils/blob/13ebcc575d2547c45e8d31288b71a3242c16b886/sqlite_utils/db.py#L1165-L1174

This is the only place in the code that attempts to modify sqlite_master directly, which fails on some Python installations.

Could this use the .transform() trick instead?

Or automatically switch to that trick if it hits an error?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/577/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1856075668 I_kwDOCGYnMM5uoXeU 586 .transform() fails to drop column if table is part of a view simonw 9599 open 0     3 2023-08-18T05:25:22Z 2023-08-18T06:13:47Z   OWNER  

I got this error trying to drop a column from a table that was part of a SQL view:

error in view plugins: no such table: main.pypi_releases

Upon further investigation I found that this pattern seemed to fix it: python def transform_the_table(conn): # Run this in a transaction: with conn: # We have to read all the views first, because we need to drop and recreate them db = sqlite_utils.Database(conn) views = {v.name: v.schema for v in db.views if table.lower() in v.schema.lower()} for view in views.keys(): db[view].drop() db[table].transform( types=types, rename=rename, drop=drop, column_order=[p[0] for p in order_pairs], ) # Now recreate the views for name, schema in views.items(): db.create_view(name, schema) So grab a copy of any view that might reference this table, start a transaction, drop those views, run the transform, recreate the views again.

I wonder if this should become an option in sqlite-utils? Maybe a recreate_views=True argument for table.tranform(...)? Should it be opt-in or opt-out?

Originally posted by @simonw in https://github.com/simonw/datasette-edit-schema/issues/35#issuecomment-1683370548

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/586/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1855894222 I_kwDOCGYnMM5unrLO 585 CLI equivalents to `transform(add_foreign_keys=)` simonw 9599 closed 0     7 2023-08-18T01:07:15Z 2023-08-18T01:51:16Z 2023-08-18T01:51:15Z OWNER  

The new options added in: - #577 Deserve consideration in the CLI as well.

https://github.com/simonw/sqlite-utils/blob/d2bcdc00c6ecc01a6e8135e775ffdb87572b802b/sqlite_utils/db.py#L1706-L1708

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/585/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1754174496 I_kwDOCGYnMM5ojpQg 558 Ability to define unique columns when creating a table aguinane 1910303 open 0     0 2023-06-13T06:56:19Z 2023-08-18T01:06:03Z   NONE  

When creating a new table, it would be good to have an option to set unique columns similar to how not_null is set.

```python from sqlite_utils import Database

columns = {"mRID": str, "name": str} db = Database("example.db") db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], if_not_exists=True) db["ExampleTable"].create_index(["mRID"], unique=True, if_not_exists=True) ```

So something like this would add the UNIQUE flag to the table definition.

python db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], unique=["mRID"], if_not_exists=True)

sql CREATE TABLE ExampleTable ( mRID TEXT PRIMARY KEY NOT NULL UNIQUE, name TEXT );

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/558/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1855836914 I_kwDOCGYnMM5undLy 583 Get rid of test.utils.collapse_whitespace simonw 9599 closed 0     1 2023-08-17T23:31:09Z 2023-08-18T00:59:19Z 2023-08-18T00:59:19Z OWNER  

I have a neater pattern for this now - instead of: https://github.com/simonw/sqlite-utils/blob/1dc6b5aa644a92d3654f7068110ed7930989ce71/tests/test_create.py#L472-L475

I now prefer:

https://github.com/simonw/sqlite-utils/blob/1dc6b5aa644a92d3654f7068110ed7930989ce71/tests/test_create.py#L1163-L1171

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/583/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1818838294 I_kwDOCGYnMM5saUUW 578 Plugin hook for adding new output formats simonw 9599 open 0     5 2023-07-24T17:29:18Z 2023-08-07T15:41:49Z   OWNER  

What would it take to add a format hook? I'm still thinking about my GIS workflow, and being able to do sqlite-utils query ... --geojson would be nice. It's the one place my Datasette workflow is messy, having to do datasette . --get /path/to/query.geojson --setting max_rows_returned 10000 --load-extension spatialite. I know the current pattern is --csv, but maybe --format geojson is more future-proof.

https://discord.com/channels/823971286308356157/997738192360964156/1133076679011602432

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/578/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1839344979 I_kwDOCGYnMM5toi1T 582 Handling CSV/file input that contains NUL bytes betatim 1448859 open 0     0 2023-08-07T12:24:14Z 2023-08-07T12:24:14Z   NONE  

I was using sqlite-utils to create a DB from a CSV and it turns out the CSV contains a NUL byte.

When the processing reaches the line that contains the NUL an exception is raised.

I'm wondering if there is something that can be done in sqlite-utils to say "skip lines with encoding errors" or some such. I think it isn't super straightforward though as the exception comes from inside the csv module that does all the parsing.

Concretely the file is the KernelVersions.csv from https://www.kaggle.com/datasets/kaggle/meta-kaggle

This is the command and output: $ sqlite-utils insert --csv kaggle.db kaggle KernelVersions.csv [------------------------------------] 0% [#####################---------------] 60% 00:04:24Traceback (most recent call last): File "/home/foobar/miniconda/envs/meta-kaggle/bin/sqlite-utils", line 10, in <module> sys.exit(cli()) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1223, in insert insert_upsert_implementation( File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1085, in insert_upsert_implementation db[table].insert_all( File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py", line 3198, in insert_all chunk = list(chunk) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py", line 3742, in fix_square_braces for record in records: File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1071, in <genexpr> docs = (decode_base64_values(doc) for doc in docs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1068, in <genexpr> docs = (verify_is_dict(doc) for doc in docs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1003, in <genexpr> docs = (dict(zip(headers, row)) for row in reader) _csv.Error: line contains NUL

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/582/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1823160748 I_kwDOCGYnMM5sqzms 581 `sqlite-utils convert --pdb` option simonw 9599 closed 0     1 2023-07-26T21:02:50Z 2023-07-26T21:07:45Z 2023-07-26T21:06:10Z OWNER  

While using sqlite-utils convert I realized it would be handy if you could pass --pdb to have it open the debugger at the first instance of a failed conversion.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/581/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1822918995 I_kwDOCGYnMM5sp4lT 580 Add way to export to a csv file using the Python library kevinlinxc 44324811 open 0     0 2023-07-26T18:09:26Z 2023-07-26T18:09:26Z   NONE  

According to the documentation, we can make a csv output using the CLI tool, but not the Python library. Could we have the latter?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/580/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1821108702 I_kwDOCGYnMM5si-ne 579 Special handling for SQLite column of type `JSON` asg017 15178711 open 0     0 2023-07-25T20:37:23Z 2023-07-25T20:37:23Z   CONTRIBUTOR  

sqlite-utils should detect and have specially handling for column with a JSON column. For example:

sql CREATE TABLE "dogs" ( id INTEGER PRIMARY KEY, name TEXT, friends JSON );

Automatic Nesting

According to "Nested JSON Values", sqlite-utils will only expand JSON if the --json-cols flag is passed. It looks like it'll try to json.load all text column to test if its JSON, which can get expensive on non-json columns.

Instead, sqlite-utils should be default (ie without the --json-cols flags) do the maybe_json() operation on columns with a declared JSON type. So the above table would expand the "friends" column as expected, withoutthe --json-cols flag:

bash sqlite-utils dogs.db "select * from dogs" | python -mjson.tool

[ { "id": 1, "name": "Cleo", "friends": [ { "name": "Pancakes" }, { "name": "Bailey" } ] } ]


I'm sure there's other ways sqlite-utils can specially handle JSON columns, so keeping this open while I think of more

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/579/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1816997390 I_kwDOCGYnMM5sTS4O 576 Backfill the release notes prior to 0.4 simonw 9599 closed 0     2 2023-07-23T05:41:42Z 2023-07-23T05:49:51Z 2023-07-23T05:48:21Z OWNER  

Currently the changelog starts at 0.4:

https://sqlite-utils.datasette.io/en/3.34/changelog.html#id115

I want the other releases - according to https://pypi.org/project/sqlite-utils/#history there are three missing:

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/576/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816919568 I_kwDOCGYnMM5sS_4Q 575 Python API ability to opt-out of connection plugins simonw 9599 closed 0     2 2023-07-22T23:01:13Z 2023-07-22T23:17:22Z 2023-07-22T23:08:22Z OWNER  

Plugins affecting the CLI by default makes sense to me.

I'm less confident about them always affecting users of the Python API.

I'm going to have them apply by default, but I'm going to add a mechanism to opt-out on an individual database basis. Basically this:

```python from sqlite_utils import Database db = Database(memory=True, execute_plugins=False)

Anything using db from here on will not execute plugins

``` cc @asg017

Refs: - #567 - #574

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/575/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816918185 I_kwDOCGYnMM5sS_ip 574 `prepare_connection()` plugin hook simonw 9599 closed 0     3 2023-07-22T22:52:47Z 2023-07-22T23:13:14Z 2023-07-22T22:59:10Z OWNER  

Splitting off an issue for prepare_connection() since Alex got the PR in seconds before I shipped 3.34!

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/567#issuecomment-1646686424

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/574/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1801394744 I_kwDOCGYnMM5rXxo4 567 Plugin system asg017 15178711 closed 0     9 2023-07-12T17:02:14Z 2023-07-22T22:59:37Z 2023-07-22T22:59:36Z CONTRIBUTOR  

I'd like there to be a plugin system for sqlite-utils, similar to the datasette/llm plugins. I'd like to make plugins that would do things like:

  • Register SQLite extensions for more SQL functions + virtual tables
  • Register new subcommands
  • Different input file formats for sqlite-utils memory
  • Different output file formats (in addition to --csv --tsv --nl etc.

A few real-world use-cases of plugins I'd like to see in sqlite-utils:

  • Register many of my sqlite extensions in sqlite-utils (sqlite-http, sqlite-lines, sqlite-regex, etc.)
  • New subcommands to work with sqlite-vss vector tables
  • Input/ouput Parquet/Avro/Arrow IPC files with sqlite-arrow
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/567/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816876211 I_kwDOCGYnMM5sS1Sz 571 `.transform(keep_table=...)` option simonw 9599 closed 0     1 2023-07-22T19:49:29Z 2023-07-22T22:32:18Z 2023-07-22T22:32:18Z OWNER  

Also need a design for an option for the .transform() method to indicate that the new table should be created with a new name without dropping the old one.

I think keep_table="name_of_table" is good for this.

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/565#issuecomment-1646657324

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/571/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816877910 I_kwDOCGYnMM5sS1tW 572 Don't test Python 3.7 against textual simonw 9599 closed 0     2 2023-07-22T19:57:03Z 2023-07-22T22:16:50Z 2023-07-22T22:16:50Z OWNER  

Spotted this in the GitHub Actions logs:

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/572/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1786243905 I_kwDOCGYnMM5qd-tB 564 Document that running `db.transform()` tidies up the schema indentation simonw 9599 closed 0     0 2023-07-03T13:59:28Z 2023-07-22T22:15:34Z 2023-07-22T22:15:34Z OWNER  

... and it turns out running .transform() with no arguments still fixes the format of the schema!

```pycon

db["log"].add_column("foo", str)

<Table log (id, name2, age, weight, foo)> >>> db["log"].add_column("bar", str) <Table log (id, name2, age, weight, foo, bar)> >>> db["log"].add_column("baz", str) <Table log (id, name2, age, weight, foo, bar, baz)> >>> print(db["log"].schema) CREATE TABLE "log" ( [id] INTEGER PRIMARY KEY, [name2] TEXT, [age] INTEGER, [weight] FLOAT , [foo] TEXT, [bar] TEXT, [baz] TEXT) >>> db["log"].transform() <Table log (id, name2, age, weight, foo, bar, baz)> >>> print(db["log"].schema) CREATE TABLE "log" ( [id] INTEGER PRIMARY KEY, [name2] TEXT, [age] INTEGER, [weight] FLOAT, [foo] TEXT, [bar] TEXT, [baz] TEXT ) ``` _Originally posted by @simonw in https://github.com/simonw/llm/issues/65#issuecomment-1618347727_
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/564/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  completed
1205687423 I_kwDOCGYnMM5H3VR_ 426 CLI docs should link to Python docs and vice versa simonw 9599 closed 0 simonw 9599   1 2022-04-15T16:05:15Z 2023-07-22T22:13:22Z 2023-07-22T22:13:22Z OWNER  

For every command/API method there should be a link to the equivalent in the other form factor.

Maybe also link to the API and CLI reference pages too.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/426/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1786258502 I_kwDOCGYnMM5qeCRG 565 Table renaming: db.rename_table() and sqlite-utils rename-table simonw 9599 closed 0     6 2023-07-03T14:07:42Z 2023-07-22T22:12:40Z 2023-07-22T22:12:40Z OWNER  

I find myself wanting two new features in sqlite-utils: - The ability to have the new transformed table set to a specific name, while keeping the old table around - The ability to rename a table (sqlite-utils doesn't have a table rename function at all right now)

Originally posted by @simonw in https://github.com/simonw/llm/issues/65#issuecomment-1618375042

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/565/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816851056 I_kwDOCGYnMM5sSvJw 568 table.create(..., replace=True) simonw 9599 closed 0     7 2023-07-22T18:12:22Z 2023-07-22T19:25:35Z 2023-07-22T19:15:44Z OWNER  

Found myself using this pattern to quickly prototype a schema:

```python import sqlite_utils db = sqlite_utils.Database(memory=True)

print(db["answers_chunks"].create({ "id": int, "content": str, "embedding_type_id": int, "embedding": bytes, "embedding_content_md5": str, "source": str, }, pk="id", transform=True).schema) ```

Using replace=True to drop and then recreate the table would be neat here, and would be consistent with other places that use replace=True.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/568/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816852402 I_kwDOCGYnMM5sSvey 569 register_command plugin hook simonw 9599 closed 0     3 2023-07-22T18:17:27Z 2023-07-22T19:19:35Z 2023-07-22T19:19:35Z OWNER  

I'm going to start by adding the register_command hook using the exact same pattern as Datasette and LLM.

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/567#issuecomment-1646643450

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/569/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1816857105 I_kwDOCGYnMM5sSwoR 570 `sqlite-utils install -e` option simonw 9599 closed 0     0 2023-07-22T18:32:23Z 2023-07-22T18:55:59Z 2023-07-22T18:32:56Z OWNER  

As seen in LLM.

Needed while working on: - #567

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/570/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1795219865 I_kwDOCGYnMM5rAOGZ 566 `--no-headers` doesn't work on most formats zellyn 33625 open 0     2 2023-07-09T03:43:36Z 2023-07-09T04:13:35Z   NONE  

Version 3.33

sqlite-utils query library.db 'select asin from audible' --fmt plain --no-headers | head -3 asin 0062804006 0062891421

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/566/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1785360409 I_kwDOCGYnMM5qanAZ 563 `--empty-null` option when importing CSV simonw 9599 closed 0     1 2023-07-03T05:23:36Z 2023-07-03T05:44:43Z 2023-07-03T05:42:30Z OWNER  

CSV files with empty cells in (which come through as the empty string) are common and a bit gross.

Having an option that means "and if it's an empty string store null instead) would be cool.

I brainstormed name options here https://chat.openai.com/share/c947b738-ee7d-419c-af90-bc84e90987da

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/563/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1784794489 I_kwDOCGYnMM5qYc15 562 Explore the intersection between sqlite-utils and dataclasses simonw 9599 open 0     1 2023-07-02T19:23:08Z 2023-07-02T19:26:39Z   OWNER  

Aside: this makes me think it might be cool if sqlite-utils had a way of working with dataclasses rather than just dicts, and knew how to create a SQLite table to match a dataclass and maybe how to code-generate dataclasses for a specific table schema (dynamically or even using code-generation that can be written to disk, for better editor integrations).

Originally posted by @simonw in https://github.com/simonw/llm/issues/65#issuecomment-1616742529

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/562/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1777548699 I_kwDOCGYnMM5p8z2b 561 `--stop-after` option for `insert` and `upsert` commands simonw 9599 closed 0     1 2023-06-27T18:44:15Z 2023-06-27T18:50:09Z 2023-06-27T18:50:08Z OWNER  

I found myself wanting to insert rows from a 849MB CSV file without processing the whole thing: https://huggingface.co/datasets/jerpint-org/HackAPrompt-Playground-Submissions/tree/main

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/561/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
810618495 MDU6SXNzdWU4MTA2MTg0OTU= 235 Extract columns cannot create foreign key relation: sqlite3.OperationalError: table sqlite_master may not be modified kristomi 6913891 closed 0     18 2021-02-17T23:33:23Z 2023-06-26T01:47:01Z 2023-06-25T23:25:53Z NONE  

Thanks for what seems like a truly great suite of libraries. I wanted to try out Datasette, but never got more than half way through your YouTube video with the SF tree dataset. Whenever I try to extract a column, I get a sqlite3.OperationalError: table sqlite_master may not be modified error from Python. This snippet reproduces the error on my system, Python 3.9.1 and sqlite-utils 3.5 on an M1 Macbook Pro running in rosetta mode: curl "https://data.nasa.gov/resource/y77d-th95.json" | \ sqlite-utils insert meteorites.db meteorites - --pk=id sqlite-utils extract meteorites.db meteorites recclass

I have tried googling the problem, but all I've found is that this might be a problem with the sqlite3 database running in defensive mode, but I definitely can't know for sure. Does the problem seem familiar to you?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/235/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1773450152 I_kwDOCGYnMM5ptLOo 559 sqlean support simonw 9599 closed 0     0 2023-06-25T19:27:26Z 2023-06-25T23:25:53Z 2023-06-25T23:25:53Z OWNER  

If sqlean is available, use that.

Refs: - https://github.com/nalgeon/sqlean.py/issues/1#issuecomment-1605707788

This will provide a good workaround for: - #235

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/559/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1655860104 I_kwDOCGYnMM5ismuI 535 rows: --transpose or psql extended view-like functionality chapmanjacobd 7908073 closed 0     2 2023-04-05T15:37:33Z 2023-06-15T08:39:49Z 2023-06-14T22:05:28Z CONTRIBUTOR  

It would be nice if the rows subcommand had a flag, perhaps called --transpose which would print in long form instead of wide. Similar to extended display mode in psql (\x)

In other words instead of this:

sqlite-utils rows --limit 5 --fmt github track_metadata.db songs

| track_id | title | song_id | release | artist_id | artist_mbid | artist_name | duration | artist_familiarity | artist_hotttnesss | year | track_7digitalid | shs_perf | shs_work | |--------------------|-------------------|--------------------|--------------------------------------|--------------------|--------------------------------------|------------------|------------|----------------------|---------------------|--------|--------------------|------------|------------| | TRMMMYQ128F932D901 | Silent Night | SOQMMHC12AB0180CB8 | Monster Ballads X-Mas | ARYZTJS1187B98C555 | 357ff05d-848a-44cf-b608-cb34b5701ae5 | Faster Pussy cat | 252.055 | 0.649822 | 0.394032 | 2003 | 7032331 | -1 | 0 | | TRMMMKD128F425225D | Tanssi vaan | SOVFVAK12A8C1350D9 | Karkuteillä | ARMVN3U1187FB3A1EB | 8d7ef530-a6fd-4f8f-b2e2-74aec765e0f9 | Karkkiautomaatti | 156.551 | 0.439604 | 0.356992 | 1995 | 1514808 | -1 | 0 | | TRMMMRX128F93187D9 | No One Could Ever | SOGTUKN12AB017F4F1 | Butter | ARGEKB01187FB50750 | 3d403d44-36ce-465c-ad43-ae877e65adc4 | Hudson Mohawke | 138.971 | 0.643681 | 0.437504 | 2006 | 6945353 | -1 | 0 | | TRMMMCH128F425532C | Si Vos Querés | SOBNYVR12A8C13558C | De Culo | ARNWYLR1187B9B2F9C | 12be7648-7094-495f-90e6-df4189d68615 | Yerba Brava | 145.058 | 0.448501 | 0.372349 | 2003 | 2168257 | -1 | 0 | | TRMMMWA128F426B589 | Tangle Of Aspens | SOHSBXH12A8C13B0DF | Rene Ablaze Presents Winter Sessions | AREQDTE1269FB37231 | | Der Mystic | 514.298 | 0 | 0 | 0 | 2264873 | -1 | 0 |

The output would look something like this:

$ for col in (sqlite-columns track_metadata.db songs) sqlite-utils --fmt github track_metadata.db "select $col from songs order by rowid desc limit 5" end

| track_id | |--------------------| | TRYYYVU12903CD01E3 | | TRYYYDJ128F9310A21 | | TRYYYMG128F4260ECA | | TRYYYJO128F426DA37 | | TRYYYUS12903CD2DF0 | | title | |-------------------------------------| | Fernweh feat. Sektion Kuchikäschtli | | Faraday | | Novemba | | Jago Chhadeo | | O Samba Da Vida | | song_id | |--------------------| | SOWXJXQ12AB0189F43 | | SOLXGOR12A81C21EB7 | | SOHODZI12A8C137BB3 | | SOXQYIQ12A8C137FBB | | SOTXAME12AB018F136 | | release | |---------------------------------| | So Oder So | | The Trance Collection Vol. 2 | | Dub_Connected: electronic music | | Naale Baba Lassi Pee Gya | | Pacha V.I.P. | | artist_id | |--------------------| | AR7PLM21187B990D08 | | ARCMCOK1187B9B1073 | | ARZ3R6M1187B9AF750 | | ART5FZD1187B9A7FCF | | AR7Z4J81187FB3FC59 | | artist_mbid | |--------------------------------------| | 3af2b07e-c91c-4160-9bda-f0b9e3144ed3 | | 4ac5f3de-c5ad-475e-ad50-41f1ef9dba20 | | 8b97e9c8-61f5-4615-9a96-276f24204e34 | | 2357c400-9109-42b6-b3fe-9e2d9f8e3872 | | 9d50cb20-7e42-45cc-b0dd-154c3e92a577 | | artist_name | |----------------| | Texta | | Elude | | Gabriel Le Mar | | Kuldeep Manak | | Kiko Navarro | | duration | |------------| | 295.079 | | 484.519 | | 553.038 | | 244.166 | | 217.443 | | artist_familiarity | |----------------------| | 0.552977 | | 0.403668 | | 0.556918 | | 0.4015 | | 0.528617 | | artist_hotttnesss | |---------------------| | 0.454869 | | 0.256935 | | 0.336914 | | 0.374866 | | 0.411595 | | year | |--------| | 2004 | | 0 | | 0 | | 0 | | 0 | | track_7digitalid | |--------------------| | 8486723 | | 5472456 | | 2219291 | | 1632096 | | 7522478 | | shs_perf | |------------| | -1 | | -1 | | -1 | | -1 | | -1 | | shs_work | |------------| | 0 | | 0 | | 0 | | 0 | | 0 |

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/535/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1581090327 I_kwDOCGYnMM5ePYYX 529 Microsoft line endings chapmanjacobd 7908073 closed 0     1 2023-02-12T02:20:48Z 2023-06-14T23:12:12Z 2023-06-14T23:11:47Z CONTRIBUTOR  

sqlite-utils prints \r\n but it should probably print \n (unless the platform is detected as Windows?)

It has tripped me up a few times when piping the output of sqlite-utils to other programs:

$ sqlite-utils --no-headers --csv ~/lb/fs/d.db 'select path from media limit 1' | cat -A /mnt/d7/file^M$ $ sqlite-utils --no-headers --csv ~/lb/fs/d.db 'select path from media limit 1' | tr -d '\r' | cat -A /mnt/d7/file$

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/529/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1383646615 I_kwDOCGYnMM5SeMWX 491 Ability to merge databases and tables sgraaf 8904453 open 0     7 2022-09-23T11:10:55Z 2023-06-14T22:14:24Z   NONE  

Hi! Let me firstly say that I am a big fan of your work -- I follow your tweets and blog posts with great interest 😄.

Now onto the matter at hand: I think it would be great if sqlite-utils included a merge or combine command, with the purpose of combining different SQLite databases into a single SQLite database. This way, the newly "merged" database would contain all differently named tables contained in the databases to be merged as-is, as well a concatenation of all tables of the same name.

This could look something like this:

bash sqlite-utils merge cats.db dogs.db > animals.db

I imagine this is rather straightforward if all databases involved in the merge contain differently named tables (i.e. no chance of conflicts), but things get slightly more complicated if two or more of the databases to be merged contain tables with the same name. Not only do you have to "do something" with the primary key(s), but these tables could also simply have different schemas (and therefore be incompatible for concatenation to begin with).

Anyhow, I would love your thoughts on this, and, if you are open to it, work together on the design and implementation!

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1733198948 I_kwDOCGYnMM5nToRk 555 Filter table by a large bunch of ids redraw 10843208 open 0     1 2023-05-31T00:29:51Z 2023-06-14T22:01:57Z   NONE  

Hi! this might be a question related to both SQLite & sqlite-utils, and you might be more experienced with them.

I have a large bunch of ids, and I'm wondering which is the best way to query them in terms of performance, and simplicity if possible.

The naive approach would be something like select * from table where rowid in (?, ?, ?...) but that wouldn't scale if ids are >1k.

Another approach might be creating a temp table, or in-memory db table, insert all ids in that table and then join with the target one.

I failed to attach an in-memory db both using sqlite-utils, and plain sql's execute(), so my closest approach is something like,

python def filter_existing_video_ids(video_ids): db = get_db() # contains a "videos" table db.execute("CREATE TEMPORARY TABLE IF NOT EXISTS tmp (video_id TEXT NOT NULL PRIMARY KEY)") db["tmp"].insert_all([{"video_id": video_id} for video_id in video_ids]) for row in db["tmp"].rows_where("video_id not in (select video_id from videos)"): yield row["video_id"] db["tmp"].drop()

That kinda worked, I couldn't find an option in sqlite-utils's create_table() to tell it's a temporary table. Also, tmp table is not dropped finally, neither using .drop() despite being created with the keyword TEMPORARY. I believe it should be automatically dropped after connection/session ends though I read.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/555/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1740150327 I_kwDOCGYnMM5nuJY3 557 Aliased ROWID option for tables created from alter=True commands chapmanjacobd 7908073 closed 0     2 2023-06-04T05:29:28Z 2023-06-14T06:09:21Z 2023-06-05T19:26:26Z CONTRIBUTOR  

If you use INTEGER PRIMARY KEY column, the VACUUM does not change the values of that column. However, if you use unaliased rowid, the VACUUM command will reset the rowid values.

ROWID should never be used with foreign keys but the simple act of aliasing rowid to id (which is what happens when one does id integer primary key DDL) makes it OK.

It would be convenient if there were more options to use a string column (eg. filepath) as the PK, and be able to use it during upserts, but when creating a foreign key, to create an integer column which aliases rowid

I made an attempt to switch to integer primary keys here but it is not going well... In my usecase the path column is a business key. Yes, it should be as simple as including the id column in any select statement where I plan on using upsert but it would be nice if this could be abstracted away somehow https://github.com/chapmanjacobd/library/commit/788cd125be01d76f0fe2153335d9f6b21db1343c

https://github.com/chapmanjacobd/library/actions/runs/5173602136/jobs/9319024777

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/557/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1740026046 I_kwDOCGYnMM5ntrC- 556 Support storing incrementally piped values mcint 601708 open 0     1 2023-06-04T00:45:23Z 2023-06-04T01:21:15Z   CONTRIBUTOR  

I'm trying to use sqlite-utils to data generated incrementally. There are a few aspects of this that I don't currently know how to handle. I would like an option to apply writes incrementally, line-by-line as they are received. I would like an option to echo incremental progress. And, it would be nice to have

In particular, I'm using CoreLocationCLI -w -j to generate, newline-delimited JSON.

One variant of the command

stdbuf -oL CoreLocationCLI -w -j | pee 'sqlite-utils insert loc.db loc -' nl

pee, from moreutils, is like tee but spawns and pipes to the processes created by invoking each of its arguments, so, for gratuitous demonstration, pee 'sponge out.log' cat would behave like tee.

It looks like I can get what I want with: stdbuf -oL CoreLocationCLI -w -j | while read line; do <<<"$line" sqlite-utils insert loc.db loc -; echo "$line"; done | nl

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/556/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1720096994 I_kwDOCGYnMM5mhpji 554 `IndexError` when doing `.insert(..., pk='id')` after `insert_all` xavdid 1231935 open 0     1 2023-05-22T17:13:02Z 2023-05-22T17:18:33Z   NONE  

I believe this is related to https://github.com/simonw/sqlite-utils/issues/98.

When pk is specified by table A's insert call, it throws an index error if a different table has written a row with a higher rowid than exists in the first table. Here's a basic example:

```py from sqlite_utils import Database

def test_pk_for_insert(fresh_db): user = {"id": "abc", "name": "david"}

fresh_db["users"].insert(user, pk="id")

fresh_db["comments"].insert_all(
    [
        {"id": "def", "text": "ok"},
        {"id": "ghi", "text": "great"},
    ],
)

fresh_db["users"].insert(
    user,
    ignore=True,
    # BUG: when specifying pk on the second insert call 
    # db.py goes into a block it doesn't expect and we get the error
    pk="id",
)

if name == "main": db = Database("bug.db") if db["users"].exists(): raise ValueError( "bug only shows on a new database - remove bug.db before running the script" ) test_pk_for_insert(db) ```

The error is:

py File "/Users/david/projects/reddit-to-sqlite/.venv/lib/python3.11/site-packages/sqlite_utils/db.py", line 2960, in insert_chunk row = list(self.rows_where("rowid = ?", [self.last_rowid]))[0] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ IndexError: list index out of range

The issue is in this block:

https://github.com/simonw/sqlite-utils/blob/2747257a3334d55e890b40ec58fada57ae8cfbfd/sqlite_utils/db.py#L2954-L2958

relevant locals are:

  • pk: 'id'
  • result.lastrowid: 2

What's most interesting is the comment # self.last_rowid will be 0 if a "INSERT OR IGNORE" happened, which doesn't seem to be the case here.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/554/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1718612569 I_kwDOCGYnMM5mb_JZ 552 Document how to setup shell auto-completion simonw 9599 closed 0     1 2023-05-21T19:20:41Z 2023-05-21T21:05:16Z 2023-05-21T21:03:40Z OWNER  

https://click.palletsprojects.com/en/8.1.x/shell-completion/

This works for zsh:

eval "$(_SQLITE_UTILS_COMPLETE=zsh_source sqlite-utils)"

This will probably work for bash:

eval "$(_SQLITE_UTILS_COMPLETE=bash_source sqlite-utils)"

Need to add this to the installation docs here: https://sqlite-utils.datasette.io/en/stable/installation.html - along with the pattern for adding that to .zshrc or whatever.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/552/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718607907 I_kwDOCGYnMM5mb-Aj 551 Make as many examples in the CLI docs as possible copy-and-pastable simonw 9599 closed 0     6 2023-05-21T19:04:10Z 2023-05-21T21:04:04Z 2023-05-21T20:57:24Z OWNER  

e.g. in this section:

https://sqlite-utils.datasette.io/en/stable/cli.html#running-queries-directly-against-csv-or-json

The little copy button will also copy the $ which breaks the examples when copied.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/551/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718517882 I_kwDOCGYnMM5mboB6 545 Try out Trogon for a tui interface simonw 9599 closed 0     6 2023-05-21T14:08:25Z 2023-05-21T19:33:13Z 2023-05-21T18:41:58Z OWNER  

https://github.com/Textualize/trogon

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/545/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718595700 I_kwDOCGYnMM5mb7B0 550 AttributeError: 'EntryPoints' object has no attribute 'get' for flake8 on Python 3.7 simonw 9599 closed 0     3 2023-05-21T18:24:39Z 2023-05-21T18:42:25Z 2023-05-21T18:41:58Z OWNER  

https://github.com/simonw/sqlite-utils/actions/runs/5039064797/jobs/9036965488

Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.7.16/x64/bin/flake8", line 8, in <module> sys.exit(main()) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/cli.py", line 22, in main app.run(argv) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 363, in run self._run(argv) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 350, in _run self.initialize(argv) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 330, in initialize self.find_plugins(config_finder) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/main/application.py", line 153, in find_plugins self.check_plugins = plugin_manager.Checkers(local_plugins.extension) File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/plugins/manager.py", line 357, in __init__ self.namespace, local_plugins=local_plugins File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/plugins/manager.py", line 238, in __init__ self._load_entrypoint_plugins() File "/opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/flake8/plugins/manager.py", line 254, in _load_entrypoint_plugins eps = importlib_metadata.entry_points().get(self.namespace, ()) AttributeError: 'EntryPoints' object has no attribute 'get' Error: Process completed with exit code 1.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/550/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718576761 I_kwDOCGYnMM5mb2Z5 548 analyze-tables should validate provide --column names simonw 9599 closed 0     1 2023-05-21T17:20:24Z 2023-05-21T17:35:52Z 2023-05-21T17:35:52Z OWNER  

Noticed this while testing: - #547

If you pass a non-existent column to -c/--column you don't get an error message.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/548/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718572201 I_kwDOCGYnMM5mb1Sp 547 No need to show common values if everything is null simonw 9599 closed 0     1 2023-05-21T17:05:07Z 2023-05-21T17:19:21Z 2023-05-21T17:19:21Z OWNER  

Noticed this:

``` % sqlite-utils analyze-tables content.db repos -c delete_branch_on_merge --common-limit 20 --no-least repos.delete_branch_on_merge: (1/1)

Total rows: 158 Null rows: 158 Blank rows: 0

Distinct values: 0

Most common: 158: None ```

The 158: None there is duplicate information considering we already know there are 158/158 null rows.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/547/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1718515590 I_kwDOCGYnMM5mbneG 544 New options for analyze-tables --common-limit --no-most and --no-least simonw 9599 closed 0     2 2023-05-21T14:03:19Z 2023-05-21T17:03:06Z 2023-05-21T16:19:31Z OWNER  

The "least common" section is frequently uninteresting, especially for huge tables with a large number of repeated-once values.

sqlite-utils analyze-tables content.db repos --common-limit 20 --no-least
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/544/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1124731464 I_kwDOCGYnMM5DCgpI 399 Make it easier to insert geometries, with documentation and maybe code simonw 9599 open 0     25 2022-02-05T00:11:26Z 2023-05-16T03:11:52Z   OWNER  

In playing with the new SpatiaLite helpers from #385 I noticed that actually populating geometry columns is still a little bit tricky. Here's what I ended up doing:

```python import httpx, sqlite_utils db = sqlite_utils.Database("/tmp/spatial.db") attractions = httpx.get("https://latest.datasette.io/fixtures/roadside_attractions.json?_shape=array").json() db["attractions"].insert_all(attractions, pk="pk")

Schema of that table is now:

CREATE TABLE [attractions] (

[pk] INTEGER PRIMARY KEY,

[name] TEXT,

[address] TEXT,

[latitude] FLOAT,

[longitude] FLOAT

)

db.init_spatialite() db["attractions"].add_geometry_column("point", "POINT")

db.execute(""" update attractions set point = GeomFromText( 'POINT(' || longitude || ' ' || latitude || ')', 4326 ) """) `` That last line took some figuring out - especially the need for the SRID of4326`, without which I got this error:

IntegrityError: attractions.point violates Geometry constraint [geom-type or SRID not allowed]

It would be good to both document this in more detail, but ideally also to come up with a more obvious pattern for inserting common types of spatial data.

Also related: - #398 - #79

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1578790070 I_kwDOCGYnMM5eGmy2 527 `Table.convert()` skips falsey values mcarpenter 167893 closed 0     5 2023-02-10T00:00:52Z 2023-05-09T21:15:05Z 2023-05-08T21:03:24Z CONTRIBUTOR  

Summary

By design, Table.convert() does not attempt conversion of falsey values (None, "", 0, ...). This is surprising (directly contradicts the docstring) and convert() may quietly skip cells where the user assumed a conversion would take place.

Example

Increment a column of integers by one

``` python from sqlite_utils import Database

db = Database(memory=True) table = db['table'] col = 'x' table.insert_all([{col: 0}, {col:1}]) print(table.get(1)) # 0 print(table.get(2)) # 1 print()

table.convert(col, lambda x: x+1) print(table.get(1)) # got 0, expected 1 ⚠⚠⚠ print(table.get(2)) # got 2, expected 2 ```

Another example might be, say, transforming cells containing empty string to NULL.

Discussion

This was, I think, a pragmatic choice so that consumers can skip writing guard clauses for these falsey values (particularly from the CLI). But this surprising undocumented behavior can lead to incorrect data. I don't think this is a good trade-off between convenience and correctness.

In the absence of this convenience users will either have to write guard clauses into their conversion expressions (or adapt the called function to do the same), so: python fn(value) if value else value instead of: python fn(value) This is more typing and sometimes I will forget, and there will be errors. (But they will be noisy errors, which is a good thing).

Such a change will certainly inconvenience some existing consumers; there will be some breakage. But I think this is worth it to avoid quietly not converting some values by default, which can lead to quietly bad data.

I have a PR that I will attach, please take a look and see what you think.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/527/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1701018909 I_kwDOCGYnMM5lY30d 543 Tests broken on Windows due to new convert() lambda names simonw 9599 closed 0     0 2023-05-08T22:11:29Z 2023-05-08T22:19:04Z 2023-05-08T22:19:04Z OWNER  

https://github.com/simonw/sqlite-utils/actions/runs/4920084038/jobs/8788501314 python sql = 'update [example] set [dt] = lambda_-9223371942137158589([dt]);' From: - #526

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/543/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1516644980 I_kwDOCGYnMM5aZip0 520 rows_from_file() raises confusing error if file-like object is not in binary mode simonw 9599 closed 0     3 2023-01-02T19:00:14Z 2023-05-08T22:08:07Z 2023-05-08T22:08:07Z OWNER  

I got this error:

File "/Users/simon/Dropbox/Development/openai-to-sqlite/openai_to_sqlite/cli.py", line 27, in embeddings rows, _ = rows_from_file(input) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/simon/.local/share/virtualenvs/openai-to-sqlite-jt4obeb2/lib/python3.11/site-packages/sqlite_utils/utils.py", line 305, in rows_from_file first_bytes = buffered.peek(2048).strip() ^^^^^^^^^^^^^^^^^^^ From this code: ```python

@cli.command() @click.argument( "db_path", type=click.Path(file_okay=True, dir_okay=False, allow_dash=False), ) @click.option( "-i", "--input", type=click.File("r"), default="-", ) def embeddings(db_path, input): "Store embeddings for one or more text documents" click.echo("Here is some output") db = sqlite_utils.Database(db_path) rows, _ = rows_from_file(input) print(list(rows)) `` The error went away when I changed it totype=click.File("rb")`.

This should either be called out in the documentation or rows_from_file() should be fixed to handle text-mode files in addition to binary files.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/520/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1279144769 I_kwDOCGYnMM5MPjNB 448 Reading rows from a file => AttributeError: '_io.StringIO' object has no attribute 'readinto' mungewell 236907 closed 0     5 2022-06-21T21:48:27Z 2023-05-08T22:01:00Z 2023-05-08T22:01:00Z NONE  

Attempting to run the example given here (without extra bracket ;-): https://sqlite-utils.datasette.io/en/stable/python-api.html#reading-rows-from-a-file ``` from sqlite_utils.utils import rows_from_file import io

rows, format = rows_from_file(io.StringIO("id,name\n1,Cleo")) print(list(rows), format)

Outputs [{'id': '1', 'name': 'Cleo'}] Format.CSV

```

Gives error ```

"c:\Program Files\Python37\python.exe" test2.py Traceback (most recent call last): File "test2.py", line 4, in <module> rows, format = rows_from_file(io.StringIO("id,name\n1,Cleo")) File "C:\Users\swood\Downloads\sqlite-utils-main-20220621\sqlite-utils-main\sqlite_utils\utils.py", line 300, in rows_from_file first_bytes = buffered.peek(2048).strip() AttributeError: '_io.StringIO' object has no attribute 'readinto' ```

I am running Python on Windows. ```

"c:\Program Files\Python37\python.exe" Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/448/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1575131737 I_kwDOCGYnMM5d4ppZ 525 Repeated calls to `Table.convert()` fail mcarpenter 167893 closed 0     4 2023-02-07T22:40:47Z 2023-05-08T21:59:41Z 2023-05-08T21:54:02Z CONTRIBUTOR  

Summary

When using the API, repeated calls to Table.convert() do not work correctly since all conversions quietly use the callable (function, lambda) from the first call to convert() only. Subsequent invocations with different callables use the callable from the first invocation only.

Example

```python from sqlite_utils import Database

db = Database(memory=True) table = db['table'] col = 'x' table.insert_all([{col: 1}]) print(table.get(1))

table.convert(col, lambda x: x*2) print(table.get(1))

def zeroize(x): return 0

zeroize = lambda x: 0

zeroize.name = 'zeroize'

table.convert(col, zeroize) print(table.get(1)) ```

Output: {'x': 1} {'x': 2} {'x': 4} Expected: {'x': 1} {'x': 2} {'x': 0}

Explanation

This is some relevant documentation.

  • Table.convert() takes a Callable to perform data conversion on a column
  • The Callable is passed to Database.register_function()
  • Database.register_function() uses the callable's __name__ attribute for registration
  • (Aside: all lambdas have a __name__ of <lambda>: I thought this was the problem, and it was close, but not quite)
  • However convert() first wraps the callable by local function convert_value()
  • Consequently register_function() sees name convert_value for all invocations from convert()
  • register_function() silently ignores registrations using the same name, retaining only the first such registration

There's a mismatch between the comments and the code: https://github.com/simonw/sqlite-utils/blob/fc221f9b62ed8624b1d2098e564f525c84497969/sqlite_utils/db.py#L404

but actually the existing function is returned/used instead (as the "registering custom sql functions" doc I linked above says too). Seems like this can be rectified to match the comment?

Suggested fix

I think there are four things: 1. The call to register_function() from convert()should have an explicit name= parameter (to continue using convert_value() and the progress bar). 2. For functions, this name can be the real function name. (I understand the sqlite api needs a name, and it's nice if those are recognizable names where possible). For lambdas would 'lambda-{uuid}' or similar be acceptable? 3. register_function() really should throw an error on repeated attempts to register a duplicate (function, arity)-pair. 4. A test? I haven't looked at the test framework here but seems this should be testable.

See also

  • 458

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/525/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1465194249 I_kwDOCGYnMM5XVRcJ 514 upsert of new row with check constraints fails cldellow 193185 closed 0     5 2022-11-26T16:12:23Z 2023-05-08T21:50:52Z 2023-05-08T21:50:51Z NONE  

(I originally opened this in https://github.com/simonw/datasette-insert/issues/20, but I see that that library depends on sqlite-utils)

In the case of a new row, upsert first adds the row, specifying only its pkeys: https://github.com/simonw/sqlite-utils/blob/965ca0d5f5bffe06cc02cd7741344d1ddddf9d56/sqlite_utils/db.py#L2783-L2787

This means that a table with NON NULL (or other constraint) columns that aren't part of the pkey can't have new rows upserted.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/514/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1044267332 I_kwDOCGYnMM4-PkFE 336 sqlite-util tranform --column-order mangles columns of type "timestamp" fgregg 536941 closed 0     1 2021-11-04T01:15:38Z 2023-05-08T21:13:38Z 2023-05-08T21:13:38Z CONTRIBUTOR  

Reproducible code below:

```bash

echo 'create table bar (baz text, created_at timestamp default CURRENT_TIMESTAMP)' | sqlite3 foo.db sqlite3 foo.db SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. sqlite> .schema bar CREATE TABLE bar (baz text, created_at timestamp default CURRENT_TIMESTAMP); sqlite> .exit sqlite-utils transform foo.db bar --column-order baz sqlite3 foo.db SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. sqlite> .schema bar CREATE TABLE IF NOT EXISTS "bar" ( [baz] TEXT, [created_at] FLOAT DEFAULT 'CURRENT_TIMESTAMP' ); sqlite> .exit sqlite-utils transform foo.db bar --column-order baz sqlite3 foo.db SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. sqlite> .schema bar CREATE TABLE IF NOT EXISTS "bar" ( [baz] TEXT, [created_at] FLOAT DEFAULT '''CURRENT_TIMESTAMP''' ); ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/336/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1432377191 I_kwDOCGYnMM5VYFdn 509 `sqlite-utils transform` breaks DEFAULT string values and STRFTIME() kennysong 2199875 closed 0     0 2022-11-02T02:32:23Z 2023-05-08T21:13:38Z 2023-05-08T21:13:38Z NONE  

Very nice library! Our team found sqlite-utils through @simonw's comment on the "Simple declarative schema migration for SQLite" article, and we were excited to use it, but unfortunately sqlite-utils transform seems to break our DB.

Running sqlite-utils transform to modify a column mangles their DEFAULT values:

  • Default string values are wrapped in extra single quotes
  • Function expressions such as STRFTIME() are turned into strings!

Here are steps to reproduce:

Original database

``` $ sqlite3 test.db << EOF CREATE TABLE mytable ( col1 TEXT DEFAULT 'foo', col2 TEXT DEFAULT (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')) ) EOF

$ sqlite3 test.db "SELECT sql FROM sqlite_master WHERE name = 'mytable';" CREATE TABLE mytable ( col1 TEXT DEFAULT 'foo', col2 TEXT DEFAULT (STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW')) ) ```

Modified database after sqlite-utils

``` $ sqlite3 test.db "INSERT INTO mytable DEFAULT VALUES; SELECT * FROM mytable;" foo|2022-11-02 02:26:58.038

$ sqlite-utils transform test.db mytable --rename col1 renamedcol1

$ sqlite3 test.db "SELECT sql FROM sqlite_master WHERE name = 'mytable';" CREATE TABLE "mytable" ( [renamedcol1] TEXT DEFAULT '''foo''', [col2] TEXT DEFAULT 'STRFTIME(''%Y-%m-%d %H:%M:%f'', ''NOW'')' )

$ sqlite3 test.db "INSERT INTO mytable DEFAULT VALUES; SELECT * FROM mytable;" foo|2022-11-02 02:26:58.038 'foo'|STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW') ```

(Related: #336)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/509/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1700936245 I_kwDOCGYnMM5lYjo1 542 Remove `skip_false=True` and `--no-skip-false` in `sqlite-utils` 4.0 simonw 9599 open 0   4.0 backwards incomatible changes 9374594 1 2023-05-08T21:04:28Z 2023-05-08T21:07:41Z   OWNER  

Following: - #527

The only reason I didn't remove fix this mis-feature entirely is that it represents a backwards incompatible change. I'll make that change in 4.0.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/542/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1595340692 I_kwDOCGYnMM5fFveU 530 add ability to configure "on delete" and "on update" attributes of foreign keys: fgregg 536941 open 0     2 2023-02-22T15:44:14Z 2023-05-08T20:39:01Z   CONTRIBUTOR  

sqlite supports these, and it would be quite nice to be able to add them with sqlite-utils.

https://www.sqlite.org/foreignkeys.html#fk_actions

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/530/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1620254998 I_kwDOCGYnMM5gkyEW 532 Show more information when JSON can't be imported with sqlite-utils insert voltagex 83080728 closed 0     2 2023-03-12T06:41:44Z 2023-05-08T20:32:16Z 2023-05-08T20:32:02Z NONE  

I am currently trying to import the JSON export of my data from Discord, specifically activity/reporting/events-*.json

sqlite-utils.exe insert test.db reporting events-2023-00000-of-00001.json [###################################-] 99% 00:00:00 Error: Invalid JSON - use --csv for CSV or --tsv for TSV files

Please show more information as to why this is invalid, if possible.

I am using version 3.30 with Python 3.10 on Windows 11.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/532/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1695428235 I_kwDOCGYnMM5lDi6L 538 `table.upsert_all` fails to write rows when `not_null` is present xavdid 1231935 closed 0     9 2023-05-04T07:30:38Z 2023-05-08T20:06:35Z 2023-05-08T19:27:02Z NONE  

I found an odd bug today, where calls to table.upsert_all don't write rows if you include the not_null kwarg.

Repro Example

```py from sqlite_utils import Database

db = Database("upsert-test.db")

db["comments"].upsert_all( [{"id": 1, "name": "david"}], pk="id", not_null=["name"], )

assert list(db["comments"].rows) # err! ```

The schema is correctly created:

sql CREATE TABLE [comments] ( [id] INTEGER PRIMARY KEY, [name] TEXT NOT NULL )

But no rows are created. Removing either the not_null kwargs works as expected, as does an insert_all call.

Version Info

  • Python: 3.11.0
  • sqlite-utils: 3.30
  • sqlite: 3.39.5 2022-10-14
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/538/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1700840265 I_kwDOCGYnMM5lYMNJ 541 Get tests to pass with `pytest -Werror` simonw 9599 open 0     1 2023-05-08T19:57:23Z 2023-05-08T19:59:35Z   OWNER  

Inspired by: - #534

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/541/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1622640374 I_kwDOCGYnMM5gt4b2 534 ResourceWarning: unclosed file djhenderson 1244826 closed 0     1 2023-03-14T03:02:18Z 2023-05-08T19:56:29Z 2023-05-08T19:56:29Z NONE  

Issuing either

py -Wdefault -m sqlite_utils insert dogs.db dogs dogs0.csv --csv [#############-----------------------] 36% [####################################] 100%C:\Users\Doug\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlite_utils\cli.py:1187: ResourceWarning: unclosed file <_io.TextIOWrapper name='dogs0.csv' encoding='utf-8-sig'> insert_upsert_implementation( ResourceWarning: Enable tracemalloc to get the object allocation traceback or set pythonwarnings=default sqlite-utils insert dogs.db dogs dogs0.csv --csv [#############-----------------------] 36% [####################################] 100%C:\Users\Doug\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlite_utils\cli.py:1187: ResourceWarning: unclosed file <_io.TextIOWrapper name='dogs0.csv' encoding='utf-8-sig'> insert_upsert_implementation( ResourceWarning: Enable tracemalloc to get the object allocation traceback

exhibits a ResourceWarning indicating that the CSV file being loaded is not closed.

sqlite-utils --version sqlite-utils, version 3.30 py --version Python 3.11.2 Windows Version 10.0.19045 Build 19045 SQLite version 3.41.0 2023-02-21 18:09:37

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/534/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1699184583 I_kwDOCGYnMM5lR3_H 540 sphinx.builders.linkcheck build error simonw 9599 closed 0     4 2023-05-07T18:37:09Z 2023-05-08T04:56:13Z 2023-05-07T18:42:36Z OWNER  

https://readthedocs.org/projects/sqlite-utils/builds/20512693/ ``` Running Sphinx v6.2.1

Traceback (most recent call last): File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/sphinx/registry.py", line 442, in load_extension mod = import_module(extname) File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 671, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 783, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/sphinx/builders/linkcheck.py", line 20, in <module> from requests import Response File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/requests/init.py", line 43, in <module> import urllib3 File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/urllib3/init.py", line 38, in <module> raise ImportError( ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2n 7 Dec 2017. See: https://github.com/urllib3/urllib3/issues/2168

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/sphinx/cmd/build.py", line 280, in build_main app = Sphinx(args.sourcedir, args.confdir, args.outputdir, File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/sphinx/application.py", line 225, in init self.setup_extension(extension) File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/sphinx/application.py", line 404, in setup_extension self.registry.load_extension(self, extname) File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/latest/lib/python3.8/site-packages/sphinx/registry.py", line 445, in load_extension raise ExtensionError(__('Could not import extension %s') % extname, sphinx.errors.ExtensionError: Could not import extension sphinx.builders.linkcheck (exception: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2n 7 Dec 2017. See: https://github.com/urllib3/urllib3/issues/2168)

Extension error: Could not import extension sphinx.builders.linkcheck (exception: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2n 7 Dec 2017. See: https://github.com/urllib3/urllib3/issues/2168) ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/540/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1699174055 I_kwDOCGYnMM5lR1an 539 `--raw-lines` option, like `--raw` for multiple lines simonw 9599 closed 0     4 2023-05-07T18:07:46Z 2023-05-07T18:43:24Z 2023-05-07T18:26:18Z OWNER  

I wanted to output newline-separated output of the first column of every row in the results - like --row but for more than one line.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/539/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1393202060 I_kwDOCGYnMM5TCpOM 496 devrel/python api: Pylance type hinting chapmanjacobd 7908073 open 0     4 2022-10-01T03:03:34Z 2023-05-03T05:53:27Z   CONTRIBUTOR  

Pylance is generally pretty good at figuring out stuff but sqlite-utils has some quirks which make type hinting kinda useless. Maybe you don't care but I thought I would bring it to your attention.

For example:

db["subs"].insert_all(subs, pk="index")

Cannot access member "insert_all" for type "View" Member "insert_all" is unknown

insert_all and all the other methods show up as a type issues because the program can't know whether something is a View or a Table. Fair enough. But that basically throws all type checking out the window.

pk="index" also shows up as a type issue:

Argument of type "Literal['index']" cannot be assigned to parameter "pk" of type "Default" in function "insert_all" "Literal['index']" is incompatible with "Default"

I think this is because DEFAULT is an empty class?

maybe a few small changes could be made to make the library more type-friendly

The interim solution is of course to turn off type hints completely for the line db["subs"].insert_all(subs, pk="index") # type: ignore

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/496/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1203842656 I_kwDOCGYnMM5HwS5g 425 `sqlite3.NotSupportedError`: deterministic=True requires SQLite 3.8.3 or higher simonw 9599 closed 0     5 2022-04-13T22:16:53Z 2023-04-15T20:14:58Z 2022-04-13T22:48:57Z OWNER  

Got this error while investigating: - #421

Even though I was using the LD_PRELOAD trick from https://til.simonwillison.net/sqlite/ld-preload to use a newer version of SQLite.

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/421#issuecomment-1098531354

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/425/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
907795562 MDU6SXNzdWU5MDc3OTU1NjI= 265 Using enable_fts before search term prabhur 36287 open 0     1 2021-06-01T01:43:34Z 2023-04-01T17:27:18Z   NONE  

Many thanks for the sqlite-utils suite of utilities. Has made my life much much easier. I used this to create a table and enable FTS. All works fine. The datasette utility detects FTS and shows a text box. Searching for a term using that interface works well.

However, when I start to use features by following https://www.sqlite.org/fts5.html section "3. Full-text Query Syntax" I seem to run into issues that I suspect is due to escape_fts wrapper function.

As an example, if i search for the term "^குகை"on the text box in datasette it produces 140 results. However, when i tweak the query produced by datasette to not use "escape_fts" it produces 5 results.

Similarly, when I try to restrict the search to a single column in FTS using a spec like {title : ^குகை} it returns no rows. The same thing pulls results when used without escape_fts. The text in the table is in Tamil language and the search term is a Tamil word.

... where posts_fts match escape_fts(:search) vs

... where posts_fts match (:search)

Any ideas why? How can I get the benefits of both escaping as well as utilizing different facets of providing / controlling search terms? Thanks.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/265/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
702386948 MDU6SXNzdWU3MDIzODY5NDg= 159 .delete_where() does not auto-commit (unlike .insert() or .upsert()) spdkils 11712349 open 0     9 2020-09-16T01:55:52Z 2023-04-01T17:21:05Z   NONE  

When you use the delete_where() function on a table, it never commits....

Is that intentional?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/159/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1114543475 I_kwDOCGYnMM5CbpVz 388 Link to stable docs from older versions simonw 9599 closed 0     7 2022-01-26T01:55:46Z 2023-03-26T23:43:12Z 2022-01-26T02:00:22Z OWNER  

https://sqlite-utils.datasette.io/en/2.14.1/ isn't showing a link to the stable release right now.

I should also apply the same fix I used for Datasette in: - https://github.com/simonw/datasette/issues/1608

TIL: https://til.simonwillison.net/readthedocs/link-from-latest-to-stable

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/388/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1620516340 I_kwDOCGYnMM5glx30 533 ReadTheDocs error: not all arguments converted during string formatting simonw 9599 closed 0     2 2023-03-12T21:21:05Z 2023-03-12T21:25:33Z 2023-03-12T21:25:33Z OWNER  

This came up as a failure running tests for: - #531

Traceback on https://readthedocs.org/projects/sqlite-utils/builds/19749348/

``` File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/531/lib/python3.8/site-packages/docutils/parsers/rst/states.py", line 889, in interpreted nodes, messages2 = role_fn(role, rawsource, text, lineno, self) File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/531/lib/python3.8/site-packages/sphinx/ext/extlinks.py", line 103, in role title = caption % part TypeError: not all arguments converted during string formatting

Exception occurred: File "/home/docs/checkouts/readthedocs.org/user_builds/sqlite-utils/envs/531/lib/python3.8/site-packages/sphinx/ext/extlinks.py", line 103, in role title = caption % part TypeError: not all arguments converted during string formatting ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/533/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1572766460 I_kwDOCGYnMM5dvoL8 524 Transformation type `--type DATETIME` 4l1fe 21095447 closed 0     15 2023-02-06T15:18:42Z 2023-02-15T12:10:54Z 2023-02-15T12:10:54Z NONE  

Hey. Currently i do transformation with the type --type TEXT, but i noticed using the sqlalchemy based library dataset that is reading and writing differ depending on the column types TEXT, DATETIME.

Is it possible to alter a column type to DATETIME somehow using Sqlite-Utils?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/524/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1560651350 I_kwDOCGYnMM5dBaZW 523 Feature request: trim all leading and trailing white space for all columns for all tables in a database fgregg 536941 open 0     1 2023-01-28T02:40:10Z 2023-01-28T02:41:14Z   CONTRIBUTOR  

It's pretty common that i need to trim leading or trailing white space from lots of columns in a database a part of an initial ETL.

I use the following recipe a lot, and it would be great to include this functionality into sqlite-utils

trimify.sql sql select 'select group_concat(''update [' || name || '] set ['' || name || ''] = trim(['' || name || ''])'', ''; '') || ''; '' as sql_to_run from pragma_table_info('''||name||''');' from sqlite_schema;

then something like:

bash sqlite3 example.db < scripts/trimify.sql > table_trim.sql && \ sqlite3 $example.db < table_trim.sql > trim.sql && \ sqlite3 $example.db < trim.sql

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/523/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1550536442 I_kwDOCGYnMM5ca076 521 Custom JSON encoder janrito 31504 open 0     0 2023-01-20T09:19:40Z 2023-01-20T09:19:40Z   NONE  

It would be nice if we could specify a custom encoder (and decoder) for types that will need extra deserialisation – e.g., sets, enums or sparse matrices – or even project-specific types

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/521/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1373224657 I_kwDOCGYnMM5R2b7R 488 `sqlite-utils transform` should set empty strings to null when converting text columns to integer/float simonw 9599 open 0     5 2022-09-14T15:51:30Z 2022-12-23T17:38:55Z   OWNER  

/tmp % echo "id,age,weight\n1,3,2.5\n2,," | sqlite-utils insert test.db test - --csv /tmp % sqlite-utils schema test.db CREATE TABLE [test] ( [id] TEXT, [age] TEXT, [weight] TEXT ); /tmp % sqlite-utils transform test.db test --type age integer --type weight float /tmp % sqlite-utils schema test.db CREATE TABLE "test" ( [id] TEXT, [age] INTEGER, [weight] FLOAT ); /tmp % sqlite-utils rows test.db test [{"id": "1", "age": 3, "weight": 2.5}, {"id": "2", "age": "", "weight": ""}] It would be neat if this resulted in the following instead: {"id": "2", "age": null, "weight": null} Related Discord discussion: https://discord.com/channels/823971286308356157/823971286941302908/1019635490833567794

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/488/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1487764628 I_kwDOCGYnMM5YrXyU 518 flake8 ValueError: Error code '#' supplied to 'extend-ignore' option... simonw 9599 closed 0     0 2022-12-10T01:30:24Z 2022-12-10T01:36:46Z 2022-12-10T01:36:46Z OWNER  

Error code '#' supplied to 'extend-ignore' option does not match '^[A-Z]{1,3}[0-9]{0,3}$'

https://github.com/simonw/sqlite-utils/actions/runs/3662011265/jobs/6190770361

I think from this:

https://github.com/simonw/sqlite-utils/blob/e660635cea6c32f4022818380b1e1ee88e7c93a6/setup.cfg#L1-L3

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/518/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1487757143 I_kwDOCGYnMM5YrV9X 517 Drop support for Python 3.6 simonw 9599 closed 0     1 2022-12-10T01:23:31Z 2022-12-10T01:36:36Z 2022-12-10T01:36:36Z OWNER  

CI has started failing for Python 3.6: https://github.com/simonw/sqlite-utils/actions/runs/3576322798

It's fixable by swiching away from ubuntu-latest according to:

  • https://github.com/actions/setup-python/issues/355#issuecomment-1335042510

But https://endoflife.date/python says that 3.6 end of life was almost 6 years ago, and end of security support nearly 1 year ago.

So I'm OK dropping support entirely - Python 3.6 users will still be able to install version 3.30, just not any releases that come next.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/517/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1479914599 I_kwDOCGYnMM5YNbRn 516 Feature request: output number of ignored/replaced rows for insert command simonw 9599 open 0     4 2022-12-06T18:59:21Z 2022-12-06T19:08:14Z   OWNER  

https://hachyderm.io/@briandorsey/109468185742876820

I'm fiddling with piping json to insert -ignore I'd love to see the count of records inserted & ignored, but didn't see a way to do that in the help/docs.

Example: xh "https://hachyderm.io/api/v1/timelines/tag/rust?max_id=109443380308326328" | sqlite-utils insert aoc.db aoc - --pk=id --ignore

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/516/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1434911255 I_kwDOCGYnMM5VhwIX 510 Cannot enable FTS5 despite it being available ar-jan 1176293 closed 0     3 2022-11-03T16:03:49Z 2022-11-18T18:37:52Z 2022-11-17T10:36:28Z NONE  

When I do sqlite-utils enable-fts my.db table_name column_name (with or without --fts5), I get an FTS4 virtual table instead of the expected FTS5.

FTS5 is however available and Python/SQLite versions do not seem to be the issue. I can manually create the FTS5 virtual table, and then Datasette also works with it from this same Python environment.

>>> sqlite3.version 2.6.0 >>> sqlite3.sqlite_version 3.39.4

PRAGMA compile_options; includes ENABLE_FTS5.

sqlite-utils, version 3.30.

Any ideas what's happening and how to fix?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/510/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1453134846 I_kwDOCGYnMM5WnRP- 513 Add or document streamlined workflow for importing Datasette csv / json exports henry501 19328961 open 0     0 2022-11-17T10:54:47Z 2022-11-17T10:54:47Z   NONE  

I'm working on some small front-end enhancements to the laion-aesthetic-datasette project, and I wanted to partially populate a database directly using exports from the existing Datasette instance instead of downloading the parquet files and creating my own multi-GB database.

There have been a number of small issues that are certainly related to my relative lack of familiarity with the toolkit, but that are still surprising.

For example: a CSV export of the images table (http://laion-aesthetic.datasette.io/laion-aesthetic-6pls.csv?sql=select+rowid%2C+url%2C+text%2C+domain_id%2C+width%2C+height%2C+similarity%2C+punsafe%2C+pwatermark%2C+aesthetic%2C+hash%2C+index_level_0+from+images+order+by+random%28%29+limit+100) has nested single quotes, double quotes, and commas that aren't handled by rows_from_file. Similarly, the json output has to be manually transformed to add the column names and remove extraneous information before sqlite_utils can import it.

I was able to work through these issues, but as an enhancement it would be really helpful to create or document a clear workflow that avoids the friction of this data transformation.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/513/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1450952393 I_kwDOCGYnMM5We8bJ 512 mypy failures in CI simonw 9599 closed 0     3 2022-11-16T06:22:48Z 2022-11-16T07:49:51Z 2022-11-16T07:49:50Z OWNER  

https://github.com/simonw/sqlite-utils/actions/runs/3472012235 failed on Python 3.11:

Truncated output: sqlite_utils/db.py:2467: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True sqlite_utils/db.py:2467: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase sqlite_utils/db.py:2530: error: Incompatible default for argument "where" (default has type "None", argument has type "str") [assignment] sqlite_utils/db.py:2530: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True sqlite_utils/db.py:2530: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase sqlite_utils/db.py:2658: error: Argument 1 to "count_where" of "Queryable" has incompatible type "Optional[str]"; expected "str" [arg-type] Found 23 errors in 1 file (checked 51 source files) Best look at https://github.com/hauntsaninja/no_implicit_optional

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/512/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1436539554 I_kwDOCGYnMM5Vn9qi 511 [insert_all, upsert_all] IntegrityError: constraint failed chapmanjacobd 7908073 closed 0     2 2022-11-04T19:21:48Z 2022-11-04T22:59:54Z 2022-11-04T22:54:09Z CONTRIBUTOR  

My understand is that INSERT OR IGNORE will ignore when inserts would cause duplicate keys so I'm not sure exactly why the error is raised from sqlite3.

``` import argparse from pathlib import Path

from xklb import db, utils from xklb.utils import log

def parse_args() -> argparse.Namespace: parser = argparse.ArgumentParser() parser.add_argument("database") parser.add_argument("dbs", nargs="*") parser.add_argument("--upsert") parser.add_argument("--db", "-db", help=argparse.SUPPRESS) parser.add_argument("--verbose", "-v", action="count", default=0) args = parser.parse_args()

if args.db:
    args.database = args.db
Path(args.database).touch()
args.db = db.connect(args)
log.info(utils.dict_filter_bool(args.__dict__))

return args

def merge_db(args, source_db): source_db = str(Path(source_db).resolve())

s_db = db.connect(argparse.Namespace(database=source_db, verbose=args.verbose))
for table in [s for s in s_db.table_names() if not "_fts" in s and not s.startswith("sqlite_")]:
    log.info("[%s]: %s", source_db, table)
    with s_db.conn:
        data = s_db[table].rows

    with args.db.conn:
        if args.upsert:
            args.db[table].upsert_all(data, pk=args.upsert.split(","), alter=True)
        else:
            args.db[table].insert_all(data, alter=True, replace=True)

def merge_dbs(): args = parse_args() for s_db in args.dbs: merge_db(args, s_db)

if name == "main": merge_dbs()

```

``` $ lb-dev merge video.db tube_71.db --upsert path -vv SQL: INSERT OR IGNORE INTO media VALUES(?); - params: ['https://archive.org/details/088ghostofachanceroygetssackedrevengeofthelivinglunchdvdripxvidphz'] ... File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:3122, in Table.insert_all(self, records, pk, foreign_keys, column_order, not_null, defaults, batch_size, hash_id, hash_id_columns, alter, ignore, replace, truncate, extracts, conversions, columns, upsert, analyze) 3116 all_columns += [ 3117 column for column in record if column not in all_columns 3118 ] 3120 first = False -> 3122 self.insert_chunk( 3123 alter, 3124 extracts, 3125 chunk, 3126 all_columns, 3127 hash_id, 3128 hash_id_columns, 3129 upsert, 3130 pk, 3131 conversions, 3132 num_records_processed, 3133 replace, 3134 ignore, 3135 ) 3137 if analyze: 3138 self.analyze()

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:2887, in Table.insert_chunk(self, alter, extracts, chunk, all_columns, hash_id, hash_id_columns, upsert, pk, conversions, num_records_processed, replace, ignore) 2885 for query, params in queries_and_params: 2886 try: -> 2887 result = self.db.execute(query, params) 2888 except OperationalError as e: 2889 if alter and (" column" in e.args[0]): 2890 # Attempt to add any missing columns, then try again

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:484, in Database.execute(self, sql, parameters) 482 self._tracer(sql, parameters) 483 if parameters is not None: --> 484 return self.conn.execute(sql, parameters) 485 else: 486 return self.conn.execute(sql)

IntegrityError: constraint failed

/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py(484)execute() 482 self._tracer(sql, parameters) 483 if parameters is not None: --> 484 return self.conn.execute(sql, parameters) 485 else: 486 return self.conn.execute(sql) ```

sqlite3 --version 3.36.0 2021-06-18 18:36:39

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/511/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
473083260 MDU6SXNzdWU0NzMwODMyNjA= 50 "Too many SQL variables" on large inserts simonw 9599 closed 0     4 2019-07-25T21:43:31Z 2022-11-04T14:38:36Z 2019-07-28T11:59:33Z OWNER  

Reported here: https://github.com/dogsheep/healthkit-to-sqlite/issues/9

It looks like there's a default limit of 999 variables - we need to be smart about that, maybe dynamically lower the batch size based on the number of columns.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/50/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1429029604 I_kwDOCGYnMM5VLULk 506 Make `cursor.rowcount` accessible (wontfix) simonw 9599 closed 0     3 2022-10-30T21:51:55Z 2022-11-01T17:37:47Z 2022-11-01T17:37:13Z OWNER  

In building this Datasette feature on top of sqlite-utils I thought it might be useful to expose the number of rows that had been affected by a bulk insert or update - the cursor.rowcount:

  • https://github.com/simonw/datasette/issues/1866

This isn't currently exposed by sqlite-utils.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/506/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1430325103 I_kwDOCGYnMM5VQQdv 507 conn.execute: UnicodeEncodeError: 'utf-8' codec can't encode character chapmanjacobd 7908073 closed 0     1 2022-10-31T18:49:51Z 2022-11-01T00:40:17Z 2022-11-01T00:40:16Z CONTRIBUTOR  

I'm not really sure what caused this and it happened in the middle of my program (after running for 35775 seconds).

Extracting metadata 49.9% (chunk 9893 of 19831) ... File "/home/xk/.local/lib/python3.10/site-packages/xklb/fs_extract.py", line 90, in extract_chunk args.db["media"].insert_all(utils.list_dict_filter_bool(media), pk="path", alter=True, replace=True) File "/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py", line 3107, in insert_all self.insert_chunk( File "/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py", line 2872, in insert_chunk result = self.db.execute(query, params) File "/home/xk/.local/lib/python3.10/site-packages/sqlite_utils/db.py", line 483, in execute return self.conn.execute(sql, parameters) UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 62: surrogates not allowed

This might be relevant: https://stackoverflow.com/questions/31898353/python-cant-encode-with-surrogateescape

I'm going to try re-running with

`py def execute( self, sql: str, parameters: Optional[Union[Iterable, dict]] = None ) -> sqlite3.Cursor: """ Execute SQL query and return asqlite3.Cursor``.

    :param sql: SQL query to execute
    :param parameters: Parameters to use in that query - an iterable for ``where id = ?``
      parameters, or a dictionary for ``where id = :id``
    """
    try:
        if self._tracer:
            self._tracer(sql, parameters)
        if parameters is not None:
            return self.conn.execute(sql, parameters)
        else:
            return self.conn.execute(sql)
    except UnicodeEncodeError:
        sql = sql.encode('utf-8', 'surrogatepass').decode('utf-8')
        if parameters is not None:
            parameters = parameters.encode('utf-8', 'surrogatepass').decode('utf-8')
        return self.execute(sql, parameters)

```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/507/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1423182778 I_kwDOCGYnMM5U1Au6 505 Release sqlite-utils 3.30 simonw 9599 closed 0     2 2022-10-25T22:20:05Z 2022-10-25T22:41:26Z 2022-10-25T22:41:16Z OWNER  

https://github.com/simonw/sqlite-utils/compare/3.29...defa2974c6d3abc19be28d6b319649b8028dc966

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/505/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1386562662 I_kwDOCGYnMM5SpURm 493 Tiny typographical error in install/uninstall docs simonw 9599 open 0     3 2022-09-26T19:00:42Z 2022-10-25T21:31:15Z   OWNER  

Added in: - #483

I don't know how to fix this in Sphinx: I'm getting this: https://sqlite-utils.datasette.io/en/latest/cli.html#cli-install

The insert –convert and query –functions options

But I want it to display insert --convert and not insert –convert there.

Here's the code: https://github.com/simonw/sqlite-utils/blob/85247038f70d7eb2f3e272cfeaa4c44459cafba8/docs/cli.rst#L2125

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/493/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1392690202 I_kwDOCGYnMM5TAsQa 495 Support JSON values returned from .convert() functions mhalle 649467 closed 0     3 2022-09-30T16:33:49Z 2022-10-25T21:23:37Z 2022-10-25T21:23:28Z NONE  

When using the convert function on a JSON column, the result of the conversion function must be a string. If the return value is either a dict (object) or a list (array), the convert call will error out with an unhelpful user defined function exception.

It makes sense that since the original column value was a string and required conversion to data structures, the result should be converted back into a JSON string as well. However, other functions auto-convert to JSON string representation, so the fact that convert doesn't could be surprising.

At least the documentation should note this requirement, because the sqlite error messages won't readily reveal the issue.

Jf only sqlite's JSON column type meant something :)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/495/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 83.435ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows