home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

89 rows where repo = 140912432 and state = "open" sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, author_association, draft, created_at (date), updated_at (date)

type 2

  • issue 85
  • pull 4

state 1

  • open · 89 ✖

repo 1

  • sqlite-utils · 89 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1988525411 I_kwDOCGYnMM52hn1j 603 Pyhton 3.12 Bug report constantinedev 1324252 open 0     1 2023-11-10T22:57:48Z 2023-12-08T05:10:31Z   NONE  

I start with new python3 verison 3.12.0 Also have the error where connect DataBase

Traceback (most recent call last): File "/home/t/Development/python/FKPJ/ClinicSYS/run.py", line 1, in <module> import re, os, io, json, sqlite_utils, requests, pytz, logging File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/__init__.py", line 1, in <module> from .db import Database File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py", line 277, in <module> class Database: File "/home/t/.local/lib/python3.12/site-packages/sqlite_utils/db.py", line 306, in Database filename_or_conn: Optional[Union[str, pathlib.Path, sqlite3.Connection]] = None, ^^^^^^^^^^^^^^^^^^ This bug come from sqlite-utils since's v3.33. Anyone get the same ?

As well now of the resolved plan just keep the sqlite-utils version in python3.12 with v3.32.1 [tested] but where are the sqlite3.Connection problem....

This won't happen on python version down to 3.11[tested] Just the python3.12.0, I have test this error are come from the sqlite3 connection The error say from sqlite_utils and with the sqlite3 Connection, what can I do.

Let fix together.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/603/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1978603203 I_kwDOCGYnMM517xbD 602 `sqlite-utils transform` removes the `AUTOINCREMENT` keyword ArsTapatun 4472046 open 0     0 2023-11-06T08:48:43Z 2023-11-06T08:48:43Z   NONE  

Context

We ran into this bug randomly, noticing that deleted ROWID would get reused after migrating the DB. Using transform to change any column in the table will also unexpectedly strip away the AUTOINCREMENT keyword from the primary key definition, even if it was not the transformation target.

Reproducible example

Original database

```sql $ sqlite3 test.db << EOF CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ) EOF

$ sqlite3 test.db ".schema mytable" CREATE TABLE mytable ( col1 INTEGER PRIMARY KEY AUTOINCREMENT, col2 TEXT NOT NULL ); ```

Modified database after sqlite-utils

```sql $ sqlite-utils transform test.db mytable --rename col2 renamedcol2

$ sqlite3 test.db "SELECT sql FROM sqlite_master WHERE name = 'mytable';" CREATE TABLE IF NOT EXISTS "mytable" ( [col1] INTEGER PRIMARY KEY, [renamedcol2] TEXT NOT NULL ); ```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/602/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1977155641 I_kwDOCGYnMM512QA5 601 Move plugin directory into documentation simonw 9599 open 0     0 2023-11-04T04:07:52Z 2023-11-04T04:07:52Z   OWNER  

https://github.com/simonw/sqlite-utils-plugins should be in the official documentation.

I can use the same pattern as https://llm.datasette.io/en/stable/plugins/directory.html

https://til.simonwillison.net/readthedocs/stable-docs

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/601/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1920416843 I_kwDOCGYnMM5ydzxL 597 sqlite-utils insert-files should be able to convert fields grimnight 1737541 open 0     0 2023-09-30T22:20:47Z 2023-09-30T22:20:47Z   NONE  

Currently using both insert-files and convert is needed in order to create sqlar files, it would be more convenient if it could be done with just one command.

```shell ~ ❯ cat test.py import os

class Example: def init(self, arg1, arg2): self.arg1 = arg1

~ ❯ sqlite-utils insert-files test.sqlar sqlar test.py -c name:name -c data:content -c mode:mode -c mtime:mtime -c sz:size --pk=name [####################################] 100%

~ ❯ sqlite-utils convert test.sqlar sqlar data "zlib.compress(value)" --import=zlib --where "name = 'test.py'" [####################################] 100%

~ ❯ cat test.py | sqlite-utils convert test.sqlar sqlar data "zlib.compress(sys.stdin.buffer.read())" --import=zlib --import=sys --where "name = 'test.py'" # Alternative way [####################################] 100%

~ ❯ sqlite3 test.sqlar "SELECT hex(data) FROM sqlar WHERE name = 'test.py';" | python3 -c "import sys, zlib; sys.stdout.buffer.write(zlib.decompress(bytes.fromhex(sys.stdin.read())))" import os

class Example: def init(self, arg1, arg2): self.arg1 = arg1

~ ❯ rm test.py

~ ❯ sqlar -l test.sqlar test.py

~ ❯ sqlar -x test.sqlar

~ ❯ cat test.py import os

class Example: def init(self, arg1, arg2): self.arg1 = arg1

```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/597/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
944846776 MDU6SXNzdWU5NDQ4NDY3NzY= 297 Option for importing CSV data using the SQLite .import mechanism simonw 9599 open 0     23 2021-07-14T22:36:41Z 2023-09-22T20:49:52Z   OWNER  

As seen in https://til.simonwillison.net/sqlite/import-csv - .mode csv and then .import school.csv schools is hugely faster than importing via sqlite-utils insert and doing the work in Python - but it can only be implemented by shelling out to the sqlite3 CLI tool, it's not functionality that is exposed to the Python sqlite3 module.

An option to use this would be useful - maybe something like this:

sqlite-utils insert blah.db blah blah.csv --fast
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/297/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1891614971 I_kwDOCGYnMM5wv8D7 594 Represent compound foreign keys in table.foreign_keys output simonw 9599 open 0     2 2023-09-12T03:48:24Z 2023-09-12T03:51:13Z   OWNER  

Given this schema: sql CREATE TABLE departments ( campus_name TEXT NOT NULL, dept_code TEXT NOT NULL, dept_name TEXT, PRIMARY KEY (campus_name, dept_code) ); CREATE TABLE courses ( course_code TEXT PRIMARY KEY, course_name TEXT, campus_name TEXT NOT NULL, dept_code TEXT NOT NULL, FOREIGN KEY (campus_name, dept_code) REFERENCES departments(campus_name, dept_code) ); The output of db["courses"].foreign_keys right now is: [ForeignKey(table='courses', column='campus_name', other_table='departments', other_column='campus_name'), ForeignKey(table='courses', column='dept_code', other_table='departments', other_column='dept_code')] Which suggests two normal foreign keys, not one compound foreign key.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/594/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1879214365 I_kwDOCGYnMM5wAokd 590 Ability to tell if a Database is an in-memory one simonw 9599 open 0     1 2023-09-03T19:50:15Z 2023-09-03T19:50:36Z   OWNER  

Currently the constructor accepts memory=True or memory_name=... and uses those to create a connection, but does not record what those values were:

https://github.com/simonw/sqlite-utils/blob/1260bdc7bfe31c36c272572c6389125f8de6ef71/sqlite_utils/db.py#L307-L349

This makes it hard to tell if a database object is to an in-memory or a file-based database, which is sometimes useful to know.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/590/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1879209560 I_kwDOCGYnMM5wAnZY 589 Mechanism for de-registering registered SQL functions simonw 9599 open 0     3 2023-09-03T19:32:39Z 2023-09-03T19:36:34Z   OWNER  

I used a custom SQL function in a migration script and then realized that it should be de-registered before the end of the script to avoid leaking into the calling code.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/589/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1868713944 I_kwDOCGYnMM5vYk_Y 588 `table.get(column=value)` option for retrieving things not by their primary key simonw 9599 open 0     1 2023-08-28T00:41:23Z 2023-08-28T00:41:54Z   OWNER  

This came up working on this feature: - https://github.com/simonw/llm/pull/186

I have a table with this schema: sql CREATE TABLE [collections] ( [id] INTEGER PRIMARY KEY, [name] TEXT, [model] TEXT ); CREATE UNIQUE INDEX [idx_collections_name] ON [collections] ([name]); So the primary key is an integer (because it's going to have a huge number of rows foreign key related to it, and I don't want to store a larger text value thousands of times), but there is a unique constraint on the name - that would be the primary key column if not for all of those foreign keys.

Problem is, fetching the collection by name is actually pretty inconvenient.

Fetch by numeric ID:

python try: table["collections"].get(1) except NotFoundError: # It doesn't exist Fetching by name: python def get_collection(db, collection): rows = db["collections"].rows_where("name = ?", [collection]) try: return next(rows) except StopIteration: raise NotFoundError("Collection not found: {}".format(collection)) It would be neat if, for columns where we know that we should always get 0 or one result, we could do this instead: python try: collection = table["collections"].get(name="entries") except NotFoundError: # It doesn't exist The existing .get() method doesn't have any non-positional arguments, so using **kwargs like that should work:

https://github.com/simonw/sqlite-utils/blob/1260bdc7bfe31c36c272572c6389125f8de6ef71/sqlite_utils/db.py#L1495

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/588/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1856075668 I_kwDOCGYnMM5uoXeU 586 .transform() fails to drop column if table is part of a view simonw 9599 open 0     3 2023-08-18T05:25:22Z 2023-08-18T06:13:47Z   OWNER  

I got this error trying to drop a column from a table that was part of a SQL view:

error in view plugins: no such table: main.pypi_releases

Upon further investigation I found that this pattern seemed to fix it: python def transform_the_table(conn): # Run this in a transaction: with conn: # We have to read all the views first, because we need to drop and recreate them db = sqlite_utils.Database(conn) views = {v.name: v.schema for v in db.views if table.lower() in v.schema.lower()} for view in views.keys(): db[view].drop() db[table].transform( types=types, rename=rename, drop=drop, column_order=[p[0] for p in order_pairs], ) # Now recreate the views for name, schema in views.items(): db.create_view(name, schema) So grab a copy of any view that might reference this table, start a transaction, drop those views, run the transform, recreate the views again.

I wonder if this should become an option in sqlite-utils? Maybe a recreate_views=True argument for table.tranform(...)? Should it be opt-in or opt-out?

Originally posted by @simonw in https://github.com/simonw/datasette-edit-schema/issues/35#issuecomment-1683370548

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/586/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1754174496 I_kwDOCGYnMM5ojpQg 558 Ability to define unique columns when creating a table aguinane 1910303 open 0     0 2023-06-13T06:56:19Z 2023-08-18T01:06:03Z   NONE  

When creating a new table, it would be good to have an option to set unique columns similar to how not_null is set.

```python from sqlite_utils import Database

columns = {"mRID": str, "name": str} db = Database("example.db") db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], if_not_exists=True) db["ExampleTable"].create_index(["mRID"], unique=True, if_not_exists=True) ```

So something like this would add the UNIQUE flag to the table definition.

python db["ExampleTable"].create(columns, pk="mRID", not_null=["mRID"], unique=["mRID"], if_not_exists=True)

sql CREATE TABLE ExampleTable ( mRID TEXT PRIMARY KEY NOT NULL UNIQUE, name TEXT );

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/558/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1818838294 I_kwDOCGYnMM5saUUW 578 Plugin hook for adding new output formats simonw 9599 open 0     5 2023-07-24T17:29:18Z 2023-08-07T15:41:49Z   OWNER  

What would it take to add a format hook? I'm still thinking about my GIS workflow, and being able to do sqlite-utils query ... --geojson would be nice. It's the one place my Datasette workflow is messy, having to do datasette . --get /path/to/query.geojson --setting max_rows_returned 10000 --load-extension spatialite. I know the current pattern is --csv, but maybe --format geojson is more future-proof.

https://discord.com/channels/823971286308356157/997738192360964156/1133076679011602432

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/578/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1839344979 I_kwDOCGYnMM5toi1T 582 Handling CSV/file input that contains NUL bytes betatim 1448859 open 0     0 2023-08-07T12:24:14Z 2023-08-07T12:24:14Z   NONE  

I was using sqlite-utils to create a DB from a CSV and it turns out the CSV contains a NUL byte.

When the processing reaches the line that contains the NUL an exception is raised.

I'm wondering if there is something that can be done in sqlite-utils to say "skip lines with encoding errors" or some such. I think it isn't super straightforward though as the exception comes from inside the csv module that does all the parsing.

Concretely the file is the KernelVersions.csv from https://www.kaggle.com/datasets/kaggle/meta-kaggle

This is the command and output: $ sqlite-utils insert --csv kaggle.db kaggle KernelVersions.csv [------------------------------------] 0% [#####################---------------] 60% 00:04:24Traceback (most recent call last): File "/home/foobar/miniconda/envs/meta-kaggle/bin/sqlite-utils", line 10, in <module> sys.exit(cli()) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1223, in insert insert_upsert_implementation( File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1085, in insert_upsert_implementation db[table].insert_all( File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py", line 3198, in insert_all chunk = list(chunk) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/db.py", line 3742, in fix_square_braces for record in records: File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1071, in <genexpr> docs = (decode_base64_values(doc) for doc in docs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1068, in <genexpr> docs = (verify_is_dict(doc) for doc in docs) File "/home/foobar/miniconda/envs/meta-kaggle/lib/python3.10/site-packages/sqlite_utils/cli.py", line 1003, in <genexpr> docs = (dict(zip(headers, row)) for row in reader) _csv.Error: line contains NUL

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/582/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1822918995 I_kwDOCGYnMM5sp4lT 580 Add way to export to a csv file using the Python library kevinlinxc 44324811 open 0     0 2023-07-26T18:09:26Z 2023-07-26T18:09:26Z   NONE  

According to the documentation, we can make a csv output using the CLI tool, but not the Python library. Could we have the latter?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/580/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1821108702 I_kwDOCGYnMM5si-ne 579 Special handling for SQLite column of type `JSON` asg017 15178711 open 0     0 2023-07-25T20:37:23Z 2023-07-25T20:37:23Z   CONTRIBUTOR  

sqlite-utils should detect and have specially handling for column with a JSON column. For example:

sql CREATE TABLE "dogs" ( id INTEGER PRIMARY KEY, name TEXT, friends JSON );

Automatic Nesting

According to "Nested JSON Values", sqlite-utils will only expand JSON if the --json-cols flag is passed. It looks like it'll try to json.load all text column to test if its JSON, which can get expensive on non-json columns.

Instead, sqlite-utils should be default (ie without the --json-cols flags) do the maybe_json() operation on columns with a declared JSON type. So the above table would expand the "friends" column as expected, withoutthe --json-cols flag:

bash sqlite-utils dogs.db "select * from dogs" | python -mjson.tool

[ { "id": 1, "name": "Cleo", "friends": [ { "name": "Pancakes" }, { "name": "Bailey" } ] } ]


I'm sure there's other ways sqlite-utils can specially handle JSON columns, so keeping this open while I think of more

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/579/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1795219865 I_kwDOCGYnMM5rAOGZ 566 `--no-headers` doesn't work on most formats zellyn 33625 open 0     2 2023-07-09T03:43:36Z 2023-07-09T04:13:35Z   NONE  

Version 3.33

sqlite-utils query library.db 'select asin from audible' --fmt plain --no-headers | head -3 asin 0062804006 0062891421

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/566/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1784794489 I_kwDOCGYnMM5qYc15 562 Explore the intersection between sqlite-utils and dataclasses simonw 9599 open 0     1 2023-07-02T19:23:08Z 2023-07-02T19:26:39Z   OWNER  

Aside: this makes me think it might be cool if sqlite-utils had a way of working with dataclasses rather than just dicts, and knew how to create a SQLite table to match a dataclass and maybe how to code-generate dataclasses for a specific table schema (dynamically or even using code-generation that can be written to disk, for better editor integrations).

Originally posted by @simonw in https://github.com/simonw/llm/issues/65#issuecomment-1616742529

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/562/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1383646615 I_kwDOCGYnMM5SeMWX 491 Ability to merge databases and tables sgraaf 8904453 open 0     7 2022-09-23T11:10:55Z 2023-06-14T22:14:24Z   NONE  

Hi! Let me firstly say that I am a big fan of your work -- I follow your tweets and blog posts with great interest 😄.

Now onto the matter at hand: I think it would be great if sqlite-utils included a merge or combine command, with the purpose of combining different SQLite databases into a single SQLite database. This way, the newly "merged" database would contain all differently named tables contained in the databases to be merged as-is, as well a concatenation of all tables of the same name.

This could look something like this:

bash sqlite-utils merge cats.db dogs.db > animals.db

I imagine this is rather straightforward if all databases involved in the merge contain differently named tables (i.e. no chance of conflicts), but things get slightly more complicated if two or more of the databases to be merged contain tables with the same name. Not only do you have to "do something" with the primary key(s), but these tables could also simply have different schemas (and therefore be incompatible for concatenation to begin with).

Anyhow, I would love your thoughts on this, and, if you are open to it, work together on the design and implementation!

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/491/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1733198948 I_kwDOCGYnMM5nToRk 555 Filter table by a large bunch of ids redraw 10843208 open 0     1 2023-05-31T00:29:51Z 2023-06-14T22:01:57Z   NONE  

Hi! this might be a question related to both SQLite & sqlite-utils, and you might be more experienced with them.

I have a large bunch of ids, and I'm wondering which is the best way to query them in terms of performance, and simplicity if possible.

The naive approach would be something like select * from table where rowid in (?, ?, ?...) but that wouldn't scale if ids are >1k.

Another approach might be creating a temp table, or in-memory db table, insert all ids in that table and then join with the target one.

I failed to attach an in-memory db both using sqlite-utils, and plain sql's execute(), so my closest approach is something like,

python def filter_existing_video_ids(video_ids): db = get_db() # contains a "videos" table db.execute("CREATE TEMPORARY TABLE IF NOT EXISTS tmp (video_id TEXT NOT NULL PRIMARY KEY)") db["tmp"].insert_all([{"video_id": video_id} for video_id in video_ids]) for row in db["tmp"].rows_where("video_id not in (select video_id from videos)"): yield row["video_id"] db["tmp"].drop()

That kinda worked, I couldn't find an option in sqlite-utils's create_table() to tell it's a temporary table. Also, tmp table is not dropped finally, neither using .drop() despite being created with the keyword TEMPORARY. I believe it should be automatically dropped after connection/session ends though I read.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/555/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1740026046 I_kwDOCGYnMM5ntrC- 556 Support storing incrementally piped values mcint 601708 open 0     1 2023-06-04T00:45:23Z 2023-06-04T01:21:15Z   CONTRIBUTOR  

I'm trying to use sqlite-utils to data generated incrementally. There are a few aspects of this that I don't currently know how to handle. I would like an option to apply writes incrementally, line-by-line as they are received. I would like an option to echo incremental progress. And, it would be nice to have

In particular, I'm using CoreLocationCLI -w -j to generate, newline-delimited JSON.

One variant of the command

stdbuf -oL CoreLocationCLI -w -j | pee 'sqlite-utils insert loc.db loc -' nl

pee, from moreutils, is like tee but spawns and pipes to the processes created by invoking each of its arguments, so, for gratuitous demonstration, pee 'sponge out.log' cat would behave like tee.

It looks like I can get what I want with: stdbuf -oL CoreLocationCLI -w -j | while read line; do <<<"$line" sqlite-utils insert loc.db loc -; echo "$line"; done | nl

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/556/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1720096994 I_kwDOCGYnMM5mhpji 554 `IndexError` when doing `.insert(..., pk='id')` after `insert_all` xavdid 1231935 open 0     1 2023-05-22T17:13:02Z 2023-05-22T17:18:33Z   NONE  

I believe this is related to https://github.com/simonw/sqlite-utils/issues/98.

When pk is specified by table A's insert call, it throws an index error if a different table has written a row with a higher rowid than exists in the first table. Here's a basic example:

```py from sqlite_utils import Database

def test_pk_for_insert(fresh_db): user = {"id": "abc", "name": "david"}

fresh_db["users"].insert(user, pk="id")

fresh_db["comments"].insert_all(
    [
        {"id": "def", "text": "ok"},
        {"id": "ghi", "text": "great"},
    ],
)

fresh_db["users"].insert(
    user,
    ignore=True,
    # BUG: when specifying pk on the second insert call 
    # db.py goes into a block it doesn't expect and we get the error
    pk="id",
)

if name == "main": db = Database("bug.db") if db["users"].exists(): raise ValueError( "bug only shows on a new database - remove bug.db before running the script" ) test_pk_for_insert(db) ```

The error is:

py File "/Users/david/projects/reddit-to-sqlite/.venv/lib/python3.11/site-packages/sqlite_utils/db.py", line 2960, in insert_chunk row = list(self.rows_where("rowid = ?", [self.last_rowid]))[0] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ IndexError: list index out of range

The issue is in this block:

https://github.com/simonw/sqlite-utils/blob/2747257a3334d55e890b40ec58fada57ae8cfbfd/sqlite_utils/db.py#L2954-L2958

relevant locals are:

  • pk: 'id'
  • result.lastrowid: 2

What's most interesting is the comment # self.last_rowid will be 0 if a "INSERT OR IGNORE" happened, which doesn't seem to be the case here.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/554/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1124731464 I_kwDOCGYnMM5DCgpI 399 Make it easier to insert geometries, with documentation and maybe code simonw 9599 open 0     25 2022-02-05T00:11:26Z 2023-05-16T03:11:52Z   OWNER  

In playing with the new SpatiaLite helpers from #385 I noticed that actually populating geometry columns is still a little bit tricky. Here's what I ended up doing:

```python import httpx, sqlite_utils db = sqlite_utils.Database("/tmp/spatial.db") attractions = httpx.get("https://latest.datasette.io/fixtures/roadside_attractions.json?_shape=array").json() db["attractions"].insert_all(attractions, pk="pk")

Schema of that table is now:

CREATE TABLE [attractions] (

[pk] INTEGER PRIMARY KEY,

[name] TEXT,

[address] TEXT,

[latitude] FLOAT,

[longitude] FLOAT

)

db.init_spatialite() db["attractions"].add_geometry_column("point", "POINT")

db.execute(""" update attractions set point = GeomFromText( 'POINT(' || longitude || ' ' || latitude || ')', 4326 ) """) `` That last line took some figuring out - especially the need for the SRID of4326`, without which I got this error:

IntegrityError: attractions.point violates Geometry constraint [geom-type or SRID not allowed]

It would be good to both document this in more detail, but ideally also to come up with a more obvious pattern for inserting common types of spatial data.

Also related: - #398 - #79

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/399/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1700936245 I_kwDOCGYnMM5lYjo1 542 Remove `skip_false=True` and `--no-skip-false` in `sqlite-utils` 4.0 simonw 9599 open 0   4.0 backwards incomatible changes 9374594 1 2023-05-08T21:04:28Z 2023-05-08T21:07:41Z   OWNER  

Following: - #527

The only reason I didn't remove fix this mis-feature entirely is that it represents a backwards incompatible change. I'll make that change in 4.0.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/542/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1595340692 I_kwDOCGYnMM5fFveU 530 add ability to configure "on delete" and "on update" attributes of foreign keys: fgregg 536941 open 0     2 2023-02-22T15:44:14Z 2023-05-08T20:39:01Z   CONTRIBUTOR  

sqlite supports these, and it would be quite nice to be able to add them with sqlite-utils.

https://www.sqlite.org/foreignkeys.html#fk_actions

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/530/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1700840265 I_kwDOCGYnMM5lYMNJ 541 Get tests to pass with `pytest -Werror` simonw 9599 open 0     1 2023-05-08T19:57:23Z 2023-05-08T19:59:35Z   OWNER  

Inspired by: - #534

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/541/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1393202060 I_kwDOCGYnMM5TCpOM 496 devrel/python api: Pylance type hinting chapmanjacobd 7908073 open 0     4 2022-10-01T03:03:34Z 2023-05-03T05:53:27Z   CONTRIBUTOR  

Pylance is generally pretty good at figuring out stuff but sqlite-utils has some quirks which make type hinting kinda useless. Maybe you don't care but I thought I would bring it to your attention.

For example:

db["subs"].insert_all(subs, pk="index")

Cannot access member "insert_all" for type "View" Member "insert_all" is unknown

insert_all and all the other methods show up as a type issues because the program can't know whether something is a View or a Table. Fair enough. But that basically throws all type checking out the window.

pk="index" also shows up as a type issue:

Argument of type "Literal['index']" cannot be assigned to parameter "pk" of type "Default" in function "insert_all" "Literal['index']" is incompatible with "Default"

I think this is because DEFAULT is an empty class?

maybe a few small changes could be made to make the library more type-friendly

The interim solution is of course to turn off type hints completely for the line db["subs"].insert_all(subs, pk="index") # type: ignore

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/496/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
907795562 MDU6SXNzdWU5MDc3OTU1NjI= 265 Using enable_fts before search term prabhur 36287 open 0     1 2021-06-01T01:43:34Z 2023-04-01T17:27:18Z   NONE  

Many thanks for the sqlite-utils suite of utilities. Has made my life much much easier. I used this to create a table and enable FTS. All works fine. The datasette utility detects FTS and shows a text box. Searching for a term using that interface works well.

However, when I start to use features by following https://www.sqlite.org/fts5.html section "3. Full-text Query Syntax" I seem to run into issues that I suspect is due to escape_fts wrapper function.

As an example, if i search for the term "^குகை"on the text box in datasette it produces 140 results. However, when i tweak the query produced by datasette to not use "escape_fts" it produces 5 results.

Similarly, when I try to restrict the search to a single column in FTS using a spec like {title : ^குகை} it returns no rows. The same thing pulls results when used without escape_fts. The text in the table is in Tamil language and the search term is a Tamil word.

... where posts_fts match escape_fts(:search) vs

... where posts_fts match (:search)

Any ideas why? How can I get the benefits of both escaping as well as utilizing different facets of providing / controlling search terms? Thanks.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/265/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
702386948 MDU6SXNzdWU3MDIzODY5NDg= 159 .delete_where() does not auto-commit (unlike .insert() or .upsert()) spdkils 11712349 open 0     9 2020-09-16T01:55:52Z 2023-04-01T17:21:05Z   NONE  

When you use the delete_where() function on a table, it never commits....

Is that intentional?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/159/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1560651350 I_kwDOCGYnMM5dBaZW 523 Feature request: trim all leading and trailing white space for all columns for all tables in a database fgregg 536941 open 0     1 2023-01-28T02:40:10Z 2023-01-28T02:41:14Z   CONTRIBUTOR  

It's pretty common that i need to trim leading or trailing white space from lots of columns in a database a part of an initial ETL.

I use the following recipe a lot, and it would be great to include this functionality into sqlite-utils

trimify.sql sql select 'select group_concat(''update [' || name || '] set ['' || name || ''] = trim(['' || name || ''])'', ''; '') || ''; '' as sql_to_run from pragma_table_info('''||name||''');' from sqlite_schema;

then something like:

bash sqlite3 example.db < scripts/trimify.sql > table_trim.sql && \ sqlite3 $example.db < table_trim.sql > trim.sql && \ sqlite3 $example.db < trim.sql

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/523/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
743384829 MDExOlB1bGxSZXF1ZXN0NTIxMjg3OTk0 203 changes to allow for compound foreign keys drkane 1049910 open 0     7 2020-11-16T00:30:10Z 2023-01-25T18:47:18Z   FIRST_TIME_CONTRIBUTOR simonw/sqlite-utils/pulls/203

Add support for compound foreign keys, as per issue #117

Not sure if this is the right approach. In particular I'm unsure about:

  • the new ForeignKey class, which replaces the namedtuple in order to ensure that column and other_column are forced into tuples. The class does the job, but doesn't feel very elegant.
  • I haven't rewritten guess_foreign_table to take account of multiple columns, so it just checks for the first column in the foreign key definition. This isn't ideal.
  • I haven't added any ability to the CLI to add compound foreign keys, it's only in the python API at the moment.

The PR also contains a minor related change that columns and tables are always quoted in foreign key definitions.

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/203/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1550536442 I_kwDOCGYnMM5ca076 521 Custom JSON encoder janrito 31504 open 0     0 2023-01-20T09:19:40Z 2023-01-20T09:19:40Z   NONE  

It would be nice if we could specify a custom encoder (and decoder) for types that will need extra deserialisation – e.g., sets, enums or sparse matrices – or even project-specific types

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/521/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1373224657 I_kwDOCGYnMM5R2b7R 488 `sqlite-utils transform` should set empty strings to null when converting text columns to integer/float simonw 9599 open 0     5 2022-09-14T15:51:30Z 2022-12-23T17:38:55Z   OWNER  

/tmp % echo "id,age,weight\n1,3,2.5\n2,," | sqlite-utils insert test.db test - --csv /tmp % sqlite-utils schema test.db CREATE TABLE [test] ( [id] TEXT, [age] TEXT, [weight] TEXT ); /tmp % sqlite-utils transform test.db test --type age integer --type weight float /tmp % sqlite-utils schema test.db CREATE TABLE "test" ( [id] TEXT, [age] INTEGER, [weight] FLOAT ); /tmp % sqlite-utils rows test.db test [{"id": "1", "age": 3, "weight": 2.5}, {"id": "2", "age": "", "weight": ""}] It would be neat if this resulted in the following instead: {"id": "2", "age": null, "weight": null} Related Discord discussion: https://discord.com/channels/823971286308356157/823971286941302908/1019635490833567794

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/488/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1479914599 I_kwDOCGYnMM5YNbRn 516 Feature request: output number of ignored/replaced rows for insert command simonw 9599 open 0     4 2022-12-06T18:59:21Z 2022-12-06T19:08:14Z   OWNER  

https://hachyderm.io/@briandorsey/109468185742876820

I'm fiddling with piping json to insert -ignore I'd love to see the count of records inserted & ignored, but didn't see a way to do that in the help/docs.

Example: xh "https://hachyderm.io/api/v1/timelines/tag/rust?max_id=109443380308326328" | sqlite-utils insert aoc.db aoc - --pk=id --ignore

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/516/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1453134846 I_kwDOCGYnMM5WnRP- 513 Add or document streamlined workflow for importing Datasette csv / json exports henry501 19328961 open 0     0 2022-11-17T10:54:47Z 2022-11-17T10:54:47Z   NONE  

I'm working on some small front-end enhancements to the laion-aesthetic-datasette project, and I wanted to partially populate a database directly using exports from the existing Datasette instance instead of downloading the parquet files and creating my own multi-GB database.

There have been a number of small issues that are certainly related to my relative lack of familiarity with the toolkit, but that are still surprising.

For example: a CSV export of the images table (http://laion-aesthetic.datasette.io/laion-aesthetic-6pls.csv?sql=select+rowid%2C+url%2C+text%2C+domain_id%2C+width%2C+height%2C+similarity%2C+punsafe%2C+pwatermark%2C+aesthetic%2C+hash%2C+index_level_0+from+images+order+by+random%28%29+limit+100) has nested single quotes, double quotes, and commas that aren't handled by rows_from_file. Similarly, the json output has to be manually transformed to add the column names and remove extraneous information before sqlite_utils can import it.

I was able to work through these issues, but as an enhancement it would be really helpful to create or document a clear workflow that avoids the friction of this data transformation.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/513/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1405196044 PR_kwDOCGYnMM5AmYzy 499 feat: recreate fts triggers after table transform chapmanjacobd 7908073 open 0     2 2022-10-11T20:35:39Z 2022-10-26T17:54:51Z   CONTRIBUTOR simonw/sqlite-utils/pulls/499

https://github.com/simonw/sqlite-utils/pull/498


:books: Documentation preview :books:: https://sqlite-utils--499.org.readthedocs.build/en/499/

alternatively, self.disable_fts()

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/499/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1386562662 I_kwDOCGYnMM5SpURm 493 Tiny typographical error in install/uninstall docs simonw 9599 open 0     3 2022-09-26T19:00:42Z 2022-10-25T21:31:15Z   OWNER  

Added in: - #483

I don't know how to fix this in Sphinx: I'm getting this: https://sqlite-utils.datasette.io/en/latest/cli.html#cli-install

The insert –convert and query –functions options

But I want it to display insert --convert and not insert –convert there.

Here's the code: https://github.com/simonw/sqlite-utils/blob/85247038f70d7eb2f3e272cfeaa4c44459cafba8/docs/cli.rst#L2125

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/493/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1149661489 I_kwDOCGYnMM5EhnEx 409 `with db:` for transactions simonw 9599 open 0     3 2022-02-24T19:22:06Z 2022-10-01T03:42:50Z   OWNER  

This can be a documented wrapper around with db.conn:.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/409/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1386530156 I_kwDOCGYnMM5SpMVs 492 Idea: ability to pass extra variables to `--convert` scripts simonw 9599 open 0     1 2022-09-26T18:30:45Z 2022-09-26T18:33:19Z   OWNER  

Got this idea from this example in https://jeqo.github.io/notes/2022-09-24-ingest-logs-sqlite/

bash sqlite-utils insert /tmp/kafka-logs.db logs server.log.2022-09-24-21 --text --convert " import re r = re.compile(r'^\[(?P<datetime>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})\] (?P<level>\w+) (?P<log>(.+(\n(?\!\[).+|)+))', re.MULTILINE) def convert(text): rows = [m.groupdict() for m in r.finditer(text)] for row in rows: row.update({'server': 'localhost'}) row.update({'component': 'broker'}) return rows " And the accompanying note:

The row.update allows to label rows as I’m planning to ingest logs from different hosts and potentially different components.

This made me think: it might be neat if you could inject additional variable values into that script with extra command-line options, to make this kind of reuse easier. Something like this:

bash sqlite-utils insert /tmp/kafka-logs.db logs server.log.2022-09-24-21 --text --convert " import re r = re.compile(r'^\[(?P<datetime>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})\] (?P<level>\w+) (?P<log>(.+(\n(?\!\[).+|)+))', re.MULTILINE) def convert(text): rows = [m.groupdict() for m in r.finditer(text)] for row in rows: row.update({'server': server}) row.update({'component': component}) return rows " --var server "localhost" --var component "broker"

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/492/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1082651698 I_kwDOCGYnMM5Ah_Qy 358 Support for CHECK constraints luxint 11597658 open 0     7 2021-12-16T21:19:45Z 2022-09-25T07:15:59Z   NONE  

Hi,

I noticed the transform.table() method doesn't have an option to add/change or drop a check constraint (see https://sqlite.org/lang_createtable.html -> 3.7 Check Constraints. would be great to have this as an option!

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/358/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1374939463 I_kwDOCGYnMM5R8-lH 489 Ability to load JSON records held in a file with a single top level key that is a list of objects simonw 9599 open 0     9 2022-09-15T18:46:03Z 2022-09-15T20:56:10Z   OWNER  

It's very common for JSON to look like this: json { "Version": "5.5.52.6", "List": [ { "Description": "Nonpartisan", "Id": 1, "ExternalId": "" }, { "Description": "Undeclared", "Id": 2, "ExternalId": "" } ] } This example taken from the records downloaded from https://www.elections.alaska.gov/election-results/e/

Right now you can't import this into sqlite-utils - you need to run it through jq .List first.

But since this is so common, it would be neat if sqlite-utils could have a rule of thumb that says "if it's an object, but it has a single key that is is a list of objects, use that instead".

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/489/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1128466114 I_kwDOCGYnMM5DQwbC 406 Creating tables with custom datatypes psychemedia 82988 open 0     5 2022-02-09T12:16:31Z 2022-09-15T18:13:50Z   NONE  

Via https://stackoverflow.com/a/18622264/454773 I note the ability to register custom handlers for novel datatypes that can map into and out of things like sqlite BLOBs.

From a quick look and a quick play, I didn't spot a way to do this in sqlite_utils?

For example:

```python

Via https://stackoverflow.com/a/18622264/454773

import sqlite3 import numpy as np import io

def adapt_array(arr): """ http://stackoverflow.com/a/31312102/190597 (SoulNibbler) """ out = io.BytesIO() np.save(out, arr) out.seek(0) return sqlite3.Binary(out.read())

def convert_array(text): out = io.BytesIO(text) out.seek(0) return np.load(out)

Converts np.array to TEXT when inserting

sqlite3.register_adapter(np.ndarray, adapt_array)

Converts TEXT to np.array when selecting

sqlite3.register_converter("array", convert_array) ```

```python from sqlite_utils import Database db = Database('test.db')

Reset the database connection to used the parsed datatype

sqlite_utils doesn't seem to support eg:

Database('test.db', detect_types=sqlite3.PARSE_DECLTYPES)

db.conn = sqlite3.connect(db_name, detect_types=sqlite3.PARSE_DECLTYPES)

Create a table the old fashioned way

but using the new custom data type

vector_table_create = """ CREATE TABLE dummy (title TEXT, vector array ); """

cur = db.conn.cursor() cur.execute(vector_table_create)

sqlite_utils doesn't appear to support custom types (yet?!)

The following errors on the "array" datatype

""" db["dummy"].create({ "title": str, "vector": "array", }) """ ```

We can then add / retrieve records from the database where the datatype of the vector field is a custom registered array type (which is to say, a numpy array):

```python import numpy as np

db["dummy"].insert({'title':"test1", 'vector':np.array([1,2,3])})

for row in db.query("SELECT * FROM dummy"): print(row['title'], row['vector'], type(row['vector']))

""" test1 [1 2 3] <class 'numpy.ndarray'> """ ```

It would be handy to be able to do this idiomatically in sqlite_utils.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/406/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1363766973 I_kwDOCGYnMM5RSW69 484 Expose convert recipes to `sqlite-utils --functions` simonw 9599 open 0     11 2022-09-06T20:15:08Z 2022-09-07T19:09:52Z   OWNER  

--functions was added in: - #471

It would be useful if the r.jsonsplit() and similar recipes for sqlite-utils convert could be used in these blocks of code too: https://sqlite-utils.datasette.io/en/stable/cli.html#sqlite-utils-convert-recipes

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/484/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
816526538 MDU6SXNzdWU4MTY1MjY1Mzg= 239 sqlite-utils extract could handle nested objects simonw 9599 open 0     16 2021-02-25T15:10:28Z 2022-09-03T23:46:02Z   OWNER  

Imagine a table (imported from a nested JSON file) where one of the columns contains values that look like this:

{"email": "anonymous@noreply.airtable.com", "id": "usrROSHARE0000000", "name": "Anonymous"}

The sqlite-utils extract command already uses single text values in a column to populate a new table. It would not be much of a stretch for it to be able to use JSON instead, including specifying which of those values should be used as the primary key in the new table.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/239/reactions",
    "total_count": 6,
    "+1": 5,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1359604075 I_kwDOCGYnMM5RCelr 481 Idea: `sqlite-utils create-table tablename --sql "select ..."` simonw 9599 open 0     0 2022-09-02T01:41:24Z 2022-09-02T01:42:08Z   OWNER  

Could offer syntactic sugar for:

sql create table foo as select * from bar

sqlite-utils create-table data.db foo --sql "select * from bar" https://sqlite-utils.datasette.io/en/stable/cli-reference.html#create-table

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/481/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1353074021 I_kwDOCGYnMM5QpkVl 474 Add an option for specifying column names when inserting CSV data hubgit 14294 open 0     3 2022-08-27T15:29:59Z 2022-08-31T03:42:36Z   NONE  

https://sqlite-utils.datasette.io/en/stable/cli.html#csv-files-without-a-header-row

The first row of any CSV or TSV file is expected to contain the names of the columns in that file.

If your file does not include this row, you can use the --no-headers option to specify that the tool should not use that fist row as headers.

If you do this, the table will be created with column names called untitled_1 and untitled_2 and so on. You can then rename them using the sqlite-utils transform ... --rename command.

It would be nice to be able to specify the column names when importing CSV/TSV without a header row, via an extra command line option.

(renaming a column of a large table can take a long time, which makes it an inconvenient workaround)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/474/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1355193529 I_kwDOCGYnMM5Qxpy5 479 OperationalError: cannot VACUUM from within a transaction chapmanjacobd 7908073 open 0     0 2022-08-30T05:34:24Z 2022-08-30T05:34:24Z   CONTRIBUTOR  

Maybe when calling .vacuum() and other DB-level write-lock operations sqlite_utils could guard against this error message by automatically committing first?

``` 46 db["media"].optimize() # type: ignore ---> 47 db.vacuum()

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:1047, in Database.vacuum(self) 1045 def vacuum(self): 1046 "Run a SQLite VACUUM against the database." -> 1047 self.execute("VACUUM;")

File ~/.local/lib/python3.10/site-packages/sqlite_utils/db.py:470, in Database.execute(self, sql, parameters) 468 return self.conn.execute(sql, parameters) 469 else: --> 470 return self.conn.execute(sql)

OperationalError: cannot VACUUM from within a transaction ```

It might also be nice to add a sentence or two about how transactions are committed on the docs page. When I was swapping out my sqlite3 code for this library it was nice that everything was pretty much drop-in but I was/am unsure what to do about the places I explicitly call .commit() in my code

Related to https://github.com/simonw/sqlite-utils/issues/121

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/479/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1353481513 I_kwDOCGYnMM5QrH0p 478 `sqlite-utils tables data.db table1 table2` simonw 9599 open 0     1 2022-08-28T22:05:53Z 2022-08-28T22:22:35Z   OWNER  

The sqlite-utils tables command currently lists all tables.

If you have a huge table in there then running it with --counts can get expensive, because of the huge table.

Would be useful if it could accept an optional list of tables that it should execute against, as an alternative to the default of all of them.

This should be a backwards compatible change. Current design is: https://sqlite-utils.datasette.io/en/stable/cli-reference.html#tables

``` Usage: sqlite-utils tables [OPTIONS] PATH

List the tables in the database

Example:

  sqlite-utils tables trees.db

```

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/478/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1326349129 I_kwDOCGYnMM5PDntJ 461 Consider including animated SVG console demos simonw 9599 open 0     1 2022-08-02T20:10:04Z 2022-08-02T20:12:14Z   OWNER  

I recorded this one using https://github.com/nbedos/termtosvg - with pipx install termtosvg and then termtosvg - execute demo - exit to save.

json [ { "id": 1, "name": "Catimus" }, { "id": 2, "name": "Feliopia" } ]

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/461/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1324659241 I_kwDOCGYnMM5O9LIp 459 Single quoted transform recipes on Windows do not work as expected shakeel 19921 open 0     0 2022-08-01T16:14:54Z 2022-08-01T16:14:54Z   CONTRIBUTOR  

Trying to follow the tutorial for sqlite-utils and datasette https://datasette.io/tutorials/clean-data on Windows 11 OS Microsoft Windows [Version 10.0.22622.440], with sqlite-utils and datasette installed using pipx.

pipx list package datasette 0.61.1, installed using Python 3.10.4 - datasette.exe package sqlite-utils 3.28, installed using Python 3.10.4 - sqlite-utils.exe

In the step to transform dates into ISO dates the quoted value 'r.parsedatetime(value)' is copied verbatim into the columns instead of applying the output of the Python recipe.

``` sqlite-utils convert manatees.db locations \ REPDATE created_date last_edited_date \ 'r.parsedatetime(value)' --dry-run

1975/01/31 00:00:00+00 --- becomes: r.parsedatetime(value)

Would affect 13568 rows ```

However, if I change the code from single quotes to double quotes, it works as expected.

``` sqlite-utils convert manatees.db locations \ REPDATE created_date last_edited_date \ "r.parsedatetime(value)" --dry-run

1975/01/31 00:00:00+00 --- becomes: 1975-01-31T00:00:00+00:00

Would affect 13568 rows ```

Specifying the transform code recipe should work with single quotes on Windows.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/459/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1310243385 I_kwDOCGYnMM5OGLo5 456 feature request: pivot command fgregg 536941 open 0     5 2022-07-20T00:58:08Z 2022-07-20T17:50:50Z   CONTRIBUTOR  

pivoting long-format table to wide-format tables is pretty common and kind of pain. would love to see this feature in sqlite-utils!

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/456/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1271426387 I_kwDOCGYnMM5LyG1T 444 CSV `extras_key=` and `ignore_extras=` equivalents for CLI tool simonw 9599 open 0     5 2022-06-14T22:22:47Z 2022-07-07T16:39:18Z   OWNER  

I forgot to add equivalents of extras_key= and ignore_extras= to the CLI tool - will do that in a separate issue.

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/440#issuecomment-1155767915

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/444/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
455486286 MDU6SXNzdWU0NTU0ODYyODY= 26 Mechanism for turning nested JSON into foreign keys / many-to-many simonw 9599 open 0     14 2019-06-13T00:52:06Z 2022-06-29T23:35:29Z   OWNER  

The GitHub JSON APIs have a really interesting convention with respect to related objects.

Consider https://api.github.com/repos/simonw/sqlite-utils/issues - here's a truncated subset: json { "id": 449818897, "node_id": "MDU6SXNzdWU0NDk4MTg4OTc=", "number": 24, "title": "Additional Column Constraints?", "user": { "login": "IgnoredAmbience", "id": 98555, "node_id": "MDQ6VXNlcjk4NTU1", "avatar_url": "https://avatars0.githubusercontent.com/u/98555?v=4", "gravatar_id": "" }, "labels": [ { "id": 993377884, "node_id": "MDU6TGFiZWw5OTMzNzc4ODQ=", "url": "https://api.github.com/repos/simonw/sqlite-utils/labels/enhancement", "name": "enhancement", "color": "a2eeef", "default": true } ], "state": "open" } The user column lists a complete user. The labels column has a list of labels.

Since both user and label have populated id field this is actually enough information for us to create records for them AND set up the corresponding foreign key (for user) and m2m relationships (for labels).

It would be really neat if sqlite-utils had some kind of mechanism for correctly processing these kind of patterns.

Thanks to jq there's not much need for extra customization of the shape here - if we support a narrowly defined structure users can use jq to reshape arbitrary JSON to match.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/26/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1227571375 I_kwDOCGYnMM5JK0Cv 431 Allow making m2m relation of a table to itself rafguns 738408 open 0     3 2022-05-06T08:30:43Z 2022-06-23T14:12:51Z   NONE  

I am building a database, in which one of the tables has a many-to-many relationship to itself. As far as I can see, this is not (yet) possible using .m2m() in sqlite-utils. This may be a bit of a niche use case, so feel free to close this issue if you feel it would introduce too much complexity compared to the benefits.

Example: suppose I have a table of people, and I want to store the information that John and Mary have two children, Michael and Suzy. It would be neat if I could do something like this:

```python from sqlite_utils import Database

db = Database(memory=True) db["people"].insert({"name": "John"}, pk="name").m2m( "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name" ) db["people"].insert({"name": "Mary"}, pk="name").m2m( "people", [{"name": "Michael"}, {"name": "Suzy"}], m2m_table="parent_child", pk="name" ) ```

But if I do that, the many-to-many table parent_child has only one column: CREATE TABLE [parent_child] ( [people_id] TEXT REFERENCES [people]([name]), PRIMARY KEY ([people_id], [people_id]) )

This could be solved by adding one or two keyword_arguments to .m2m(), e.g. .m2m(..., left_name=None, right_name=None) or .m2m(..., names=(None, None)).

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/431/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1250495688 I_kwDOCGYnMM5KiQzI 439 Misleading progress bar against utf-16-le CSV input frafra 4068 open 0     12 2022-05-27T08:34:49Z 2022-06-15T03:53:43Z   NONE  

The program crashes without any error. wget "https://artsdatabanken.no/Fab2018/api/export/csv" sqlite-utils create-database test.db sqlite-utils insert --csv --delimiter ";" --encoding "utf-16-le" test test.db csv [------------------------------------] 0% [#################-------------------] 49% 00:00:01 I would like to highlight various issues: 1. sqlite-utils catches exceptions without printing the stacktrace and/or reraising the exception, so there is no easy way to use pdb or similar to debug the program, solution: add a debug option 2. Silent crash: this is related to (1.), and it happens when there is a catch-all mechanism; solution: let the program fail.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/439/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1224112817 I_kwDOCGYnMM5I9nqx 430 Document how to use `PRAGMA temp_store` to avoid errors when running VACUUM against huge databases rayvoelker 9308268 open 0     2 2022-05-03T13:33:58Z 2022-06-14T23:26:37Z   NONE  

I'm trying to figure out a way to get the table.extract() method to complete successfully -- I'm not sure if maybe the cause (and a possible solution) of this on Ubuntu Server 22.04 is to adjust some of the PRAGMA values within SQLite itself ... on another Linux system (PopOS), using this method on this same database appears to work just fine.

Here's the bit that's causing the error, and the resulting error output: ```python

combine these columns into 1 table "bib_properties" :

best_title

bib_level_code

mat_type

material_code

best_author

db["circ_trans"].extract( ["best_title", "bib_level_code", "mat_type", "material_code", "best_author"], table="bib_properties", fk_column="bib_properties_id" )

db["circ_trans"].extract( ["call_number"], table="call_number", fk_column="call_number_id", rename={"call_number": "value"} ) ```

```python

OperationalError Traceback (most recent call last) Input In [17], in <cell line: 7>() 1 # combine these columns into 1 table "bib_properties" : 2 # best_title 3 # bib_level_code 4 # mat_type 5 # material_code 6 # best_author ----> 7 db["circ_trans"].extract( 8 ["best_title", "bib_level_code", "mat_type", "material_code", "best_author"], 9 table="bib_properties", 10 fk_column="bib_properties_id" 11 ) 13 db["circ_trans"].extract( 14 ["call_number"], 15 table="call_number", 16 fk_column="call_number_id", 17 rename={"call_number": "value"} 18 )

File ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:1764, in Table.extract(self, columns, table, fk_column, rename) 1761 column_order.append(c.name) 1763 # Drop the unnecessary columns and rename lookup column -> 1764 self.transform( 1765 drop=set(columns), 1766 rename={magic_lookup_column: fk_column}, 1767 column_order=column_order, 1768 ) 1770 # And add the foreign key constraint 1771 self.add_foreign_key(fk_column, table, "id")

File ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:1526, in Table.transform(self, types, rename, drop, pk, not_null, defaults, drop_foreign_keys, column_order) 1524 with self.db.conn: 1525 for sql in sqls: -> 1526 self.db.execute(sql) 1527 # Run the foreign_key_check before we commit 1528 if pragma_foreign_keys_was_on:

File ~/jupyter/venv/lib/python3.10/site-packages/sqlite_utils/db.py:465, in Database.execute(self, sql, parameters) 463 return self.conn.execute(sql, parameters) 464 else: --> 465 return self.conn.execute(sql)

OperationalError: database or disk is full ```

This database is about 17G in total size, so I'm assuming the error is coming from the vacuum ... where i'm assuming it's maybe trying to do the temp storage in a location that doesn't have sufficient room. The disk space is more than ample on the host in question (1.8T is free in the directory where the sqlite db resides) The /tmp directory however is limited on a smaller disk associated with the OS

I'm trying to think if there's a way to set the PRAGMA temp_store or maybe if it's temp_store_directory that I'm after ... to use the same local directory of where the file is located (maybe this is a property of the version of sqlite on the system?)

```python

SET the temp file store to be a file ...

print(db.execute('PRAGMA temp_store').fetchall()) print(db.execute('PRAGMA temp_store=FILE').fetchall())

print(db.execute('PRAGMA temp_store').fetchall())

the users home directory ...

print(db.execute("PRAGMA temp_store_directory='/home/plchuser/'").fetchall()) print(db.execute("PRAGMA sqlite3_temp_directory='/home/plchuser/'").fetchall())

print(db.execute("PRAGMA temp_store_directory").fetchall()) print(db.execute("PRAGMA sqlite3_temp_directory").fetchall()) text [(1,)] [] [(1,)] [] [] [('/home/plchuser/',)] [] ```

Here's the docs on the Temporary File Storage Locations https://www.sqlite.org/tempfiles.html

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/430/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1236693079 I_kwDOCGYnMM5JtnBX 432 Support `rows_where()`, `delete_where()` etc for attached alias databases luxint 11597658 open 0     5 2022-05-16T06:38:58Z 2022-06-14T22:16:48Z   NONE  

Hi,

I noticed rows_where() doesn't return any rows from tables which are from attached databases. The exists() function returns false. As far as I can see this is because the table_names() function only looks for table names in the current database and not in attached (or temp) databases.

Besides, rows_where(), also insert_all() and delete_where() didn't do what I was expecting because of this. For the moment I've patched table_names() for myself, see below but I'm not sure what the total impact is on the other functions like lookup truncate etc which all use exists(). Also view_names() doesn't look for views in attached or temp databases. python def table_names(self, fts4: bool = False, fts5: bool = False) -> List[str]: "A list of string table names in this database." where = ["type = 'table'"] if fts4: where.append("sql like '%USING FTS4%'") if fts5: where.append("sql like '%USING FTS5%'") dbs = [x[1] for x in self.execute('pragma database_list').fetchall()] lst=[] for db in dbs: sql = "select name from {} where {}".format(db+".sqlite_master"," AND ".join(where)) lst.extend(r[0] for r in self.execute(sql).fetchall()) return lst

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/432/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1160182768 I_kwDOCGYnMM5FJvvw 412 Optional Pandas integration simonw 9599 open 0     13 2022-03-05T01:49:27Z 2022-06-14T15:36:29Z   OWNER  

It would be neat if there was a way to use this more seamlessly with Pandas, in particular Pandas dataframes - but without making Pandas a required dependency.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/412/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1215216249 I_kwDOCGYnMM5Ibrp5 428 Research adding support for savepoints simonw 9599 open 0     1 2022-04-26T01:04:01Z 2022-04-26T01:05:29Z   OWNER  

https://www.sqlite.org/lang_savepoint.html

Savepoints are like regular transactions except they have names and can be nested.

Would there be any value in adding support to them to sqlite-utils, potentially as some kind of context manager? Something like this: python with db.savepoint("name"): # do stuff with db.savepoint("name2"): # do more stuff raise Release # Rolls back to before "name2" savepoint I've never used this feature so I'm not comfortable adding anything like this without a bunch of extra research.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/428/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1181236173 I_kwDOCGYnMM5GaDvN 422 Reconsider not running convert functions against null values simonw 9599 open 0     1 2022-03-25T20:22:40Z 2022-03-25T20:23:21Z   OWNER  

I just got caught out by the fact that None values are not processed by the .convert() mechanism https://github.com/simonw/sqlite-utils/blob/0b7b80bd40fe86e4d66a04c9f607d94991c45c0b/sqlite_utils/db.py#L2504-L2510

I had run this code while working on #420 and I wasn't sure why it didn't work:

``` $ sqlite-utils add-column content.db articles score float $ sqlite-utils convert content.db articles score ' import random random.seed(10)

def convert(value): global random return random.random() ' `` The reason it didn't work is that the newly addedscorecolumn was full ofnull` values.

I fixed it by doing this instead:

$ sqlite-utils add-column content.db articles score float --not-null-default 1.0

But this indicates to me that the design of convert() here may be incorrect.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/422/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
688351054 MDU6SXNzdWU2ODgzNTEwNTQ= 140 Idea: insert-files mechanism for adding extra columns with fixed values simonw 9599 open 0     1 2020-08-28T20:57:36Z 2022-03-20T19:45:45Z   OWNER  

Say for example you want to populate a file_type column with the value gif. That could work like this:

sqlite-utils insert-files gifs.db images *.gif \ -c path -c md5 -c last_modified:mtime \ -c file_type:text:gif --pk=path So a column defined as a text column with a value that follows a second colon.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/140/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
675753042 MDU6SXNzdWU2NzU3NTMwNDI= 131 sqlite-utils insert: options for column types simonw 9599 open 0     5 2020-08-09T18:59:11Z 2022-03-15T13:21:42Z   OWNER  

The insert command currently results in string types for every column - at least when used against CSV or TSV inputs.

It would be useful if you could do the following:

  • automatically detects the column types based on eg the first 1000 records
  • explicitly state the rule for specific columns

--detect-types could work for the former - or it could do that by default and allow opt-out using --no-detect-types

For specific columns maybe this:

sqlite-utils insert db.db images images.tsv \
  --tsv \
  -c id int \
  -c score float
sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/131/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1160034488 I_kwDOCGYnMM5FJLi4 411 Support for generated columns eyeseast 25778 open 0     8 2022-03-04T20:41:33Z 2022-03-11T22:32:43Z   CONTRIBUTOR  

This is a fairly new feature -- SQLite version 3.31.0 (2020-01-22) -- that I, admittedly, haven't gotten to work yet. But it looks incredibly useful: https://dgl.cx/2020/06/sqlite-json-support

I'm not sure if this is an option on add-column or a separate command like add-generated-column. Either way, it needs an argument to populate it. It could be something like this:

sh sqlite-utils add-column data.db table-name generated --as 'json_extract(data, "$.field")' --virtual

More here: https://www.sqlite.org/gencol.html

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/411/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1125297737 I_kwDOCGYnMM5DEq5J 402 Advanced class-based `conversions=` mechanism simonw 9599 open 0     14 2022-02-06T19:47:41Z 2022-02-16T10:18:55Z   OWNER  

The conversions= parameter works like this at the moment: https://sqlite-utils.datasette.io/en/3.23/python-api.html#converting-column-values-using-sql-functions

python db["places"].insert( {"name": "Wales", "geometry": wkt}, conversions={"geometry": "GeomFromText(?, 4326)"}, ) This proposal is to support values in that dictionary that are objects, not strings, which can represent more complex conversions - spun out from #399.

New proposed mechanism: ```python from sqlite_utils.utils import LongitudeLatitude

db["places"].insert( { "name": "London", "point": (-0.118092, 51.509865) }, conversions={"point": LongitudeLatitude}, ) `` HereLongitudeLatitudeis a magical value which does TWO things: it sets up theGeomFromText(?, 4326)SQL function, and it handles converting the(51.509865, -0.118092)tuple into aPOINT({} {})` string.

This would involve a change to the conversions= contract - where it usually expects a SQL string fragment, but it can also take an object which combines that SQL string fragment with a Python conversion function.

Best of all... this resolves the lat, lon v.s. lon, lat dilemma because you can use from sqlite_utils.utils import LongitudeLatitude OR from sqlite_utils.utils import LatitudeLongitude depending on which you prefer!

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/399#issuecomment-1030739566

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/402/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1072792507 I_kwDOCGYnMM4_8YO7 352 `sqlite-utils insert --extract colname` simonw 9599 open 0     4 2021-12-07T00:55:44Z 2022-02-03T22:59:36Z   OWNER  

Is there a reason I've not added --extract as an option for sqlite-utils insert next? There's a extracts= option for the various table.insert() etc methods - last line in this code block:

https://github.com/simonw/sqlite-utils/blob/213a0ff177f23a35f3b235386366ff132eb879f1/sqlite_utils/db.py#L2483-L2495

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/352/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1122446693 I_kwDOCGYnMM5C5y1l 394 Test against Python 3.11-dev simonw 9599 open 0     1 2022-02-02T22:21:03Z 2022-02-03T21:06:35Z   OWNER  

Same as: - https://github.com/simonw/datasette/issues/1621

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/394/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1090798237 I_kwDOCGYnMM5BBEKd 359 Use RETURNING if available to populate last_pk simonw 9599 open 0     0 2021-12-29T23:43:23Z 2021-12-29T23:43:23Z   OWNER  

Inspired by this: https://news.ycombinator.com/item?id=29729283

Because SQLite is effectively serializing all the writes for us, we have zero locking in our code. We used to have to lock when inserting new items (to get the LastInsertRowId), but the newer version of SQLite supports the RETURNING keyword, so we don't even have to lock on inserts now.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/359/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
706001517 MDU6SXNzdWU3MDYwMDE1MTc= 163 Idea: conversions= could take Python functions simonw 9599 open 0     4 2020-09-22T00:37:12Z 2021-12-20T00:56:52Z   OWNER  

Right now you use conversions= like this:

python db["example"].insert({ "name": "The Bigfoot Discovery Museum" }, conversions={"name": "upper(?)"}) How about if you could optionally provide a Python function (or a lambda) like this? python db["example"].insert({ "name": "The Bigfoot Discovery Museum" }, conversions={"name": lambda s: s.upper()}) This would work by creating a random name for that function, registering it (similar to #162), executing the SQL and then un-registering the custom function at the end.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/163/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1066603133 PR_kwDOCGYnMM4vKAzW 347 Test against pysqlite3 running SQLite 3.37 simonw 9599 open 0     9 2021-11-29T23:17:57Z 2021-12-11T01:02:19Z   OWNER simonw/sqlite-utils/pulls/347

Refs #346 and #344.

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/347/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1071531082 I_kwDOCGYnMM4_3kRK 349 A way of creating indexes on newly created tables simonw 9599 open 0     3 2021-12-05T18:56:12Z 2021-12-07T01:04:37Z   OWNER  

I'm writing code for https://github.com/simonw/git-history/issues/33 that creates a table inside a loop:

python item_pk = db[item_table].lookup( {"_item_id": item_id}, item_to_insert, column_order=("_id", "_item_id"), pk="_id", ) I need to look things up by _item_id on this table, which means I need an index on that column (the table can get very big).

But there's no mechanism in SQLite utils to detect if the table was created for the first time and add an index to it. And I don't want to run CREATE INDEX IF NOT EXISTS every time through the loop.

This should work like the foreign_keys= mechanism.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/349/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1072435124 I_kwDOCGYnMM4_7A-0 350 Optional caching mechanism for table.lookup() simonw 9599 open 0     3 2021-12-06T17:54:25Z 2021-12-06T17:56:57Z   OWNER  

Inspired by work on git-history where I used this pattern: ```python column_name_to_id = {}

def column_id(column):
    if column not in column_name_to_id:
        id = db["columns"].lookup(
            {"namespace": namespace_id, "name": column},
            foreign_keys=(("namespace", "namespaces", "id"),),
        )
        column_name_to_id[column] = id
    return column_name_to_id[column]

`` If you're going to be doing a large number oftable.lookup(...)` calls and you know that no other script will be modifying the database at the same time you can presumably get a big speedup using a Python in-memory cache - maybe even a LRU one to avoid memory bloat.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/350/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
1066563554 I_kwDOCGYnMM4_knfi 346 Way to test SQLite 3.37 (and potentially other versions) in CI simonw 9599 open 0     5 2021-11-29T22:21:06Z 2021-11-29T23:12:49Z   OWNER  

Need to figure out a good pattern for testing this in CI too - it will currently skip the new tests if it doesn't have SQLite 3.37 or higher.

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/issues/344#issuecomment-982076924

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/346/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
836829560 MDU6SXNzdWU4MzY4Mjk1NjA= 248 support for Apache Arrow / parquet files I/O mhalle 649467 open 0     1 2021-03-20T14:59:30Z 2021-10-28T23:46:48Z   NONE  

I just started looking at Apache Arrow using pyarrow for import and export of tabular datasets, and it looks quite compelling. It might be worth looking at for sqlite-utils and/or datasette.

As a test, I took a random jsonl data dump of a dataset I have with floats, strings, and ints and converted it to arrow's parquet format using the naive pyarrow.parquet.write_file() command, which has automatic type inferrence. It compressed down to 7% of the original size. Conversion of a 26MB JSON file and serializing it to parquet was eyeblink instantaneous. Parquet files are portable and can be directly imported into pandas and other analytics software.

The only hangup is the automatic type inference of the naive reader. It's great for general laziness and for parsing JSON columns (it correctly interpreted a table of mine with a JSON array). However, I did get an exception for a string column where most entries looked integer-like but had a couple values that weren't -- the reader tried to coerce all of them for some reason, even though the JSON type is string. Since the writer optionally takes a schema, it shouldn't be too hard to grab the sqlite header types. With some additional hinting, you might get datetime columns and JSON, which are native Arrow types.

Somewhat tangentially, someone even wrote an sqlite vfs extension for Parquet: https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/248/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
817989436 MDU6SXNzdWU4MTc5ODk0MzY= 242 Async support eyeseast 25778 open 0     13 2021-02-27T18:29:38Z 2021-10-28T14:37:56Z   CONTRIBUTOR  

Following our conversation last week, want to note this here before I forget.

I've had a couple situations where I'd like to do a bunch of updates in an async event loop, but I run into SQLite's issues with concurrent writes. This feels like something sqlite-utils could help with.

PeeWee ORM has a SQLite write queue that might be a good model. It's using threads or gevent, but I think that approach would translate well enough to asyncio.

Happy to help with this, too.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/242/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
974067156 MDU6SXNzdWU5NzQwNjcxNTY= 318 Research: handle gzipped CSV directly simonw 9599 open 0     2 2021-08-18T21:23:04Z 2021-08-18T21:25:30Z   OWNER  

Would it be worthwhile for the sqlite-utils command-line tool to grow features to efficiently directly interact with gzipped CSV data?

Maybe add --gz options to both insert and to the various commands that output query results.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/318/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
722816436 MDU6SXNzdWU3MjI4MTY0MzY= 186 .extract() shouldn't extract null values simonw 9599 open 0     7 2020-10-16T02:41:08Z 2021-08-12T12:32:14Z   OWNER  

This almost works, but it creates a rogue type record with a value of None. In [1]: import sqlite_utils In [2]: db = sqlite_utils.Database(memory=True) In [5]: db["creatures"].insert_all([ {"id": 1, "name": "Simon", "type": None}, {"id": 2, "name": "Natalie", "type": None}, {"id": 3, "name": "Cleo", "type": "dog"}], pk="id") Out[5]: <Table creatures (id, name, type)> In [7]: db["creatures"].extract("type") Out[7]: <Table creatures (id, name, type_id)> In [8]: list(db["creatures"].rows) Out[8]: [{'id': 1, 'name': 'Simon', 'type_id': None}, {'id': 2, 'name': 'Natalie', 'type_id': None}, {'id': 3, 'name': 'Cleo', 'type_id': 2}] In [9]: db["type"] Out[9]: <Table type (id, type)> In [10]: list(db["type"].rows) Out[10]: [{'id': 1, 'type': None}, {'id': 2, 'type': 'dog'}]

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/186/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
961008507 MDU6SXNzdWU5NjEwMDg1MDc= 308 Add an interactive tutorial as a Jupyter notebook simonw 9599 open 0     2 2021-08-04T20:34:22Z 2021-08-04T21:30:59Z   OWNER  

Can show people how to open this up in Binder.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/308/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
915421499 MDU6SXNzdWU5MTU0MjE0OTk= 267 row.update() or row.pk Gravitar64 12721157 open 0     4 2021-06-08T19:56:00Z 2021-06-22T17:27:27Z   NONE  

Hi,

fantastic framework for working with Sqlite3 databases!!!

I tried to update spezific rows in a table and used

for row in db[tablename]: newValue = row["counter"] * row["prize"]
row.update({"Fieldname": newValue}) print(row)

This updates the value in the printet row, but not in the database. So I switched to

db[tablename].update(id, {"Filedname": newValue})

This works fine. But row.update would be nicer, because no need for the id (its that row), no need for the tablename and the db (all defined in the for row ... loop).

Thx

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/267/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
818684978 MDU6SXNzdWU4MTg2ODQ5Nzg= 243 How can i use this utils to deal with fts on column meta of tables ? svjack 27874014 open 0     0 2021-03-01T09:45:05Z 2021-03-01T09:45:05Z   NONE  

Thank you to release this bravo project. When i use this project on multi table db, I want to implement convenient search on column name from different tables. I want to develop a meta table to save the meta data of different columns of different tables and search on this meta table to get rows from the data table (which the meta table describes) does this project provide some simple function on it ?

You can think a have a knowledge graph about the table in the db, and i save this knowledge graph into the db with fts enabled.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/243/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
816601354 MDExOlB1bGxSZXF1ZXN0NTgwMjM1NDI3 241 Extract expand - work in progress simonw 9599 open 0     0 2021-02-25T16:36:38Z 2021-02-25T16:36:38Z   OWNER simonw/sqlite-utils/pulls/241

Refs #239. Still needs documentation and CLI implementation.

sqlite-utils 140912432 pull    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/241/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
1  
688670158 MDU6SXNzdWU2ODg2NzAxNTg= 147 SQLITE_MAX_VARS maybe hard-coded too low simonwiles 96218 open 0     7 2020-08-30T07:26:45Z 2021-02-15T21:27:55Z   CONTRIBUTOR  

I came across this while about to open an issue and PR against the documentation for batch_size, which is a bit incomplete.

As mentioned in #145, while:

SQLITE_MAX_VARIABLE_NUMBER ... defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

it is common that it is increased at compile time. Debian-based systems, for example, seem to ship with a version of sqlite compiled with SQLITE_MAX_VARIABLE_NUMBER set to 250,000, and I believe this is the case for homebrew installations too.

In working to understand what batch_size was actually doing and why, I realized that by setting SQLITE_MAX_VARS in db.py to match the value my sqlite was compiled with (I'm on Debian), I was able to decrease the time to insert_all() my test data set (~128k records across 7 tables) from ~26.5s to ~3.5s. Given that this about .05% of my total dataset, this is time I am keen to save...

Unfortunately, it seems that sqlite3 in the python standard library doesn't expose the get_limit() C API (even though pysqlite used to), so it's hard to know what value sqlite has been compiled with (note that this could mean, I suppose, that it's less than 999, and even hardcoding SQLITE_MAX_VARS to the conservative default might not be adequate. It can also be lowered -- but not raised -- at runtime). The best I could come up with is echo "" | sqlite3 -cmd ".limits variable_number" (only available in sqlite >= 2015-05-07 (3.8.10)).

Obviously this couldn't be relied upon in sqlite_utils, but I wonder what your opinion would be about exposing SQLITE_MAX_VARS as a user-configurable parameter (with suitable "here be dragons" warnings)? I'm going to go ahead and monkey-patch it for my purposes in any event, but it seems like it might be worth considering.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/147/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
472115381 MDU6SXNzdWU0NzIxMTUzODE= 49 extracts= should support multiple-column extracts simonw 9599 open 0     10 2019-07-24T07:06:41Z 2020-10-16T19:18:19Z   OWNER  

Lookup tables can be constructed on compound columns, but the extracts= option doesn't currently support that.

Right now extracts can be defined in two ways: ```python

Extract these columns into tables with the same name:

dogs = db.table("dogs", extracts=["breed", "most_recent_trophy"])

Same as above but with custom table names:

dogs = db.table("dogs", extracts={"breed": "Breeds", "most_recent_trophy": "Trophies"}) ``` Need some kind of syntax for much more complicated extractions, like when two columns (say "source" and "source_version") are extracted into a single table.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/49/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
573578548 MDU6SXNzdWU1NzM1Nzg1NDg= 89 Ability to customize columns used by extracts= feature simonw 9599 open 0     3 2020-03-01T16:54:48Z 2020-10-16T19:17:50Z   OWNER  

@simonw any thoughts on allow extracts to specify the lookup column name? If I'm understanding the documentation right, .lookup() allows you to define the "value" column (the documentation uses name), but when you use extracts keyword as part of .insert(), .upsert() etc. the lookup must be done against a column named "value". I have an existing lookup table that I've populated with columns "id" and "name" as opposed to "id" and "value", and seems I can't use extracts=, unless I'm missing something...

Initial thought on how to do this would be to allow the dictionary value to be a tuple of table name column pair... so: table = db.table("trees", extracts={"species_id": ("Species", "name"})

I haven't dug too much into the existing code yet, but does this make sense? Worth doing?

Originally posted by @chrishas35 in https://github.com/simonw/sqlite-utils/issues/46#issuecomment-592999503

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/89/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
581795570 MDU6SXNzdWU1ODE3OTU1NzA= 93 Support more string values for types in .add_column() simonw 9599 open 0     0 2020-03-15T19:32:49Z 2020-09-24T20:36:46Z   OWNER  

https://sqlite-utils.readthedocs.io/en/2.4.2/python-api.html#adding-columns says:

SQLite types you can specify are "TEXT", "INTEGER", "FLOAT" or "BLOB".

As discovered in #92 this isn't the right list of values. I should expand this to match https://www.sqlite.org/datatype3.html

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/93/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
652961907 MDU6SXNzdWU2NTI5NjE5MDc= 121 Improved (and better documented) support for transactions simonw 9599 open 0     3 2020-07-08T04:56:51Z 2020-09-24T20:36:46Z   OWNER  

Originally posted by @simonw in https://github.com/simonw/sqlite-utils/pull/118#issuecomment-655283393

We should put some thought into how this library supports and encourages smart use of transactions.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/121/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
688352145 MDU6SXNzdWU2ODgzNTIxNDU= 141 insert-files support for compressed values simonw 9599 open 0     0 2020-08-28T20:59:46Z 2020-09-24T20:36:08Z   OWNER  

The sqlar format supports this, it would be useful if insert-files could support this too.

https://www.sqlite.org/sqlar.html

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/141/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
695441530 MDU6SXNzdWU2OTU0NDE1MzA= 154 OperationalError: cannot change into wal mode from within a transaction simonw 9599 open 0     2 2020-09-07T23:42:44Z 2020-09-07T23:47:10Z   OWNER  

I'm getting this error when running:

sqlite-utils enable-wal beta.db

OperationalError: cannot change into wal mode from within a transaction

I'm worried that maybe that's because of this new code from #152:

https://github.com/simonw/sqlite-utils/blob/deb2eb013ff85bbc828ebc244a9654f0d9c3139e/sqlite_utils/db.py#L128-L129

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/154/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
644161221 MDU6SXNzdWU2NDQxNjEyMjE= 117 Support for compound (composite) foreign keys simonw 9599 open 0     3 2020-06-23T21:33:42Z 2020-06-23T21:40:31Z   OWNER  

It turns out SQLite supports composite foreign keys: https://www.sqlite.org/foreignkeys.html#fk_composite

Their example looks like this: ```sql CREATE TABLE album( albumartist TEXT, albumname TEXT, albumcover BINARY, PRIMARY KEY(albumartist, albumname) );

CREATE TABLE song( songid INTEGER, songartist TEXT, songalbum TEXT, songname TEXT, FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname) ); ```

Here's what that looks like in sqlite-utils:

``` In [1]: import sqlite_utils

In [2]: import sqlite3

In [3]: conn = sqlite3.connect(":memory:")

In [4]: conn
Out[4]: <sqlite3.Connection at 0x1087186c0>

In [5]: conn.executescript(""" ...: CREATE TABLE album( ...: albumartist TEXT, ...: albumname TEXT, ...: albumcover BINARY, ...: PRIMARY KEY(albumartist, albumname) ...: ); ...:
...: CREATE TABLE song( ...: songid INTEGER, ...: songartist TEXT, ...: songalbum TEXT, ...: songname TEXT, ...: FOREIGN KEY(songartist, songalbum) REFERENCES album(albumartist, albumname) ...: ); ...: """)
Out[5]: <sqlite3.Cursor at 0x1088def10>

In [6]: db = sqlite_utils.Database(conn)

In [7]: db.tables
Out[7]: [<Table album (albumartist, albumname, albumcover)>, <Table song (songid, songartist, songalbum, songname)>] In [8]: db.tables[0].foreign_keys Out[8]: [] In [9]: db.tables[1].foreign_keys Out[9]: [ForeignKey(table='song', column='songartist', other_table='album', other_column='albumartist'), ForeignKey(table='song', column='songalbum', other_table='album', other_column='albumname')] ``` The table appears to have two separate foreign keys, when actually it has a single compound composite foreign key.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/117/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
539204432 MDU6SXNzdWU1MzkyMDQ0MzI= 70 Implement ON DELETE and ON UPDATE actions for foreign keys LucasElArruda 26292069 open 0     2 2019-12-17T17:19:10Z 2020-02-27T04:18:53Z   NONE  

Hi! I did not find any mention on the library about ON DELETE and ON UPDATE actions for foreign keys. Are those expected to be implemented? If not, it would be a nice thing to include!

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/70/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
546073980 MDU6SXNzdWU1NDYwNzM5ODA= 74 Test failures on openSUSE 15.1: AssertionError: Explicit other_table and other_column jayvdb 15092 open 0     3 2020-01-07T04:35:50Z 2020-01-12T07:21:17Z   CONTRIBUTOR  

openSUSE 15.1 is using python 3.6.5 and click-7.0 , however it has test failures while openSUSE Tumbleweed on py37 passes.

Most fail on the cli exit code like py [ 74s] =================================== FAILURES =================================== [ 74s] _________________________________ test_tables __________________________________ [ 74s] [ 74s] db_path = '/tmp/pytest-of-abuild/pytest-0/test_tables0/test.db' [ 74s] [ 74s] def test_tables(db_path): [ 74s] result = CliRunner().invoke(cli.cli, ["tables", db_path]) [ 74s] > assert '[{"table": "Gosh"},\n {"table": "Gosh2"}]' == result.output.strip() [ 74s] E assert '[{"table": "...e": "Gosh2"}]' == '' [ 74s] E - [{"table": "Gosh"}, [ 74s] E - {"table": "Gosh2"}] [ 74s] [ 74s] tests/test_cli.py:28: AssertionError

packaging project at https://build.opensuse.org/package/show/home:jayvdb:py-new/python-sqlite-utils

I'll keep digging into this after I have github-to-sqlite working on Tumbleweed, as I'll need openSUSE Leap 15.1 working before I can submit this into the main python repo.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/74/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 311.155ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows