html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006219956,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006219956,IC_kwDOCGYnMM47-bK0,22429695,2022-01-06T01:51:54Z,2022-01-06T06:22:25Z,NONE,"# [Codecov](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) Report
> Merging [#361](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (b7f0b88) into [main](https://codecov.io/gh/simonw/sqlite-utils/commit/f3fd8613113d21d44238a6ec54b375f5aa72c4e0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) (f3fd861) will **decrease** coverage by `0.05%`.
> The diff coverage is `92.85%`.
[![Impacted file tree graph](https://codecov.io/gh/simonw/sqlite-utils/pull/361/graphs/tree.svg?width=650&height=150&src=pr&token=O0X3703L9P&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)
```diff
@@ Coverage Diff @@
## main #361 +/- ##
==========================================
- Coverage 96.49% 96.44% -0.06%
==========================================
Files 5 5
Lines 2283 2306 +23
==========================================
+ Hits 2203 2224 +21
- Misses 80 82 +2
```
| [Impacted Files](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison) | Coverage Δ | |
|---|---|---|
| [sqlite\_utils/cli.py](https://codecov.io/gh/simonw/sqlite-utils/pull/361/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-c3FsaXRlX3V0aWxzL2NsaS5weQ==) | `95.49% <92.00%> (-0.11%)` | :arrow_down: |
| [sqlite\_utils/utils.py](https://codecov.io/gh/simonw/sqlite-utils/pull/361/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison#diff-c3FsaXRlX3V0aWxzL3V0aWxzLnB5) | `94.23% <100.00%> (ø)` | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison)
> `Δ = absolute (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Last update [f3fd861...b7f0b88](https://codecov.io/gh/simonw/sqlite-utils/pull/361?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Simon+Willison).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006315145,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006315145,IC_kwDOCGYnMM47-yaJ,9599,2022-01-06T06:20:51Z,2022-01-06T06:20:51Z,OWNER,This is all documented. I'm going to rebase-merge it to keep the individual commits.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006311742,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006311742,IC_kwDOCGYnMM47-xk-,9599,2022-01-06T06:12:19Z,2022-01-06T06:12:19Z,OWNER,"Got that working:
```
% echo 'This is cool' | sqlite-utils insert words.db words - --text --convert '({""word"": w} for w in text.split())'
% sqlite-utils dump words.db
BEGIN TRANSACTION;
CREATE TABLE [words] (
[word] TEXT
);
INSERT INTO ""words"" VALUES('This');
INSERT INTO ""words"" VALUES('is');
INSERT INTO ""words"" VALUES('cool');
COMMIT;
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006309834,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006309834,IC_kwDOCGYnMM47-xHK,9599,2022-01-06T06:08:01Z,2022-01-06T06:08:01Z,OWNER,"For `--text` the conversion function should be allowed to return an iterable instead of a dictionary, in which case it will be treated as the full list of records to be inserted.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006301546,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006301546,IC_kwDOCGYnMM47-vFq,9599,2022-01-06T05:44:47Z,2022-01-06T05:44:47Z,OWNER,Just need documentation for `--convert` now against the various different types of input.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006300280,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006300280,IC_kwDOCGYnMM47-ux4,9599,2022-01-06T05:40:45Z,2022-01-06T05:40:45Z,OWNER,"I'm going to rename `--all` to `--text`:
> - Use `--text` to write the entire input to a column called ""text""
To avoid that clash with Python's `all()` function.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006299778,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006299778,IC_kwDOCGYnMM47-uqC,9599,2022-01-06T05:39:10Z,2022-01-06T05:39:10Z,OWNER,`all` is a bad variable name because it clashes with the Python `all()` built-in function.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006295276,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006295276,IC_kwDOCGYnMM47-tjs,9599,2022-01-06T05:26:11Z,2022-01-06T05:26:11Z,OWNER,"Here's the traceback if your `--convert` function doesn't return a dict right now:
```
% sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all
Traceback (most recent call last):
File ""/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/bin/sqlite-utils"", line 33, in
sys.exit(load_entry_point('sqlite-utils', 'console_scripts', 'sqlite-utils')())
File ""/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py"", line 1137, in __call__
return self.main(*args, **kwargs)
File ""/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py"", line 1062, in main
rv = self.invoke(ctx)
File ""/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py"", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File ""/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py"", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File ""/Users/simon/.local/share/virtualenvs/sqlite-utils-C4Ilevlm/lib/python3.8/site-packages/click/core.py"", line 763, in invoke
return __callback(*args, **kwargs)
File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py"", line 949, in insert
insert_upsert_implementation(
File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py"", line 834, in insert_upsert_implementation
db[table].insert_all(
File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 2602, in insert_all
first_record = next(records)
File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/db.py"", line 3044, in fix_square_braces
for record in records:
File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/cli.py"", line 831, in
docs = (decode_base64_values(doc) for doc in docs)
File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py"", line 86, in decode_base64_values
to_fix = [
File ""/Users/simon/Dropbox/Development/sqlite-utils/sqlite_utils/utils.py"", line 89, in
if isinstance(doc[k], dict)
TypeError: string indices must be integers
```
I can live with that for the moment.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006294777,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006294777,IC_kwDOCGYnMM47-tb5,9599,2022-01-06T05:24:54Z,2022-01-06T05:24:54Z,OWNER,"> I added a custom error message for if the user's `--convert` code doesn't return a dict.
That turned out to be a bad idea because it meant exhausting the iterator early for the check - before we got to the `.insert_all()` code that breaks the iterator up into chunks. I tried fixing that with `itertools.tee()` to run the generator twice but that's grossly memory-inefficient for large imports.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006288444,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006288444,IC_kwDOCGYnMM47-r48,9599,2022-01-06T05:07:10Z,2022-01-06T05:07:10Z,OWNER,"And here's a demo of `--convert` used with `--all` - I added a custom error message for if the user's `--convert` code doesn't return a dict.
```
% sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert 'all.upper()' --all
Error: Records returned by your --convert function must be dicts
% sqlite-utils insert /tmp/all.db blah /tmp/log.log --convert '{""all"": all.upper()}' --all
% sqlite-utils dump /tmp/all.db
BEGIN TRANSACTION;
CREATE TABLE [blah] (
[all] TEXT
);
INSERT INTO ""blah"" VALUES('INFO: 127.0.0.1:60581 - ""GET / HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60581 - ""GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60581 - ""GET /FAVICON.ICO HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60581 - ""GET /FOO/TIDDLYWIKI HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60581 - ""GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60584 - ""GET /FOO/-/STATIC/SQL-FORMATTER-2.3.3.MIN.JS HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60586 - ""GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.JS HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60585 - ""GET /FOO/-/STATIC/CODEMIRROR-5.57.0.MIN.CSS HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60588 - ""GET /FOO/-/STATIC/CODEMIRROR-5.57.0-SQL.MIN.JS HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60587 - ""GET /FOO/-/STATIC/CM-RESIZE-1.0.1.MIN.JS HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60586 - ""GET /FOO/TIDDLYWIKI/TIDDLERS HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60586 - ""GET /FOO/-/STATIC/APP.CSS?CEAD5A HTTP/1.1"" 200 OK
INFO: 127.0.0.1:60584 - ""GET /FOO/-/STATIC/TABLE.JS HTTP/1.1"" 200 OK
');
COMMIT;
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006284673,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006284673,IC_kwDOCGYnMM47-q-B,9599,2022-01-06T04:55:52Z,2022-01-06T04:55:52Z,OWNER,"Test code that just worked for me:
```
sqlite-utils insert /tmp/blah.db blah /tmp/log.log --convert '
bits = line.split()
return dict([(""b_{}"".format(i), bit) for i, bit in enumerate(bits)])' --lines
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006232013,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006232013,IC_kwDOCGYnMM47-eHN,9599,2022-01-06T02:21:35Z,2022-01-06T02:21:35Z,OWNER,"I'm having second thoughts about this bit:
> Your Python code will be passed a ""row"" variable representing the imported row, and can return a modified row.
>
> If you are using `--lines` your code will be passed a ""line"" variable, and for `--all` an ""all"" variable.
The code in question is this:
https://github.com/simonw/sqlite-utils/blob/500a35ad4d91c8a6232134ce9406efec11bedff8/sqlite_utils/utils.py#L296-L303
Do I really want to add the complexity of supporting different variable names there? I think always using `value` might be better.
Except... `value` made sense for the existing `sqlite-utils convert` command where you are running a conversion function against the value for the column in the current row - is it confusing if applied to lines or documents or `all`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006230411,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006230411,IC_kwDOCGYnMM47-duL,9599,2022-01-06T02:17:35Z,2022-01-06T02:17:35Z,OWNER,"Documentation: https://github.com/simonw/sqlite-utils/blob/33223856ff7fe746b7b77750fbe5b218531d0545/docs/cli.rst#inserting-unstructured-data-with---lines-and---all - I went with a single section titled ""Inserting unstructured data with --lines and --all""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006220129,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006220129,IC_kwDOCGYnMM47-bNh,9599,2022-01-06T01:52:26Z,2022-01-06T01:52:26Z,OWNER,I'm going to refactor all of the tests for `sqlite-utils insert` into a new `test_cli_insert.py` module.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,
https://github.com/simonw/sqlite-utils/pull/361#issuecomment-1006219848,https://api.github.com/repos/simonw/sqlite-utils/issues/361,1006219848,IC_kwDOCGYnMM47-bJI,9599,2022-01-06T01:51:36Z,2022-01-06T01:51:36Z,OWNER,"So far I've just implemented the new help:
```
% sqlite-utils insert --help
Usage: sqlite-utils insert [OPTIONS] PATH TABLE FILE
Insert records from FILE into a table, creating the table if it does not
already exist.
By default the input is expected to be a JSON array of objects. Or:
- Use --nl for newline-delimited JSON objects
- Use --csv or --tsv for comma-separated or tab-separated input
- Use --lines to write each incoming line to a column called ""line""
- Use --all to write the entire input to a column called ""all""
You can also use --convert to pass a fragment of Python code that will be
used to convert each input.
Your Python code will be passed a ""row"" variable representing the imported
row, and can return a modified row.
If you are using --lines your code will be passed a ""line"" variable, and for
--all an ""all"" variable.
Options:
--pk TEXT Columns to use as the primary key, e.g. id
--flatten Flatten nested JSON objects, so {""a"": {""b"": 1}}
becomes {""a_b"": 1}
--nl Expect newline-delimited JSON
-c, --csv Expect CSV input
--tsv Expect TSV input
--lines Treat each line as a single value called 'line'
--all Treat input as a single value called 'all'
--convert TEXT Python code to convert each item
--import TEXT Python modules to import
--delimiter TEXT Delimiter to use for CSV files
--quotechar TEXT Quote character to use for CSV/TSV
--sniff Detect delimiter and quote character
--no-headers CSV file has no header row
--batch-size INTEGER Commit every X records
--alter Alter existing table to add any missing columns
--not-null TEXT Columns that should be created as NOT NULL
--default ... Default value that should be set for a column
--encoding TEXT Character encoding for input, defaults to utf-8
-d, --detect-types Detect types for columns in CSV/TSV data
--load-extension TEXT SQLite extensions to load
--silent Do not show progress bar
--ignore Ignore records if pk already exists
--replace Replace records if pk already exists
--truncate Truncate table before inserting records, if table
already exists
-h, --help Show this message and exit.
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1094890366,