home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

16 rows where issue = 1095570074 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • simonw 16

issue 1

  • `--batch-size 1` doesn't seem to commit for every item · 16 ✖

author_association 1

  • OWNER 16
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1008557414 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008557414 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48HV1m simonw 9599 2022-01-10T05:36:19Z 2022-01-10T05:36:19Z OWNER

That did the trick.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008546573 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008546573 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48HTMN simonw 9599 2022-01-10T05:05:15Z 2022-01-10T05:05:15Z OWNER

Bit nasty but it might work: ```python def try_until(expected): tries = 0 while True: rows = list(Database(db_path)["rows"].rows) if rows == expected: return tries += 1 if tries > 10: assert False, "Expected {}, got {}".format(expected, rows) time.sleep(tries * 0.1)

try_until([{"name": "Azi"}])
proc.stdin.write(b'{"name": "Suna"}\n')
proc.stdin.flush()
try_until([{"name": "Azi"}, {"name": "Suna"}])

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008545140 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008545140 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48HS10 simonw 9599 2022-01-10T05:01:34Z 2022-01-10T05:01:34Z OWNER

Urgh, tests are still failing intermittently - for example: ``` time.sleep(0.4)

  assert list(Database(db_path)["rows"].rows) == [{"name": "Azi"}]

E AssertionError: assert [] == [{'name': 'Azi'}] E Right contains one more item: {'name': 'Azi'} E Full diff: E - [{'name': 'Azi'}] E + [] ``` I'm going to change this code to keep on trying up to 10 seconds - that should get the tests to pass faster on most machines.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008537194 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008537194 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48HQ5q simonw 9599 2022-01-10T04:29:53Z 2022-01-10T04:31:29Z OWNER

After a bunch of debugging with print() statements it's clear that the problem isn't with when things are committed or the size of the batches - it's that the data sent to standard input is all being processed in one go, not a line at a time.

I think that's because it is being buffered by this: https://github.com/simonw/sqlite-utils/blob/d2a79d200f9071a86027365fa2a576865b71064f/sqlite_utils/cli.py#L759-L770

The buffering is there so that we can sniff the first few bytes to detect if it's a CSV file - added in 99ff0a288c08ec2071139c6031eb880fa9c95310 for #230. So maybe for non-CSV inputs we should disable buffering?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008526736 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008526736 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48HOWQ simonw 9599 2022-01-10T04:07:29Z 2022-01-10T04:07:29Z OWNER

I think this test is right: python def test_insert_streaming_batch_size_1(db_path): # https://github.com/simonw/sqlite-utils/issues/364 # Streaming with --batch-size 1 should commit on each record # Can't use CliRunner().invoke() here bacuse we need to # run assertions in between writing to process stdin proc = subprocess.Popen( [ sys.executable, "-m", "sqlite_utils", "insert", db_path, "rows", "-", "--nl", "--batch-size", "1", ], stdin=subprocess.PIPE, ) proc.stdin.write(b'{"name": "Azi"}') proc.stdin.flush() assert list(Database(db_path)["rows"].rows) == [{"name": "Azi"}] proc.stdin.write(b'{"name": "Suna"}') proc.stdin.flush() assert list(Database(db_path)["rows"].rows) == [{"name": "Azi"}, {"name": "Suna"}] proc.stdin.close() proc.wait()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008234293 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008234293 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48GG81 simonw 9599 2022-01-09T05:37:02Z 2022-01-09T05:37:02Z OWNER

Calling p.stdin.close() and then p.wait() terminates the subprocess.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008233910 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008233910 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48GG22 simonw 9599 2022-01-09T05:32:53Z 2022-01-09T05:35:45Z OWNER

This is strange. The following: ```pycon

import subprocess p = subprocess.Popen(["sqlite-utils", "insert", "/tmp/stream.db", "stream", "-", "--nl"], stdin=subprocess.PIPE) p.stdin.write(b'\n'.join(b'{"id": %s}' % str(i).encode("utf-8") for i in range(1000))) 11889

At this point /tmp/stream.db is still 0 bytes - but if I then run this:

p.stdin.close()

/tmp/stream.db is now 20K and contains the written data

`` No wait, mystery solved - I can addp.stdin.flush()instead ofp.stdin.close()` and the file suddenly jumps up in size.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008216201 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008216201 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48GCiJ simonw 9599 2022-01-09T02:34:12Z 2022-01-09T02:34:12Z OWNER

I can now write tests that look like this: https://github.com/simonw/sqlite-utils/blob/539f5ccd90371fa87f946018f8b77d55929e06db/tests/test_cli.py#L2024-L2030

Which means I can write a test that exercises this bug.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008214998 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008214998 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48GCPW simonw 9599 2022-01-09T02:23:20Z 2022-01-09T02:23:20Z OWNER

Possible way of running the test: add this to sqlite_utils/cli.py:

python if __name__ == "__main__": cli() Now the tool can be run using python -m sqlite_utils.cli --help

Then in the test use subprocess to call sys.executable (the path to the current Python interpreter) and pass it -m sqlite_utils.cli to run the script!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008214406 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008214406 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48GCGG simonw 9599 2022-01-09T02:18:21Z 2022-01-09T02:18:21Z OWNER

I'm having trouble figuring out the best way to write a unit test for this. Filed a relevant feature request for Click here: - https://github.com/pallets/click/issues/2171

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008155916 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008155916 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48Fz0M simonw 9599 2022-01-08T21:16:46Z 2022-01-08T21:16:46Z OWNER

No, chunks() seems to work OK in the test I just added.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008154873 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008154873 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48Fzj5 simonw 9599 2022-01-08T21:11:55Z 2022-01-08T21:11:55Z OWNER

I'm suspicious that the chunks() utility function may not be working correctly: ```pycon In [10]: [list(d) for d in list(chunks('abc', 5))] Out[10]: [['a'], ['b'], ['c']]

In [11]: [list(d) for d in list(chunks('abcdefghi', 5))] Out[11]: [['a'], ['b'], ['c'], ['d'], ['e'], ['f'], ['g'], ['h'], ['i']]

In [12]: [list(d) for d in list(chunks('abcdefghi', 3))] Out[12]: [['a'], ['b'], ['c'], ['d'], ['e'], ['f'], ['g'], ['h'], ['i']] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008153586 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008153586 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48FzPy simonw 9599 2022-01-08T21:06:15Z 2022-01-08T21:06:15Z OWNER

I added a print statement after for query, params in queries_and_params and confirmed that something in the code is waiting until 16 records are available to be inserted and then executing the inserts, even with --batch-size 1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008151884 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008151884 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48Fy1M simonw 9599 2022-01-08T20:59:21Z 2022-01-08T20:59:21Z OWNER

(That Heroku example doesn't record the timestamp, which limits its usefulness)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008143248 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008143248 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48FwuQ simonw 9599 2022-01-08T20:34:12Z 2022-01-08T20:34:12Z OWNER

Built that tool: https://github.com/simonw/stream-delay and https://pypi.org/project/stream-delay/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  
1008129841 https://github.com/simonw/sqlite-utils/issues/364#issuecomment-1008129841 https://api.github.com/repos/simonw/sqlite-utils/issues/364 IC_kwDOCGYnMM48Ftcx simonw 9599 2022-01-08T20:04:42Z 2022-01-08T20:04:42Z OWNER

It would be easier to test this if I had a utility for streaming out a file one line at a time.

A few recipes for this in https://superuser.com/questions/526242/cat-file-to-terminal-at-particular-speed-of-lines-per-second - I'm going to build a quick stream-delay tool though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
`--batch-size 1` doesn't seem to commit for every item 1095570074  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 34.197ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows