home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

6 rows where repo = 140912432 and user = 649467 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 5
  • open 1

type 1

  • issue 6

repo 1

  • sqlite-utils · 6 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1392690202 I_kwDOCGYnMM5TAsQa 495 Support JSON values returned from .convert() functions mhalle 649467 closed 0     3 2022-09-30T16:33:49Z 2022-10-25T21:23:37Z 2022-10-25T21:23:28Z NONE  

When using the convert function on a JSON column, the result of the conversion function must be a string. If the return value is either a dict (object) or a list (array), the convert call will error out with an unhelpful user defined function exception.

It makes sense that since the original column value was a string and required conversion to data structures, the result should be converted back into a JSON string as well. However, other functions auto-convert to JSON string representation, so the fact that convert doesn't could be surprising.

At least the documentation should note this requirement, because the sqlite error messages won't readily reveal the issue.

Jf only sqlite's JSON column type meant something :)

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/495/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
836829560 MDU6SXNzdWU4MzY4Mjk1NjA= 248 support for Apache Arrow / parquet files I/O mhalle 649467 open 0     1 2021-03-20T14:59:30Z 2021-10-28T23:46:48Z   NONE  

I just started looking at Apache Arrow using pyarrow for import and export of tabular datasets, and it looks quite compelling. It might be worth looking at for sqlite-utils and/or datasette.

As a test, I took a random jsonl data dump of a dataset I have with floats, strings, and ints and converted it to arrow's parquet format using the naive pyarrow.parquet.write_file() command, which has automatic type inferrence. It compressed down to 7% of the original size. Conversion of a 26MB JSON file and serializing it to parquet was eyeblink instantaneous. Parquet files are portable and can be directly imported into pandas and other analytics software.

The only hangup is the automatic type inference of the naive reader. It's great for general laziness and for parsing JSON columns (it correctly interpreted a table of mine with a JSON array). However, I did get an exception for a string column where most entries looked integer-like but had a couple values that weren't -- the reader tried to coerce all of them for some reason, even though the JSON type is string. Since the writer optionally takes a schema, it shouldn't be too hard to grab the sqlite header types. With some additional hinting, you might get datetime columns and JSON, which are native Arrow types.

Somewhat tangentially, someone even wrote an sqlite vfs extension for Parquet: https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/248/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
815554385 MDU6SXNzdWU4MTU1NTQzODU= 237 db["my_table"].drop(ignore=True) parameter, plus sqlite-utils drop-table --ignore and drop-view --ignore mhalle 649467 closed 0     3 2021-02-24T14:55:06Z 2021-02-25T17:11:41Z 2021-02-25T17:11:41Z NONE  

When I'm generating a derived table in python, I often drop the table and create it from scratch. However, the first time I generate the table, it doesn't exist, so the drop raises an exception. That means more boilerplate.

I was going to submit a pull request that adds an "if_exists" option to the drop method of tables and views.

However, for a utility like sqlite_utils, perhaps the "IF EXISTS" SQL semantics is what you want most of the time, and thus should be the default.

What do you think?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/237/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
783778672 MDU6SXNzdWU3ODM3Nzg2NzI= 220 Better error message for *_fts methods against views mhalle 649467 closed 0     3 2021-01-11T23:24:00Z 2021-02-22T20:44:51Z 2021-02-14T22:34:26Z NONE  

enable_fts and its related methods only work on tables, not views.

Could those methods and possibly others move up to the Queryable superclass?

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/220/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
707407567 MDU6SXNzdWU3MDc0MDc1Njc= 171 Idea: transitive closure tables for tree structures mhalle 649467 closed 0     2 2020-09-23T14:17:33Z 2020-10-22T04:38:35Z 2020-10-22T04:07:14Z NONE  

I just read that sqlite has a transitive closure table extension using a virtual table in order to represent trees:

https://charlesleifer.com/blog/querying-tree-structures-in-sqlite-using-python-and-the-transitive-closure-extension/

Even without this extension, though, a util to build a transitive closure table would allow trees to be queried easily. Since it relies on self-referential foreign keys, the relationships might even be able to be automatically detected.

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/171/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
432727685 MDU6SXNzdWU0MzI3Mjc2ODU= 20 JSON column values get extraneously quoted mhalle 649467 closed 0   1.0 4348046 1 2019-04-12T20:15:30Z 2019-05-25T00:57:19Z 2019-05-25T00:57:19Z NONE  

If the input to sqlite-utils insert includes a column that is a JSON array or object, sqlite-utils query will introduce an extra level of quoting on output:

```

echo '[{"key": ["one", "two", "three"]}]' | sqlite-utils insert t.db t -

sqlite-utils t.db 'select * from t'

[{"key": "[\"one\", \"two\", \"three\"]"}]

sqlite3 t.db 'select * from t'

["one", "two", "three"] ```

This might require an imperfect solution, since sqlite3 doesn't have a JSON type. Perhaps fields that start with [" or {" and end with "] or "} could be detected, with a flag to turn off that behavior for weird text fields (or vice versa).

sqlite-utils 140912432 issue    
{
    "url": "https://api.github.com/repos/simonw/sqlite-utils/issues/20/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 58.438ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows