home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

16 rows where issue = 816526538 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • simonw 13
  • tmaier 2
  • hubgit 1

author_association 2

  • OWNER 13
  • NONE 3

issue 1

  • sqlite-utils extract could handle nested objects · 16 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1236214402 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-1236214402 https://api.github.com/repos/simonw/sqlite-utils/issues/239 IC_kwDOCGYnMM5JryKC simonw 9599 2022-09-03T23:46:02Z 2022-09-03T23:46:02Z OWNER

Yeah having a version of this that can setup m2m relationships would definitely be interesting.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
1236200834 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-1236200834 https://api.github.com/repos/simonw/sqlite-utils/issues/239 IC_kwDOCGYnMM5Jru2C hubgit 14294 2022-09-03T21:26:32Z 2022-09-03T21:26:32Z NONE

I was looking for something like this today, for extracting columns containing objects (and arrays of objects) into separate tables.

Would it make sense (especially for the fields containing arrays of objects) to create a one-to-many relationship, where each row of the newly created table would contain the id of the row that originally contained it?

If the extracted objects have a unique id and are repeated, it could even create a many-to-many relationship, with a third table for the joins.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
960295228 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-960295228 https://api.github.com/repos/simonw/sqlite-utils/issues/239 IC_kwDOCGYnMM45PPE8 tmaier 350038 2021-11-03T23:35:37Z 2021-11-03T23:36:50Z NONE

I think I only wonder how I would parse the JSON value within such a lambda...

My naive approach would have been $ sqlite-utils convert demo.db statuses statuses 'return value' --multi

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
960292442 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-960292442 https://api.github.com/repos/simonw/sqlite-utils/issues/239 IC_kwDOCGYnMM45POZa tmaier 350038 2021-11-03T23:28:55Z 2021-11-03T23:28:55Z NONE

I am super interested in this feature.

After reading the other issues you referenced, I think the right way would be to use the current extract feature and then to use sqlite-utils convert to extract the json object into individual columns

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786830832 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786830832 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NjgzMDgzMg== simonw 9599 2021-02-26T18:52:40Z 2021-02-26T18:52:40Z OWNER

Could this handle lists of objects too? That would be pretty amazing - if the column has a [{...}, {...}] list in it could turn that into a many-to-many.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786795132 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786795132 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4Njc5NTEzMg== simonw 9599 2021-02-26T17:45:53Z 2021-02-26T17:45:53Z OWNER

If there's no primary key in the JSON could use the hash_id mechanism.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786794435 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786794435 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4Njc5NDQzNQ== simonw 9599 2021-02-26T17:44:38Z 2021-02-26T17:44:38Z OWNER

This came up in office hours!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
786035142 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-786035142 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NjAzNTE0Mg== simonw 9599 2021-02-25T16:36:17Z 2021-02-25T16:36:17Z OWNER

WIP in a pull request.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785992158 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785992158 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk5MjE1OA== simonw 9599 2021-02-25T15:37:04Z 2021-02-25T15:37:04Z OWNER

Here's the current implementation of .extract(): https://github.com/simonw/sqlite-utils/blob/806c21044ac8d31da35f4c90600e98115aade7c6/sqlite_utils/db.py#L1049-L1074

Tricky detail here: I create the lookup table first, based on the types of the columns that are being extracted.

I need to do this because extraction currently uses unique tuples of values, so the table has to be created in advance.

But if I'm using these new expand functions to figure out what's going to be extracted, I don't know the names of the columns and their types in advance. I'm only going to find those out during the transformation.

This may turn out to be incompatible with how .extract() works at the moment. I may need a new method, .extract_expand() perhaps? It could be simpler - work only against a single column for example.

I can still use the existing sqlite-utils extract CLI command though, with a --json flag and a rule that you can't run it against multiple columns.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785983837 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785983837 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MzgzNw== simonw 9599 2021-02-25T15:25:21Z 2021-02-25T15:28:57Z OWNER

Problem with calling this argument transform= is that the term "transform" already means something else in this library.

I could use convert= instead.

... but that doesn't instantly make me think of turning a value into multiple columns.

How about expand=? I've not used that term anywhere yet.

db["Reports"].extract(["Reported by"], expand={"Reported by": json.loads})

I think that works. You're expanding a single value into several columns of information.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785983070 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785983070 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MzA3MA== simonw 9599 2021-02-25T15:24:17Z 2021-02-25T15:24:17Z OWNER

I'm going to go with last-wins - so if multiple transform functions return the same key the last one will over-write the others.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785980813 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785980813 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MDgxMw== simonw 9599 2021-02-25T15:21:02Z 2021-02-25T15:23:47Z OWNER

Maybe the Python version takes an optional dictionary mapping column names to transformation functions? It could then merge all of those results together - and maybe throw an error if the same key is produced by more than one column.

python db["Reports"].extract(["Reported by"], transform={"Reported by": json.loads}) Or it could have an option for different strategies if keys collide: first wins, last wins, throw exception, add a prefix to the new column name. That feels a bit too complex for an edge-case though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785980083 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785980083 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk4MDA4Mw== simonw 9599 2021-02-25T15:20:02Z 2021-02-25T15:20:02Z OWNER

It would be OK if the CLI version only allows you to specify a single column if you are using the --json option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785979769 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785979769 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk3OTc2OQ== simonw 9599 2021-02-25T15:19:37Z 2021-02-25T15:19:37Z OWNER

For the Python version I'd like to be able to provide a transformation callback function - which can be json.loads but could also be anything else which accepts the value of the current column and returns a Python dictionary of columns and their values to use in the new table.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785979192 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785979192 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk3OTE5Mg== simonw 9599 2021-02-25T15:18:46Z 2021-02-25T15:18:46Z OWNER

Likewise the sqlite-utils extract command takes one or more columns: ``` Usage: sqlite-utils extract [OPTIONS] PATH TABLE COLUMNS...

Extract one or more columns into a separate table

Options: --table TEXT Name of the other table to extract columns to --fk-column TEXT Name of the foreign key column to add to the table --rename <TEXT TEXT>... Rename this column in extracted table ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  
785978689 https://github.com/simonw/sqlite-utils/issues/239#issuecomment-785978689 https://api.github.com/repos/simonw/sqlite-utils/issues/239 MDEyOklzc3VlQ29tbWVudDc4NTk3ODY4OQ== simonw 9599 2021-02-25T15:18:03Z 2021-02-25T15:18:03Z OWNER

The Python .extract() method currently starts like this: python def extract(self, columns, table=None, fk_column=None, rename=None): rename = rename or {} if isinstance(columns, str): columns = [columns] if not set(columns).issubset(self.columns_dict.keys()): raise InvalidColumns( "Invalid columns {} for table with columns {}".format( columns, list(self.columns_dict.keys()) ) ) ... Note that it takes a list of columns (and treats a string as a single item list). That's because it can be called with a list of columns and it will use them to populate another table of unique tuples of those column values.

So a new mechanism that can instead read JSON values from a single column needs to be compatible with that existing design.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
sqlite-utils extract could handle nested objects 816526538  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 24.771ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows