home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 1039037439 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • simonw 2
  • Florents-Tselai 1

author_association 2

  • OWNER 2
  • NONE 1

issue 1

  • Add functionality to read Parquet files. · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
979442854 https://github.com/simonw/sqlite-utils/pull/333#issuecomment-979442854 https://api.github.com/repos/simonw/sqlite-utils/issues/333 IC_kwDOCGYnMM46YRym simonw 9599 2021-11-25T19:47:26Z 2021-11-25T19:47:26Z OWNER

I just remembered that there's one other place that this could fit: as a Datasette "insert" plugin.

This is vaporware at the moment, but the idea is that Datasette itself could grow a mechanism for importing data, that's driven by plugins.

Out of the box Datasette would be able to import CSV and CSV files, similar to sqlite-utils insert ... --csv - but plugins would then be able to add support for additional format such as GeoJSON or - in this case - Parquet.

The neat thing about having it as a Datasette plugin is that one plugin would enable three different ways of importing data:

  1. Via a new datasette insert ... CLI option (similar to sqlite-utils)
  2. Via a web form upload interface, where authenticated Datasette users would be able to upload files
  3. Via an API interface, where files could be programatically submitted to a running Datasette server

I started fleshing out this idea quite a while ago but didn't make much concrete progress, maybe I should revisit it:

  • https://github.com/simonw/datasette/issues/1160
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add functionality to read Parquet files. 1039037439  
979345527 https://github.com/simonw/sqlite-utils/pull/333#issuecomment-979345527 https://api.github.com/repos/simonw/sqlite-utils/issues/333 IC_kwDOCGYnMM46X6B3 Florents-Tselai 2118708 2021-11-25T16:31:47Z 2021-11-25T16:31:47Z NONE

Thanks for your reply @simonw . Tbh, my first attempt was actually the parquet-to-sqlite package but I already had Makefiles that relied on SQLite-utils and it was less intrusive to my workflow. Maybe I'll revisit that decision. FYI: there's a [sqlite-parquet-vtable](https://github.com/cldellow/sqlite-parquet-vtable)

I don't think plugins make much sense either. Probably defeats the purpose of simplicity: simple database along with a pip-able package.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add functionality to read Parquet files. 1039037439  
974754412 https://github.com/simonw/sqlite-utils/pull/333#issuecomment-974754412 https://api.github.com/repos/simonw/sqlite-utils/issues/333 IC_kwDOCGYnMM46GZJs simonw 9599 2021-11-21T04:35:32Z 2021-11-21T04:35:32Z OWNER

Some other recent projects (like trying to get this library to work in JupyterLite) have made me much more cautious about adding new dependencies, especially dependencies like pyarrow which require custom C/Rust extensions.

There are a few ways this could work though:

  • Have this as an optional dependency feature - so it only works if the user installs pyarrow as well
  • Implement this as a separate tool, parquet-to-sqlite - which could itself depend on sqlite-utils
  • Add a concept of "plugins" to sqlite-utils, similar to how those work in Datasette: https://docs.datasette.io/en/stable/plugins.html

My favourite option is parquet-to-sqlite because that can be built without any additional changes to sqlite-utils at all!

I find the concept of plugins for sqlite-utils interesting. I've so far not had quite enough potential use-cases to convince me this is worthwhile (especially since it should be very easy to build out separate tools entirely), but I'm ready to be convinced that a plugin mechanism would be worthwhile.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add functionality to read Parquet files. 1039037439  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 27.871ms · About: github-to-sqlite