home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "NONE" and issue = 703246031 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • chapmanjacobd 1
  • hydrosquall 1

issue 1

  • github-to-sqlite should handle rate limits better · 2 ✖

author_association 1

  • NONE · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1279224780 https://github.com/dogsheep/github-to-sqlite/issues/51#issuecomment-1279224780 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51 IC_kwDODFdgUs5MP2vM chapmanjacobd 7908073 2022-10-14T16:34:07Z 2022-10-14T16:34:07Z NONE

also, it says that authenticated requests have a much higher "rate limit". Unauthenticated requests only get 60 req/hour ?? seems more like a quota than a "rate limit" (although I guess that is semantic equivalence)

You would want to use x-ratelimit-reset

time.sleep(r['x-ratelimit-reset'] + 1 - time.time())

But a more complete solution would bring authenticated requests to the other subcommands. I'm surprised only github-to-sqlite get is using the --auth= CLI flag

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
github-to-sqlite should handle rate limits better 703246031  
1208757153 https://github.com/dogsheep/github-to-sqlite/issues/51#issuecomment-1208757153 https://api.github.com/repos/dogsheep/github-to-sqlite/issues/51 IC_kwDODFdgUs5IDCuh hydrosquall 9020979 2022-08-09T00:29:44Z 2022-08-09T00:29:44Z NONE

I've been looking into how to to get this data out of Github (especially now there are "secondary rate limits" without an advertised allowance separate from the regular rate limits.

I've had decent success with the Airbyte github extractor (aside from one data quality issue https://github.com/airbytehq/airbyte/pull/15420 ). Airbyte splits data extraction between the GraphQL and REST endpoints depending on the resource type, but they're very comprehensive.

https://github.com/airbytehq/airbyte/blob/306a75ef5370728e0912cf52a1a898a530db0c90/airbyte-integrations/connectors/source-github/source_github/streams.py#L22-L122

Before this, I tried a few solutions in my own custom wrapper mentioned in this thread + its children https://github.com/PyGithub/PyGithub/issues/1989 , but they weren't working as expected.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
github-to-sqlite should handle rate limits better 703246031  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 21.229ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows