home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

4 rows where repo = 248903544 and type = "issue" sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: user, author_association, created_at (date), updated_at (date), closed_at (date)

state 2

  • open 3
  • closed 1

type 1

  • issue · 4 ✖

repo 1

  • hacker-news-to-sqlite · 4 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1205867842 I_kwDODtX3eM5H4BVC 4 Retrieve the top-level story for a comment telotortium 1755789 open 0     0 2022-04-15T20:25:39Z 2022-04-15T20:25:39Z   NONE  

I think that each comment inserted into the database should include a column onstory that contains the ID of the story on which the comment was made. This is exactly equivalent to the link after "on:" at the top of an HN comment page (example). We could do this either by directly retrieving the HTML page and using Beautiful Soup to find that link, or alternatively recurse up the tree in the Firebase API using the parent field (probably using functools.lru_cache in case a person has commented a bunch of times on the same story).

hacker-news-to-sqlite 248903544 issue    
{
    "url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/4/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
952189173 MDU6SXNzdWU5NTIxODkxNzM= 3 Use HN algolia endpoint to retrieve trees simonw 9599 open 0     3 2021-07-25T03:35:27Z 2021-07-25T18:41:17Z   MEMBER  

The trees command currently has to make a request for every single comment. Algolia have an endpoint that bundles the entire thread together into a single request.

https://hn.algolia.com/api/v1/items/ID

Here's an example that loads quickly, with about 50 comments: https://hn.algolia.com/api/v1/items/27941108

It doesn't appear to use pagination at all - if a thread is big then the response is big.

I ran this search to find some stories with more than 1000 comments: https://hn.algolia.com/api/v1/search?tags=story&numericFilters=num_comments%3E=1000

Here's one: https://news.ycombinator.com/item?id=25015967 with 4759 comments. Hitting the API takes 41s and returns 3.7 MB of JSON! wget 'https://hn.algolia.com/api/v1/items/25015967' 0.03s user 0.04s system 0% cpu 41.368 total /tmp % ls -lah 25015967 -rw-r--r-- 1 simon wheel 3.7M Jul 24 20:31 25015967

hacker-news-to-sqlite 248903544 issue    
{
    "url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/3/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
952179830 MDU6SXNzdWU5NTIxNzk4MzA= 2 Command for fetching Hacker News threads from the search API simonw 9599 open 0     4 2021-07-25T02:00:45Z 2021-07-25T03:12:57Z   MEMBER  

I want to be able to fetch every item for a domain, e.g. https://news.ycombinator.com/from?site=simonwillison.net

hacker-news-to-sqlite 248903544 issue    
{
    "url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/2/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
   
585526292 MDU6SXNzdWU1ODU1MjYyOTI= 1 Set up full text search simonw 9599 closed 0     1 2020-03-21T15:57:35Z 2020-03-21T19:47:46Z 2020-03-21T19:45:52Z MEMBER  

Should run against title and text in items, and about and id in users.

hacker-news-to-sqlite 248903544 issue    
{
    "url": "https://api.github.com/repos/dogsheep/hacker-news-to-sqlite/issues/1/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 32.935ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows