home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 954546309 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • maxhawkins 4
  • iloveitaly 2
  • Btibert3 1

issue 1

  • Add Gmail takeout mbox import (v2) · 7 ✖

author_association 1

  • NONE 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions issue performed_via_github_app
1710950671 https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1710950671 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 IC_kwDODFE5qs5l-wkP iloveitaly 150855 2023-09-08T01:22:49Z 2023-09-08T01:22:49Z NONE

Makes sense, thanks for explaining!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Gmail takeout mbox import (v2) 954546309  
1710380941 https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1710380941 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 IC_kwDODFE5qs5l8leN maxhawkins 28565 2023-09-07T15:39:59Z 2023-09-07T15:39:59Z NONE

@maxhawkins curious why you didn't use the stdlib mailbox to parse the mbox files?

Mailbox parses the entire mbox into memory. Using the lower level library lets us stream the emails in one at a time to support larger archives. Both libraries are in the stdlib.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Gmail takeout mbox import (v2) 954546309  
1708945716 https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1708945716 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 IC_kwDODFE5qs5l3HE0 iloveitaly 150855 2023-09-06T19:12:33Z 2023-09-06T19:12:33Z NONE

@maxhawkins curious why you didn't use the stdlib mailbox to parse the mbox files?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Gmail takeout mbox import (v2) 954546309  
1003437288 https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1003437288 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 IC_kwDODFE5qs47zzzo maxhawkins 28565 2021-12-31T19:06:20Z 2021-12-31T19:06:20Z NONE

@maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text.

Shouldn't be hard. The easiest way is probably to remove the if body.content_type == "text/html" clause from utils.py:254 and just return content directly without parsing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Gmail takeout mbox import (v2) 954546309  
1002735370 https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1002735370 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 IC_kwDODFE5qs47xIcK Btibert3 203343 2021-12-29T18:58:23Z 2021-12-29T18:58:23Z NONE

@maxhawkins how hard would it be to add an entry to the table that includes the HTML version of the email, if it exists? I just attempted your the PR branch on a very small mbox file, and it worked great. My use case is a research project and I need to access more than just the body plain text.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Gmail takeout mbox import (v2) 954546309  
896378525 https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-896378525 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 IC_kwDODFE5qs41baad maxhawkins 28565 2021-08-10T23:28:45Z 2021-08-10T23:28:45Z NONE

I added parsing of text/html emails using BeautifulSoup.

Around half of the emails in my archive don't include a text/plain payload so adding html parsing makes a good chunk of them searchable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Gmail takeout mbox import (v2) 954546309  
894581223 https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-894581223 https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 IC_kwDODFE5qs41Ujnn maxhawkins 28565 2021-08-07T00:57:48Z 2021-08-07T00:57:48Z NONE

Just added two more fixes:

  • Added parsing for rfc 2047 encoded unicode headers
  • Body is now stored as TEXT rather than a BLOB regardless of what order the messages are parsed in.

I was able to run this on my Takeout export and everything seems to work fine. @simonw let me know if this looks good to merge.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
Add Gmail takeout mbox import (v2) 954546309  

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);
Powered by Datasette · Queries took 26.053ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows