issue_comments
10 rows where author_association = "NONE" and user = 28565 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: issue_url, created_at (date), updated_at (date)
user 1
- maxhawkins · 10 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | issue | performed_via_github_app |
---|---|---|---|---|---|---|---|---|---|---|---|
1710380941 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1710380941 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 | IC_kwDODFE5qs5l8leN | maxhawkins 28565 | 2023-09-07T15:39:59Z | 2023-09-07T15:39:59Z | NONE |
Mailbox parses the entire mbox into memory. Using the lower level library lets us stream the emails in one at a time to support larger archives. Both libraries are in the stdlib. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add Gmail takeout mbox import (v2) 954546309 | |
1003437288 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-1003437288 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 | IC_kwDODFE5qs47zzzo | maxhawkins 28565 | 2021-12-31T19:06:20Z | 2021-12-31T19:06:20Z | NONE |
Shouldn't be hard. The easiest way is probably to remove the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add Gmail takeout mbox import (v2) 954546309 | |
896378525 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-896378525 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 | IC_kwDODFE5qs41baad | maxhawkins 28565 | 2021-08-10T23:28:45Z | 2021-08-10T23:28:45Z | NONE | I added parsing of text/html emails using BeautifulSoup. Around half of the emails in my archive don't include a text/plain payload so adding html parsing makes a good chunk of them searchable. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add Gmail takeout mbox import (v2) 954546309 | |
894581223 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/8#issuecomment-894581223 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/8 | IC_kwDODFE5qs41Ujnn | maxhawkins 28565 | 2021-08-07T00:57:48Z | 2021-08-07T00:57:48Z | NONE | Just added two more fixes:
I was able to run this on my Takeout export and everything seems to work fine. @simonw let me know if this looks good to merge. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Add Gmail takeout mbox import (v2) 954546309 | |
888075098 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-888075098 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | IC_kwDODFE5qs407vNa | maxhawkins 28565 | 2021-07-28T07:18:56Z | 2021-07-28T07:18:56Z | NONE |
I did some investigation into this issue and made a fix here. The problem was that some messages (like gchat logs) don't have a @simonw While looking into this I found something unexpected about how sqlite_utils handles upserts if the pkey column is |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Add Gmail takeout mbox import 813880401 | |
885094284 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885094284 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | IC_kwDODFE5qs40wXeM | maxhawkins 28565 | 2021-07-22T17:41:32Z | 2021-07-22T17:41:32Z | NONE | I added a follow-up commit that deals with emails that don't have a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Add Gmail takeout mbox import 813880401 | |
885022230 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885022230 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | IC_kwDODFE5qs40wF4W | maxhawkins 28565 | 2021-07-22T15:51:46Z | 2021-07-22T15:51:46Z | NONE | One thing I noticed is this importer doesn't save attachments along with the body of the emails. It would be nice if those got stored as blobs in a separate attachments table so attachments can be included while fetching search results. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Add Gmail takeout mbox import 813880401 | |
884672647 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-884672647 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | IC_kwDODFE5qs40uwiH | maxhawkins 28565 | 2021-07-22T05:56:31Z | 2021-07-22T14:03:08Z | NONE | How does this commit look? https://github.com/maxhawkins/google-takeout-to-sqlite/commit/72802a83fee282eb5d02d388567731ba4301050d It seems that Takeout's mbox format is pretty simple, so we can get away with just splitting the file on lines begining with I was able to load a 12GB takeout mbox without the program using more than a couple hundred MB of memory during the import process. It does make us lose the progress bar, but maybe I can add that back in a later commit. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Add Gmail takeout mbox import 813880401 | |
849708617 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-849708617 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDg0OTcwODYxNw== | maxhawkins 28565 | 2021-05-27T15:01:42Z | 2021-05-27T15:01:42Z | NONE | Any updates? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Add Gmail takeout mbox import 813880401 | |
791089881 | https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791089881 | https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5 | MDEyOklzc3VlQ29tbWVudDc5MTA4OTg4MQ== | maxhawkins 28565 | 2021-03-05T02:03:19Z | 2021-03-05T02:03:19Z | NONE | I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout. Is it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
WIP: Add Gmail takeout mbox import 813880401 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [issue] INTEGER REFERENCES [issues]([id]) , [performed_via_github_app] TEXT); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue 2