html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-888075098,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,888075098,IC_kwDODFE5qs407vNa,28565,2021-07-28T07:18:56Z,2021-07-28T07:18:56Z,NONE,"> I'm not sure why but my most recent import, when displayed in Datasette, looks like this: > > I did some investigation into this issue and made a fix [here](https://github.com/dogsheep/google-takeout-to-sqlite/pull/8/commits/8ee555c2889a38ff42b95664ee074b4a01a82f06). The problem was that some messages (like gchat logs) don't have a `Message-Id` and we need to use `X-GM-THRID` as the pkey instead. @simonw While looking into this I found something unexpected about how sqlite_utils handles upserts if the pkey column is `None`. When the pkey is NULL I'd expect the function to either use rowid or throw an exception. Instead, it seems upsert_all creates a row where all columns are NULL instead of using the values provided as parameters.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401, https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885094284,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,885094284,IC_kwDODFE5qs40wXeM,28565,2021-07-22T17:41:32Z,2021-07-22T17:41:32Z,NONE,I added a follow-up commit that deals with emails that don't have a `Date` header: https://github.com/maxhawkins/google-takeout-to-sqlite/commit/4bc70103582c10802c85a523ef1e99a8a2154aa9,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401, https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-885022230,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,885022230,IC_kwDODFE5qs40wF4W,28565,2021-07-22T15:51:46Z,2021-07-22T15:51:46Z,NONE,One thing I noticed is this importer doesn't save attachments along with the body of the emails. It would be nice if those got stored as blobs in a separate attachments table so attachments can be included while fetching search results.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401, https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-884672647,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,884672647,IC_kwDODFE5qs40uwiH,28565,2021-07-22T05:56:31Z,2021-07-22T14:03:08Z,NONE,"How does this commit look? https://github.com/maxhawkins/google-takeout-to-sqlite/commit/72802a83fee282eb5d02d388567731ba4301050d It seems that Takeout's mbox format is pretty simple, so we can get away with just splitting the file on lines begining with `From `. My commit just splits the file every time a line starts with `From ` and uses `email.message_from_bytes` to parse each chunk. I was able to load a 12GB takeout mbox without the program using more than a couple hundred MB of memory during the import process. It does make us lose the progress bar, but maybe I can add that back in a later commit.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401, https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-849708617,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,849708617,MDEyOklzc3VlQ29tbWVudDg0OTcwODYxNw==,28565,2021-05-27T15:01:42Z,2021-05-27T15:01:42Z,NONE,Any updates?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401, https://github.com/dogsheep/google-takeout-to-sqlite/pull/5#issuecomment-791089881,https://api.github.com/repos/dogsheep/google-takeout-to-sqlite/issues/5,791089881,MDEyOklzc3VlQ29tbWVudDc5MTA4OTg4MQ==,28565,2021-03-05T02:03:19Z,2021-03-05T02:03:19Z,NONE,"I just tried to run this on a small VPS instance with 2GB of memory and it crashed out of memory while processing a 12GB mbox from Takeout. Is it possible to stream the emails to sqlite instead of loading it all into memory and upserting at once?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",813880401,