html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/sqlite-utils/issues/555#issuecomment-1592047502,https://api.github.com/repos/simonw/sqlite-utils/issues/555,1592047502,IC_kwDOCGYnMM5e5LeO,7908073,2023-06-14T22:00:10Z,2023-06-14T22:01:57Z,CONTRIBUTOR,"You may want to try doing a performance comparison between this and just selecting all the ids with few constraints and then doing the filtering within python.
That might seem like a lazy-programmer, inefficient way but queries with large resultsets are a different profile than what databases like SQLITE are designed for. That is not to say that SQLITE is slow or that python is always faster but when you start reading >20% of an index there is an equilibrium that is reached. Especially when adding in writing extra temp tables and stuff to memory/disk. And especially given the `NOT IN` style of query...
You may also try chunking like this:
```py
def chunks(lst, n) -> Generator:
for i in range(0, len(lst), n):
yield lst[i : i + n]
SQLITE_PARAM_LIMIT = 32765
data = []
chunked = chunks(video_ids, consts.SQLITE_PARAM_LIMIT)
for ids in chunked:
data.expand(
list(
db.query(
f""""""SELECT * from videos
WHERE id in (""""""
+ "","".join([""?""] * len(ids))
+ "")"",
(*ids,),
)
)
)
```
but that actually won't work with your `NOT IN` requirements. You need to query the full resultset to check any row.
Since you are doing stuff with files/videos in SQLITE you might be interested in my side project: https://github.com/chapmanjacobd/library","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1733198948,