html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app https://github.com/simonw/datasette/issues/1384#issuecomment-1066222323,https://api.github.com/repos/simonw/datasette/issues/1384,1066222323,IC_kwDOBm6k_c4_jULz,2670795,2022-03-14T00:36:42Z,2022-03-14T00:36:42Z,CONTRIBUTOR,"> Ah, sorry, I didn't get what you were saying you the first time. Using _metadata_local in that way makes total sense -- I agree, refreshing metadata each cell was seeming quite excessive. Now I'm on the same page! :) All good. Report back any issues you find with this stuff. Metadata/dynamic config hasn't been tested widely outside of what I've done AFAIK. If you find a strong use case for async meta, it's going to be better to know sooner rather than later!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1066194130,https://api.github.com/repos/simonw/datasette/issues/1384,1066194130,IC_kwDOBm6k_c4_jNTS,167160,2022-03-13T22:23:04Z,2022-03-13T22:23:04Z,NONE,"Ah, sorry, I didn't get what you were saying you the first time. Using _metadata_local in that way makes total sense -- I agree, refreshing metadata each cell was seeming quite excessive. Now I'm on the same page! :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1066169718,https://api.github.com/repos/simonw/datasette/issues/1384,1066169718,IC_kwDOBm6k_c4_jHV2,2670795,2022-03-13T19:48:49Z,2022-03-13T19:48:49Z,CONTRIBUTOR,"> For my reference, did you include a `render_cell` plugin calling `get_metadata` in those tests? You shouldn't need to do this, as I mentioned previously. The code inside `render_cell` hook already has access to the most recently sync'd metadata via `datasette._metadata_local`. Refreshing the metadata for every cell seems ... excessive.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1066143991,https://api.github.com/repos/simonw/datasette/issues/1384,1066143991,IC_kwDOBm6k_c4_jBD3,167160,2022-03-13T17:13:09Z,2022-03-13T17:13:09Z,NONE,"Thanks for taking the time to reply @brandonrobertz , this is really helpful info. > See ""Many small queries are efficient in sqlite"" for more information on the rationale here. Also note that in the datasette-live-config reference plugin, the DB connection is cached, so that eliminated most of the performance worries we had. Ah, that's nifty! Yeah, then caching on the python side is likely a waste :) I'm new to working with sqlite so this is super good to know the many-small-queries is a common pattern > I tested on very large Datasette deployments (hundreds of DBs, millions of rows). For my reference, did you include a `render_cell` plugin calling `get_metadata` in those tests? I'm less concerned now that I know a little more about sqlite's caching, but that special situation will jump you to a few orders of magnitude above what the sqlite article describes (e.g. 200 vs 20,000 queries+metadata merges for a page displaying 100 rows of a 200 column table). It wouldn't scale with db size as much as # of visible cells being rendered on the page, although they would be identical queries I suppose so will cache well. (If you didn't test this specific situation, no worries -- I'm just trying to calibrate my intuition on this and can do my own benchmarks at some point.) > Simon talked about eventually making something like this a standard feature of Datasette Yeah, getting metadata (and static pages as well for that matter) from internal tables definitely has my vote for including as a standard feature! Its really nice to be able to distribute a single *.db with all the metadata and static pages bundled. My metadata are sufficiently complex/domain specific that it makes sense to continue on my own plugin for now, but I'll be thinking about more general parts I can spin off as possible contributions to liveconfig (if you're open to them) or other plugins in this ecosystem.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1066006292,https://api.github.com/repos/simonw/datasette/issues/1384,1066006292,IC_kwDOBm6k_c4_ifcU,2670795,2022-03-13T02:09:44Z,2022-03-13T02:09:44Z,CONTRIBUTOR,"> If I'm understanding your plugin code correctly, you query the db using the sync handle every time `get_metdata` is called, right? Won't this become a pretty big bottleneck if a hook into `render_cell` is trying to read metadata / plugin config? Reading from sqlite DBs is pretty quick and I didn't notice significant performance issues when I was benchmarking. I tested on very large Datasette deployments (hundreds of DBs, millions of rows). See [""Many small queries are efficient in sqlite""](https://sqlite.org/np1queryprob.html) for more information on the rationale here. Also note that in the [datasette-live-config](https://github.com/next-LI/datasette-live-config) reference plugin, the DB connection is cached, so that eliminated most of the performance worries we had. If you need to ensure fresh metadata is being read inside of a `render_cell` hook specifically, you don't need to do anything further! `get_metadata` gets called before `render_cell` every request, so it already has access to the synced meta. There shouldn't be a need to call `get_metadata(...)` or `metadata(...)` inside `render_cell`, you can just use `datasette._metadata_local` if you're really worried about performance. > The plugin is close, but looks like it only grabs remote metadata, is that right? Instead what I'm wanting is to grab metadata embedded in the attached databases. Yes correct, the datadette-remote-metadata plugin doesn't do that. But the datasette-live-config plugin does. [It supports a `__metadata` table](https://github.com/next-LI/datasette-live-config/blob/main/datasette_live_config/__init__.py#L107-L138) that, when it exists on an attached DB, gets pulled into the Datasette internal `_metadata` and is also accessible via `get_metadata`. Updating is instantaneous so there's no gotchas for users or security issues for users relying on the metadata-based permissions. Simon talked about eventually making something like this a standard feature of Datasette, but I'm not sure what the status is on that! Good luck!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1065951744,https://api.github.com/repos/simonw/datasette/issues/1384,1065951744,IC_kwDOBm6k_c4_iSIA,167160,2022-03-12T19:47:17Z,2022-03-12T19:47:17Z,NONE,"Awesome, thanks @brandonrobertz ! The plugin is close, but looks like it only grabs remote metadata, is that right? Instead what I'm wanting is to grab metadata embedded in the attached databases. Rather than extending that plugin, at this point I've realized I need a lot more flexibility in metadata for my data model (esp around formatting cell values and custom file exports) so rather than extending that I'll continue working on a plugin specific to my app. If I'm understanding your plugin code correctly, you query the db using the sync handle every time `get_metdata` is called, right? Won't this become a pretty big bottleneck if a hook into `render_cell` is trying to read metadata / plugin config? > Making the get_metadata async won't improve the situation by itself as only some of the code paths accessing metadata use that hook. The other paths use the internal metadata dict. I agree -- because things like `render_cell` will potentially want to read metadata/config, `get_metadata` should really remain sync and lightweight, which we can do with something like the remote-metadata plugin that could also poll metadata tables in attached databases. That leaves your app, where it sounds like you want changes made by the user in the browser in to be immediately reflected, rather than have to wait for the next metadata refresh. In this case I wonder if you could have your app make a sync write to the datasette object so the change would have the immediate effect, but then have a separate async polling mechanism to eventually write that change out to the database for long-term persistence. Then you'd have the best of both worlds, I think? But probably not worth the trouble if your use cases are small (and/or you're not reading metadata/config from tight loops like render_cell).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1065940779,https://api.github.com/repos/simonw/datasette/issues/1384,1065940779,IC_kwDOBm6k_c4_iPcr,2670795,2022-03-12T18:49:29Z,2022-03-12T18:50:07Z,CONTRIBUTOR,"Hello! Just wanted to chime in and note that there's a plugin to have Datasette [watch for updates to an external metadata.yaml/json and update the internal settings accordingly](https://datasette.io/plugins/datasette-remote-metadata), so I think the cache/poll use case is already covered. @khusmann If you don't need truly dynamic metadata then what you've come up with or the plugin ought to work fine. Making the get_metadata async won't improve the situation by itself as only some of the code paths accessing metadata use that hook. The other paths use the internal metadata dict. Trying to force all paths through a async hook would have performance ramifications and making everything use the internal meta will cause problems for users that need changes to take effect immediately. This is why I came to the non-async solution as it was the path of least change within Datasette. As always, open to new ideas, etc!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1065929510,https://api.github.com/repos/simonw/datasette/issues/1384,1065929510,IC_kwDOBm6k_c4_iMsm,167160,2022-03-12T17:49:59Z,2022-03-12T17:49:59Z,NONE,"Ok, I'm taking a slightly different approach, which I think is sort of close to the in-memory _metadata table idea. I'm using a startup hook to load metadata / other info from the database, which I store in the datasette object for later: ``` @hookimpl def startup(datasette): async def inner(): datasette._mypluginmetadata = # await db query return inner ``` Then, I can use this in other plugins: ``` @hookimpl def render_cell(value, column, table, database, datasette): # use datasette._mypluginmetadata ``` For my app I don't need anything to update dynamically so it's fine to pre-populate everything on startup. It's also good to have things precached especially for a hook like render_cell, which would otherwise require a ton of redundant db queries. Makes me wonder if we could take a sort of similar caching approach with the internal _metadata table. Like have a little watchdog that could query all of the attached dbs for their _metadata tables every 5min or so, which then could be merged into the in memory _metadata table which then could be accessed sync by the plugins, or something like that. For most the use cases I can think of, live updates don't need to take into effect immediately; refreshing a cache every 5min or on some other trigger (adjustable w a config setting) would be just fine. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-1062124485,https://api.github.com/repos/simonw/datasette/issues/1384,1062124485,IC_kwDOBm6k_c4_TrvF,167160,2022-03-08T19:26:32Z,2022-03-08T19:26:32Z,NONE,"Looks like I'm late to the party here, but wanted to join the convo if there's still time before this interface is solidified in v1.0. My plugin use case is for education / social science data, which is meta-data heavy in the documentation of measurement scales, instruments, collection procedures, etc. that I want to connect to columns, tables, and dbs (and render in static pages, but looks like I can do that with the jinja plugin hook). I'm still digging in and I think @brandonrobertz 's approach will work for me at least for now, but I want to bump this thread in the meantime -- are there still plans for an async metadata hook at some point in the future? (or are you considering other directions?)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869075395,https://api.github.com/repos/simonw/datasette/issues/1384,869075395,MDEyOklzc3VlQ29tbWVudDg2OTA3NTM5NQ==,9599,2021-06-26T23:54:21Z,2021-06-26T23:59:21Z,OWNER,(It may well be that implementing #1168 involves a switch to async metadata),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869075368,https://api.github.com/repos/simonw/datasette/issues/1384,869075368,MDEyOklzc3VlQ29tbWVudDg2OTA3NTM2OA==,9599,2021-06-26T23:53:55Z,2021-06-26T23:53:55Z,OWNER,"Great, let's drop fallback then. My instinct at the moment is to ship this plugin hook as-is but with a warning that it may change before Datasette 1.0 - then before 1.0 either figure out an async variant or finish the database-backed metadata concept from #1168 and recommend that as an alternative.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869074701,https://api.github.com/repos/simonw/datasette/issues/1384,869074701,MDEyOklzc3VlQ29tbWVudDg2OTA3NDcwMQ==,2670795,2021-06-26T23:45:18Z,2021-06-26T23:45:37Z,CONTRIBUTOR,"> Here's where the plugin hook is called, demonstrating the `fallback=` argument: > > https://github.com/simonw/datasette/blob/05a312caf3debb51aa1069939923a49e21cd2bd1/datasette/app.py#L426-L472 > > I'm not convinced of the use-case for passing `fallback=` to the hook here - is there a reason a plugin might care whether fallback is `True` or `False`, seeing as the `metadata()` method already respects that fallback logic on line 459? I think you're right. I can't think of a reason why the plugin would care about the `fallback` parameter since plugins are currently mandated to return a full, global metadata dict.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869074182,https://api.github.com/repos/simonw/datasette/issues/1384,869074182,MDEyOklzc3VlQ29tbWVudDg2OTA3NDE4Mg==,2670795,2021-06-26T23:37:42Z,2021-06-26T23:37:42Z,CONTRIBUTOR,"> > Hmmm... that's tricky, since one of the most obvious ways to use this hook is to load metadata from database tables using SQL queries. > > @brandonrobertz do you have a working example of using this hook to populate metadata from database tables I can try? > > Answering my own question: here's how Brandon implements it in his `datasette-live-config` plugin: https://github.com/next-LI/datasette-live-config/blob/72e335e887f1c69c54c6c2441e07148955b0fc9f/datasette_live_config/__init__.py#L50-L160 > > That's using a completely separate SQLite connection (actually wrapped in `sqlite-utils`) and making blocking synchronous calls to it. > > This is a pragmatic solution, which works - and likely performs just fine, because SQL queries like this against a small database are so fast that not running them asynchronously isn't actually a problem. > > But... it's weird. Everywhere else in Datasette land uses `await db.execute(...)` - but here's an example where users are encouraged to use blocking calls instead. _Ideally_ this hook would be asynchronous, but when I started down that path I quickly realized how large of a change this would be, since metadata gets used synchronously across the entire Datasette codebase. (And calling async code from sync is non-trivial.) In my live-configuration implementation I use synchronous reads using a persistent sqlite connection. This works pretty well in practice, but I agree it's limiting. My thinking around this was to go with the path of least change as `Datasette.metadata()` is a critical core function.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869071790,https://api.github.com/repos/simonw/datasette/issues/1384,869071790,MDEyOklzc3VlQ29tbWVudDg2OTA3MTc5MA==,9599,2021-06-26T23:04:12Z,2021-06-26T23:04:12Z,OWNER,"> Hmmm... that's tricky, since one of the most obvious ways to use this hook is to load metadata from database tables using SQL queries. > > @brandonrobertz do you have a working example of using this hook to populate metadata from database tables I can try? Answering my own question: here's how Brandon implements it in his `datasette-live-config` plugin: https://github.com/next-LI/datasette-live-config/blob/72e335e887f1c69c54c6c2441e07148955b0fc9f/datasette_live_config/__init__.py#L50-L160 That's using a completely separate SQLite connection (actually wrapped in `sqlite-utils`) and making blocking synchronous calls to it. This is a pragmatic solution, which works - and likely performs just fine, because SQL queries like this against a small database are so fast that not running them asynchronously isn't actually a problem. But... it's weird. Everywhere else in Datasette land uses `await db.execute(...)` - but here's an example where users are encouraged to use blocking calls instead.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869071435,https://api.github.com/repos/simonw/datasette/issues/1384,869071435,MDEyOklzc3VlQ29tbWVudDg2OTA3MTQzNQ==,9599,2021-06-26T22:59:26Z,2021-06-26T22:59:26Z,OWNER,"The other alternative is to finish the work to build a `_metadata` internal table, see #1168. The idea there was that if we want to support efficient pagination and search across the metadata for thousands of attached tables powering it with a plugin hook doesn't work well - we don't want to call the hook once for every one of 1,000+ tables just to implement the homepage. So instead, all metadata for all attached databases would be loaded into an in-memory database called `_metadata`. Plugins that want to modify stored metadata could then do so by directly writing to that table.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869071167,https://api.github.com/repos/simonw/datasette/issues/1384,869071167,MDEyOklzc3VlQ29tbWVudDg2OTA3MTE2Nw==,9599,2021-06-26T22:55:36Z,2021-06-26T22:55:36Z,OWNER,"Just realized I already have an issue open for this, at #860. I'm going to close that and continue work on this in this issue.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869070941,https://api.github.com/repos/simonw/datasette/issues/1384,869070941,MDEyOklzc3VlQ29tbWVudDg2OTA3MDk0MQ==,9599,2021-06-26T22:53:34Z,2021-06-26T22:53:34Z,OWNER,"The `await` thing is worrying me a lot - it feels like this plugin hook is massively less useful if it can't make it's own DB queries and generally do asynchronous stuff - but I'd also like not to break every existing plugin that calls `datasette.metadata(...)`. One solution that could work: introduce a new method, maybe `await datasette.get_metadata(...)`, which uses this plugin hook - and keep the existing `datasette.metadata()` method (which doesn't call the hook) around. This would ensure existing plugins keep on working. Then, upgrade those plugins separately - with the goal of deprecating and removing `.metadata()` entirely in Datasette 1.0 - having upgraded the plugins in the meantime.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869070348,https://api.github.com/repos/simonw/datasette/issues/1384,869070348,MDEyOklzc3VlQ29tbWVudDg2OTA3MDM0OA==,9599,2021-06-26T22:46:18Z,2021-06-26T22:46:18Z,OWNER,"Here's where the plugin hook is called, demonstrating the `fallback=` argument: https://github.com/simonw/datasette/blob/05a312caf3debb51aa1069939923a49e21cd2bd1/datasette/app.py#L426-L472 I'm not convinced of the use-case for passing `fallback=` to the hook here - is there a reason a plugin might care whether fallback is `True` or `False`, seeing as the `metadata()` method already respects that fallback logic on line 459?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869070076,https://api.github.com/repos/simonw/datasette/issues/1384,869070076,MDEyOklzc3VlQ29tbWVudDg2OTA3MDA3Ng==,9599,2021-06-26T22:42:21Z,2021-06-26T22:42:21Z,OWNER,"Hmmm... that's tricky, since one of the most obvious ways to use this hook is to load metadata from database tables using SQL queries. @brandonrobertz do you have a working example of using this hook to populate metadata from database tables I can try?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869069926,https://api.github.com/repos/simonw/datasette/issues/1384,869069926,MDEyOklzc3VlQ29tbWVudDg2OTA2OTkyNg==,9599,2021-06-26T22:40:15Z,2021-06-26T22:40:53Z,OWNER,"The documentation says: > **datasette**: You can use this to access plugin configuration options via `datasette.plugin_config(your_plugin_name)`, or to execute SQL queries. That's not accurate: since the plugin hook is a regular function, not an awaitable, you can't use it to run `await db.execute(...)` so you can't execute SQL queries. I can fix this with the await-me-maybe pattern, used for other plugin hooks: https://simonwillison.net/2020/Sep/2/await-me-maybe/ BUT... that requires changing the `ds.metadata()` function to be awaitable, which will affect every existing plugn that uses that documented internal method!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869069768,https://api.github.com/repos/simonw/datasette/issues/1384,869069768,MDEyOklzc3VlQ29tbWVudDg2OTA2OTc2OA==,9599,2021-06-26T22:37:53Z,2021-06-26T22:37:53Z,OWNER,The documentation doesn't describe the ``fallback`` argument at the moment.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135, https://github.com/simonw/datasette/issues/1384#issuecomment-869069655,https://api.github.com/repos/simonw/datasette/issues/1384,869069655,MDEyOklzc3VlQ29tbWVudDg2OTA2OTY1NQ==,9599,2021-06-26T22:36:14Z,2021-06-26T22:37:37Z,OWNER,"Documentation for the new hook is now live at https://docs.datasette.io/en/latest/plugin_hooks.html#get-metadata-datasette-key-database-table-fallback Link to the current snapshot of that documentation: https://github.com/simonw/datasette/blob/05a312caf3debb51aa1069939923a49e21cd2bd1/docs/plugin_hooks.rst#get-metadata-datasette-key-database-table-fallback","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",930807135,