github: issue_comments: 9 rows where issue = 964322136 sorted by updated

9 rows where issue = 964322136 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
985982668	https://github.com/simonw/datasette/issues/1426#issuecomment-985982668	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c46xObM	knowledgecamp12 95520595	2021-12-04T07:11:29Z	2021-12-04T07:11:29Z	NONE	You can generate xml site map from the online tools using https://tools4seo.site/xml-sitemap-generator.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
974711959	https://github.com/simonw/datasette/issues/1426#issuecomment-974711959	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c46GOyX	tannewt 52649	2021-11-20T21:11:51Z	2021-11-20T21:11:51Z	NONE	I think another thing would be to make `/pages/robots.txt` work. That way you can use jinja to generate a desired robots.txt. I'm using it to allow the main index and what it links to to be crawled (but not the database pages directly.)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
902263367	https://github.com/simonw/datasette/issues/1426#issuecomment-902263367	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c41x3JH	simonw 9599	2021-08-19T21:33:51Z	2021-08-19T21:36:28Z	OWNER	I was worried about if it's possible to allow access to `/fixtures` but deny access to `/fixtures?sql=...` From various answers on Stack Overflow it looks like this should handle that: `User-agent: * Disallow: /fixtures?` I could use this for tables too - it may well be OK to access table index pages while still avoiding pagination, facets etc. I think this should block both query strings and row pages while allowing the table page itself: `User-agent: * Disallow: /fixtures/searchable? Disallow: /fixtures/searchable/*` Could even accompany that with a `sitemap.xml` that explicitly lists all of the tables - which would mean adding sitemaps to Datasette core too.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
902260338	https://github.com/simonw/datasette/issues/1426#issuecomment-902260338	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c41x2Zy	simonw 9599	2021-08-19T21:28:25Z	2021-08-19T21:29:40Z	OWNER	Actually it looks like you can send a `sitemap.xml` to Google using an unauthenticated GET request to: `https://www.google.com/ping?sitemap=FULL_URL_OF_SITEMAP` According to https://developers.google.com/search/docs/advanced/sitemaps/build-sitemap	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
902260799	https://github.com/simonw/datasette/issues/1426#issuecomment-902260799	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c41x2g_	simonw 9599	2021-08-19T21:29:13Z	2021-08-19T21:29:13Z	OWNER	Bing's equivalent is: https://www.bing.com/webmasters/help/Sitemaps-3b5cf6ed `http://www.bing.com/ping?sitemap=FULL_URL_OF_SITEMAP`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
895522818	https://github.com/simonw/datasette/issues/1426#issuecomment-895522818	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c41YJgC	simonw 9599	2021-08-09T20:34:10Z	2021-08-09T20:34:10Z	OWNER	At the very least Datasette should serve a blank `/robots.txt` by default - I'm seeing a ton of 404s for it in the logs.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
895510773	https://github.com/simonw/datasette/issues/1426#issuecomment-895510773	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c41YGj1	simonw 9599	2021-08-09T20:14:50Z	2021-08-09T20:19:22Z	OWNER	https://twitter.com/mal/status/1424825895139876870 True pinging google should be part of the build process on a static site :) That's another aspect of this: if you DO want your site crawled, teaching the `datasette publish` command how to ping Google when a deploy has gone out could be a nice improvement. Annoyingly it looks like you need to configure an auth token of some sort in order to use their API though, which is likely too much hassle to be worth building into Datasette itself: https://developers.google.com/search/apis/indexing-api/v3/using-api ``` curl -X POST https://indexing.googleapis.com/v3/urlNotifications:publish -d '{ "url": "https://careers.google.com/jobs/google/technical-writer", "type": "URL_UPDATED" }' -H "Content-Type: application/json" { "error": { "code": 401, "message": "Request is missing required authentication credential. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.", "status": "UNAUTHENTICATED" } } ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
895509536	https://github.com/simonw/datasette/issues/1426#issuecomment-895509536	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c41YGQg	simonw 9599	2021-08-09T20:12:57Z	2021-08-09T20:12:57Z	OWNER	I could try out the `X-Robots` HTTP header too: https://developers.google.com/search/docs/advanced/robots/robots_meta_tag#xrobotstag	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136
895500565	https://github.com/simonw/datasette/issues/1426#issuecomment-895500565	https://api.github.com/repos/simonw/datasette/issues/1426	IC_kwDOBm6k_c41YEEV	simonw 9599	2021-08-09T20:00:04Z	2021-08-09T20:00:04Z	OWNER	A few options for how this would work: `datasette ... --robots allow` `datasette ... --setting robots allow` Options could be: `allow` - allow all crawling `deny` - deny all crawling `limited` - allow access to the homepage and the index pages for each database and each table, but disallow crawling any further than that The "limited" mode is particularly interesting. Could even make it the default, but I think that may be a bit too confusing. Idea would be to get the key pages indexed but use `nofollow` to discourage crawlers from indexing individual row pages or deep pages like `https://datasette.io/content/repos?_facet=owner&_facet=language&_facet_array=topics&topics__arraycontains=sqlite#facet-owner`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Manage /robots.txt in Datasette core, block robots by default 964322136

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
, [performed_via_github_app] TEXT);
CREATE INDEX [idx_issue_comments_issue]
                ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
                ON [issue_comments] ([user]);