home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

9 rows where repo = 107914493, state = "closed" and user = 536941 sorted by updated_at descending

✖
✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, draft, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 5
  • pull 4

state 1

  • closed · 9 ✖

repo 1

  • datasette · 9 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association pull_request body repo type active_lock_reason performed_via_github_app reactions draft state_reason
1448143294 I_kwDOBm6k_c5WUOm- 1890 Autocomplete text entry for filter values that correspond to facets fgregg 536941 closed 0     16 2022-11-14T14:11:31Z 2022-11-17T00:47:36Z 2022-11-16T03:23:01Z CONTRIBUTOR  

datasette allows users to enter in the value for named parameters into a free-text form field.

I think it would add a lot of usability, if the form field could be a drop down of options when query value is already a faceted column.

datasette 107914493 issue    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1890/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1400121355 PR_kwDOBm6k_c5AVujU 1835 use inspect data for hash and file size fgregg 536941 closed 0     3 2022-10-06T18:25:24Z 2022-10-27T20:51:30Z 2022-10-06T20:06:07Z CONTRIBUTOR simonw/datasette/pulls/1835

inspect_data should already include the hash and the db file size, so this PR takes advantage of using those instead of always recalculating. should help a lot on startup with large DBs.

closes #1834

datasette 107914493 pull    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1835/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1400431789 PR_kwDOBm6k_c5AWyQK 1837 Make hash and size a lazy property fgregg 536941 closed 0     2 2022-10-06T23:51:22Z 2022-10-27T20:51:21Z 2022-10-27T20:51:20Z CONTRIBUTOR simonw/datasette/pulls/1837

Many apologies, @simonw. My previous PR #1835 did not really solve the problem because the name of the database is often not known to database object in the init method.

I took a cue from how you dealt with this issue and made hash a lazy property and did something similar with size.


:books: Documentation preview :books:: https://datasette--1837.org.readthedocs.build/en/1837/

datasette 107914493 pull    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1837/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1386456717 PR_kwDOBm6k_c4_oHI4 1820 [SPIKE] Don't truncate query CSVs fgregg 536941 closed 0     2 2022-09-26T17:27:01Z 2022-10-07T16:12:17Z 2022-10-07T16:12:17Z CONTRIBUTOR simonw/datasette/pulls/1820

Relates to #526

This is a minimal set of changes needed for having query CSVs attempt to download all the rows.

What's good about it is the minimalism.

What's bad about it:

  1. We are abusing the _size argument to indicate we don't want truncation, which isn't the most obvious thing. Additionally, there are various checks that make sure the "_size" URL parameter is a positive integer, which we are relying on to prevent overloading.
  2. The default CSV on a table page will use the max_returned_rows argument. Changing this could be a breaking change, since that's currently a place that has some facilities for pagination. Additionally, i think there's a limit under the hood somewhere which if we removed could lead to sql timeouts
  3. There are similar reasons for leaving the current streaming method alone, as the current methods could allow for downloading very large files that could have a sql timeout if we tried to get them in one go.

:books: Documentation preview :books:: https://datasette--1820.org.readthedocs.build/en/1820/

datasette 107914493 pull    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1820/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
1  
1400083043 I_kwDOBm6k_c5Tc5Jj 1834 inspect data is not used for caching database hash fgregg 536941 closed 0     0 2022-10-06T17:52:01Z 2022-10-06T20:06:21Z 2022-10-06T20:06:08Z CONTRIBUTOR  

When databases are loaded,

https://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/app.py#L257-L260

there is nothing preventing the rehashing of the database for immutable databases.

https://github.com/simonw/datasette/blob/cb1e093fd361b758120aefc1a444df02462389a3/datasette/database.py#L50-L53

what i might expect is that relevant values of inspect_data get passed to the Database class to prevent re-hashing?

With data that is many gigs large, this is a significant start up time.

datasette 107914493 issue    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1834/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1334628400 I_kwDOBm6k_c5PjNAw 1779 google cloudrun updated their limits on maxscale based on memory and cpu count fgregg 536941 closed 0   Datasette 0.62 8303187 13 2022-08-10T13:27:21Z 2022-08-14T19:42:59Z 2022-08-14T17:07:34Z CONTRIBUTOR  

if you don't set an explicit limit on container scaling, then google defaults to 100

google recently updated the limits on container scaling, such that if you set up datasette to use more memory or cpu, then you need to set the maxScale argument much smaller than 100.

would be nice if datasette publish could do this math for you and set the right maxScale.

Log of an failing publish run.

ERROR: (gcloud.run.deploy) spec.template.spec.containers[0].resources.limits.cpu: Invalid value specified for cpu. For the specified value, maxScale may not exceed 15. Consider running your workload in a region with greater capacity, decreasing your requested cpu-per-instance, or requesting an increase in quota for this region if you are seeing sustained usage near this limit, see https://cloud.google.com/run/quotas. Your project may gain access to further scaling by adding billing information to your account. Traceback (most recent call last): File "/home/runner/.local/bin/datasette", line 8, in <module> sys.exit(cli()) File "/home/runner/.local/lib/python3.8/site-packages/click/core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "/home/runner/.local/lib/python3.8/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/home/runner/.local/lib/python3.8/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/runner/.local/lib/python3.8/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/runner/.local/lib/python3.8/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/runner/.local/lib/python3.8/site-packages/click/core.py", line 754, in invoke return __callback(*args, **kwargs) File "/home/runner/.local/lib/python3.8/site-packages/datasette/publish/cloudrun.py", line 160, in cloudrun check_call( File "/usr/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'gcloud run deploy --allow-unauthenticated --platform=managed --image gcr.io/labordata/datasette warehouse --memory 8Gi --cpu 2' returned non-zero exit status 1.

datasette 107914493 issue    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1779/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1089529555 I_kwDOBm6k_c5A8ObT 1581 when hashed urls are turned on, the _memory db has improperly long-lived cache expiry fgregg 536941 closed 0     1 2021-12-28T00:05:48Z 2022-03-24T04:08:18Z 2022-03-24T04:08:18Z CONTRIBUTOR  

if hashed_urls are on, then a -000 suffix is added to the _memory database, and the cache settings are set just as if it was a normal hashed database.

in particular, this header is set:

cache-control: max-age=31536000

this is not appropriate because the _memory-000 database isn't really hashed based on the contents of the databases (see #1561).

Either the cache-control header should be changed, or the _memory db should have a hash suffix that does depend on the contents of the databases.

datasette 107914493 issue    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1581/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed
1090055810 PR_kwDOBm6k_c4wWDxH 1582 don't set far expiry if hash is '000' fgregg 536941 closed 0     1 2021-12-28T18:16:13Z 2022-03-24T04:07:58Z 2022-03-24T04:07:58Z CONTRIBUTOR simonw/datasette/pulls/1582

This will close #1581.

I couldn't find any unit tests related to the testing hashed urls, and I know that you want to break that code out of the core application (#1561), so I'm not quite sure what you would like me to for testing.

datasette 107914493 pull    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1582/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
0  
1082765654 I_kwDOBm6k_c5AibFW 1561 add hash id to "_memory" url if hashed url mode is turned on and crossdb is also turned on fgregg 536941 closed 0     3 2021-12-17T00:45:12Z 2022-03-19T04:45:40Z 2022-03-19T04:45:40Z CONTRIBUTOR  

If hashed_url mode is turned on and crossdb is also turned on, then queries to _memory should have a hash_id.

One way that it could work is to have the _memory hash be a hash of all the individual databases.

Otherwise, crossdb queries can get quit out of data if using aggressive caching.

datasette 107914493 issue    
{
    "url": "https://api.github.com/repos/simonw/datasette/issues/1561/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [pull_request] TEXT,
   [body] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
, [active_lock_reason] TEXT, [performed_via_github_app] TEXT, [reactions] TEXT, [draft] INTEGER, [state_reason] TEXT);
CREATE INDEX [idx_issues_repo]
                ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
                ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
                ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
                ON [issues] ([user]);
Powered by Datasette · Queries took 59.641ms · About: github-to-sqlite
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows