Zenodo: Support simple download URL for archives

Created on 23 Oct 2018  ·  4Comments  ·  Source: zenodo/zenodo

Currently URLs for downloaded archives appear to be indirect and do not include the file name extension (e.g wget) - ideally the filename of the archive should be preserved when downloading using a tool on the command line. Is it possible to change this?

Most helpful comment

Thanks @lnielsen. Curl'ing the api/records/record_id worked for fetching wget'able urls. It would be helpful to surface these wget'able urls directly on the site, if possible.

All 4 comments

Thanks for reporting. AFAIK this is already supported. Example:

$ wget https://zenodo.org/api/files/4f53dd1f-df5f-4a9c-8b46-6eacfc4b8840/results.zip
--2018-10-24 08:33:10--  https://zenodo.org/api/files/4f53dd1f-df5f-4a9c-8b46-6eacfc4b8840/results.zip
Resolving zenodo.org... 137.138.76.77
Connecting to zenodo.org|137.138.76.77|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 309980977 (296M) [application/octet-stream]
Saving to: 'results.zip'

Do you have an example where it's not the case?

Hi @lnielsen,

https://zenodo.org/record/51405
This record, for example, contains files with the format
https://zenodo.org/record/<record number>/files/<filename>?download=1,
which is different to the
https://zenodo.org/api/files/<UID>/<filename>
format you give above. With reference to @bbarker's request, using wget on the former isn't possible.

This may be ignorance on my part. Is there a method I'm missing for getting the API-style permalink?

@benjaminhwilliams In both cases it's the same piece of code serving the file, the only difference is that you'll get a nice human error message on https://zenodo.org/record/<record number>/files/<filename>?download=1 for e.g. 404 pages.

In terms of wget and the ?download=1 then it's in my opinion wget misbehaving. You can however simply remove the ?download=1 to satisfy wget.

wget is misbehaving because we are infact sending the correct filename in the HTTP headers. See below:

$ curl -I "https://zenodo.org/record/51405/files/l-cyst_01.tar.gz?download=1"
HTTP/1.1 200 OK
...
Content-Disposition: attachment; filename=l-cyst_01.tar.gz
...

That said, if you need automated downloads, better use our REST API where you get direct file links:

$ curl https://zenodo.org/api/records/51405
{
  ...
  "files": [
    {
      "bucket": "cbc7d513-2359-47fe-a9c6-f826de7776c5",
      "checksum": "md5:780a7b23320307ae8b6cf2d6e99ade1f",
      "key": "l-cyst_fast_04.tar.gz",
      "links": {
        "self": "https://zenodo.org/api/files/cbc7d513-2359-47fe-a9c6-f826de7776c5/l-cyst_fast_04.tar.gz"
      },
      "size": 140654635,
      "type": "gz"
    },
    {
      "bucket": "cbc7d513-2359-47fe-a9c6-f826de7776c5",
      "checksum": "md5:c04800ec8ffaaad867ee54a3a1688ac5",
      "key": "l-cyst_very_fast_01.tar.gz",
      "links": {
        "self": "https://zenodo.org/api/files/cbc7d513-2359-47fe-a9c6-f826de7776c5/l-cyst_very_fast_01.tar.gz"
      },
      "size": 63254814,
      "type": "gz"
    },
  ...
}

Thanks @lnielsen. Curl'ing the api/records/record_id worked for fetching wget'able urls. It would be helpful to surface these wget'able urls directly on the site, if possible.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lnielsen picture lnielsen  ·  6Comments

lnielsen picture lnielsen  ·  8Comments

wkpalan picture wkpalan  ·  3Comments

par4dise picture par4dise  ·  9Comments

maurice-schleussinger picture maurice-schleussinger  ·  3Comments