Zenodo: Support of "HTTP/1.1 byte range request" in file retrieval

Created on 9 Sep 2018 · 10Comments · Source: zenodo/zenodo

I have one feature request on zenodo - can the zenodo server support HTTP/1.1 byte range request https://tools.ietf.org/html/rfc7233 ?

Zenodo platform is already incredible, and your support of the byte range request will increase the value of deposited data further since some applications have relied on byte range request, in particular when dealing large files.

I'd like to add an example on how the byte range request works, to make my point clear. For example, github (raw.githubusercontent.com) support the byte range request as below:

###
### The entire part of the README file is retrieved, and processed locally
###
$ curl  https://raw.githubusercontent.com/zenodo/zenodo/master/README.rst |head -5 | tail -1
    Zenodo is free software; you can redistribute it

###
### Only the specified bytes specified in the file is retrieved, which does not require local processing
###
$ curl -H "range: bytes=72-125"  https://raw.githubusercontent.com/zenodo/zenodo/master/README.rst 
    Zenodo is free software; you can redistribute it

However, the byte range request is ignored in zenodo.org

###
### the entire part of the file is retrieved
###
$ curl   https://zenodo.org/record/1407145/files/DOI_Test.txt
This is a test of the Zenodo DOI functionality for GitLab. 

###
### Only small bytes are requested, but the entire part is retrieved
###
$ curl -H "range: bytes=6-7"  https://zenodo.org/record/1407145/files/DOI_Test.txt
This is a test of the Zenodo DOI functionality for GitLab.

Enhancement Needs investigation Accepted

Source

hkawaji

👍11

Most helpful comment

I just wanted to add my :+1: to state that enabling range requests would be very useful for geospatial data formats. Cloud Optimized GeoTIFF in particular would benefit a lot from this. Allowing range requests could really reduce the bandwidth needed from zenodo.

rabernat on 18 Sep 2020

👍11

All 10 comments

I'll second this. It would be very useful e.g. for genomics datasets to be accessed directly with tabix. It seems to require a config change in the zenodo web server setting 'max_ranges' to a positive number.

Is there some technical reason not to do that?

kpalin on 31 Jan 2019

Our file storage backend at the moment is not optimized to serve HTTP range requests (meaning that enabling this feature would potentially lead to significant slowdowns for the file upload/download API). Of course, there are people working on making it possible, though we can't give an accurate ETA on it...

slint on 31 Jan 2019

rabernat on 18 Sep 2020

👍11

Our file storage backend at the moment is not optimized to serve HTTP range requests (meaning that enabling this feature would potentially lead to significant slowdowns for the file upload/download API). Of course, there are people working on making it possible, though we can't give an accurate ETA on it...

Many people cannot download large genetic files (several GB). e.g.,
https://github.com/zenodo/zenodo/issues/460#issuecomment-546623751

Some has to retry many times, and that's actually wasting your bandwidth...

zhanxw on 11 Oct 2020

For our project also important that we can use Cloud-Optimized GeoTIFFs (see e.g. https://zenodo.org/record/4483227) directly from Zenodo. Figshare apparently works with COG's, zenodo does not? We wrote a tutorial for users how to get small chunks of data using COG files.

thengl on 4 Feb 2021

Could you please support this?

We need it to serve large image files (in Zarr format) by chunks, that allows us visualize the files in the browser instantly. It won't be possible to for the browser to download the, e.g.10GB, file and display.

oeway on 25 Mar 2021

👍6

Just noting the value for the Zarr use case. Thanks all for your work on Zenodo!

jakirkham on 25 Mar 2021

For Zarr, we could hypothetically get zenodo working today, without any changes. Zenodo does not support directories, but if we could map a regular zarr directory store to some sort of flat hierarchy, via a special character, we could make it work. For example, if the special character is __

.zgroup
foo__.zarray
foo__.zattrs
foo__0.0
foo__0.1

etc.

rabernat on 25 Mar 2021

Could you please raise an issue here ( https://github.com/zarr-developers/zarr-specs/issues )?

jakirkham on 25 Mar 2021

@rabernat I afraid that won't scale because Zenodo only allow 100 files at maximum.

Total files size limit per record is 50GB (max 100 files). One-time 100GB quota can be requested and granted on a case-by-case basis.

source: https://www.openaire.eu/technical-requirements

oeway on 25 Mar 2021

Was this page helpful?

0 / 5 - 0 ratings

Related issues

github: inclusion of release assets in published record

slint · 5Comments

Support simple download URL for archives

bbarker · 4Comments

Text search (communities): case sensitive

par4dise · 9Comments

Zenodo Internal Server Errors

virresh · 7Comments

github: make it easier to validate/generate a ".zenodo.json"

slint · 4Comments