Registry: GrSciColl Number of specimens perhaps should be optional, or derived from GBIF count?

Created on 6 Oct 2021  ·  9Comments  ·  Source: gbif/registry

I would suggest that the Number of specimens field registered for each grscicoll institution should either be optional or it should be possible for it to be automatically generated with each page load from the occurrence count. Otherwise it becomes a bit tedious if you are publishing new data daily to have to go into grscicoll and manually update the specimen count every day there too.

So for example on https://www.gbif.org/grscicoll/institution/41930fa9-e6da-4351-bc6e-59dfaca3be7d, the green count of occurrences on the top right should always be equal to the Number of specimens field. Or the number of specimens field should be optional (currently if you put an empty string in there it sets it to '0').

GRSciColl

Most helpful comment

I agree that this field should be optional. I don't know if this is an API or UI thing, @marcos-lg does the API save 0 for specimen counts by default? Could it be changed?

But the specimen count doesn't have to be the number of specimens published on GBIF. In fact, I think it is very useful to be able to advertise undigitized specimens. And the difference between the estimated number of specimen and the actual number of records available on GBIF can be a good tool to prioritise data mobilisation efforts.

All 9 comments

I agree that this field should be optional. I don't know if this is an API or UI thing, @marcos-lg does the API save 0 for specimen counts by default? Could it be changed?

But the specimen count doesn't have to be the number of specimens published on GBIF. In fact, I think it is very useful to be able to advertise undigitized specimens. And the difference between the estimated number of specimen and the actual number of records available on GBIF can be a good tool to prioritise data mobilisation efforts.

I agree that this field should be optional. I don't know if this is an API or UI thing, @marcos-lg does the API save 0 for specimen counts by default? Could it be changed?

I'll change it to be optional.

I am not sure about this. For me this is an estimate of the total number of specimens held by an institution. The upper right number is the number of specimen records that have been mobilized to GBIF. This difference between the estimate and realised number digital specimen records at GBIF can drive future digitisation on demand.

Yes good point, but then perhaps it's ok that it will be optional?

But the specimen count doesn't have to be the number of specimens published on GBIF. In fact, I think it is very useful to be able to advertise undigitized specimens. And the difference between the estimated number of specimen and the actual number of records available on GBIF can be a good tool to prioritise data mobilisation efforts.

Greetings all, I would also consult with the TDWG CD convenors here. (Matt Woodburn, Janeen Jones, Sharon Grant, Kate Webbink). What @ManonGros writes here is exactly what we are trying to support with these standards.

  1. We need "denominators." We need these numbers so that we can do data visualizations as well as programmatic calculations.
  2. We need to clearly! get counts of physical objects (for a denominator) and count of digital objects.
  3. With the digital objects, preferably there's a way to distinguish between individual objects and lots (such as for wet collections).
  4. With these data, in the appropriate standard buckets, the CD group has a data model that shows what calculations are possible, based on what is or is not provided by the data user.

@timrobertson100 has seen the above data models ... and may be able to offer more insights.

It is exciting to see this conversation taking place -- and noting that we are moving forward with our ability to better share these data so that we can better understand what we have, what makes us each unique, and what we need.

Hi @debpaul integrating the TWDG CDs is part of our long term plans for GRSciColl! Although we are not there yet, we certainly keeping an eye on the evolution of the standard

Does it actually make more sense for number of specimens to be per grscicoll collection, rather than per institution?

Just deployed to PROD the change to make the number of specimens optional.

@rukayaj it would make sense and I believe that this is something that is being taken into account in the TDWG CDs. I wouldn't want to start fiddling with the current model given that we will want to integrate the TDWG CDs anyway. I think that this will be something that will be incorporated at that point.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

marcos-lg picture marcos-lg  ·  11Comments

timrobertson100 picture timrobertson100  ·  17Comments

MortenHofft picture MortenHofft  ·  24Comments

ahahn-gbif picture ahahn-gbif  ·  4Comments

ManonGros picture ManonGros  ·  12Comments