Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import volumes from Pangaea #95

Open
mzur opened this issue Jun 27, 2017 · 8 comments
Open

Import volumes from Pangaea #95

mzur opened this issue Jun 27, 2017 · 8 comments

Comments

@mzur
Copy link
Member

mzur commented Jun 27, 2017

We could provide a function to easily import volumes from Pangaea as remote volumes. These volumes can keep a reference to their source and all the metadata stored in Pangaea (as we don't want to store all that in Biigle ourselves).

@mzur mzur added idea labels Jun 27, 2017
@mzur
Copy link
Member Author

mzur commented Sep 8, 2017

We noticed that some datasets in Pangaea may be easily used as a remote volume. Take this one for example. You can download a CSV with all image filenames and the volume URL. There is even location data for each image. Another example is this where the URL is also usable for a remote volume. These images are loaded from tape so first there is a redirect to a "please wait" page and then the download is initiated. This works automatically, too. When I request an image with cURL I get the HTML response first. If I wait a few seconds and then request the same URL again, I get the image file.

We can probably implement a dialog where users can create new volumes from Pangaea. They only have to insert the dataset URL and BIIGLE does the rest.

@mzur mzur removed idea labels Oct 12, 2017
@mzur
Copy link
Member Author

mzur commented Nov 9, 2017

Make sure to import the DOI of the dataset as well (biigle/volumes#38).

@tschoeni
Copy link

This would be a very good feature with high application potential!

@mzur
Copy link
Member Author

mzur commented Oct 8, 2018

Also import these image metadata fields.

@mzur
Copy link
Member Author

mzur commented Jun 3, 2019

I just sent my second message to the PANGAEA guys via their contact form. Hopefully they'll answer at some point.

@mzur
Copy link
Member Author

mzur commented Jul 15, 2019

The PANGAEA people finally answered. They said that they can't change the existing behavior of a returned code 503 and a periodic retry until an image is fetched from tape. If we want to make BIIGLE compatible with this, we would need to handle URLs from PANGAEA as a special case, both in the (video) annotation tool and in the file cache package.

@mzur
Copy link
Member Author

mzur commented Dec 11, 2019

Apparently PANGAEA is not interested in becoming a central image and video repository. Continued in #207.

@mzur mzur closed this as completed Dec 11, 2019
mzur added a commit that referenced this issue Jul 9, 2020
@mzur
Copy link
Member Author

mzur commented Jun 25, 2021

We had another discussion with the people of PANGAEA. The possibility to receive a 503 response remains but it should be possible to make BIIGLE compatible at the following locations:

  • The annotation tool can show the loading animation and retry to fetch the image (based on response header timeout) for as long as there are 503 responses. The loading animation is only shown for the current image. Previous/next images are attempted only once and will show the loading animation again when the user switches to the image. This change has to be propagated to:

    • video annotation tool
    • ananas
    • MAIA
    • ???
  • The FileCache can also retry to fetch images based on the response header timeout. There needs to be an upper limit for the retry count/duration (ask PANGAEA staff?).

  • The create volume action also needs to handle possible 503 responses. As it uses the FileCache for the checks, it would hang for as long as there are 503 responses. Maybe it's sufficient to display a message like "Validation of your data may take a while" once the request takes more than 10 s to complete. However, this could run into the 30 s execution timeout. Maybe we should regard 503 responses as "the image exists" and just accept the volume?

@mzur mzur reopened this Jun 25, 2021
@mzur mzur added the 2021 label Jun 25, 2021
@mzur mzur removed the MI2 label Feb 9, 2023
@mzur mzur removed the 2021 label Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants