Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH_filtering_data #141

Open
Hamsterrrrrrrrr opened this issue Aug 8, 2024 · 3 comments
Open

SSH_filtering_data #141

Hamsterrrrrrrrr opened this issue Aug 8, 2024 · 3 comments
Labels

Comments

@Hamsterrrrrrrrr
Copy link

Dataset Name

ssh_train_aug.zarr ubm_train_aug.zarr ssh_train.zarr ssh_val.zarr ssh_test.zarr bm_train.zarr bm_val.zarr bm_test.zarr ubm_train.zarr ubm_val.zarr ubm_test.zarr

Dataset URL

https://zenodo.org/records/6574307

Description

We used the data from https://zenodo.org/records/6574307 and apply augmentation to it(we generate randomly patches in the spatial domain and do some augmentation hence got a lot more data for training)

Size

100G, it is split into many files

License

Unknown

Data Format

Zarr

Data Format (other)

No response

Access protocol

HTTP(S)

Source File Organization

No response

Example URLs

No response

Authorization

No; data are fully public

Transformation / Processing

No response

Target Format

Zarr

Comments

No response

@jbusecke
Copy link
Contributor

jbusecke commented Aug 8, 2024

Hi @Hamsterrrrrrrrr, thanks for submitting a dataset request!

I think I need some further clarification on what exactly to ingest here.
Following the link you sent I see two files:
image

Do you want these to be converted to zarr in the cloud? Or are there more as you indicated with:

100G, it is split into many files

Happy to work on this once we are clear on the details.

@Hamsterrrrrrrrr
Copy link
Author

Hi Jbusecke,

The data link contains the raw data, on which I did some processing. The data files uploaded are the processed data, not the raw data. I have already converted them into .zarr before uploading. I hope this clarifies things.

Best regards,
Yue

@jbusecke
Copy link
Contributor

Hi @Hamsterrrrrrrrr,

The data link contains the raw data

The raw data is at https://zenodo.org/records/6574307 ?

, on which I did some processing.

What sort of processing did you do, and where is the output located?

We currently use the dataset ingestion to get officially archived/published datasets into an analysis ready cloud optimized format (e.g. zarr). If this data is processed by you, we need to figure out a way how to host/ingest the data, see the note box here. Maybe we should schedule a meeting call to discuss details?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants