Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read goes-r data from https aws, gcp or azure (it uses h5netcdf) #1424

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jhbravo
Copy link
Contributor

@jhbravo jhbravo commented Nov 5, 2020

This second method @djhoese, it uses h5netcdf, I think is a best option, but it has some error, I've tried to track it but can't find it
is adapted of this: pydata/xarray#1075
When I load directly from https using xarray it works fine, but in satpy "abi_base.py" it gets stuck.
the advantage of this method, is that you can read direct from S3 using boto3: pydata/xarray#1075 (comment)

netcdf_bytes = s3_object['Body'].read()
netcdf_bytes_io = io.BytesIO(netcdf_bytes)
ds = xr.open_dataset(netcdf_bytes_io)

and from google storage

bucket = storage_client.bucket('gcp-public-data-goes-16')
objt = 'ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C13_G16_s20202822056149_e20202822058534_c20202822059042.nc'
blob = bucket.get_blob(objt)
netcdf_bytes_io = io.BytesIO(blob.download_as_string())
ds = xr.open_dataset(netcdf_bytes_io)

the latter I tried it in google colab and it worked


I did the test with this data in my local machine

from satpy import Scene
from satpy.utils import debug_on
debug_on()

list_files = ['https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C01_G16_s20202822031149_e20202822033522_c20202822033567.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C02_G16_s20202822031149_e20202822033522_c20202822033556.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C03_G16_s20202822031149_e20202822033522_c20202822033572.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C04_G16_s20202822031149_e20202822033522_c20202822034046.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C05_G16_s20202822031149_e20202822033522_c20202822034051.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C06_G16_s20202822031149_e20202822033528_c20202822034065.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C07_G16_s20202822031149_e20202822033534_c20202822033585.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C08_G16_s20202822031149_e20202822033522_c20202822034031.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C09_G16_s20202822031149_e20202822033528_c20202822034008.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C10_G16_s20202822031149_e20202822033534_c20202822033595.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C11_G16_s20202822031149_e20202822033522_c20202822033577.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C12_G16_s20202822031149_e20202822033528_c20202822034057.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C13_G16_s20202822031149_e20202822033534_c20202822034038.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C14_G16_s20202822031149_e20202822033522_c20202822034014.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C15_G16_s20202822031149_e20202822033528_c20202822034022.nc',
 'https://gcp-public-data-goes-16.storage.googleapis.com/ABI-L1b-RadC/2020/282/20/OR_ABI-L1b-RadC-M6C16_G16_s20202822031149_e20202822033534_c20202822034001.nc']

gscn = Scene(reader="abi_l1b", filenames=list_files,)
gscn.load(['C{:02d}'.format(13)])

and this is the error when I read the data

/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/pyproj/crs/crs.py:543: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
proj_string = self.to_proj4()
[DEBUG: 2020-11-04 20:04:51 : satpy.readers.abi_l1b] Reading in get_dataset C13.
[DEBUG: 2020-11-04 20:04:51 : satpy.readers.abi_l1b] Calibrating to brightness temperatures
[ERROR: 2020-11-04 20:04:51 : satpy.readers.yaml_reader] Could not load dataset 'DataID(name='C13', wavelength=WavelengthRange(min=10.1, central=10.35, max=10.6, unit='µm'), resolution=2000, calibration=<calibration.brightness_temperature>, modifiers=())': dimensions () must have the same length as the number of data dimensions, ndim=1
Traceback (most recent call last):
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/satpy/readers/yaml_reader.py", line 828, in _load_dataset_with_area
ds = self._load_dataset_data(file_handlers, dsid, **kwargs)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/satpy/readers/yaml_reader.py", line 711, in _load_dataset_data
proj = self._load_dataset(dsid, ds_info, file_handlers, **kwargs)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/satpy/readers/yaml_reader.py", line 687, in _load_dataset
projectable = fh.get_dataset(dsid, ds_info)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/satpy/readers/abi_l1b.py", line 48, in get_dataset
res = self._ir_calibrate(radiances)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/satpy/readers/abi_l1b.py", line 112, in _ir_calibrate
fk1 = float(self["planck_fk1"])
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/satpy/readers/abi_base.py", line 117, in __getitem__
data = data.where(data != fill, new_fill)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/xarray/core/dataarray.py", line 2765, in func
f(self.variable, other_variable)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/xarray/core/nputils.py", line 79, in array_ne
return _ensure_bool_is_ndarray(self != other, self, other)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/xarray/core/variable.py", line 2134, in func
result = Variable(dims, new_data, attrs=attrs)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/xarray/core/variable.py", line 327, in __init__
self._dims = self._parse_dimensions(dims)
File "/home/jhbravo/Software/miniconda3/envs/pytroll/lib/python3.8/site-packages/xarray/core/variable.py", line 559, in _parse_dimensions
raise ValueError(
ValueError: dimensions () must have the same length as the number of data dimensions, ndim=1
[WARNING: 2020-11-04 20:04:51 : satpy.scene] The following datasets were not created and may require resampling to be generated: DataID(name='C13', wavelength=WavelengthRange(min=10.1, central=10.35, max=10.6, unit='µm'), resolution=2000, calibration=<calibration.brightness_temperature>, modifiers=())
  • Closes #xxxx
  • Tests added
  • Passes flake8 satpy
  • Fully documented
  • Add your name to AUTHORS.md if not there already

@ghost
Copy link

ghost commented Nov 5, 2020

DeepCode's analysis on #b2cc26 found:

  • ℹ️ 1 minor issue. 👇

Top issues

Description Example fixes
standard import "import io" should be placed before "import numpy as np" Occurrences: 🔧 Example fixes

👉 View analysis in DeepCode’s Dashboard | Configure the bot

@codecov
Copy link

codecov bot commented Nov 5, 2020

Codecov Report

Merging #1424 into master will decrease coverage by 0.00%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1424      +/-   ##
==========================================
- Coverage   90.58%   90.58%   -0.01%     
==========================================
  Files         236      236              
  Lines       33797    33804       +7     
==========================================
+ Hits        30615    30620       +5     
- Misses       3182     3184       +2     
Impacted Files Coverage Δ
satpy/readers/abi_base.py 91.48% <66.66%> (-1.05%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9ccddbf...b2cc261. Read the comment docs.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.03%) to 90.551% when pulling b2cc261 on jhbravo:read_https_2 into 9ccddbf on pytroll:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.03%) to 90.551% when pulling b2cc261 on jhbravo:read_https_2 into 9ccddbf on pytroll:master.

@mraspaud
Copy link
Member

@djhoese any opinion on this?

@djhoese
Copy link
Member

djhoese commented Dec 16, 2020

@mraspaud Does the base reader support your FSFile stuff yet? I'd say this PR is an alternative that is specific to this reader so I'd prefer a documented example of how to use the FSFile stuff rather than this.

@mraspaud
Copy link
Member

Yes it is.
@jhbravo do you mind checking if the following solution would work for you?
https://github.com/pytroll/satpy/blob/master/satpy/readers/__init__.py#L550-L561

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants