SHIFT Data Quickstart Guide

This a quickstart guide for working with SHIFT data on the SMCE. The guide covers how to

  • Read in data with the SHIFT Python Utilities Library

  • Orthorectify data

  • Clip data with a shapefile

  • Write data to disk

[1]:
import sys
sys.path.append('/efs/SHIFT-Python-Utilities/')
from shift_python_utilities.intake_shift import shift_catalog
import rioxarray as rxr
import rasterio as rio
import geopandas as gpd
from shapely.geometry import Polygon

# Intialize an instance of the catalog
cat = shift_catalog()

Working with SHIFT Gridded Data

Read a shapefile using the Geopandas library

[2]:
geodf = gpd.read_file("/efs/edlang1/SHIFT-Python-Utilities/shift_python_utilities/tests/test_data/quick_start_shp/quick_start_shp.shp")
geodf
[2]:
FID geometry
0 0 POLYGON ((-120.49103 34.49217, -120.48940 34.4...
1 1 POLYGON ((-120.48611 34.49070, -120.48595 34.4...
2 2 POLYGON ((-120.48925 34.48861, -120.48823 34.4...

Read in the gridded data using the shift python utilities library and assign the appropiate CRS

[3]:
ds = cat.aviris_v1_gridded.read_chunked()

# assign the crs from the metadata to the xarray dataset
ds.rio.write_crs(rio.CRS.from_wkt(",".join(ds.attrs['coordinate system string'])), inplace=True)
ds
[3]:
<xarray.Dataset>
Dimensions:      (time: 13, y: 12023, wavelength: 425, x: 13739)
Coordinates:
    spatial_ref  int64 0
  * time         (time) datetime64[us] 2022-02-24 2022-02-28 ... 2022-05-29
  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03
  * x            (x) float64 7.177e+05 7.177e+05 ... 7.864e+05 7.864e+05
  * y            (y) float64 3.866e+06 3.866e+06 ... 3.806e+06 3.806e+06
Data variables:
    reflectance  (time, y, wavelength, x) float32 dask.array<chunksize=(1, 1, 425, 13739), meta=np.ndarray>
Attributes: (12/13)
    description:               flight_products/20220224/box_mosaics/box_rfl_p...
    samples:                   13739
    lines:                     12023
    bands:                     425
    header offset:             0
    file type:                 ENVI Standard
    ...                        ...
    interleave:                bil
    byte order:                0
    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...
    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...
    wavelength:                ['377.1956495', '382.20564950000005', '387.215...
    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...

Clip the data using the Geopandas dataframe. Make sure the dataframe and the gridded data have the same CRS.

[4]:
clipped = ds.rio.clip(geodf.to_crs(ds.rio.crs).geometry.values, all_touched=True)
clipped
[4]:
<xarray.Dataset>
Dimensions:      (y: 98, x: 97, time: 13, wavelength: 425)
Coordinates:
  * y            (y) float64 3.82e+06 3.82e+06 3.82e+06 ... 3.819e+06 3.819e+06
  * x            (x) float64 7.304e+05 7.304e+05 ... 7.309e+05 7.309e+05
  * time         (time) datetime64[us] 2022-02-24 2022-02-28 ... 2022-05-29
  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03
    spatial_ref  int64 0
Data variables:
    reflectance  (time, y, wavelength, x) float32 dask.array<chunksize=(13, 98, 425, 97), meta=np.ndarray>
Attributes: (12/13)
    description:               flight_products/20220224/box_mosaics/box_rfl_p...
    samples:                   13739
    lines:                     12023
    bands:                     425
    header offset:             0
    file type:                 ENVI Standard
    ...                        ...
    interleave:                bil
    byte order:                0
    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...
    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...
    wavelength:                ['377.1956495', '382.20564950000005', '387.215...
    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...

Write the result as a GeoTIFF

To make the data compatable with rioxarray’s to raster function you must

  • Reduce the dimensionality so the data being written is 2D or 3D. In this case I am reducing the dimensionality along the time dimension by writting a file for each date

  • Select the data variable you would like to write (reflectance)

  • Transpose the data to the dimensional ordering rioxarray requires (band, y_dim, x_dim)

[6]:
# Only 2D and 3D data can be written so here we select time to reduce the dimensionality
clipped.sel(time='2022-02-24').reflectance.transpose('wavelength', 'y', 'x').rio.to_raster('outpath_2022_02_24.tif', driver="GTIFF")
clipped.sel(time='2022-05-29').reflectance.transpose('wavelength', 'y', 'x').rio.to_raster('outpath_2022_05_29.tif', driver="GTIFF")

Working with the Raw SHIFT Data

Create a Geopandas Dataframe from coordinates, or read a shapefile using the Geopandas library. Verify your shapefile is using the appropriate CRS

[21]:
shp = Polygon([
    (-119.8853015 , 34.42277795),
    (-119.86975941, 34.42312643),
    (-119.86921817, 34.4066284 ),
    (-119.88476322, 34.40623869),
    (-119.8853015 , 34.42277795)]
)
geodf = gpd.GeoDataFrame(geometry=[shp], crs=4326)
geodf = geodf.to_crs(geodf.estimate_utm_crs())
geodf
[21]:
geometry
0 POLYGON ((234835.191 3812810.517, 236265.070 3...

Using the shift python utilities library you can pass you shapefile data along with the date and time of the flight and retrieve the data for you area of interest

[22]:
ds = cat.L2a(date=20220224, time=200332, ortho=True, subset=geodf ).read_chunked()
ds
[22]:
<xarray.Dataset>
Dimensions:      (lat: 383, lon: 298, wavelength: 425)
Coordinates:
  * lat          (lat) float64 3.813e+06 3.813e+06 ... 3.811e+06 3.811e+06
  * lon          (lon) float64 2.348e+05 2.348e+05 ... 2.363e+05 2.363e+05
  * wavelength   (wavelength) float64 377.2 382.2 387.2 ... 2.496e+03 2.501e+03
    spatial_ref  int64 0
Data variables:
    reflectance  (lat, lon, wavelength) float32 -0.007509 -4.117e-05 ... nan nan
    elevation    (lat, lon) float32 -23.59 -23.19 -22.79 -22.54 ... nan nan nan
Attributes: (12/14)
    description:                 AVIRIS-NG Measured Radiances in uW nm-1 cm-2...
    lines:                     2112
    samples:                   6135
    is_tiled :                 0
    bands:                     425
    interleave:                bil
    ...                        ...
    res:                       (4.8, 4.8)
    map_info:                  UTM, 1, 1, 224764.8, 3815937.6, 4.8, 4.8, 11, ...
    coordinate_system_string:  PROJCS["WGS 84 / UTM zone 11N",GEOGCS["WGS 84"...
    nodatavals:                (nan, nan, nan, nan, nan, nan, nan, nan, nan, ...
    descriptions:              [ 377.1956495  382.2056495  387.2156495  392.2...
    scales:                    (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, ...

Write the data as a GeoTIFF

[27]:
ds.reflectance.transpose('wavelength', 'lat', 'lon').rio.to_raster(raster_path="outpath.tif",  driver="GTIFF")