{ "cells": [ { "cell_type": "markdown", "id": "baad4598-9d74-4f43-a083-7bdb94e6bb58", "metadata": {}, "source": [ "# Aligning Data with Different Resolutions" ] }, { "cell_type": "markdown", "id": "180991a4-ef3d-4ceb-b6b9-e56a22ed59fd", "metadata": {}, "source": [ "This example demonstrates how to align data from different sources (EMIT, HLS, SHIFT) for analysis. Using rioxarray we can reproject the different data sources to the same CRS and shape or resolution.\n", "\n", "This example uses code from the [EMIT-Data-Resources repository](https://github.com/nasa/EMIT-Data-Resources)." ] }, { "cell_type": "code", "execution_count": 2, "id": "792350be-ddde-4b2b-8409-686f99729b45", "metadata": {}, "outputs": [], "source": [ "import requests\n", "import s3fs\n", "import sys\n", "import rasterio as rio\n", "import xarray as xr\n", "sys.path.append('/efs/edlang1/EMIT-Data-Resources/modules/')\n", "from emit_tools import emit_xarray\n", "sys.path.append('/efs/SHIFT-Python-Utilities/')\n", "from shift_python_utilities.intake_shift import shift_catalog\n", "import geopandas as gpd\n", "from shapely.geometry import Polygon\n", "\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "id": "45b3aba7-e8e7-44d6-b79a-2874402dd869", "metadata": {}, "source": [ "Generate temporary credentials for Earth Data Search S3 access" ] }, { "cell_type": "code", "execution_count": 3, "id": "32f20580-e7be-4947-9762-3b43dc6b9590", "metadata": {}, "outputs": [], "source": [ "s3_cred_endpoint = {\n", "'podaac':'https://archive.podaac.earthdata.nasa.gov/s3credentials',\n", "'gesdisc': 'https://data.gesdisc.earthdata.nasa.gov/s3credentials',\n", "'lpdaac':'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials',\n", "'ornldaac': 'https://data.ornldaac.earthdata.nasa.gov/s3credentials',\n", "'ghrcdaac': 'https://data.ghrc.earthdata.nasa.gov/s3credentials'\n", "}\n", "\n", "\n", "def get_temp_creds(provider):\n", " return requests.get(s3_cred_endpoint[provider]).json()\n", "\n", "temp_creds_req = get_temp_creds('lpdaac')\n", "\n", "fs_s3 = s3fs.S3FileSystem(anon=False,\n", " key=temp_creds_req['accessKeyId'],\n", " secret=temp_creds_req['secretAccessKey'],\n", " token=temp_creds_req['sessionToken'])" ] }, { "cell_type": "markdown", "id": "76d0a144-c1f2-46e8-b1cd-5d19e00fc0a3", "metadata": {}, "source": [ "Create a Geodataframe with a polygon" ] }, { "cell_type": "code", "execution_count": 4, "id": "c3b93e50-1e03-4c12-b8d1-9ecbd5143ede", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geometry
0POLYGON ((-120.44686 34.44239, -120.44686 34.4...
\n", "
" ], "text/plain": [ " geometry\n", "0 POLYGON ((-120.44686 34.44239, -120.44686 34.4..." ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "shp = Polygon([(-120.44686132950059, 34.44238828271541),\n", " (-120.44686132950059, 34.48046721892582),\n", " (-120.41425043549059, 34.48046721892582), \n", " (-120.41425043549059, 34.44238828271541)])\n", "\n", "geodf = gpd.GeoDataFrame(geometry=[shp], crs=4326)\n", "geodf" ] }, { "cell_type": "markdown", "id": "94cf8ae7-7a3c-4197-a021-c0a1058665ea", "metadata": {}, "source": [ "- Using Earth Data Search find the S3 link associated with the imagery of interest\n", "- Retrieve orthorectified the data using fsspec and the emit_xarray module and reorder the dimensions to (band, y, x)\n", " - Note: If you are getting an error related to not being able to identify the file type you may need to update the version of xarray you are using. The EMIT documentation recommends xarray 2022.12.0 or newer. In python running\n", " - `xr.__version__` will give you your xarray version\n", " - running `pip install xarray==2022.12.0` will install the minimum required version\n", " - Another work around is to rerun the cell and see if it works the second time. This has been reported to work. \n", " \n", "- Clip the orthorectified data using the Geodataframe" ] }, { "cell_type": "code", "execution_count": 6, "id": "72455841-430d-490a-ab28-5ee070e50798", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:           (latitude: 71, longitude: 61, wavelengths: 285)\n",
       "Coordinates:\n",
       "  * latitude          (latitude) float64 34.48 34.48 34.48 ... 34.44 34.44 34.44\n",
       "  * longitude         (longitude) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4\n",
       "    fwhm              (wavelengths) float32 8.415 8.415 8.415 ... 8.807 8.809\n",
       "    good_wavelengths  (wavelengths) float32 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0\n",
       "  * wavelengths       (wavelengths) float32 381.0 388.4 ... 2.486e+03 2.493e+03\n",
       "    spatial_ref       int64 0\n",
       "Data variables:\n",
       "    reflectance       (wavelengths, latitude, longitude) float32 0.0259 ... 0...\n",
       "Attributes: (12/38)\n",
       "    ncei_template_version:             NCEI_NetCDF_Swath_Template_v2.0\n",
       "    summary:                           The Earth Surface Mineral Dust Source ...\n",
       "    keywords:                          Imaging Spectroscopy, minerals, EMIT, ...\n",
       "    Conventions:                       CF-1.63\n",
       "    sensor:                            EMIT (Earth Surface Mineral Dust Sourc...\n",
       "    instrument:                        EMIT\n",
       "    ...                                ...\n",
       "    southernmost_latitude:             33.98409945295017\n",
       "    spatialResolution:                 0.000542232520256367\n",
       "    spatial_ref:                       GEOGCS["WGS 84",DATUM["WGS_1984",SPHER...\n",
       "    geotransform:                      [-1.20995667e+02  5.42232520e-04 -0.00...\n",
       "    day_night_flag:                    Day\n",
       "    title:                             EMIT L2A Estimated Surface Reflectance...
" ], "text/plain": [ "\n", "Dimensions: (latitude: 71, longitude: 61, wavelengths: 285)\n", "Coordinates:\n", " * latitude (latitude) float64 34.48 34.48 34.48 ... 34.44 34.44 34.44\n", " * longitude (longitude) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4\n", " fwhm (wavelengths) float32 8.415 8.415 8.415 ... 8.807 8.809\n", " good_wavelengths (wavelengths) float32 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0\n", " * wavelengths (wavelengths) float32 381.0 388.4 ... 2.486e+03 2.493e+03\n", " spatial_ref int64 0\n", "Data variables:\n", " reflectance (wavelengths, latitude, longitude) float32 0.0259 ... 0...\n", "Attributes: (12/38)\n", " ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0\n", " summary: The Earth Surface Mineral Dust Source ...\n", " keywords: Imaging Spectroscopy, minerals, EMIT, ...\n", " Conventions: CF-1.63\n", " sensor: EMIT (Earth Surface Mineral Dust Sourc...\n", " instrument: EMIT\n", " ... ...\n", " southernmost_latitude: 33.98409945295017\n", " spatialResolution: 0.000542232520256367\n", " spatial_ref: GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHER...\n", " geotransform: [-1.20995667e+02 5.42232520e-04 -0.00...\n", " day_night_flag: Day\n", " title: EMIT L2A Estimated Surface Reflectance..." ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3_url = \"s3://lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230422T195924_2311213_002/EMIT_L2A_RFL_001_20230422T195924_2311213_002.nc\"\n", "# Open s3 url\n", "fp = fs_s3.open(s3_url, mode='rb')\n", "# Open dataset with xarray\n", "ds_emit = emit_xarray(fp, ortho=True).swap_dims({\"bands\":\"wavelengths\"}).transpose(\"wavelengths\",\"latitude\",\"longitude\")\n", "ds_emit = ds_emit.rio.clip(geodf.to_crs(ds_emit.rio.crs).geometry.values, all_touched=True)\n", "ds_emit" ] }, { "cell_type": "markdown", "id": "e195b86a-08ed-4602-aef8-ca1e84327b5d", "metadata": {}, "source": [ "- Retrieve the HLS bands of interest and concatenate them into one Xaray Dataset\n", "- Clip the data with the Geodataframe" ] }, { "cell_type": "code", "execution_count": 7, "id": "e19147d2-b504-44a3-bb98-88594007549f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:      (y: 145, x: 104, band: 3)\n",
       "Coordinates:\n",
       "  * y            (y) float64 3.818e+06 3.818e+06 ... 3.814e+06 3.814e+06\n",
       "  * x            (x) float64 7.345e+05 7.345e+05 ... 7.375e+05 7.376e+05\n",
       "  * band         (band) int64 1 1 1\n",
       "    spatial_ref  int64 0\n",
       "Data variables:\n",
       "    band_data    (band, y, x) float32 nan nan nan nan nan ... nan nan nan nan
" ], "text/plain": [ "\n", "Dimensions: (y: 145, x: 104, band: 3)\n", "Coordinates:\n", " * y (y) float64 3.818e+06 3.818e+06 ... 3.814e+06 3.814e+06\n", " * x (x) float64 7.345e+05 7.345e+05 ... 7.375e+05 7.376e+05\n", " * band (band) int64 1 1 1\n", " spatial_ref int64 0\n", "Data variables:\n", " band_data (band, y, x) float32 nan nan nan nan nan ... nan nan nan nan" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3_urls = [\n", " \"s3://lp-prod-protected/HLSS30.020/HLS.S30.T10SGD.2023121T183919.v2.0/HLS.S30.T10SGD.2023121T183919.v2.0.B04.tif\",\n", " \"s3://lp-prod-protected/HLSS30.020/HLS.S30.T10SGD.2023121T183919.v2.0/HLS.S30.T10SGD.2023121T183919.v2.0.B03.tif\",\n", " \"s3://lp-prod-protected/HLSS30.020/HLS.S30.T10SGD.2023121T183919.v2.0/HLS.S30.T10SGD.2023121T183919.v2.0.B02.tif\",\n", "]\n", "\n", "ds_hls = []\n", "for url in s3_urls:\n", " fp = fs_s3.open(url, mode='rb')\n", " ds_hls += [xr.open_dataset(fp, engine='rasterio')]\n", "\n", "ds_hls = xr.concat(ds_hls, 'band')\n", "ds_hls = ds_hls.rio.clip(geodf.to_crs(ds_hls.rio.crs).geometry.values, all_touched=True) \n", "ds_hls" ] }, { "cell_type": "markdown", "id": "2478b98f-71cd-4266-acfb-806b3ca26ece", "metadata": {}, "source": [ "- Using the SHIFT Python Utilites library retireve the gridded data, select a date and reorder the dimensions (band, y, x)\n", "- Assign the CRS from the metadata to the dataset\n", "- Clip the data with the Geodataframe" ] }, { "cell_type": "code", "execution_count": 8, "id": "38d9ace0-7882-4e24-a8a7-179c3f51dcd1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:      (y: 861, x: 622, wavelength: 425)\n",
       "Coordinates:\n",
       "  * y            (y) float64 3.818e+06 3.818e+06 ... 3.814e+06 3.814e+06\n",
       "  * x            (x) float64 7.345e+05 7.345e+05 ... 7.376e+05 7.376e+05\n",
       "    time         datetime64[ns] 2022-04-29\n",
       "  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03\n",
       "    spatial_ref  int64 0\n",
       "Data variables:\n",
       "    reflectance  (wavelength, y, x) float32 dask.array<chunksize=(425, 1, 622), meta=np.ndarray>\n",
       "Attributes: (12/13)\n",
       "    description:               flight_products/20220224/box_mosaics/box_rfl_p...\n",
       "    samples:                   13739\n",
       "    lines:                     12023\n",
       "    bands:                     425\n",
       "    header offset:             0\n",
       "    file type:                 ENVI Standard\n",
       "    ...                        ...\n",
       "    interleave:                bil\n",
       "    byte order:                0\n",
       "    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...\n",
       "    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...\n",
       "    wavelength:                ['377.1956495', '382.20564950000005', '387.215...\n",
       "    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...
" ], "text/plain": [ "\n", "Dimensions: (y: 861, x: 622, wavelength: 425)\n", "Coordinates:\n", " * y (y) float64 3.818e+06 3.818e+06 ... 3.814e+06 3.814e+06\n", " * x (x) float64 7.345e+05 7.345e+05 ... 7.376e+05 7.376e+05\n", " time datetime64[ns] 2022-04-29\n", " * wavelength (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03\n", " spatial_ref int64 0\n", "Data variables:\n", " reflectance (wavelength, y, x) float32 dask.array\n", "Attributes: (12/13)\n", " description: flight_products/20220224/box_mosaics/box_rfl_p...\n", " samples: 13739\n", " lines: 12023\n", " bands: 425\n", " header offset: 0\n", " file type: ENVI Standard\n", " ... ...\n", " interleave: bil\n", " byte order: 0\n", " map info: ['UTM', '1', '1', '717720', '3865865', '5', '5...\n", " coordinate system string: ['PROJCS[\"WGS_1984_UTM_Zone_10N\"', 'GEOGCS[\"GC...\n", " wavelength: ['377.1956495', '382.20564950000005', '387.215...\n", " fwhm: ['5.57', '5.58', '5.58', '5.58', '5.5900000000..." ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cat = shift_catalog()\n", "ds_shift = cat.aviris_v1_gridded.read_chunked().sel(time='2022-04-29').transpose('wavelength', 'y', 'x')\n", "ds_shift.rio.write_crs(rio.CRS.from_wkt(\",\".join(ds_shift.attrs['coordinate system string'])), inplace=True)\n", "ds_shift = ds_shift.rio.clip(geodf.to_crs(ds_shift.rio.crs).geometry.values, all_touched=True) \n", "ds_shift" ] }, { "cell_type": "markdown", "id": "8c799300-f91d-4486-bedc-afe555093650", "metadata": {}, "source": [ "Reproject the HLS dataset and the SHIFT data to the same CRS and resolution as the EMIT data" ] }, { "cell_type": "code", "execution_count": 9, "id": "4f0881e2-e151-42dc-8d4b-e5ef603dc64a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:      (x: 65, y: 73, wavelength: 425)\n",
       "Coordinates:\n",
       "  * x            (x) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4 -120.4\n",
       "  * y            (y) float64 34.48 34.48 34.48 34.48 ... 34.44 34.44 34.44 34.44\n",
       "  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03\n",
       "    time         datetime64[ns] 2022-04-29\n",
       "    spatial_ref  int64 0\n",
       "Data variables:\n",
       "    reflectance  (wavelength, y, x) float32 3.403e+38 3.403e+38 ... 3.403e+38\n",
       "Attributes: (12/13)\n",
       "    description:               flight_products/20220224/box_mosaics/box_rfl_p...\n",
       "    samples:                   13739\n",
       "    lines:                     12023\n",
       "    bands:                     425\n",
       "    header offset:             0\n",
       "    file type:                 ENVI Standard\n",
       "    ...                        ...\n",
       "    interleave:                bil\n",
       "    byte order:                0\n",
       "    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...\n",
       "    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...\n",
       "    wavelength:                ['377.1956495', '382.20564950000005', '387.215...\n",
       "    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...
" ], "text/plain": [ "\n", "Dimensions: (x: 65, y: 73, wavelength: 425)\n", "Coordinates:\n", " * x (x) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4 -120.4\n", " * y (y) float64 34.48 34.48 34.48 34.48 ... 34.44 34.44 34.44 34.44\n", " * wavelength (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03\n", " time datetime64[ns] 2022-04-29\n", " spatial_ref int64 0\n", "Data variables:\n", " reflectance (wavelength, y, x) float32 3.403e+38 3.403e+38 ... 3.403e+38\n", "Attributes: (12/13)\n", " description: flight_products/20220224/box_mosaics/box_rfl_p...\n", " samples: 13739\n", " lines: 12023\n", " bands: 425\n", " header offset: 0\n", " file type: ENVI Standard\n", " ... ...\n", " interleave: bil\n", " byte order: 0\n", " map info: ['UTM', '1', '1', '717720', '3865865', '5', '5...\n", " coordinate system string: ['PROJCS[\"WGS_1984_UTM_Zone_10N\"', 'GEOGCS[\"GC...\n", " wavelength: ['377.1956495', '382.20564950000005', '387.215...\n", " fwhm: ['5.57', '5.58', '5.58', '5.58', '5.5900000000..." ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds_hls = ds_hls.rio.reproject(dst_crs=ds_emit.rio.crs, resolution=ds_emit.rio.transform()._scaling)\n", "ds_shift = ds_shift.rio.reproject(dst_crs=ds_emit.rio.crs, resolution=ds_emit.rio.transform()._scaling)\n", "ds_shift" ] }, { "cell_type": "markdown", "id": "92310508-fd26-418d-973f-33ab5302c1d6", "metadata": {}, "source": [ "Reproject the HLS dataset and the SHIFT data to the same CRS and shape as the EMIT data" ] }, { "cell_type": "code", "execution_count": 10, "id": "df702123-dad1-4cd2-989d-014a406b8033", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:      (x: 71, y: 61, wavelength: 425)\n",
       "Coordinates:\n",
       "  * x            (x) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4 -120.4\n",
       "  * y            (y) float64 34.48 34.48 34.48 34.48 ... 34.44 34.44 34.44 34.44\n",
       "  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03\n",
       "    time         datetime64[ns] 2022-04-29\n",
       "    spatial_ref  int64 0\n",
       "Data variables:\n",
       "    reflectance  (wavelength, y, x) float32 3.403e+38 3.403e+38 ... 3.403e+38\n",
       "Attributes: (12/13)\n",
       "    description:               flight_products/20220224/box_mosaics/box_rfl_p...\n",
       "    samples:                   13739\n",
       "    lines:                     12023\n",
       "    bands:                     425\n",
       "    header offset:             0\n",
       "    file type:                 ENVI Standard\n",
       "    ...                        ...\n",
       "    interleave:                bil\n",
       "    byte order:                0\n",
       "    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...\n",
       "    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...\n",
       "    wavelength:                ['377.1956495', '382.20564950000005', '387.215...\n",
       "    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...
" ], "text/plain": [ "\n", "Dimensions: (x: 71, y: 61, wavelength: 425)\n", "Coordinates:\n", " * x (x) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4 -120.4\n", " * y (y) float64 34.48 34.48 34.48 34.48 ... 34.44 34.44 34.44 34.44\n", " * wavelength (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03\n", " time datetime64[ns] 2022-04-29\n", " spatial_ref int64 0\n", "Data variables:\n", " reflectance (wavelength, y, x) float32 3.403e+38 3.403e+38 ... 3.403e+38\n", "Attributes: (12/13)\n", " description: flight_products/20220224/box_mosaics/box_rfl_p...\n", " samples: 13739\n", " lines: 12023\n", " bands: 425\n", " header offset: 0\n", " file type: ENVI Standard\n", " ... ...\n", " interleave: bil\n", " byte order: 0\n", " map info: ['UTM', '1', '1', '717720', '3865865', '5', '5...\n", " coordinate system string: ['PROJCS[\"WGS_1984_UTM_Zone_10N\"', 'GEOGCS[\"GC...\n", " wavelength: ['377.1956495', '382.20564950000005', '387.215...\n", " fwhm: ['5.57', '5.58', '5.58', '5.58', '5.5900000000..." ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "shape = ds_emit.dims[ds_emit.rio.x_dim], ds_emit.dims[ds_emit.rio.y_dim]\n", "ds_hls = ds_hls.rio.reproject(dst_crs=ds_emit.rio.crs, shape=shape)\n", "ds_shift = ds_shift.rio.reproject(dst_crs=ds_emit.rio.crs, shape=shape)\n", "ds_shift" ] } ], "metadata": { "kernelspec": { "display_name": "shift_utils", "language": "python", "name": "shift_utils" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }