Using the NASA VEDA EOAPI - EO Dashboard Notebooks

This notebook is divided into two parts, each demonstrating the functionalities of the VEDA EOAPI.

Reading and visualizing one of the datasets from the VEDA data catalog.
Using the EOAPI to generate a time-series of OMI (Ozone Monitoring Instrument) NO₂ and SO₂ datasets

Author: Slesa Adhikari

1. Reading and visualizing one of the datasets from the VEDA data catalog¶

Import all the necesssary libraries

Make sure you install these first using:

pip install pystac_client folium seaborn pandas

# imports
import requests
from pystac_client import Client
import folium
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

Define the API endpoints

The EOAPI is a combination of two components.

Data catalog - the SpatioTemporal Asset Catalog (STAC) specification is used to catalog the available datasets
Dynamic tile server - TiTiler is used to dynamically serve cloud optimized geotiff (raster) data files

STAC_API_URL = "https://staging-stac.delta-backend.com/"
RASTER_API_URL = "https://staging-raster.delta-backend.com"

Use the pystac_client library to interact with the STAC data catalog

catalog = Client.open(STAC_API_URL)

List all the datasets (collections) in the catalog

for collection in list(catalog.get_collections()):
    print(f"{collection.id} - {collection.title}")

Choose a collection to work with

Search all the items in the collection

collection_id = "no2-monthly"
search = catalog.search(collections=[collection_id])
items = list(search.items())
items

Load and inspect one of the items

s3_uri = items[0].assets["cog_default"].href

stats = requests.get(
    f"{RASTER_API_URL}/cog/statistics",
    params={"url": s3_uri}
).json()
stats

Display the COG in a map

Get the tiles endpoint for the file

rescale = f"{stats['b1']['min']},{stats['b1']['max']}"

tiles = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={collection_id}&item={items[0].id}&assets=cog_default&colormap_name=rdbu_r&rescale={rescale}"
).json()

tiles

Use the tiles to visualize the file in a map

m = folium.Map(
    zoom_start=6,
    scroll_wheel_zoom=True, 
    tiles=tiles["tiles"][0], 
    attr="VEDA", 
    minzoom=0, 
    maxzoom=18,
)

Make this map slightly more insightful

Using the minimum and maximum values for the colorscale may not be the most useful. Dynamic tiling to the rescue! TiTiler comes with a large amount of options to style the map. Here we’ll do a quick a dirty adjustment of the max value.

rescale_limited_max = f"{stats['b1']['min']},{stats['b1']['max']/2}"

tiles_limited_max = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={collection_id}&item={items[0].id}&assets=cog_default&colormap_name=rdbu_r&rescale={rescale_limited_max}"
).json()
tiles_limited_max

m_limited_max = folium.Map(
    zoom_start=6,
    scroll_wheel_zoom=True, 
    tiles=tiles_limited_max["tiles"][0], 
    attr="VEDA", 
    minzoom=0, 
    maxzoom=18,
)
m_limited_max

2. Using the EOAPI to generate a time-series of OMI (Ozone Monitoring Instrument) NO₂ and SO₂ datasets¶

There’s a story published in the EODashboard with the following title:

Air pollution in India, China and the U.S. have changed significantly over the past two decades

Link: https://eodashboard.org/story?id=air-pollution-us-india-china

The story talks about the trend of air pollution in India, China and the U.S. using the NO₂ and SO₂ readings grabbed from the OMI instrument

Here, we’ll recreate the analysis using the EOAPI

# Here, we find the relevant collection ID for the dataset
collections = {
    "no2": "OMI_trno2-COG",
    "so2": "OMSO2PCA-COG",
}

Define the roughly similar Area of Interest (AOI) for each of the country as seen in the story

aois = {
    "india": {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "coordinates": [
          [
            [
              84.44227371801799,
              25.276852788244952
            ],
            [
              81.73331688510166,
              25.379576397063317
            ],
            [
              81.40290450746915,
              20.640781701865322
            ],
            [
              84.09079123546121,
              20.59296261766137
            ],
            [
              84.44227371801799,
              25.276852788244952
            ]
          ]
        ],
        "type": "Polygon"
      }
    },
    "china": {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "coordinates": [
          [
            [
              118.14487188674968,
              40.38237805885373
            ],
            [
              112.59679754686567,
              40.39197699341523
            ],
            [
              112.78712023622006,
              32.015052150835814
            ],
            [
              117.937454307721,
              32.102440507249895
            ],
            [
              118.14487188674968,
              40.38237805885373
            ]
          ]
        ],
        "type": "Polygon"
      }
    },
    "usa": {
        "type": "Feature",
        "properties": {},
        "geometry": {
            "coordinates": [
            [
                [
                -80.16702521343733,
                41.73420113945659
                ],
                [
                -83.56446680395005,
                38.599369254919566
                ],
                [
                -82.00280661075571,
                37.54658260550103
                ],
                [
                -78.28140359718638,
                40.450899619800595
                ],
                [
                -80.16702521343733,
                41.73420113945659
                ]
            ]
            ],
            "type": "Polygon"
        }
    },
}

Define a function that takes the following params:

item: a STAC item
geojson: the geojson of the AOI

Using the /cog/statistics/ endpoint of the raster API, we get back the statistics of the item (which corresponds to one COG file) within the given geojson AOI.

The statistics includes min, max, mean, std, etc.

# the bounding box should be passed to the geojson param as a geojson Feature or FeatureCollection
def generate_stats(item, geojson):
    result = requests.post(
        f"{RASTER_API_URL}/cog/statistics", 
        params={
            "url": item.assets["cog_default"].href
        },
        json=geojson
    ).json()
    return {
        **result["properties"],
        "start_datetime": str(item.properties.get("datetime", item.properties.get("start_datetime")))[:4],
        "collection": item.collection_id
    }

Let’s start out with the US 🇺🇸 !

We’ll get all the items in the NO2 and SO2 collections and generate the statistics from them for the Ohio River Valley region of the United States.

usa_aoi = aois["usa"]
items = list(catalog.search(collections=[collections["no2"], collections["so2"]]).items())
usa_stats = [
    generate_stats(item, usa_aoi)
    for item in items
]

Create a function that takes the statistics (which is a json) and converts it to a pandas dataframe in a format that’ll make it easy to read and visualize.

We’re only concerned with the mean statistics for this example. Specifically the change from the year 2005 in percentage. We’ll use pandas to calculate this change percentage and assign it to the change column.

def clean_stats(stats_json) -> pd.DataFrame:
    # convert the stats_json as is to pandas dataframe
    df = pd.json_normalize(stats_json)
    # simple renaming for readability
    df.columns = [col.replace("statistics.b1.", "") for col in df.columns]
    # create a date column from the start_datetime column
    df["date"] = pd.to_datetime(df["start_datetime"], format="%Y")
    # sort the dataframe by the date column
    df = df.sort_values(by=["date"])
    # create a change column that calculates the change of the mean values from the value in 2005
    df["change"] = df.groupby("collection", group_keys=False)["mean"].apply(lambda x: x.div(x.iloc[0]).subtract(1).mul(100))
    return df

cleaned_stats_df = clean_stats(usa_stats)

We’ll now create a time-series of the change in mean values for NO2 and SO2 for the area in the US.

plt.xticks([i for i in range(0, 21, 2)])
sns.set_style("darkgrid")
ax = sns.lineplot(
    x="start_datetime",
    y="change",
    hue="collection",
    data=cleaned_stats_df,
    palette=["#2196f3", "#ff5722"],
    style="collection",
    markers=["*", "d"]
)
ax.set_title("US - Ohio River Valley")
ax.set_xlabel("Years")
ax.set_ylabel("Change from 2005 (%)")
plt.legend(frameon=False, ncol=3)
plt.show()

Now, let’s create a function that creates this trend graph, given the country.

def create_chart(country):
    stats = [generate_stats(item, aois[country]) for item in list(catalog.search(collections=[collections["no2"], collections["so2"]]).items())]
    df = clean_stats(stats)

    # Create a chart using Seaborn
    plt.xticks([i for i in range(0, 21, 2)])
    sns.set_style("darkgrid")
    ax = sns.lineplot(
        x="start_datetime",
        y="change",
        hue="collection",
        data=df,
        palette=["#2196f3", "#ff5722"],
        style="collection",
        markers=["*", "d"]
    )
    ax.set_title(country.title())
    ax.set_xlabel("Years")
    ax.set_ylabel("Change from 2005 (%)")
    plt.legend(frameon=False, ncol=3)
    plt.show()

We can use this function to create charts for the rest of the AOIs.

India 🇮🇳

create_chart("india")

China 🇨🇳

create_chart("china")