Skip to content

Overview

AIAS - ARLAS Item and Asset Services

AIAS groups a set of microservices in order to offer functions for ingestion, access and download of archives, STAC Items and Assets. AIAS and ARLAS makes a fully functional catalog.

Functions for ingestion:

  • Register a STAC item with its assets : ARLAS Item Registration Services (AIRS)
  • Asynchronously register one archive (/processes/ingest) or a directory containing archives (/processes/directory_ingest) : ARLAS Processing (APROC)
  • List files and archives from a directory: File and Archive Management (FAM)

Note

Some STAC synchronisation scrips are provided. See STAC Synchronisation

Functions for download:

  • Asynchronously download one or several archives (/processes/download) : ARLAS Processing (APROC)

Functions for access:

  • Access control on the assets with ARLAS Gateway for Assets (AGATE)

Running the stack

To start a standalone stack for testing:

./test/start_stack.sh
At least 8Go of RAM is needed.

This stack relies on the docker compose configuration files. The available endpoints are:

Health checks are available for all non-third party services:

ARLAS Item Registration Services (AIRS)

ARLAS Item Registration Services offers registration services for Spatio-temporal assets. It manages Items as well as Assets (e.g. raster files, cogs, etc.). The service exposes the STAC-T methods (https://github.com/stac-api-extensions/transaction) as well as a set of methods for handling the assets.

By default, the service manages the assets. When an item is registered, the service checks that the managed asset exists: the asset must be added before the item. Deleting an item is cascaded on the managed assets. An asset can be unmanaged by setting asset.airs:managed=False (or asset.airs__managed=False).

AIRS Data model

The AIRS Model is based on the STAC specifications. It supports the following extensions:

  • view
  • storage
  • eo
  • processing
  • dc3 (ARLAS Datacube Builder)
  • cube
  • sar
  • proj

Also, metadata are enriched by the service and placed in the generated namespace.

Namespaces are prefixes in the key names of the JSON. The : is used for separating the namespace and the field name. Since ARLAS does not support the : in field names, the character is replaced by __ for storage and indexation.

For more details, see the model documentation

Prerequisites

  • minio
  • elasticsearch
  • docker

See here for the available versions of airs.

AIRS Configuration

The following environment variables must be set to run AIRS:

Variable
AIRS_HOST
AIRS_PORT
AIRS_CORS_ORIGINS
AIRS_CORS_METHODS
AIRS_CORS_HEADERS
AIRS_ARLAS_COLLECTION_NAME
AIRS_ARLAS_URL
AIRS_INDEX_ENDPOINT_URL
AIRS_INDEX_COLLECTION_PREFIX
AIRS_INDEX_LOGIN
AIRS_INDEX_PWD
AIRS_S3_BUCKET
AIRS_S3_ACCESS_KEY_ID
AIRS_S3_SECRET_ACCESS_KEY
AIRS_S3_REGION
AIRS_S3_TIER
AIRS_S3_PLATFORM
AIRS_S3_ASSET_HTTP_ENDPOINT_URL
AIRS_S3_ENDPOINT_URL
ARLASEO_MAPPING_URL
AIRS_COLLECTION_URL
AIRS_PREFIX
AIRS_LOGGER_LEVEL

Using AIRS

In the following examples, we will:

  • add an asset
  • check that it exists
  • add an item
  • get the item
  • delete the item and its asset
Add an asset

curl -X POST \
    "http://127.0.0.1:8000/arlas/airs/collections/digitalearth.africa/items/077cb463-1f68-5532-aa8b-8df0b510231a/assets/classification?content_type=image/tiff" \
    -F file=@test/inputs/ESA_WorldCover_10m_2021_v200_N15E000_Map.tif
Result:
{"msg":"Object has been uploaded to bucket successfully"}

Check that the asset exists

curl -I \
    "http://127.0.0.1:8000/arlas/airs/collections/digitalearth.africa/items/077cb463-1f68-5532-aa8b-8df0b510231a/assets/classification"
Result:

HTTP/1.1 204 No Content
Add an item
curl -X POST \
    -H "Content-Type: application/json" \
    "http://127.0.0.1:8000/arlas/airs/collections/digitalearth.africa/items" \
    -d @test/inputs/077cb463-1f68-5532-aa8b-8df0b510231a.json

Result:

{
   "collection":"digitalearth.africa",
   "catalog":"snow",
   "id":"077cb463-1f68-5532-aa8b-8df0b510231a",
   "geometry":{...},
   "bbox": ...,
   "assets":{
      "classification":{...},
      "arlas_eo_item":{...}
   },
   "properties":{
      "datetime":1640908800.0,
      "start_datetime":1609459200.0,
      "end_datetime":1640908800.0,
      "eo:bands":[
         {
            "name":"classification"
         }
      ],
      "proj:epsg":4326,
      "proj:shape":[
         36000.0,
         36000.0
      ],
      "generated:day_of_week":4,
      "generated:day_of_year":365,
      "generated:hour_of_day":1,
      "generated:minute_of_day":60,
      ...
   }
}

Check the item exists
curl -X GET \
    -H "Content-Type: application/json" \
    "http://127.0.0.1:8000/arlas/airs/collections/digitalearth.africa/items/077cb463-1f68-5532-aa8b-8df0b510231a"

Result: same as previous call (registration).

Delete the item and its assets
curl -X DELETE \
    -H "Content-Type: application/json" \
    "http://127.0.0.1:8000/arlas/airs/collections/digitalearth.africa/items/077cb463-1f68-5532-aa8b-8df0b510231a"

ARLAS Processes (APROC)

ARLAS Processes (APROC) exposes an OGC API Processes compliant API.

List of processes:

  • ingest : it ingests an archive.
  • directory_ingest : it ingests archives found in a directory.
  • download : it ingests an archive.
  • enrich : it enriches an item (like adding a cog).

Ingest process

The ingest process takes a url pointing at an archive. The process runs the following steps:

  • identify the driver for ingestion
  • identify the assets to fetch (done by the driver)
  • fetch the assets (e.g. copy/download) (done by the driver)
  • transform the assets if necessary (e.g. create cog) (done by the driver)
  • upload the assets
  • register the item in AIRS

As mentioned, the process is "driver" based. Each data source must have a compliant driver in order to be ingested in AIRS. A driver has to

  • say whether it supports a given archive or not
  • identify the archive's assets to be fetched
  • fetch the assets
  • transform the assets
  • create an AIRS Item

A driver must implement the abstract class Driver.

Warning

The name of the class within the module must be Driver.

The following drivers are available in the extensions directory:

  • ast_dem
  • digitalglobe
  • dimap
  • geoeye
  • rapideye
  • spot5
  • terrasarx
  • geotif and jpeg2000

The drivers are configured in drivers.yaml

Enrich process

The enrich process takes a list of tuple collection/item id. The process runs the following step for each item:

  • say whether it supports a given archive or not
  • create the asset for the given item (done by the driver)
  • upload the asset
  • update the item

A driver must implement the abstract class Driver. The following drivers are available in the extensions directory:

  • safe for sentinel 2 products

The drivers are configured in enrich_drivers.yaml

Prerequisites

  • python 3.10
  • AIRS
  • celery backend (redis)
  • celery brocker (rabbitmq)
  • docker

See here for the available versions of aproc-service and here for the available versions of aproc-processes

APROC Configuration

The following environment variables must be set to run aproc-service and aproc-proc:

Variable
APROC_ENDPOINT_FROM_APROC
APROC_CONFIGURATION_FILE
APROC_HOST
APROC_PORT
CELERY_BROKER_URL
CELERY_RESULT_BACKEND
AIRS_ENDPOINT
APROC_PREFIX
APROC_LOGGER_LEVEL
ARLAS_URL_SEARCH
APROC_CORS_ORIGINS
APROC_CORS_METHODS
APROC_CORS_HEADERS
AIRS_INDEX_COLLECTION_PREFIX
ARLAS_SMTP_ACTIVATED
ARLAS_SMTP_HOST
ARLAS_SMTP_PORT
ARLAS_SMTP_USERNAME
ARLAS_SMTP_PASSWORD
ARLAS_SMTP_FROM
APROC_DOWNLOAD_ADMIN_EMAILS
APROC_DOWNLOAD_OUTBOX_DIR
APROC_DOWNLOAD_CONTENT_USER
APROC_DOWNLOAD_SUBJECT_USER
APROC_DOWNLOAD_CONTENT_ERROR
APROC_DOWNLOAD_SUBJECT_ERROR
APROC_DOWNLOAD_CONTENT_ADMIN
APROC_DOWNLOAD_SUBJECT_ADMIN
APROC_EMAIL_PATH_PREFIX_ADD
APROC_PATH_TO_WINDOWS
APROC_DOWNLOAD_REQUEST_SUBJECT_USER
APROC_DOWNLOAD_REQUEST_CONTENT_USER
APROC_DOWNLOAD_REQUEST_SUBJECT_ADMIN
APROC_DOWNLOAD_REQUEST_CONTENT_ADMIN
APROC_INDEX_ENDPOINT_URL
APROC_INDEX_NAME
APROC_INDEX_LOGIN
APROC_INDEX_PWD
APROC_RESOURCE_ID_HASH_STARTS_AT

Using ingest and directory_ingest

Add an archive
curl -X 'POST' \
  'http://localhost:8001/arlas/aproc/processes/ingest/execution' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{"inputs": {"collection": "digitalearth.africa", "catalog": "spot6", "url": "/inputs/DIMAP/PROD_SPOT6_001/VOL_SPOT6_001_A/IMG_SPOT6_MS_001_A/", "annotations":""}, "outputs": null, "response": "raw", "subscriber": null}'

Result:

{
  "processID": "ingest",
  "type": "process",
  "jobID": "c3300fd2-aed6-4887-b2e9-d5db8ce02ced",
  "status": "accepted",
  "message": "",
  "created": 1698153197,
  "started": null,
  "finished": null,
  "updated": 1698153197,
  "progress": null,
  "links": null,
  "resourceID": "inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A-"
}

Add archives contained in a directory

curl -X 'POST' \
  'http://localhost:8001/arlas/aproc/processes/directory_ingest/execution' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{"inputs": {"collection": "digitalearth.africa", "catalog": "dimap", "directory": "DIMAP"}, "outputs": null, "response": "raw", "subscriber": null}'
Result:
{
  "processID": "directory_ingest",
  "type": "process",
  "jobID": "d288ff3e-e880-43d3-880e-f2725f5f55b2",
  "status": "accepted",
  "message": "",
  "created": 1698153396,
  "started": null,
  "finished": null,
  "updated": 1698153396,
  "progress": null,
  "links": null,
  "resourceID": "DIMAP"
}

Download process

The download and directory_download relies on a driver mechanism. A driver must implement the abstract class Driver.

Available drivers are

  • dimap
  • tif_file

Using download

curl -X 'POST' \
  'http://localhost:8001/arlas/aproc/processes/download/execution' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{"inputs": {"requests": [{"collection": "digitalearth.africa", "item_id": "inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A"}], "crop_wkt": "", "target_projection": "epsg:4326", "target_format": "Geotiff"}}'
Result:
{
  "processID": "download",
  "type": "process",
  "jobID": "40154302-4ca7-468c-854d-c09b245e3e64",
  "status": "accepted",
  "message": "",
  "created": 1698154319,
  "started": null,
  "finished": null,
  "updated": 1698154319,
  "progress": null,
  "links": null,
  "resourceID": "db6bd405d357b8f6420bfe6797bbbec1e6430afe"
}

Getting the status

To get the status of one running process (job):

curl -X 'GET' \
  'http://localhost:8001/arlas/aproc/jobs/40154302-4ca7-468c-854d-c09b245e3e64' \
  -H 'accept: application/json'
Result:
{
  "processID": "download",
  "type": "process",
  "jobID": "40154302-4ca7-468c-854d-c09b245e3e64",
  "status": "successful",
  "message": "{'download_location': '/outbox/anonymous/inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A/inputs_DIMAP_PROD_SPOT6_001_VOL_SPOT6_001_A_IMG_SPOT6_MS_001_A.Geotiff'}",
  "created": 1698154319,
  "started": 1698154319,
  "finished": 1698154440,
  "updated": 1698154440,
  "progress": null,
  "links": null,
  "resourceID": "db6bd405d357b8f6420bfe6797bbbec1e6430afe"
}

To get the status of the process for one resource (item id for an ingest):

curl -X 'GET' \
  'http://localhost:8001/arlas/aproc/jobs/resources/inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A' \
  -H 'accept: application/json'
Result:
[
  {
    "processID": "ingest",
    "type": "process",
    "jobID": "efd65a52-78c3-4fbd-9f2b-40bb726de1ca",
    "status": "failed",
    "message": "",
    "created": 1698153257,
    "started": 1698153257,
    "finished": 1698153257,
    "updated": 1698153257,
    "progress": null,
    "links": null,
    "resourceID": "inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A"
  },
  ...
  {
    "processID": "ingest",
    "type": "process",
    "jobID": "d79ae63b-79dd-4a6c-a93a-ee2924575d1e",
    "status": "successful",
    "message": "{'item_location': 'http://airs-server:8000/arlas/airs/collections/digitalearth.africa/items/inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A'}",
    "created": 1698153397,
    "started": 1698153397,
    "finished": 1698153397,
    "updated": 1698153397,
    "progress": null,
    "links": null,
    "resourceID": "inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A"
  }
]

To get the status of the most recent processes for ingest and which are running:

curl -X 'GET' \
  'http://localhost:8001/arlas/aproc/jobs?offset=0&limit=10&process_id=ingest&status=running' \
  -H 'accept: application/json'
Result:
{
    "status_list": [
        {
            "processID": "ingest",
            "type": "process",
            "jobID": "c8c79f8e-51c9-4a99-96e2-527cc365cb1d",
            "status": "successful",
            "message": "{'item_location': 'http://airs-server:8000/arlas/airs/collections/digitalearth.africa/items/SENTINEL2A_20230604-105902-526_L2A_T31TCJ_D'}",
            "created": 1698400307,
            "started": 1698400320,
            "finished": 1698400327,
            "updated": 1698400327,
            "resourceID": "SENTINEL2A_20230604-105902-526_L2A_T31TCJ_D"
        },
        ...
    ],
    "total": 41
}

AGATE

AGATE is ARLAS Asset Gateway. It is a service for protecting assets from an object store such as minio. It must be used as a forward authorisation service.

Variable
ARLAS_URL_SEARCH
AGATE_PREFIX
AGATE_HOST
AGATE_PORT
AGATE_ENDPOINT
AGATE_URL_HEADER
AGATE_URL_HEADER_PREFIX
AGATE_LOGGER_LEVEL
AGATE_CORS_ORIGINS
AGATE_CORS_METHODS
AGATE_CORS_HEADERS

FAM

FAM is ARLAS File and Archive Management service. The endpoint lists files in a directory and can list contained archives.

Variable
INGESTED_FOLDER
FAM_LOGGER_LEVEL
FAM_PREFIX

Example for listing files in DIMAP:

curl -X 'POST' \
  'http://localhost:8005/arlas/fam/files' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "path": "DIMAP",
  "size": 10
}'
Returns
[
  {
    "name": ".DS_Store",
    "path": "DIMAP/.DS_Store",
    "is_dir": false,
    "last_modification_date": "2023-09-29T19:14:05.037229",
    "creation_date": "2023-09-29T19:14:05.037771"
  },
  {
    "name": "PROD_SPOT6_001",
    "path": "DIMAP/PROD_SPOT6_001",
    "is_dir": true,
    "last_modification_date": "2023-09-29T19:14:05.082518",
    "creation_date": "2023-09-29T19:14:05.082518"
  }
]

Example for listing archives in DIMAP:

curl -X 'POST' \
  'http://localhost:8005/arlas/fam/archives' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "path": "DIMAP",
  "size": 10
}'
Returns
[
  {
    "name": "IMG_SPOT6_MS_001_A",
    "path": "DIMAP/PROD_SPOT6_001/VOL_SPOT6_001_A/IMG_SPOT6_MS_001_A",
    "is_dir": true,
    "last_modification_date": "2023-09-29T19:15:20.930201",
    "creation_date": "2023-09-29T19:15:20.930201",
    "id": "inputs-DIMAP-PROD_SPOT6_001-VOL_SPOT6_001_A-IMG_SPOT6_MS_001_A",
    "driver_name": "dimap"
  }
]

STAC Synchronisation

The following synchronisations are available:

GEODES

To ingest products from the GEODES catalogue into AIRS, the process needs to access the AIRS service. The simplest way is to run the docker container within the same network as AIRS. Below is an example:

docker run --rm \
  -v `pwd`:/app/  \
  --network arlas-net aias/stac-geodes:latest \
  add https://geodes-portal.cnes.fr/api/stac/items http://airs-server:8000/airs geodes S2L1C \
  --data-type PEPS_S2_L1C \
  --data-type MUSCATE_SENTINEL2_SENTINEL2_L2A \
  --data-type MUSCATE_Snow_SENTINEL2_L2B-SNOW \
  --data-type MUSCATE_WaterQual_SENTINEL2_L2B-WATER \
  --data-type MUSCATE_SENTINEL2_SENTINEL2_L3A \
  --product-level L1C \
  --max 1000

To get some help, simply run docker run --rm --network arlas-net gisaia/stac-geodes:latest add https://geodes-portal.cnes.fr/api/stac/items --help