Create and Run a Processor

This guide is here to help you get up and running with Intellect and its processors. Whether you're brand new or just need a refresher, we’ve got you covered with clear steps, simple explanations, and a real example to pull it all together.

This guide can be seen as a simplified extract of what already described here, so it can be particularly interesting for those users who are not familiar with Docker.

It is recommended to read this guide by following the proposed order.

What Is a Processor

A processor in Intellect is a service running in batch mode to process one input and producing an output.

How To Access Intellect

Intellect can be accessed by the ESA PAL main page and by clicking on "Processing (Intellect)". You will be then redirected to a login page.

Required Files (Sources)

To build your own processor in Intellect, you will need four main sources or files:

A Dockerfile
An entrypoint
The script you want to run within the processor
A requirements file (not always needed)

As said before, in the following it is assumed that the user has a limited knowledge about Docker. For this reason, some pieces of code will be threated as "default" parts, i.e. they can be left unchanged in case you want to use them for creating your own processor.

Dockerfile

The Dockerfile defines the "bone" structure of your processor. Here you define the OS version, all the needed directories and how the other three main files should interact each other. Here below there is an example of Dockerfile that you can use as well.

FROM ubuntu:24.10

LABEL maintainer="ASCEND"

ENV TZ=Etc/UTC

RUN echo $TZ > /etc/timezone

RUN apt-get update && apt-get install --yes --no-install-recommends \
    jq zip unzip gdal-bin python3-gdal python3-venv\
    && rm -rf /var/lib/apt/lists/*

# Prepare processor script
ARG WORKERDIR=/home/worker
ARG INDIR="$WORKERDIR/workDir/inDir"
ARG OUTDIR="$WORKERDIR/workDir/outDir"
ARG PROCDIR="$WORKERDIR/procDir"
ARG WPS_PROPS="$WORKERDIR/workDir/WPS-INPUT.properties"

RUN mkdir -p $INDIR
RUN mkdir -p $OUTDIR
RUN mkdir -p $PROCDIR

ENV IN_DIR="$INDIR"
ENV OUT_DIR="$OUTDIR"
ENV PROC_DIR="$PROCDIR"
ENV WORKERDIR="$WORKERDIR"
ENV WPS_PROPS="$WPS_PROPS"

ADD requirements.txt ${PROCDIR}/requirements.txt

# Enable venv
RUN python3 -m venv $PROCDIR/venv
RUN . $PROCDIR/venv/bin/activate \
    && pip3 install -r $PROC_DIR/requirements.txt

# The COPY here allows fast builds when only code changes
COPY * ${PROCDIR}/

RUN ls -l $WORKERDIR
RUN chmod +x $PROCDIR/basic_entrypoint.py
ENTRYPOINT ["/home/worker/procDir/venv/bin/python3","/home/worker/procDir/basic_entrypoint.py"]

The first line (FROM ubuntu:24.10) indicates that Ubuntu is used as reference OS. You can change the version in case some dependencies require a different one.

The lines:

ARG WORKERDIR=/home/worker
ARG INDIR="$WORKERDIR/workDir/inDir"
ARG OUTDIR="$WORKERDIR/workDir/outDir"
ARG PROCDIR="$WORKERDIR/procDir"
ARG WPS_PROPS="$WORKERDIR/workDir/WPS-INPUT.properties"

define the directories in which your processor takes the needed inputs and saves the outputs (INDIR and OUTDIR), while WPS_PROPS defines place in which the configuration file is. It is recommended to leave these lines (WORKERDIR, INDIR, OUTDIR, PROCDIR and WPS_PROPS) as they are, since the tool expects input and output data to be located in a subdirectory named "inDir" and "outDir".

The line:

ADD requirements.txt ${PROCDIR}/requirements.txt

adds the file requirements file (requirements.txt) within the PROCDIR directory. You can create a requirement file naming it as you want. In case you decide to rename your requirements file, make sure to change the name even at the following line:

RUN . $PROCDIR/venv/bin/activate \
    && pip3 install -r $PROC_DIR/requirements.txt

Finally, the entrypoint file is called:

RUN chmod +x $PROCDIR/basic_entrypoint.py
ENTRYPOINT ["/home/worker/procDir/venv/bin/python3","/home/worker/procDir/basic_entrypoint.py"]

And even in this case, you can rename your entrypoint as you prefer, but putting attention to change the name in the lines above.

Entrypoint

Also here, you can use the entrypoint file shown below as a template for your processor.

#!home/worker/procDir/venv/bin/ python3

import os
import logging
import subprocess
import zipfile
import glob
from jproperties import Properties

logging.basicConfig(
    encoding="utf-8",
    level=logging.INFO,
    format="%(asctime)s %(levelname)-8s %(message)s",
    datefmt="%Y-%m-%dT%H:%M:%S",
)

PROC_DIR = os.environ.get("PROC_DIR")
WORKERDIR = os.environ.get("WORKERDIR")
IN_DIR = os.environ.get("IN_DIR")
OUT_DIR = os.environ.get("OUT_DIR")
WPS_PROPS = os.environ.get("WPS_PROPS")


def check_prop_file(file:str)->bool:
    '''
    it checks if properties files is not empty
    Args:
        file= path to properties file
    Return:
        bool for validity
    '''
    configs = Properties()

    with open(file, 'rb') as read_prop:
        configs.load(read_prop)
    if not len(configs)>0:
        logging.info("No input parameters given")
    return len(configs)>0

def get_parameter(name:str,file:str)->str:
    '''
    It extracts input parameter given as
    defined in GUI when launching the code
    Args:
        name = name of the paramenter i.e. bbox
        file = path to input parameter file
    Returns:
        a string that defines the given parameter
        i.e. "[44.265, 12.470, 43.220, 13.913]"
    '''
    configs = Properties()
    with open(file, 'rb') as read_prop:
        configs.load(read_prop)

    definition=configs.get(name).data
    return definition

def prepare_args_et0(WPS_PROPS):
    '''
    create arguments to launch et0 calculation
    '''
    if check_prop_file(WPS_PROPS):
        #create a class Parameters and test directly it with unitest
        input=get_parameter("input",WPS_PROPS)
        print(input)
        # aoi=get_parameter("aoi",WPS_PROPS)
        # startdate=get_parameter("startdate",WPS_PROPS)
        # enddate=get_parameter("enddate",WPS_PROPS)

def main():

    python_exec = os.path.join(PROC_DIR, "venv", "bin", "python3") 

    script_path = os.path.join(PROC_DIR, "fastcopier.py")

    args_et0 = prepare_args_et0(WPS_PROPS)

    print(f" Print ${WPS_PROPS} property contain{args_et0}")
    # Optional: Check if files exist

    if not os.path.exists(python_exec):
        logging.error(f"Python executable not found at {python_exec}")
        return
    if not os.path.exists(script_path):
        logging.error(f"Script not found at {script_path}")
        return
    
    # # prepare input

    # Run the script
    cmd = [python_exec, script_path, f"{IN_DIR}/input", f"{OUT_DIR}/output"]
    logging.info(f"Running command: {' '.join(cmd)}")

    try:
        subprocess.run(cmd, check=True)
        logging.info("Script executed successfully.")
    except subprocess.CalledProcessError as e:
        logging.error(f"Script execution failed with code {e.returncode}")

if __name__ == "__main__":
    main()

In the following lines:

PROC_DIR = os.environ.get("PROC_DIR")
WORKERDIR = os.environ.get("WORKERDIR")
IN_DIR = os.environ.get("IN_DIR")
OUT_DIR = os.environ.get("OUT_DIR")
WPS_PROPS = os.environ.get("WPS_PROPS")

the directories defined in the Dockerfile are assigned to a variable to be used within the entrypoint file. Then, the functions check_prop_file, get_parameter and prepare_args_et0 perform a check on the input data, ensuring the proper definition and usage. The path of the script you want to run as a processor (in this case "fastcopier.py") is defined in the line:

script_path = os.path.join(PROC_DIR, "fastcopier.py")

One line at which you have to keep attention is the following:

cmd = [python_exec, script_path, f"{IN_DIR}/input", f"{OUT_DIR}/output"]

Here it is defined the programming language to use (first entry), the path of your script to run as processor (second entry) and finally, the input and output type you want to use (see the Inputs and Outputs Definition section).

Requirements File

The requirement file is a .txt file in which you define all the packages that need to be installed in order to run your code. It can be thought as a list of libraries or packages that you normally install using pip. For example, let's suppose that your algorithm needs the following packages to run:

GDAL
rasterio
pandas
numpy

You can write a simple .txt file as described below:

GDAL
rasterio
pandas
numpy

Inputs and Outputs Definition

Your inputs and outputs type should be defined in the apposite sections, i.e. under Input definitions and Output definitions. You can define more than one input/output, by clicking on the "plus" symbol in green:

Here you can assign a name to your input (ID) and a title. As mentioned before, the ID that you choose should be the same reported in:

cmd = [python_exec, script_path, f"{IN_DIR}/input", f"{OUT_DIR}/output"]

Same for the output ID. The title will then appear when you need to run your processor, above the data selection bar:

You can choose different types of input:

String
Number
Enum
Catalogue Product
AOI (Area Of Interest): WKT format, e.g. POLYGON ((37.5 65.75, 42.5 65.75, 44.5 68.25, 38.5 68.25, 37.5 65.75))
Date: in ISO 8601 format, e.g. "2022-10-01T00:00:00.000Z"

Similarly, for the output you can choose a defined format:

GeoTiff: type of "raster image file" that includes geographic metadata. File Extension should be .tif
ShapeFile: "vector data format" for geographic information system
Other

You can find a more detailed explanation of the supported input and output types here.

Run a Processor

To run your processor, you can go to "Process some data" and choose the type of processing you want to perform (depending on what type of processor you implemented). Then you can select your processor from the ones appearing on your service list (on the right part of the screen) or you can filter the results to by service type or owner, to find it easier. Once selecting your processor, just insert the input you need to process and select in which folder you want to save the output. Finally, by clicking on "Run service", you can start your processing and wait until your algorithm performs its tasks.

Practical Examples

At this point, you can find some usage examples of how to create a processor on Intellect. First, you have to refer to the "Develop" tab on the top bar or, equivalently, you can click on "Integrate your own algorithm", and then "Processing services".

Example 1

Example 2