Module Algebra
See Also
opals::IAlgebra

Aim of module

Provides a generic raster algebra calculator featuring mathematical, statistical and conditional operators.

General description

Module Algebra is a powerful tool as it allows the combination of multiple input grid datasets by means of a user defined algebraic formula. Basically, Module Algebra can be used either as a selector, as a calculator or, in all generality, as a mixture of both. Module Algebra can be applied to derive:

  • difference models: by subtracting two grid models (->calculator)
  • grid masks: by evaluating one or more grid models and returning either true (transparent) or false (opaque) as result (->calculator)
  • erosion hazard models: by combining the pixels of multiple grids in an algebraic way (e.g. Wischmeier formula, ->calculator)
  • landscape dependent DSM models: by choosing the output pixel values from multiple input grid sources according to certain criteria (->selector)
  • DTM mosaics: by combining multiple input grids (->selector)
  • point density models: by summing up the point densities of multiple input grids (->selector+calculator)
  • ...

In general, the output grid (parameter outFile) is derived by applying an algebraic formula (parameter formula) combining the values of one ore more input grid models (parameter inFile). The formula can either be specified as a generic filter or in python syntax. In the latter case the python interpreter is used evaluate the formula.

By default, the output grid is created with the same pixel size as the first input grid, but an arbitrary pixel size can be specified using parameter gridSize. The extents of the resulting grid can either be specified by the user (parameter limit (left,lower,right,upper)) or they are calculated from of the outermost pixel centers of all input grid datasets involved. If no user defined limits are specified, either the common area (parameter limit intersect), the overall area (parameter limit union) or the extents of the i'th dataset (parameter limit dataset:i) are applied (c.f. section Parameter description for more details).

Please note that Module Algebra can deal with non-coinciding models simultaneously. If necessary, the input grids/rasters are re-sampled according to the output grid structure prior to evaluating the formula (parameter resampling). To strictly avoid resampling, please specify resampling=never. This would lead to a program termination if any of the contributing input grid files would require resampling.

Formula syntax

The following sections describe in detail how arbitrary formulae can be specified in Module Algebra using either Generic filter or Python syntax.

Generic filter syntax

This section describes how to define the algebraic formula using the generic filter syntax. Due to the better performance, it is highly recommended to use the generic filter syntax whenever possible.

In addition to the main purpose of sub-selecting data, generic filters also act as a calculator. Within Algebra the grid location and the respective grid values can be used as variables in the formula. The following well-known variables are defined:

Variable name Description
x x-coordinate (Easting) of the pixel location
y y-coordinate (Northing) of the pixel location
r[i] pixel value of i-th input grid model (index starting with 0)
invalid general void indicator

Using the above variable names we can now combine one ore more input grid values in an algebraic expression to compute a scalar value representing the respective output grid value. For example, to compute a slope raster in % based on a given input gradient raster, the formula is:

r[0]*100

Please note, that in the above formula only a single input grid is required which is accessed by r[0]. In general, the grid/raster values are accessed by their index starting form 0 (first input grid) to n99 (last input grid). The result of the expression (i.e. a scalar value) is assigned to the corresponding output grid cell.

Please note that if no algebraic expression is specified, a logical result is returned rather than the value. Thus,

r[0]

returns true (1) if r[0] exists, otherwise false (0). Consequently, if we simply want to return the value of the first input grid, e.g. using Algebra as a raster format converter, we would have to write:

1*r[0]

Another simple example is to subtract two input grids:

r[1]-r[0]

In this case, the difference is returned for all those pixels where, both, r[0] and r[1] exist. If either of the operands is invalid (i.e. nodata) the resulting output pixel will be invalid (nodata) as the expression cannot be evaluated correctly.

A typical application of module Algebra is the derivation of (binary) grid masks. In this case either 0 or 1 is returned depending on a certain condition. Conditions can be specified in generic filters using the ternary operator:

r[0]*r[1]<1.0 ? : 0 : 1

In the above example, first the condition (r[0]*r[1]<1.0) is evaluated and depending on the binary result of the conditional expression 0 or 1, respectively, is returned as result of the entire formula. Please note that if one of the operands in the above condition does not exist, the entire condition evaluates to false, i.e. the else branch of the ternary operator is returned.

Within conditional expressions it is often desired to check the existence of certain grid values as well as their respective value. Consider a formula returning the ratio between two grids:

r[0]/r[1]

For this formula we might want to check, if the grid value of the second input grid (r[1]) exists and is not equal to zero (otherwise the division will fail). Generic filters allow to query the existence of grid values by internally comparing the actual grid value with the grids (inherent) nodata value. For the example at hand we can simply write:

r[1] && r[1]!=0 ? r[0]/r[1] : invalid

Please note the following details:

  • existence check: r[1] checks the existence of the grid value; i.e, the grid contains a valid value at the respective location.
  • concatenation of logical expressions: the existence (r[1]) and not-equal-null (r[1]!=0) checks are concatenated with a logical "and" (&&)
  • invalid is returned if the (entire) conditional expression is "false", which results in an invalid (nodata) output pixels.

To support the creation of grid/raster mosaics, the generic filters provide statistical operators (min, max, mean, ...) for the entire stack of grid/raster values. Consider a set of stripwise DSM grid models. In order to create a mosaic of all strip DSMs, we can simply write...

mean(r)

... returning the mean of all n valid input grids cells. Pleas not, that invalid grids cells (i.e., cell value = nodata value) are not considered for the statistical calculation.

Please note, that in the context of opalsAlgebra the following restrictions w.r.t generic filters apply:

  • Only raster values (r[i]) and the grid location coordinates (x,y) can be used as input variables but no additional attributes.
  • Assignments are only supported to store intermediate results but don't take an effect beyond the (local) filter operation itself.
  • If assignments are used, the formula returns the first non-assignment expression. If only assignments are provided, the result (i.e., right side of the assignment) of the first assignment is returned.

The only reason for using assignments in the context of an opalsAlgebra formula is to increase the readability of the formula. In the following example a "raster cell density" is calculated by first defining the number of involved rasters, secondly counting the number of valid cells and, finally, returning the cell density.

n=10;cnt=count(r);cnt/n

According to the above rules, the formula returns cnt/n, where n and cnt are (locally) calculated within the first two assignments.

Generic filters are a very powerful tool providing a set of arithmetic, logical and conditions operators and other features. A detailed documentation can be found here.

Python syntax

An alternative way of specifying formulae in Module Algebra is to use Python syntax. In this case an internal Python interpreter is used to evaluate the formula. Although the performance is generally worse, the Python syntax offers more flexibility and should be used whenever the formula is too complex to be written in generic filter syntax. The basic syntax of Python style formulae is:

return ...

where the return value represents the output grid value. Normally, the output grid value is a function of one or more input grid values. Let's assume that our input grid contains surface slope values (tangent of the steepest slope). In order to compute a grid containing the values of steepest slope in % we need to multiply the original slope values by 100. In this case, the formula looks as follows:

return 100 * r[0]

where r[0] represents the grid value of the input grid. Of course, the combination formula can be much more complex, incorporating multiple grids, conditional statements and the like. A simple example based on two input grid models, which are used to create a grid mask (0|1) based on a certain threshold is:

return 0 if r[0]*r[1]<1.0 else 1

The formula must be specified as a string in Python syntax. In fact, the Python interpreter is used to parse and evaluate the condition. Please refer to the Python documentation for a detailed description of the Python syntax. In the following, important information concerning the construction of formulas is summarized:

To access certain information of the considered grid models (pixel values, no data indicators, etc.), the following well-known variables are predefined:

Variable name Description
x x-coordinate (Easting) of the pixel location
y y-coordinate (Northing) of the pixel location
r[i] pixel value of i-th input grid model (index starting with 0)
None general void indicator (c.f. below for details).

None is a constant representing a null-object in Python. It can be used to express empty or invalid objects of any type. In Module Algebra in particular, None is used to handle invalid input or output pixels. E.g., to verify if all input pixels are valid, one can write (for better readability in multi-line style):

if r[0]==None and r[1]==None and ... :
return None
else:
return ....

The very same statement can be written in more compact format, without the None keyword:

if r[0] and r[1] and ... :
return
else:
return ....

Thus, in the first line the statement r[0]==None is equivalent to not r[0] (i.e. r[0] is invalid), and in the second line it is even allowed to skip the return value, as an empty return is equivalent to returning an empty object.

To combine the specific pixel values, a series of logical and mathematical operations are provided by Python. For a complete reference please refer to the Python language reference. The most important operations are listed below:

Boolean operations (or,and,not) ordered by ascending priority):

Operation Result
x or yif x is false, then y, else x
x and yif x is false, then x, else y
not xif x is false, then True, else False

Comparison operations:

Operation Meaning
<strictly less than
<=less than or equal
>strictly greater than
>=greater than or equal
==equal
!=not equal
isobject identity
is notnegated object identity

Built-in arithmetic operations:

Operation Result
x + ysum of x and y
x - ydifference of x and y
x * yproduct of x and y
x / yquotient of x and y
x // y (floored) quotient of x and y
x % yremainder of x / y
-xx negated
+xx unchanged
abs(x)absolute value or magnitude of x
int(x)x converted to integer
long(x)x converted to long integer
float(x)x converted to floating point
complex(re,im)a complex number with real part re, imaginary part im. im defaults to zero.
c.conjugate()conjugate of the complex number c. (Identity on real numbers)
divmod(x, y)the pair (x // y, x % y)
pow(x, y)x to the power y
x ** yx to the power y

Apart from these basic operations, conditional statements are of crucial importance for the construction of complex formulas. Conditions can be specified either as single-line or as multi-line python statements:

Single-line conditional statement:

return ... if condition else ...

Multi-line conditional statement:

if condition1:
return ...
elif condition2:
return ...
else:
return ...

However, since the formula must be passed as a single string argument, line-breaks have to be entered via the special \n character sequence. The correct formula string corresponding to the example above is:

if condition1:\n return ...\nelif condition2:\n return ...\nelse:\n return

Please note, that since Python is an indent-oriented language (in opposition to languages using parentheses for grouping), the blank characters before the return statements are mandatory. The same applies to the \nelif... sequence, where no blank is allowed in order to ensure the correct indentation.

The formulas are not restricted to conditional statements, in fact, all control statements (for-loops, while-loops, etc.) are supported. The only preconditions are:

  • the formula string must be provided in correct Python syntax
  • all control paths of the formula must return a value (if a control path does not return a value, the respective pixel will be void.)

Parameter description

-inFileinput grid files
Type: opals::Vector<opals::Path>
Remarks: mandatory
The grid model files are expected in GDAL supported format. The grid values of the respective models are referred to as r[0], r[1] ... in the calculation formula.
-outFileoutput grid file name
Type: opals::Path
Remarks: estimable
Path of output grid file in GDAL supported format.
Estimation rule: The current path and the name (body) of the input file are used as file name basis. Additionally, the default postfix '_algebra' and the extension corresponding to the output format are appended
-oFormatoutput file format [tif,asc,dem,dtm ...]
Type: opals::String
Remarks: estimable
Grid file format, use GDAL driver names like GTiff, AAIGrid, USGSDEM, SCOP ... .
Estimation rule: The output format is estimated based on the extension of the output file. If neither outFile nor oFormat are specified, GeoTiff is used as default.
-formulaalgebraic formula for combining input grids
Type: opals::String
Remarks: mandatory
Formula to compute a single grid value as a string in in Generic Filter or Python syntax.
A formula must contain return a value (e.g. generic filter: r[0]-r[r1] or python: return r[0]-r[1])). For more details please refer to the module documentation.
-gridSizegrid width x/y
Type: opals::Vector<double>
Remarks: estimable
Size of output grid cell in x- and y-direction. Estimation rule: output grid size = grid size of 1st inFile
-noDatanodata value indicating void pixels
Type: double
Remarks: default=9999
Value representing an undefined value in the output grid model.
-limit2D clipping window
Type: opals::MultiGridLimit
Remarks: optional
If -limit is skipped, the common area (intersection) of all input datasets is used.
-resamplinggrid resampling method.
Type: opals::ResamplingMethod
Remarks: default=bilinear

Possible values:  
  nearestNeighbour ... nearest neighbour
  bilinear ........... bi-linear interpolation
  bicubic ............ bi-cubic interpolation
  never .............. indicator avoiding resampling 

Method to be used for resampling the input grids to the specified grid structure of the output grid.

-maxMemorymaximum RAM memory [MB]
Type: unsigned int
Remarks: default=500
The memory consumption scales with the number of input grids in order to provide good reading performance. An upper memory limit can be specified in order to avoid overflows in case of numerous input grids (e.g. large mosaic),
-skipVoidAreasskip void input datatset parts
Type: bool
Remarks: estimable
Estimation rule: In opalsAlgebra, the output raster is processed block-wise. In general, for eachblock the pixels of all input datasets are queried, even if they only contain void pixels. The latter is typically the case in merging/mosaicing applications where the input datasets are either disjunct or only partially overlapping. To improve the run-time performance in such cases activating the skipVoidAreas flag allows ignoring blocks of all-void input pixels. Please note, that in this case the order of the input datasets is not strictly maintained which might lead to invalid results if the calculation formula explicitely uses input dataset indices (e.g. -formula r[0]?r[1]:r[2]). However, for all formulae which aggregate only the valid pixels (e.g. -formula mean(r)), activating skipVoidAreas substantially speeds up processing.

Examples

The data used in the subsequent examples are located in the $OPALS_ROOT/demo/ directory and the following preprocessing steps (data import and surface grid calculation) are required:

opalsImport -inFile strip31.laz
opalsImport -inFile strip21.laz
opalsImport -inFile strip11.laz
opalsImport -inFile fullwave2.fwf -iFormat fwf
opalsGrid -inFile strip31.odm -outFile strip31.tif -interpolation movingPlanes
opalsGrid -inFile strip21.odm -outFile strip21.tif -interpolation movingPlanes -feature sigmaZ excen
opalsGrid -inFile strip11.odm -outFile strip11.tif -interpolation movingPlanes
opalsGrid -inFile fullwave2.odm -outFile dsm_mp.tif -interpolation movingPlanes -feature sigma0
opalsCell -inFile fullwave2.odm -outFIle dsm_max.tif -cellSize 1 -feature max -limit center

Example 1: Roof model

In the first example, the roofs of a digital surface model are extracted by applying a simple height threshold. All other values are set to void (no data).

Generic filter syntax:

opalsAlgebra -inFile strip21.tif -outFile strip21_roofs.tif -formula "r[0]>275. ? r[0] : invalid"

Python syntax:

opalsAlgebra -inFile strip21.tif -outFile strip21_roofs_py.tif -formula "return r[0] if r[0]>275. else None"

The output is written to a separate grid file in GeoTiff format (strip21_roofs[_py].tif) and a standard color coded visualization of it is shown in the following figure:

algebra_example1.png
Figure 1: roof model derived by Module Algebra

Example 2: Grid mask

In this example, two grid models describing roughness (sigmaZ) and shadowing effects (excen), both by-products of the moving planes surface interpolation with /ref ModuleGrid, are used to create a binary grid mask (0..invalid,1..valid pixel).

Generic filter syntax:

opalsAlgebra -inFile strip21_sigmaZ.tif -inFile strip21_excen.tif -outFile strip21_mask.tif -formula " r[0]<0.1 && r[1]<0.8 ? 1 : 0"

Python syntax:

opalsAlgebra -inFile strip21_sigmaZ.tif -inFile strip21_excen.tif -outFile strip21_mask_py.tif -formula "return 1 if r[0]<0.1 and r[1]<0.8 else 0"

The very same condition can be written more compact as a single (binary) return statement:

Generic filter syntax:

-formula "r[0]<0.1 && r[1]<0.8"

Python syntax:

-formula "return r[0]<0.1 and r[1]<0.8"

Example 3: DTM Mosaic I

A simple way of creating a DTM mosaic based on multiple input grids (e.g. each grid covering a single map sheet) is by selecting the first valid grid value from the list of input grids for each grid position using Python's for .. in .. : statement:

Python syntax:

opalsAlgebra -inFile strip31.tif strip21.tif strip11.tif -limit union -outFile mosaic_py.tif -formula "for x in r[:]:\n if x:\n return x"

This commands takes three input grids (strip31.tif strip21.tif strip11.tif), calculates the xy-extents of the output grid as the union of all input datasets (-limit union) and, for each grid post, returns the first grid point with a valid height (-formula "... x:\n return x") within a loop over all input grids (-formula "for x in r[:]: ...").

To calculate a DTM mosaic based on the mean value of all valid pixels the following command in generic filter sysntax can be used:

opalsAlgebra -inFile strip31.tif strip21.tif strip11.tif -limit union -outFile mosaic.tif -formula "mean(r)"

The same result can be achieved using the wild card syntax for specifying the input files:

opalsAlgebra -inFile strip?1.tif -limit union -outFile mosaic_wc.tif -formula "mean(r)"

Example 4: DTM Mosaic II

In the following example a (global) point density model is created based on strip-wise point density models. The strip-wise point density models can be created with the following command:

opalsCell -inFile strip31.odm -feature pdens -cellSize 5
opalsCell -inFile strip21.odm -feature pdens -cellSize 5
opalsCell -inFile strip11.odm -feature pdens -cellSize 5

Now, we can calculate the global point density model as:

Generic filter syntax:

opalsAlgebra -inFile strip31_z_pdens.tif strip21_z_pdens.tif strip11_z_pdens.tif -outFile pdens.tif -limit union -formula "count(r)>0 ? sum(r) : invalid"

Python syntax:

opalsAlgebra -inFile strip31_z_pdens.tif strip21_z_pdens.tif strip11_z_pdens.tif -outFile pdens_py.tif -limit union -formula "sum=0\nfor x in r[:]:\n if x:\n sum+=x\nreturn sum if sum>0 else None"

Contrary to the previous example, all valid point densities values are summed up and the sum is returned (if the sum is greater 0).

Example 5: Landscape dependent DSM

Finally, opalsAlgebra is used for the derivation of a landscape dependent surface model as described in Hollaus et al (2010). Hereby, the single DSM pixels are either selected from a raster map containing the highest elevation within a cell (dsm_max.tif) or from a grid model calculated via moving planes interpolation (dsm_mp.tif) depending on a roughness map (dsm_mp_sigma0.tif, by-product of moving planes interpolation).

Generic filter syntax:

opalsAlgebra -inFile dsm_max.tif dsm_mp.tif dsm_mp_sigma0.tif -outFile DSM.tif -formula " r[2] < 0.2 || !r[0] ? r[1] : r[0]"

Python syntax:

opalsAlgebra -inFile dsm_max.tif dsm_mp.tif dsm_mp_sigma0.tif -outFile DSM_py.tif -formula "return r[1] if r[2] < 0.2 or not r[0] else r[0]"

Figure 2 shows hill shadings of all three DSM variants derived with the following commands:

opalsShade -inFile dsm_max.tif
opalsShade -inFile dsm_mp.tif
opalsShade -inFile DSM.tif
algebra_dsm_lowRes.png
Figure 2: Hill shadings; DSM max (left), DSM moving planes (center), landspace dependent DSM (right)

References

M. Hollaus, G. Mandlburger, N. Pfeifer, W. Mücke: Land cover dependent derivation of digital surface models from Airborne laser Scanning data; in: "IAPRS Volume XXXVIII Part 3A", (2010), ISSN: 168299750; 221 - 226.

Author
gm
Date
26.05.2016