Producing GeoPackages with massivegeopackage

From Maria GDK Wiki
Revision as of 09:52, 25 March 2021 by Mbu (talk | contribs) ()
Jump to navigation Jump to search

Massivegeopackage is a Python package for building a GeoPackage from very large raster or elevation datasets.

The core GDAL tools gdalwarp, gdal_translate and gdaladdo seem to struggle with mosaicing very large raster datasets. When the size of the source dataset is larger than 15-20 GB, processing time seems to increase exponentially.

This Python package will convert each source file to individual GeoPackages with a common tile matrix. This is done in a user configurable number of parallel processes. Completed files will immediately be queued to be merged into a base GeoPackage. The merging is very efficient because it consists of SQL INSERT statements only - no geoprocessing.

Dependencies

  • numpy
  • Pillow
  • GDAL >= 3.1.4

The easiest way to get a Python environment where these dependencies are met, is to install QGIS 3.16 or newer. Included is the OSGeo4W shell, where you can install and use the package.

Installation

The package is distributed as a Python wheel-file. Install it using pip:

python -m pip install c:\path\massivegeopackage-1.3.0-py3-none-any.whl

General usage

The package consists of two separate modules: raster and elevation. After installation, the modules can be run with the commands:

python -m massivegeopackage.raster

python -m massivegeopackage.elevation

For version information, use either of these commands:

python -m pip show massivegeopackage

python -m massivegeopackage

From version 1.4.0, parameters can be supplied on the command line. Run either of the modules with --help to list all parameters. If no parameters are given at the command line, the program will use a series of input prompts instead.

Examples:

python -m massivegeopackage.raster -srcfolder c:\data\geotiff -targetfolder c:\data\gpkg_output

python -m massivegeopackage.elevation -srcfolder c:\data\50_dtm -areapath c:\norway\counties -targetfolder c:\data\gpkg_output -targetdatatype 32

If the package cannot be installed for whatever reason, it is also possible to unzip the .whl file, and then run the script files directly:

python c:\massivegeopackage-1.3.0-py3-none-any\massivegeopackage\raster.py

python c:\massivegeopackage-1.3.0-py3-none-any\massivegeopackage\elevation.py

massivegeopackage.raster

Input files should be homogenous (same projection, dimensions, bands, pixel size).

Supported input:

Format Band configuration
Any raster format supported by GDAL 1 band color index (8 bit)
1 band grayscale (8 bit)
3 band RGB (24 bit)
4 band RGBA (32 bit)

Parameters:

Parameter Required? Description
srcfolder Yes Full path to a folder containing source raster files.
areapath No Full path to a folder containing shape files. Each shape file should contain a single polygon feature. One clipped GeoPackge will be created for each shape file.
targetfolder Yes Empty folder for temporary files, logs and the completed GeoPackage.
input_file_type No File extension on input files. Default is tif
nodata No Pixel value to make transparent in the completed GeoPackage. Example: 0 0 0
co No Creation options for the GDAL GeoPackage driver. Example: tile_format=jpeg,quality=50
num_processes No Number of parallel processes to use. Most common storage devices will become a bottleneck with 20 or more parallel processes.
recursive No Search for source files in subfolders
debug No Log debug messages in addition to info
nocleanup No Do not clean up temporary GeoPackage files

massivegeopackage.elevation

Input files should be homogenous (same projection, dimensions, bands, pixel size).

If the input files are Float32 and target is set to 16 bit, a scale and offset will be computed for each tile. These are applied to each pixel and then rounded to the nearest integer. This stretches the tile's value range to utilize the full range of a 16 bit unsigned integer (0-65534). The scale and offset is reversed when an application reads from the output file. This way, an effective precision of around 0.01 - 0.001 meters is achieved (less varied source data results in higher precision).

Supported input and corresponding output:

Format Input configuration Output
Any raster format supported by GDAL 1 band (Int16) 1 band (UInt16)
1 band (UInt16) 1 band (UInt16)
1 band (Float32) 1 band (UInt16 or Float32)

Parameters:

Parameter Required? Description
srcfolder Yes Full path to a folder containing source raster files.
areapath No Full path to a folder containing shape files. Each shape file should contain a single polygon feature. One clipped GeoPackge will be created for each shape file.
targetfolder Yes Empty folder for temporary files, logs and the completed GeoPackage.
input_file_type No File extension on input files. Default is tif
targetdatatype No Datatype for output GeoPackage - 32 or 16 bits.
num_processes No Number of parallel processes to use. Most common storage devices will become a bottleneck with 20 or more parallel processes.
recursive No Search for source files in subfolders
debug No Log debug messages in addition to info
nocleanup No Do not clean up temporary GeoPackage files