Producing GeoPackages with massivegeopackage
Massivegeopackage is a Python package for building a GeoPackage from very large raster or elevation datasets.
The core GDAL tools gdalwarp, gdal_translate and gdaladdo seem to struggle with mosaicing very large raster datasets. When the size of the source dataset is larger than 15-20 GB, processing time seems to increase drastically.
This Python package will convert each source file to individual GeoPackages with a common tile matrix. This is done in a user configurable number of parallel processes. Completed files will immediately be queued to be merged into a base GeoPackage. The merging is very efficient because it consists of SQL INSERT
statements only - no geoprocessing.
Dependencies
- numpy
- Pillow
- GDAL >= 3.1.4
The easiest way to get a Python environment where these dependencies are met, is to install QGIS 3.16 or newer. Included is the OSGeo4W shell, where you can install and use the package.
Installation
The package is distributed as a Python wheel-file. Install using pip:
python -m pip install c:\path\massivegeopackage-1.3.0-py3-none-any.whl
Usage
From version 1.4.0, input arguments can be supplied on the command line. Run the module with
--help
to list all arguments. The first argument should be
raster
or
elevation
depending on your source data.
Examples:
python -m massivegeopackage --help python -m massivegeopackage raster --srcfolder c:\data\geotiff --targetfolder c:\data\gpkg_output python -m massivegeopackage elevation --srcfolder c:\data\50_dtm --areapath c:\norway\counties --targetfolder c:\data\gpkg_output --targetdatatype 32
For version information, use this command:
python -m pip show massivegeopackage
raster
Input files should be homogenous (same projection, dimensions, bands, pixel size).
Supported input:
Format | Band configuration |
---|---|
Any raster format supported by GDAL | 1 band color index (8 bit) |
1 band grayscale (8 bit) | |
3 band RGB (24 bit) | |
4 band RGBA (32 bit) |
Parameters:
Parameter | Required? | Description |
---|---|---|
srcfolder | Yes | Full path to a folder containing source raster files. |
areapath | No | Full path to a vector dataset. Each layer in the dataset should contain a single polygon feature. One clipped GeoPackge will be created for each layer. |
targetfolder | Yes | Empty folder for temporary files, logs and the completed GeoPackage. |
input_file_type | No | File extension on input files. Default is tif
|
nodata | No | Pixel value to make transparent in the completed GeoPackage. Example: 0 0 0
|
co | No | Creation options for the GDAL GeoPackage driver. Example: tile_format=jpeg,quality=50
|
num_processes | No | Number of parallel processes to use. Most common storage devices will become a bottleneck with 20 or more parallel processes. |
recursive | No | Search for source files in subfolders |
debug | No | Log debug messages in addition to info |
nocleanup | No | Do not clean up temporary GeoPackage files |
elevation
Input files should be homogenous (same projection, dimensions, bands, pixel size).
If the input files are Float32 and target is set to 16 bit, a scale and offset will be computed for each tile. These are applied to each pixel and then rounded to the nearest integer. This stretches the tile's value range to utilize the full range of a 16 bit unsigned integer (0-65534). The scale and offset is reversed when an application reads from the output file. This way, an effective precision of around 0.01 - 0.001 meters is achieved (less varied source data results in higher precision).
Supported input and corresponding output:
Format | Input configuration | Output |
---|---|---|
Any raster format supported by GDAL | 1 band (Int16) | 1 band (UInt16) |
1 band (UInt16) | 1 band (UInt16) | |
1 band (Float32) | 1 band (UInt16 or Float32) |
Parameters:
Parameter | Required? | Description |
---|---|---|
srcfolder | Yes | Full path to a folder containing source raster files. |
areapath | No | Full path to a vector dataset. Each layer in the dataset should contain a single polygon feature. One clipped GeoPackge will be created for each layer. |
targetfolder | Yes | Empty folder for temporary files, logs and the completed GeoPackage. |
input_file_type | No | File extension on input files. Default is tif
|
targetdatatype | No | Datatype for output GeoPackage - 32 or 16 bits. |
num_processes | No | Number of parallel processes to use. Most common storage devices will become a bottleneck with 20 or more parallel processes. |
recursive | No | Search for source files in subfolders |
debug | No | Log debug messages in addition to info |
nocleanup | No | Do not clean up temporary GeoPackage files |