Geolocation: Difference between revisions

From Maria GDK Wiki
Jump to navigation Jump to search
()
()
Line 246: Line 246:
==== RTree table ====
==== RTree table ====


The [https://www.sqlite.org/rtree.html RTree] table (''spatial_index'') is a mandatory table used for spatial searches. The Maria2012 location database rtree table should have a primary key and two column pairs representing the minimum and maximum values (bounding box) for a 2-dimensional object.
The [https://www.sqlite.org/rtree.html RTree] table (''spatial_index'') is a mandatory table used for spatial searches. The Maria GDK location database rtree table should have a primary key and two column pairs representing the minimum and maximum values (bounding box) for a 2-dimensional object.


{| class="wikitable"
{| class="wikitable"

Revision as of 16:19, 1 August 2019

The Maria GDK Geolocation service allows fast, faceted freetext searches for partial or full placenames. General placename searches are supported, as well as street adress search.

A separate conversion step is required for converting from source data to a specialized SQLite database. Converters exist for GNS and Geonames. For simple (ie csv based formats) writing new converters is relatively simple.

Converting placename data

When converting a file containing placename information to a Maria2012 location database, you need a reader for that specific fileformat. See chapter on [[./readers|creating readers]] for details.

Readers are executed from the commando line. Usage: <input file> <input format> <sqlite database file> [/clear]

Input arguments:
[1] - Path to file containing placename information. Must be in a format supported by the LocationServiceSqliteLoader.
[2] - Input format string. Currently supported: "ssr", "geonames", "gns", "gns_us" and "matrikkel".
[3] - Path to output sqlite file.
[4] - Optional argument /clear. If output file exists placename data will be added to the file. Use /clear to force a fresh database.

When adding to a existing database, tables might end up with duplicate entries.

Example:

TPG.GeoFramework.LocationServiceSqliteLoader.exe c:\LocationData\GNS\no.txt gns c:\ServiceTest\LocationData\gns_no.location.sqlite /clear

Support data

The different readers uses a small number of csv/txt-files to map information in the sqlite databases (F.ex. mapping from country codes (NO) to country names (Norway)). If these files are not present when converting a database, an error is given but converting the database will still work in some cases. The files should be placed in the same folder as the source datafiles.

Default data files can be found in the Maria GDK source repo at \Src\Layers\Location\TPG.GeoFramework.GeolocSqliteLoader\SupportData

Feature classes and codes

Dsg-files should map feature codes and feature classes to descriptive texts.

File should be comma separated strings with format code,name,text,fea_class. F.ex: 'E,"mosque","a building for public Islamic worship","S - Spot Features",'. The three first entries are feature code, feature name and feature code description. The last entry (fea_class) is a combination of feature class code and feature class text, separated with '-'.

Example:

CODE,NAME,TEXT,FEA_CLASS,
"MND","mound(s)","a low, isolated, rounded hill","T - Hypsographic",
"MNDU","undersea mound","a low, isolated, rounded hill","U - Undersea",
"MNFE","iron mine(s)","a mine where iron ore is extracted","S - Spot Features",
"MNMT","monument","a commemorative structure or statue","S - Spot Features",
"MNQ","abandoned mine","abandoned mine","S - Spot Features",

Used by:

Reader Filename
gns dsg.csv
geonames dsg.csv

Country codes

cc-files should map country codes to country names.
File should be comma separated strings with format code,name, f.ex. 'ZI,"Zimbabwe",'.

Example:

CODE,NAME,
UV,"Burkina Faso",
CM,"Cameroon",
CJ,"Cayman Islands",
IP,"Clipperton Island",
CG,"Congo, Democratic Republic of the",
CY,"Cyprus",
DR,"Dominican Republic",

Used by:

Reader Filename
gns cc.csv
gns_us cc.csv
geonames cc.csv
ssr cc.csv

Administrative data

Adm-files should map administrative codes to (numbers and/or letters) to descriptive text.
Use Adm1 for "Administrative division level 1, US state, Norwegian fylke", Adm2 for "Administrative division level 2, US county, Norwegian kommune" and Adm3 for "Administrative division level 3, US ?, Norwegian poststed".
Files should be comma separated strings with format code,name. F.ex: '1662,"Klæbu kommune",'.

Example:

CODE,NAME,
1662,"Klæbu kommune",
0604,"Kongsberg kommune",
0402,"Kongsvinger kommune",
0815,"Kragerø kommune",
1001,"Kristiansand kommune",

Used by:

Reader Filename
gns See Adm data for gns reader
matrikkel matrikkel_adm.csv
Adm data for gns reader

File should be comma separated strings with format adm1_cd,country_cd,adm1_name. F.ex: '"04","NO","Buskerud",'.

Example:

ADM1,COUNTRY_CD,ADM1_NAME,
"04","QA","Al Khawr wa adh Dhakhīrah",
"34","WA","Okavango",
"35","AF","Laghmān",
"35","AG","Aïn Defla",
"35","BF","San Salvador",

Used by:

Reader Filename
gns gns_adm1.csv

Navnetyper

Navnetyper.txt should map feature classes and feature codes to descriptive texts.\

File should be comma separated strings with format classcode,classtext,code,codetext. F.ex: 'FC1,"Terrengformer",N1,"Berg","Mindre fjell"'.

Example:

FC6,"Samferdsel",N165,"Landingsstripe","Landingsplass for privat flygning"
FC6,"Samferdsel",N232,"Plass/torg","I tettsted eller by"
FC7,"Administrative områder",N180,"Nasjon","Selvstendig stat / land (offisielt navn)"
FC7,"Administrative områder",N181,"Fylke","Offisielt navn"
FC7,"Administrative områder",N182,"Kommune","Offisielt navn"

Used by:

Reader Filename
ssr navnetyper.txt

Creating readers for location data

A Maria2012 GeoLoc placename data reader must be able to convert files with placename information to a Maria2012 readable sqlite databases. Teleplan Globe provides readers for f.ex. GNS, GeoNames and SSR.

Readers should implement interface IPlaceNameDataInterfacer.

Each reader must implement functions CreateTables and LoadData. CreateTables is responsible for creating tables used by the GeoLoc service when running placename searches. Mandatory tables are Rtree for spatial searches, FTS for free text search and main table for available placename information. Tables also utilised by the GeoLoc service when available are: feature class, feature code, country code and metadata. LoadData reads data from sourcefiles into the database tables.

IPlaceNameDB is available for database creation helper functions.

Mandatory tables

Main table

The main table placenames_main should contain all placename information extracted from a datasource. Use the placename_alt column for alternative versions of the main placename data if available. Country code, feature code/class and administrative information columns are used for facets and metadata searches.

Columns Description Type
lat real latitude wgs84 decimal degrees
long real
feature_class text feature class based on raw data, ex 'H' for hydrography for GNS or 'Samferdsel' for SSR
feature_code text feature code, unique code
cc1 text Primary country code
cc2 text Secondary country codes, comma separated
placename text Placename (reading order with diacritics)
placename_alt text Alternate spellings of placename, comma separated
adm1 text Administrative division level 1, US state, Norwegian fylke
adm2 text Administrative division level 2, US county, Norwegian kommune
adm3 text Administrative division level 3, US ?, Norwegian poststed

FTS table

The Maria2012 location service uses the FTS4 extension in sqlite to create a table (placenames_fts) with a built-in full-text index. This index allows us to efficiently query the database for all rows that contain one or more words/tokens.
All strings meant to be searchable must be added to the FTS table. Using tagged metadata will ensure more exact searches F.ex. searching for <placename> cc:NO will return placenames with related country code metadata "NO", while <placename> NO will match all metadata containing "NO".

Columns Description Type
names searchable placenames text
meta_tagged searchable tagged metadata text
meta_untagged searchable untagged metadata text

RTree table

The RTree table (spatial_index) is a mandatory table used for spatial searches. The Maria GDK location database rtree table should have a primary key and two column pairs representing the minimum and maximum values (bounding box) for a 2-dimensional object.

Columns Description Type
id primary key integer
minlat minimum latitude float
maxlat maximum latitude float
minlong minimum longitude float
maxlong maximum longitude float

Optional tables

Feature class table

Optional table (fclass) containing feature class/theme information extracted from datasource.
Example feature classes can be 'H' for hydrography features in GNS or 'Samferdsel' for Norwegian SSR data.

Columns Description Type
code feature class text
name feature name text
desc description of feature class entry text

Feature code table

Optional table (fcode) containing feature code/object type information extracted from datasource.
Example features can be "lake' in GNS.

Columns Description Type
code feature code text
name feature name text
desc description of feature code entry text

Country code table

Optional table (cc) listing all represented country codes found in the datasource, f.ex. "NO"/"Norway".

Columns Description Type
code feature code (primary key) char(2)
name feature name text

Administration code tables

Optional tables (admin1, admin2, admin3) used for collecting administrative information f.ex. state, county, fylke etc.

Columns Description Type
code administration code text
name administration name text

Metadata table

Optional table (metadata) used for additional data (f.ex. data source producer, source dataset etc.) collected from datasource.

Columns Description Type
name metadata name text
value metadata value text
desc description of metadata entry text

Sample reader code

See [[./core_geolocation_readers_samplecode.html|sample reader code]].