Spatial databases - spatial analysis(site selection)
What is a database?
A database is systematic collection of data. These data can be manipulated by adding, deleting, updating and performing analysis. Examples of existing databases include; MySQL, PostgreSQL, MongoDB, Oracle, Microsoft SQL Server, Redis, ElasticSearch , Cassandra etc. A database can either be a SQL or NoSQL. SQL database has its data well structured, the data is arranged in tables, columns and rows, these tables can be related to each other by foreign keys, examples are Oracle, MySQL and PostgreSQL. NoSQL databases on the other hand are non-relational, these databases have dynamic schemas for unstructured data, they can be documents, graphs, key-value or wide-column stores. Examples are MongoDB, Cassandra and Redis.
Spatial Databases
Spatial databases unlike normal databases are designed to specifically support geometry or geography data. The tables are optimized with a geometry or geography column that holds coordinate values of the data. The data takes different forms of vector, raster and topology. These databases thus support the accessibility, management, modification, updating, controlling and organizing of geographical information systems data. Such databases are PostgreSQL should have a PostGIS extension installed, ESRI geodatabases and IBM Db2 Family.
To get started with spatial databases a basic understanding of SQL is necessary. In this article we learn about getting started with spatial databases using PostGIS and learn how to perform spatial relations, we shall use the Transnzoia County data in Kenya. For data visualization and editing several desktop applications can be used such us, uDig, OpenJUMP, QGIS and gvSIG. In this article we shall make use of QGIS, I prefer QGIS because it has a user friendly interface and it is easy to use.
Creating Spatial Databases
It is pretty simple and straight forward. We create databases using psql or GUI, psql is strictly a command line tool in case you have no GUI installed such as PgAdmin, DBeaver among others. The commands to create a database and add postGIS extension are;
CREATE DATABASE suitability_analysis
CREATE EXTENSION postgis
Adding Data to Spatial Database
We add data to the database in various ways. First we can create tables with columns and insert data into the tables remember our tables should have a geometry column e.g.
CREATE TABLE towns(
id serial PRIMARY KEY,
name VARCHAR (50),
geom geometry(point, 4326))
Our table’s name is towns and it has three columns, id, name and geom. Varchar and geometry are data types of each column. With this option we will have to generate the spatial index with CREATE INDEX command then insert data by INSERT INTO.
CREATE INDEX idx_towns_geom ON towns USING gist(geom)
INSERT INTO towns(name, geom) VALUES ('Kitale', ST_GeomFromText('POINT(1,2)',4326))
Secondly we make use of postGIS shapefile import/exporter manager and lastly any of the desktop tools, we settled on using QGIS. We simply connect QGIS to our database using PostGIS in the database tool, we load data into QGIS then drag and drop it into the database.
Demonstration in QGIS — Identification of suitable places for land purchase for greenhouse projects in Trans Nzoia County
We shall perform suitability analysis to identify places to buy and setup a green house in Trans Nzoia County in Kenya for practice. This will be a simple project to show what you can do with PostGIS, the results will not depict the accurate results on ground since I will not factor in all the required criteria.
While conducting my research on factors to be considered when selecting site for greenhouse projects, I learnt that location factors to be considered include;
- Nearby to roads and quality of the road surface — Nearness to good roads enables easy access to the greenhouses and easy transportation of the farm produce to the market.
- Proximity to Market — Nearness to market depicts ready and available consumers.
- Water availability- Ample and inexpensive water is needed since a lot of irrigation is needed in the farm. Water should be easily available to cut on costs
From this I went ahead and downloaded Towns, Rivers, Forests and Roads data that I will use in this project, then randomly came up with these distances as my parameter;
- Distance to rivers <1000m
- Distance to roads< 1000m
- Distance to towns<50,000m
- Distance to Special Protected areas(forests, swamps and woodlands)<5000m
The image below shows our starting point, with all the necessary data loaded in a new QGIS project.
Next we perform the intersection and buffer functionalities. In PostGIS we use the ST_Buffer and ST_Intersection functions, they both work as the buffer and intersection geoprocessing tools. It is with the ST_Buffer function that we put into consideration our criteria above while ST_Intersection outputs the sites that meet all our criteria. The image below shows how our SQL query written in DB manager looks like.
WITH intersect_one AS(SELECT ST_Intersection(ST_Buffer(t.geom, 0.5), ST_Buffer(rds.geom, 0.01))as geom FROM towns t roads rds),
intersect_two AS(SELECT ST_Intersection(intersect_one.geom, ST_Buffer(rds.geom, 0.01))as geom FROM intersect_one, rivers rvs),
site AS(ST_Intersection(intersect_two.geom, ST_Buffer(f.geom,0.05)) FROM intersect_two, forests f)
SELECT * FROM sites
The above code generates all the sites that meet our criteria. To get a specific point location for site viewing you can generate random points within the sites to lead you.
This is a simple analysis to show the potential of PostGIS, its effectiveness and ease. Suitability modelling can solve a variety of problems , Site selection for wind farms, where to site new house, school, hospital development projects, where to locate firefighting crews , where to deploy troops in a military's operation, identifying potential new dumping sites in urban areas, potential wildlife habitat areas as well as optimal routes to destinations.
Advantages of using spatial databases
Some of the main pros of using spatial databases is because they support big data, they also save one from creating nonessential data layers. The advantages of postGIS are that; it is free and open Source, postGIS has lots of spatial functions (1000+), it supports both vector(points, line, polygon, multi-polygon, etc.) and raster operations, has both geometry and geography, enables Spatial Indexing, leverage PostgreSQL capabilities e.g. Parallel queries, roles, etc., supports 3D and 4D and easy Integration with both open source and commercial software.
Explore more about PostGIS from the references below.
References and more learning resources
Dominik Mikiewicz, Michal Mackiewicz, Tomasz Nycz.. 2017 Mastering PostGIS — covers PostGIS 2.2
PostGIS documentation:
POSTGRESQL:
PostgreSQL Tutorial:
https://www.postgresqltutorial.com/
R. Obe and L. Hsu. 2015. PostGIS in Action, 2nd Edition 2nd Edition Manning Publications Co.