GIS data acquisition

Geographic Information Systems

A Geographic Information System (GIS) is an integrated set of hardware and software tools, designed to capture, store, manipulate, analyse, manage, and digitally present spatial (or geographic) data and related attribute information. GIS can relate information from different sources, using two key index variables space (or location) and time. Common GIS data types (models) include:

Spatial Data: Describe the absolute and relative location of geographic features.

  • Vectors

    • Arcs (Polylines): Line segments forming individual linear features
    • Polygons: Areas enclosed by arcs
    • Points: Single coordinate pairs
  • Rasters

    • Grid-Cells: single column/row positions
    • Cell size: Resolution or else the accuracy of the data

Attribute data: Describe characteristics of the spatial features. These characteristics can be quantitative and/or qualitative in nature. Attribute data is often referred to as tabular data.

The selection of a particular data model, vector or raster, is dependent on the source and type of data, as well as the intended use of the data. Certain analytical procedures require raster data while others are better suited to vector data.

GIS data sources

Every day governments, private sector and development aid organizations collect data to inform, prepare and implement policies and investments. Yet, while elaborate reports are made public, the data underpinning the analysis remain locked in a computer out of reach. Because of this, the tremendous value they could bring to public and private actors in data-poor environments is too often lost. is an open data platform launched recently by The World Bank Group and several partners, trying to change energy data paucity. It has been developed as a public good available to governments, development organizations, non-governmental organizations, academia, civil society and individuals to share data and analytics that can help achieving universal access to modern energy services. The database considers a variety of open, geospatial datasets of various context and granularity. KTH Division of Energy Systems Analysis contributes on a contnuous basis by providing relevant datasets for electrification planning.


Indicative open libraries of GIS data

Over the past few years, KTH dESA has been actively involved in the field of geospatial analysis. The following table presents a list of libraries and directories that provide access to open GIS data.

Source Type Link
Penn World per region
MIT World per region
EDEnextdata World per region
Stanford World per region
GIS Lounge Finding GIS data
dragons8mycat Different countries
rtwilson Different types
Planet OSM Different types
Berkeley Different types
Kings College Different types
CSRC Different types
Data Discovery Center Different types
Spatial Hydrology Different types
Africa Information Highway Different types

Country specific databases

With geospatial analysis gaining momentun in many research areas, many countries have set up their own geo-databases in an effort to facilitate interdisciplinary research activities under a geospatial context. Here are few examples:

Country Source
East Timor

GIS data in OnSSET

OnSSET is a GIS-based tool and therefore requires data in a geographical format. In the context of the power sector, necessary data includes those on current and planned infrastructure (electric grid networks, road networks, power plants, industry, public facilities), population characteristics (distribution, location), economic and industrial activity, and local renewable energy flows. The table below lists all layers required for an OnSSET analysis.

# Dataset Type Description
1 Population density & distribution Raster Spatial identification and quantification of the current (base year) population. This dataset sets the basis of the ONSSET analysis as it is directly connected with the electricity demand and the assignment of energy access goals.
2 Administrative boundaries Polygon Delineates the boundaries of the analysis.
3 Existing grid network Line shapefile Used to identify and spatially calibrate the currently electrified/non-electrified population.
4 Substations Point shapefile Current Substation infrastructure used to identify and spatially calibrate the currently electrified/non-electrified population. It is also used in order to specify grid extension suitability.
5 Roads Line shapefile Current Road infrastructure used to,identify and spatially calibrate the currently electrified/non-electrified population. It is also used in order to specify grid extension suitability.
6 Planned grid network Point shapefile Represents the future plans for the extension of the national electric grid. It also includes extension to current/future substations, power plants, mines and queries.
7 Nighttime lights Raster Dataset used to,identify and spatially calibrate the currently electrified/non-electrified population.
8 GHI Raster Provide information about the Global Horizontal Irradiation (kWh/m2/year) over an area. This is later used to identify the availability/suitability of Photovoltaic systems.
9 Wind speed Raster Provide information about the wind velocity (m/sec) over an area. This is later used to identify the availability/suitability of wind power (using Capacity factors).
10 Hydro power potential Point shapefile Points showing potential mini/small hydropower potential. Dataset developed by KTH dESA including environmental, social and topological restrictions and provides power availability in each identified point. Other sources can be used but should also provide such information to reassure the proper model function.
11 Travel time Raster Visualizes spatially the travel time required to reach from any individual cell to the closest town with population more than 50,000 people.
12 Elevation Map Raster Filled DEM maps are use in a number of processes in the analysis (Energy potentials, restriction zones, grid extension suitability map etc.).
13 Slope Raster A sub product of DEM, used in forming restriction zones and to specify grid extension suitability.
14 Land Cover Raster Land cover maps are use in a number of processes in the analysis (Energy potentials, restriction zones, grid extension suitability map etc.).
15 Solar Restriction Raster Solar Restriction maps are used in order to determine areas in which the use of PV-technologies is prohibited


  • Before a model can be built, one must acquire the layers of data outlined above. More often than not, each layer must be acquired on its own. The final outcome is a multilayer map conveying all the information necessary to initiate an OnSSET electrification analysis.
  • The spatial resolution of the final map depends on the availability of input data and on the targeted level of accuracy. OnSSET can handle various levels of input data, with typical resolutions ranging from 1x1 kilometers (km) to 10x10 km. The selection of inputs usually involves a trade-off between the time needed for computation and the desired level of detail. The modeler has to decide which resolution best fits the purpose of the analysis.

GIS basic datasets

Administrative boundaries

Population data

Coverage Type Resolution Year Source Link
Africa, Asia, America Raster 100 m grid cells (depending on country) Worldpop
World grid 2.5 arc-minute grid cells 90/95/00 SEDAC
World shapefile, raster (grid) 2.5 arc-minute grid cells 2000 UNEP
Europe shapefile, csv 1 km grid cells 2006, 2011 GEOSTAT
Ghana, Haiti, Malawi, South Africa, Sri Lanka raster (grid) 1 arc-second 2015 CIESIN
World Various Various 2016 dhsprogram

Transmission lines data

UK shapefile Power transmission lines, underground cables, stations etc. na National Grid
US raster 100 m grid cells 2015 ArcGIS online
World OSM potential points or polylines 2015 OSM of various mirrors  
World From Vmap level 0 Power lines and utilities na Can be downloaded from:

Power plants location data

Coverage Type Resolution Year Source Link
World shapefile (4 levels) Generators, substations,masts 2009 Vmap level 0
World shapefile Generators (power source included) 2015 Geofabrik Available from KTH-dESA upon request


Coverage Type Resolution Year Source Link
World geoTIFF 30 m spatial resolution 2009 METI Japan, NASA
World geoTIFF 30 m posting, 1x1 degree tiles 2009, 2011 METI Japan, NASA
World ASCII, GeoTIFF 3 arc sec (approx. 90 m resolution) 2003 CGIAR CSI
Different countries GeoTIFF 1 to 30 arc sec 2014 Global Land Cover Facility
Different DEM sources various various various GIS 4 Geomorphology
World .bil and/or .tif 15 arcseconds/30arcseconds various ISCGM
World GeoTIFF 16 arcseconds/30arcseconds various NOOA
World GeoTIFF 17 arcseconds/30arcseconds various DGADV
World + Arctic areas GeoTIFF 30 arcseconds various WebGIS

Travel time to major cities

Coverage Type Resolution Year Source Link
World ESRI grid 30 arc sec 2008 (data from 2000) Joint Research Center EU
Africa (sub-Saharan) csv, ESRI ASCII raster, GeoTIFF 5 arc sec 2010 Harvest Choice
World Raster, GeoTIFF 5 arc sec 2015 Univeristy of Oxford

Mining and Quarrying

Coverage Type Resolution Year Source Link
USA Shapefile, csv, KML, KMZ Active mines and mineral plants in the US 2003 USGS
World Shapefile, dBase, HTML, Tab text,csv, Google earth points 2012-2013

Land cover

Coverage Type Resolution Year Source Link
World Bioenergy potential 1 km na IRENA
World CI Land cover - raster 300 m time series from 1992 to 2015 ESA
World GeoTiff, Google earth, jpeg,png 1-0.1 degrees 2001-2010 NASA-NEO
World HDF-EOS 0.5 degrees 2001-2012 NASA-MODIS
World Raster, csv 0.0028 - 0.0083 degrees 2000, 2005, 2010 ESA-ENVISAT
World/Protected areas Shapefile, KML, csv na 2014 Protected planet
World various various 2015 Global Land Cover Facility
World Rasters for: Costal areas, Cultivated areas, Forests, Mountains, Islands, Inland waters etc. 0.00833 degrees 2000 SEDAC
World Raster for croplands 0.0833 degrees 2000 SEDAC
World Various Rasters on Land Use various 1990-2010 Nelson Institute
World Soil type various na Worldmap.Harvard
World Various Rasters on Land Use various 1980-2014 EarthStat

The model classifies the land cover in order to calculate the grid extension penalties. The default classification values are based on the MODIS dataset found here, where the legend ranges from 0-16 with the values and corresponding land cover type can be seen below. If land cover data is retrieved from other data sources with different classification values they should be reclassified in GIS (using the Reclassify tool in ArcGIS or r.reclass in QGIS) to match those below. Alternatively changes can be madein the Python code instead. If this reclassification is not performed it may lead to an incorrect grid penalty factor or, if the highest values are above 16, an error message while running the code.

Value Label
0 Water
1 Evergreen Needleleaf forest
2 Evergreen Broadleaf forest
3 Deciduous Needleleaf forest
4 Deciduous Broadleaf forest
5 Mixed forest
6 Closed shrublands
7 Open shrublands
8 Woody savannas
9 Savannas
10 Grasslands
11 Permanent wetlands
12 Croplands
13 Urban and built-up
14 Cropland/Natural vegetation mosaic
15 Snow and ice
16 Barren or sparsely vegetated


Coverage Type Resolution Year Source Link
World Coast Lines, oceans Physical vectors, ESRI shapefiles, GeoTIFF (1:10, 1:50 and 1:110 m) 2015 Natural Earth
World Climate data 30 arc seconds and 2.5/5/10 arc minutes na WorldClim
World/USA Climate change scenarios various na na
World/Australia Water and Landscape Dynamics 0.05 to 1 degrees 1979-2012 Australian National University
Open Street Map (OSM) - Osmosis osm.pbf depending on mirror source up to date NOAA
Nighttime lights Raster file 0.0083 degrees 1992-2013 na
Africa information Highway various vectors various AfDB
World Cliamte data various various Oregon State University

Methodology for Open Street Map data and Osmosis


  • Open Street Map (OSM) is a collaborative project that intends to provide free and open access data used in mapping the world. This document aims at describing in brief the methodology used in order to obtain OSM data and transform them in compatible and useful information with the use of Osmosis and QGIS.
  • To begin with, bulk download of updated OSM data can be performed through the Planet OSM:
  • The files can be downloaded as .xml and .pbf format. However, due to the large volume of data there are various mirrors/extracts that provide access to masked data for different regions of the planet. More information can be found here: In previous cases and where used successfully.
  • It should be mentioned at this point that an interesting tool is the Overpass API. More specifically, using quarry and convert forms and redirecting to Overpass Turbo it is possible to utilize the wizard function and obtain required data for a defined area. The area is delineated by the map shown in the screen while data types include nodes, ways and relations. The data can be exported in various formats with .kml (amongst others) being compatible with the latest versions of QGIS. (As an example use the word: power in the wizard function and you will get the power related information depicted on the map). A disadvantage of this method is that the restrictions in the area size, which is limited to 100 square km.
  • Coming back to the other sources (Geofabrik, BBBike), data can be downloaded per region in .pbf format. In the latest version of QGIS it is possible to insert this data directly by simply dragging the file onto the QGIS window. However, since the files are usually very large it is recommended to transform the .pbf into a spatialite database.
  • To do this transformation open up the OSGeo shell follwoing with your installation, navigate to the folder in which you have your .pbf file (by typing cd [folder path]) and enter the following line: ogr2ogr -f SQLite X.sqlite Y.pbf (note change X to the name you want to use for your spatialite database and Y to the name of your downloaded .pbf file)
  • Once This transformation is finished (it may take some time) drag this new file into QGIS and work with it instead of the .pbf file.
  • OSM data provide access to a tremendous amount of information of various types. Feel free to explore the potential and share the results with an enthusiastic community.

Datasets that require further precessing

Solar GHI

Coverage Type Resolution | Year Source Link
World csv Local - Regional - World | 1993-2006 NASA
World tiff Regional - country 2016 Word Bank
South America shapefile, csv 40 km 2015 NREL
Europe ESRI ascii grid 1km 1981-1990 JRC
Europe and Africa ESRI ascii grid 1.5 arc-minute 1998-2011 JRC
World (-66 to 66 both long, lat) csv 0.2 gegrees (20km) 1985-2005 SoDa
Solar Radiation resources various types Various areas and resolutions

Raster Preparation Methodology using NASA datasets

Documentation on solar power assessment is available here.


Coverage Type Resolution Year Source Link
World xls,csv 1 degree spatial resolution 1993-2006 NASA
World xls, csv 0.5x0.667 degrees spatial resolution 1979-2015 EarthData - NASA
World na na na ADM-Aeolus ESA
World Raster 1x1 km spatial resolution   IRENA
Afghanistan, Pakistan, Armenia, Bhutan, Central America, Chile, China, Cuba, Domenical Republic, Ghana, Indonesia, Mexico, Mongolia, Russia, Sri Lanka, United Arab Emirates, Philippines shapefile Wind speed 50m 2009 NREL

Raster Preparation Methodology using NASA datsets

Additional documentation on wind power assessment is available here.


Hydro data Type Link Remarks
Vmap level 0 World shapefiles No permission to access
Shapefiles (4 levels)  
World shapefiles page not displayed
GRDC database: River basins, watersheds and gauged stations Permission required for GIS layers
HydroSHED Watersheds, River Networks etc.  
USGS StreamStats (estimation of ungauged rivers) Only for the US
ArcSWAT Hydrological model - calculates run-off for rivers Integrated with ArcGIS. Requires calibration with data from at least on gauged point of the river
VAPIDRO-ASTE Calculates best available location for hydro, Developed in Visual basic, integrated with ArcGIS Requires at least one gauged point of the river
RIVIDS Tabular discharge data (3,500 stations)  
GSCD Global Streamflow Characteristics Dataset 17 streamflow characteristics (0.125 degrees spatial resolution)
EEA European catchments and rivers network system (Ecrins)  
WCI Water Cycle Integrator  
NCAR Global River Flow and Continental Discharge Dataset long-term mean flow rates for 925 rivers
WWDRII World Water Development Report II Annual runoff (mm/yr per grid cell), Annual river discharge (blended, km3/yr per grid cell)
River Threat 23 layers of river threats  
HEC-GeoHMS Hydrologic Engineering Center  

Raster Preparation Methodology

Documentation on hydropower assessment together with a GIS based assessment tool are available here.