Geospatial Data
Geospatial data means data that has geographic components for locational information like coordinates, address, ZIP code, etc.
Analysis and Visualization of Geospatial data is one of the recent branches of Data Science. It can be used for many purposes like - coffee chains like Starbucks, Dunkin' or CCD determine where to make there next outlet, determining earthquake-prone areas in countries like Japan to provide extra reinforcement in those areas, it can be used determine areas with most pollution or areas where certain wildlife can find their natural shelter. In the present time, it can be used for tracking the spread of COVID-19 and determining red-zones or containment zones.
Representation of Geospatial Data
There are many types of geospatial data file formats like shapefile, GeoJSON, KML, and GPKG. These type of file formats can be used to store geospatial data.
Shapefile is the most common format for storing geospatial data. It represents the data in the form of points, lines, and polygons along with certain attributes like name, temperature, etc. These shapes along with the attributes form geographic components like rivers, roads, landmarks, etc. It contains three mandatory files .shp - representing the shape or geometry, .shx - representing the index of the shapes for easier navigation, .dbf - storing the attributes associated with the shapes. Along with these mandatory files some other files like .prj,.ixs,.mxs,etc. can also be present.
GeoJSON represents the geospatial data in JSON like format eg-
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [105.6, 110.1]
},
"properties": {
"name": "Andaman Islands"
}
}
It can represent data in the following geometry types: Point, LineString, Polygon, MultiPoint, MultiLineString, and MultiPolygon
.
KML stands for Keyhole Markup Language and is used to represent data in Google Earth. It is based on XML notation. They are distributed in zipped formats with .kmz as an extension.
GPKG or GeoPackage is an open format that can represent geospatial data in both raster and vector forms. It is an extended SQLite database file that is lightweight and contained in a single ready-to-use file.
Visualization of Geospatial Data
There are many Python libraries to visualize geospatial data and draw interesting maps some of the most famous of them are:-
Folium
It is based on Leaflet.js. It makes visualization with the help of interactive leaflet map data manipulated in python. It supports the GeoJSON file type along with some other file types.
GeoPandas
It extends the datatypes used by Pandas to represent geospatial data. It uses descartes and matplotlib for plotting. It is based on shapely for geometric operations. It supports GPKG, shapefile, and GeoJSON data formats.
BaseMap
It is a great tool to make interactive maps and is an extension of matplotlib. It supports shapefile type data.
GeoViews
It is built on Holoviews library. It uses Cartopy python package and for visualization uses matplotlib or Bokeh.
KeplerGL
It is based on WebGL. Kepler.gl is also a React component that uses Redux to manage its state and data flow. It can be embedded into other React-Redux applications and is highly customizable.
IpyLeaflet
It is used to embed maps in Jupyter notebooks. It acts as a bridge between leaflet.js and Jupyter.
Cartopy
Cartopy is a Python package for geospatial data processing, produce maps and geospatial data analyses. Cartopy makes use of powerful PROJ.4, NumPy and Shapely libraries and uses Matplotlib for the creation of maps.
Top comments (3)
Any thoughts on plotly?
Plotly is more of a general purpose visualisation tool. I like it as well, but here I have tried to give importance to the specific tools for Geospatial analysis.
In this blog , I have mentioned about some other visualisation tools
I use plotly tho