forked from IBM/db2-samples
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add spatial location jupyter notebook sample (IBM#38)
- Loading branch information
Showing
6 changed files
with
985 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
README file for Db2 Spatial Analytics Location Sample | ||
|
||
* | ||
* | ||
* (C) COPYRIGHT INTERNATIONAL BUSINESS MACHINES CORPORATION 2021. | ||
* | ||
ALL RIGHTS RESERVED. | ||
* | ||
|
||
|
||
File: samples/spatial/location/README.txt | ||
|
||
The Db2 Spatial Analytics sample consists of a Jupyter notebook with | ||
supporting SQL script init_env.sql. | ||
This file briefly introduces each demo and indicates where to look for | ||
further information. | ||
|
||
|
||
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = | ||
The demo is implemented in Jupyter notebook creating and using a locally | ||
created database. It uses the Db2 command line provess (CLP) | ||
and the IBM Python driver to interact with the database and local instance. | ||
You can use the demo and scripts as a tutorial to work with a Jupyter notebook, | ||
perform queries and display data on a map. | ||
The data used is located in spatial/samples/data and consists of two tables: | ||
- a customer table containing customer (fake) information. | ||
- a county table containing all US counties with census information. | ||
|
||
The script load_data.sql assumes that the CSV files with the data are co-located | ||
with the notebook and scripts. Thus, prior to running the script in the notebook | ||
extract the data | ||
spatial/data/geo_customer.zip | ||
spatial/data/geo_county.zip | ||
followed by either copying the data to the directory of the script and notebook | ||
or change the SQL script to point to the appropriate path for the files. | ||
|
||
The following excerpt from that file gives an introduction to the demo: | ||
***************************************************************************** | ||
This demo illustrates adding a spatial dimension to an existing information system. | ||
The existing system did not contain any explicit location (spatial) data. | ||
However, the existing system did contain implicit location data in the | ||
form of addresses. By spatially enabling the existing database, | ||
the user expands the business analysis capabilities of the system. | ||
|
||
This demo is a a jupyter notebook version of | ||
https://www.ibm.com/blogs/cloud-archive/2015/08/location-location-location/ | ||
|
||
In this scenario, a small company (MYCO) has two offices, but business has been growing and there are | ||
now customers across the country. Many of the customers have expressed a preference to meet company | ||
representatives in person. The company owners want to explore where to open a new office. | ||
|
||
Some of the questions in MYCO company owners want to answer are: | ||
|
||
We already have some ideas where to open a new office. | ||
- How can we find out which of these potential locations can serve the most customers? | ||
- How can we reach the customers with the highest business volume? | ||
- Are there other locations that should be considered? | ||
|
||
Spatial analysis functions can help find the answers. | ||
|
||
On Db2 Warehouse on Cloud the geospatial data used to bring this example to life can be found in the SAMPLE schema. | ||
It contains data about customers in the GEO_CUSTOMER table and county data in the GEO_COUNTY table | ||
in the Spatial Extender format and need conversion into the Spatial Analytics format first. | ||
However, this notebook also works with Spatial Extender. Only the DB2GSE schema is necessary to be used in | ||
queries for any spatial functions. | ||
You can use the Tables menu to view the structure and browse the content of these tables. | ||
|
||
For more information on Spatial Analytics visit the documentation: | ||
https://www.ibm.com/docs/en/db2/11.5?topic=data-db2-spatial-analytics |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
--Prep the database for the location demo | ||
|
||
-- connect to bludb; | ||
|
||
DROP TABLE GEO_TEMP; | ||
DROP TABLE GEO_COUNTY; | ||
DROP TABLE GEO_CUSTOMER; | ||
|
||
-------------------------------------- | ||
-- Create the SRS needed for the demo. | ||
-------------------------------------- | ||
|
||
CALL ST_DROP_SRS('SAMPLE_GCS_WGS_1984'); | ||
CALL ST_CREATE_SRS('SAMPLE_GCS_WGS_1984', 1005, -400, -400, 1.111948722222222E9,-100000,10000,-100000,10000,'GCS_WGS_1984',NULL, NULL,'GEOGCS[\"GCS_WGS_1984\",DATUM[\"D_WGS_1984\",SPHEROID[\"WGS_1984\",6378137.0,298.257223563]],PRIMEM[\"Greenwich\",0.0],UNIT[\"Degree\",0.0174532925199433]]', 'location demo srs'); | ||
|
||
------------------------ | ||
-- Load GEO_COUNTY table | ||
------------------------ | ||
|
||
-- Load raw data. | ||
CREATE TABLE GEO_TEMP (OBJECTID INTEGER NOT NULL PRIMARY KEY, WKT CLOB, STATEFP VARCHAR(2), COUNTYFP VARCHAR(3), COUNTYNS VARCHAR(8), NAME VARCHAR(100), GEOID VARCHAR(5), NAMELSAD VARCHAR(100), LSAD VARCHAR(2), CLASSFP VARCHAR(2), MTFCC VARCHAR(5), CSAFP VARCHAR(3), CBSAFP VARCHAR(5), METDIVFP VARCHAR(5), FUNCSTAT VARCHAR(1), ALAND DECIMAL(14,0), AWATER DECIMAL(14,0), INTPTLAT VARCHAR(11), INTPTLON VARCHAR(12)) ORGANIZE BY ROW; | ||
|
||
-- Adjust the source path and the message path as necessary. | ||
LOAD FROM ../data/county.del OF DEL LOBS FROM ./ MODIFIED BY COLDEL| MESSAGES /tmp/county_load.log INSERT INTO GEO_TEMP(OBJECTID, WKT, STATEFP, COUNTYFP, COUNTYNS, NAME, GEOID, NAMELSAD, LSAD, CLASSFP, MTFCC, CSAFP, CBSAFP, METDIVFP, FUNCSTAT, ALAND, AWATER, INTPTLAT, INTPTLON); | ||
|
||
-- Create and load county table. | ||
CREATE TABLE GEO_COUNTY (OBJECTID INTEGER NOT NULL PRIMARY KEY, Shape SYSIBM.ST_MultiPolygon INLINE LENGTH 32300, STATEFP VARCHAR(2), COUNTYFP VARCHAR(3), COUNTYNS VARCHAR(8), NAME VARCHAR(100), GEOID VARCHAR(5), NAMELSAD VARCHAR(100), LSAD VARCHAR(2), CLASSFP VARCHAR(2), MTFCC VARCHAR(5), CSAFP VARCHAR(3), CBSAFP VARCHAR(5), METDIVFP VARCHAR(5), FUNCSTAT VARCHAR(1), ALAND DECIMAL(14,0), AWATER DECIMAL(14,0), INTPTLAT VARCHAR(11), INTPTLON VARCHAR(12), xmin double generated as (st_minx(shape)), xmax double generated as (st_maxx(shape)), ymin double generated as (st_miny(shape)), ymax double generated as (st_maxy(shape)) ) ORGANIZE BY COLUMN NOT LOGGED INITIALLY; | ||
|
||
INSERT INTO GEO_COUNTY (OBJECTID, SHAPE, STATEFP, COUNTYFP, COUNTYNS, NAME, GEOID, NAMELSAD, LSAD, CLASSFP, MTFCC, CSAFP, CBSAFP, METDIVFP, FUNCSTAT, ALAND, AWATER, INTPTLAT, INTPTLON) ( SELECT OBJECTID, ST_MPolyFromText(WKT, 1005), STATEFP, COUNTYFP, COUNTYNS, NAME, GEOID, NAMELSAD, LSAD, CLASSFP, MTFCC, CSAFP, CBSAFP, METDIVFP, FUNCSTAT, ALAND, AWATER, INTPTLAT, INTPTLON FROM GEO_TEMP); | ||
|
||
-- Create a regular index on the boxfilter columns. | ||
CREATE INDEX GEO_COUNTY_BF_IDX ON GEO_COUNTY (xmin, ymin, xmax, ymax); | ||
|
||
COMMIT; | ||
DROP TABLE GEO_TEMP; | ||
|
||
-------------------------- | ||
-- Load GEO_CUSTOMER table | ||
-------------------------- | ||
-- Load raw data. | ||
CREATE TABLE GEO_TEMP (OBJECTID INTEGER NOT NULL PRIMARY KEY, WKT VARCHAR(256), NAME VARCHAR(254), INSURANCE_VALUE INTEGER) ORGANIZE BY ROW; | ||
|
||
-- Adjust the source path and message path as necessary. | ||
LOAD FROM ../data/customer.del OF DEL MODIFIED BY COLDEL| MESSAGES /tmp/customer_load.log INSERT INTO GEO_TEMP(OBJECTID, WKT, NAME, INSURANCE_VALUE); | ||
|
||
CREATE TABLE GEO_CUSTOMER (OBJECTID INTEGER NOT NULL PRIMARY KEY, SHAPE SYSIBM.ST_POINT, NAME VARCHAR(254), INSURANCE_VALUE INTEGER, xmin double generated as (st_minx(shape)), xmax double generated as (st_maxx(shape)), ymin double generated as (st_miny(shape)), ymax double generated as (st_maxy(shape))) ORGANIZE BY ROW NOT LOGGED INITIALLY; | ||
INSERT INTO GEO_CUSTOMER (OBJECTID, SHAPE, NAME, INSURANCE_VALUE) ( SELECT OBJECTID ,ST_POINTFROMTEXT(WKT, 1005), NAME ,INSURANCE_VALUE FROM GEO_TEMP ); | ||
|
||
-- Create a regular index on the boxfilter columns. | ||
CREATE INDEX GEO_CUSTOMER_BF_IDX ON GEO_CUSTOMER (xmin, ymin, xmax, ymax); | ||
|
||
-- Create a regular index on the INSURANCE_VALUE column. | ||
CREATE INDEX GEO_CUSTOMER_insurance_value_idx ON GEO_CUSTOMER(INSURANCE_VALUE); | ||
|
||
COMMIT; | ||
DROP TABLE GEO_TEMP; | ||
|
||
|
||
|
||
|
||
|
||
|
Oops, something went wrong.