Skip to content

Latest commit

 

History

History
317 lines (223 loc) · 8.35 KB

datapreparation.rst

File metadata and controls

317 lines (223 loc) · 8.35 KB

Data Preparation

Goals

  • Provide an overview of data needed by UrbanFootprint
  • Describe the required fields for base data
  • Walk through the steps used for SACOG’s base data preparation.
  • Work through an example of preparing data for a county.

Data Needs

UrbanFootprint is a data intensive application. The effort that goes into data collection, preparation, and review should not be underestimated.

The base data which is also called the base canvas, or existing conditions dataset will require extensive data collection, processing, and then review prior to its use. The requirements imposed by UrbanFootprint on its base conditions data include strict adherence to the data schema, and the need for a detailed understanding of the existing conditions at a parcel level. If you are working in a geographic area that has not had a prior installation of UrbanFootprint, it is unlikely that there will be a dataset that you can use without substantial effort in createing a base condition dataset.

Scenario development has looser data requirements, but will require that you have an understanding of regional and local plans for the future, and have planned out goals for the scenario that can be translated onto a map at a parcel scale.

Environmental constraint layers influence the intensity of development that is possible in locations where these constraints are present.

Reference layers provide a visual reference to UrbanFootprint users while editing or visualizing scenarios.

The transportation module (and any other modules that build on its results) will require that you have substantial additional data derived from both regional transportation infrastructure GIS as well as a travel demand model.

Some of the other analytical modules also require climate data to run.

Data Types and Sources

Data for the Base Canvas

Data for Scenario Development

Environmental Constraints

Base Reference Layers

Transportation Data

Analysis Reference Data

Base Data Schema: SACOG

  • The structure and field names are critical.
  • There is a single table
  • Which will be uploaded to PostGIS
  • For convenience the discussion of fields will be divided into groups
  • Metadata and Geography
  • Paint Configuration
  • Parcel Areas/Types
  • Residential/Housing
  • Employment
  • Building Square footage
  • Outdoor Irrigated Area

Metadata and Geography

Paint Configuration

These fields are not used in the base features dataset, but are included to maintain an identical structure to the End State data.

Parcel Area/Type

Residential and Housing

Employment

Building Square Footage

Outdoor Irrigated Area

Base Data Preparation: SACOG

Input Data

graphics/GenericPic1.png

  • SACOG parcel data
  • SACOG Land Use
  • Dwelling Units
  • SACOG TAZ
  • Census 2010 Blockgroups
  • Census 2010 Tracts

Data Preparation: Topology

graphics/ExistingConditions_smalll.png

  • Parcels must not overlap
  • Clip the dataset to the county border
  • Remove roads and waterbodies

Dwelling Units

  • Total DU = SACOG Parcel DU
  • Controlled to TAZ totals
  • Assign DU type using crosswalk (right), and assign DU totals to du_detsf
  • Du_detsf_sl and du_detsf_ll based on sf/du calculation.
  • ACS rates for Attached SF, MF 2-4, and MF 5 plus are applied to all parcels with MF units

Households

graphics/ExistingConditions_smalll.png

  • HH from SACOG 2008
  • DU from Parcel Data
  • Occupancy rate = HH/DU

Population

  • Calculate Average HH by block group from census data
  • Ave. HH size = pop/hh
  • Then multiply the HH count in each parcel by the Ave. HH size.

Employment

  • Parcel employment from SACOG 2008
  • Crosswalk using the table
  • Use LEHD to disaggregate where needed. (next page)
  • Accommodation extracted using SACOG Employment Inventory

Employment Processing and Source

Disaggregation

  • This technique is used several times during data preparation.
  • Calculate the proportion of each SACOG category that goes into each UF Employment Category.
  • Use the LEHD 2010 near imputed rate datase as the basis for the disaggregation.

i.e. %emp_entrec = 100*emp_entrec/ (emp_entrec+emp_other_services+emp_accomodation)

Dataset 1 (higher accuracy): 95 employees

Dataset 2: 50 retail, 30 service, and 20 industrial employees.

Total Emp Ret. % Ser. % Ind. % # Ret. # Ser. # Ind
95 50 30 20 47.5 28.5 19

Concerns: Zeros and Nulls

Building Square Footage

Need info

Irrigated Square Footage

Need info

Developablity

Need info

Alternate Method: SANDAG

  • Base Schema
  • Expanded compared to SACOG
  • Includes HH income
  • Population Educational Attainment
  • Data Sources
  • 2012 parcels, have DU and land use
  • 2012 EDD employment points with 2-4 digit NAICS codes
  • MGRA with Pop (by gender and age), and Households by income category
  • ACS Data (5 year block group and 1 year PUMS)

Loading Base Data into UrbanFootprint

  1. Upload via ftp
  2. Create new geographic area in Django
  3. Create new schema in database
  4. Load data to schema

Keep the Goal in Mind

  • Data for your region will be unique
  • This process should serve as a starting point for developing your data, not a fixed recipe.

Exercise

  • Download XXXX
  • And unzip it into a folder.
  • Inside the folder here will be a mxd and folders with data and scripts
  • We’re going to step through the scripts.