Advanced GIS

Data
a collection of facts or figures that pertain to places, people, things, events, concepts
Information
processed or value added data that have certain perceived values to a user or a community of users
Knowledge
transformation of data into information is an explicit act by a user to raise their ___ to a level appropriate for specific decision making purposes
Intelligence
When a user deploys knowledge to perceive relationships, formulate principles, and introduce personal values
System

1. formed or constructed to achieve certain basic objectives or functions

2. their continuing existence depends on their ability to satisfy the intended objectives-if this ability fails or starts to decline, systems concerned must be upgraded or replaced

3. individual system is composed of many interrelated parts which may be operational systems by themselves

4. these parts operate individually and interact with one another according to certain rules of conduct such as procedures, laws, contractual agreements, and accepted behavior

Information Systems

1. is set up to achieve the specific objectives of collecting, storing, analyzing, and presenting information in a systematic manner.

2. structurally is made up of interrelated components that include a combination of data and technical and human resources

3. being made up of input, processing, and output systems, all working according to a well-defined set of operational procedures and protocols.

4. can be operated independently and at the same time linked with other information systems

Information System Network

  • The data and technical and human resources used to form the system
  • communications protocol and data management procedure
  • functional subsystems
  • stand-alone or network mode of systems or network mode of systems configurations

Nonspatial Information systems
those designed for processing data that are not referenced to any position in geographic space. For example, a system for accounting.
Spatial Information Systems
those designed for processing data pertaining to real-world features or phenomena that are described in terms of locations

True/False

 

 

All Spatial information systems can be regarded as GIS

False

 

CAD systems, CAM systems are also spatial information systems

True/False

 

 

 

Only those spatial information systems that are ussed for processing and analyzing geospatial data or geographically referenced data can be labeled as a GIS

True
Geospatial Data

  • Characterized by geographic space
  • Representation at geographic scale

Geographic Space
data are registered to an accepted geographic coordinate system of Earth’s system so that data from different sources can be spatially cross referenced
Geographic Scale
data are normally recorded at relatively small scales and must be generalized and symbolized

Land Information Systems (LIS)

 

Land Related Information Systems (LRIS)

focus of the system on land related activities such as ownership, tax assessment, land and water resources
Geographic Information Science

set of basic research issues raised by the handling of geographic information that include geographic data, problem solving, and using geographic information on society

 

Aims to provide the theoretical and organizational coherence for the scientific study of geographic information

True/False

 

 

Canada produced the first ever GIS

True

True/False

 

 

 

GIS was application driven during the 1960’s-70’s- built to meet specific information needs of individual organizations

True
Topology
refers to the spatial relationship of adjacenc, connectivity, and containment among topographic features
Impacts of Topology

  • solved data representation problem that hindered GIS in the early years
  • reduced the complexity of applying geographic data for spatial analysis, thus making GIS easier to use
  • geospatial data can be stored in a simple structure that is capable of respresenting their attributes

ESRI

  • created ArcInfo released in 1982
  • vector based GIS to use georelational data model that employed hybrid approach to geographic data processing
  • graphical data stored using topological data structure, while attribute data are stored using relational or tabular data structure

GIS 1980s-1990s

  • shifted to data oriented approach of GIS development fundamentally changed the way GIS technology had been developed
  • designed to meet demand of corporate business goals

enterprise Computing environment
integration of geographic data with other type of business data
Information infrastructure

  • emerged in early 1990s when U.S. proposed Nation information Initiative
  • intended to provide access to information affecting their lives that pertains to governement, health care, education, and community development

Location Based Service (LBS)
LBS makes use of the information about the location of the mobile computer to deliver personalized, localized, and real time geographic services to the user.

Safety Services

 

enhanced 911 capable of identifying the location of the user in case of emergency
information services
traffic information, navigation assistance, yellow pages, and travel/tourism information
Enterprise services
vehicle tracking, logistic planning, utility asset inspection and management
consumer portal services
delivery of local news, weather report, and forecast, driving directions
Telematics services
using GPS technology to obtain real time location information to provide driving directions and vehicle tracking functionality in fleet management
triggered location services
including location senstive advertising, billing, and logistics
Location Awareness
obtaining location based information by means of coordinates
Components of GIS

  • data
  • technology
  • application
  • people

Geospatial Data

can be categorized into

  • geodetic control network
  • topographic base
  • graphical overlays

Geodetic Control Network

  • foundation of all geographic data
  • provides goegraphical framework by which different sets of geospatial data can be spatially cross-referenced with another
  • established by high-precision surveying methods and vigorous computation at the national or contintal level.

Topographic Base

created as the result of a basic mapping program by national state/ provincial, and local government mapping agencies

 

land surveying and phtogrammetry

Graphical Overlays

  • thematic data pertaining to specific GIS applications
  • overlays of physical features can be derived directly from the topographic base
  • can be collected by site investigation

Vector
data that depict the real world by means of discrete points, lines, and polygons
Raster

Data depict the real world by means of a grid of cells with spectral or attribute values

 

not good for representing individually identifiable features but is ideal for a variety of spatial analysis functions

Surface Data
depict the real world by means of a set of selected points or continuous lines of equal vales and can be analyzed and displayed in two or three dimensions and are most suited for natural phenomena
GIS Database
can be formed by aggregating one or more geodatabases, usually organized into broad categories such as land use, transportation, hydrology, environment, and utility
Hardware
made up of a configuration of core and peripheral equipment that is used for the acquisition, storage, analysis, and display of geographic information.
Central Processing Unit (CPU)
perfoms all the data processing and analysis tasks and controls the input/output connectivity with data aquisition, storage, and display systems
Conventionally GIS
were developed as tand alone applications that ran on a single host computer, but todays GIS are mostly implemented in a network environment using the client/server model.
Server
the computer on which the data and software aer stored
Client
computer by which users access the server
Software
GIS conventionaly developed using a hybrid approach that handled graphical and descriptive components of geospatial data separately
geographical data engine
a proprietary of GIS software that handled graphical data
Georelational datamodel
connection between the graphical data engine and the database management system (DBMS)
data mining
allow users to identify facts about the real world and transform these facts into geographic objects useful for geospatial data processing and analysis
The use of object oriented technologies
has transformed GIS from automated filing cabinets of maps into smart machines for geographic knowledge
toolbox approach
allows for users to customize their applications by using scripting language to build software extensions
component software
software engineering methodolgy that has been evolving since the early 1990s, addresses the integration of separate computer based applications such as document imaging, optical character recognition, database query.
GIS as a field of academic study

  • Cartography
  • remote sensing
  • mathematics
  • statistics
  • computer science
  • information technology
  • geography
  • urban planning 
  • resource management

GIS as a branch of information technology

  • cartography
  • remote sensing
  • computer programming
  • software-specific training
  • workshops
  • laboratories

GIS as a data institution

  • information technology
  • law
  • sociology
  • antrhopology
  • cognitive science
  • economics
  • political science
  • public administration

Geographic Scale
Scales of interest to human activity/interaction. Refers to spaces concerning the Earth’s surface or near-surface
Spatial Scale
Can refer to a theoretical or mathematical space and can thus be applied to all spaces and scales, real and otherwise

 

Geographic Space

The commonality of both the data and the problems that the systems are developed solve is geography, i.e. location, distribution, pattern and relationship within a specific geographical reference framework

 

Geographic Scale

 

data usually recorded at relatively small scales and must thus be generalized and symbolized
Define GIS
GIS is a collection of hardware, software, data, policies, procedures, and people for the input, storage and retrieval, manipulation and analysis, output and modeling of spatially referenced data
Pacel Based LIS
emphasis on landownership and other castral applications. lands divided into parcels that have legal descriptions
non-parcel bassed LIS
natural resources IS, used for habitat/wildlife evaluation and management, flood hazard mitigation, conservation easement, etc.
What is not a GIS?

  • CAD, CAM, CAC
  • automated mapping
  • Database Management Systems (DBMS)

Types of GIS Software Packages

 

  • Grass
  • ArcInfo
  • ArcMap
  • IDRISI

Computing planar distance using distance theorem

d-v(x2-x1)^2 + (y2-y1)^2

 

 

where X2-x1 = difference in longitude

 

where y2-y1 = difference in latitude

 

D = distance between two points

Communication Paradigm
assumed that map itself was a final product designed to communicate a spatial pattern via symbols, class limit determinations etc.

Analytical paradigm

 

maintains raaw data in computer storage device for subsequent analysis – GIS
Map
generalized view of an area as seen from above typically reduced in size
cartography
subfield of geography that focuses on map-making
scale
ratio of map units to ground units
projection
process of transforming spherical Earth to flat map
Map Projections
can be perspective or nonperspective
Perspective Map projection
strictly geometric, use of a point of origin, or viewpoint, and a surface of projection. viewpoint selected baased on a certain critieria
Nonperspective Map Projection
modifying the perspective projection to maintain desired properties – Ex. Mercator Map
Map projections
can be expressed according to generalized functional relationships between geographical coordinates of a point on the Earth’s surface and the coordinates on the plane
Properties of Globes
parallels are always parallel to each other, meridian converge at poles and are evenly spaced along any parallel; distance between parallels decreases towards poles
Map projection properties

  • area
  • shape
  • distance
  • direction

Shape

  • to maintain shape make the scale along the meridian and paralell the same in both directions
  • preserving shaped distorts the distances

Meridians
intersect parallels at right angles
Area

  • shows spatial distributions and relative sizes of spatial features
  • tradeoffs of preserving area are the shape distance and occasionally directions are distorted

true direction
inherent property of azimuthal class projections since all meridians pass through the pole
Conformal Projection
naturally preserves shapes and true direction
Mercator Projection

  • Gerardus Mercator 
  • Cylindrical projection
  • true shape
  • meridians equally spaced
  • rhumb lines enable plotting direction (good for navigation)

Disadvantages of Mercator Projection

midlatitude and poleword landmasses dramatically stretched 

 

Classes of Projection

  • Cylindrical
  • conical
  • planar

Cylindrical Projection

  • cylinder assumed to circumscribe a transparent globe such that a cylinder touches the equator throughout its circumference
  • meridians are vertical and parallel lines
  • parallels are horizontal straight lines

Conical Projections

  • come placed over globe such that the apex of the cone is exactly over the polar axis
  • cone must touch globe along a parallel of latitude
  • along standard parallel, scale is correct
  • parallels are arcs of circles

Planar or Azimuthal Projections

  • Plane Touches the globe at North or South Pole
  • Like a cone flattened until vertex reaches a limit of 180 degrees
  • shape is circular
  • meridians look like straight lines eminating from the circle
  • parallels are complete circles centered at the pole

Universal Transverse Mercator (UTM)

  • used to define horizontal positions world wide 6 degree zones each mapped by transverse Mercator Projection

Fuller Transformation

  • preserves land masses
  • very little distortion in both shape and area
  • no more than 2% error

Defining Projections in Arc/Info
for coverages, tins and grids projection information is stored as a PRJ file within their subdirectory and store as a file name .prj
Cover

projection information is written to the PRJ of a coverage

 

Grid
projection information is written to the PRJ of a grid
File
projection information is written to the PRJ of an ASCII text file
TIN
projection information is written to the PRJ of a tin
;target;
data set for which the projection information is being defined
Georeferencing
the representation of the location of real world features within the spatial framework of a particular coordinate system
objective
provide a rigid spatial framework by which the positions of real-world features are measured, computed, recorded, and analyzed
geoid and ellipsoid
deal iwth representing the physical shape of the Earth via a mathematical surface
Ellipsoid

  • the reference surface for horizontal positions
  • physical shape of the real earth is closely approximated by the rotational ellipsoid
  • reference surface for horizontal coordinates (lat/long)

f = (a-b)/ a

Amount of polar flattening 

 

where a and be are lengths of major and minor semi axes of the ellipse

Eccentricity
e2= (a2-b2)/ a2
Geoid

  • means earthlike
  • shape of the earth if oceans were allowed to flow freely under the continents
  • geoid rises over the continents and is depressed over the oceans
  • coincides with mean seal level
  • reference surface for vertical coordinates

equipotential surface
surface on which the gravity potential is constant everywhere
Datum
model that describes the position, direction, and scale of relationships of a reference surface to positions on the surface of Earth
Geodetic datums

  • established to provide positional control that supports surveying and mapping projects covering large geographic areas
  • provide positional control that supports surveying and mapping projects covering large geographic areas

vertical datum

  • zero surface from which all elevations or heights are measured
  • MSL was used as a vertical datum since MSL is available worldwide

Geocentric Datum

  • referenced by 3-D cartesian coordinate systems (x,y,z) with the origin coincident with the center of the ellipsoid

Relation b/w Coordinate Systems and Map Projections

  • coordinate systems formed on the basis on map projections but they are not map projections themselves
  • coordinate systems and map projections are different concepts with different purposes
  • map projections define how poisitons on Earth’s surface are transformed onto a flat map surface
  • the coordinate system is then superimposed on the surface to provide a referencing framework by which positions are measured and computed

Universal Transverse Mercator
projection is used to define the horizontal positions into 6 degree zones, each mapped by the Transverse Mercator projection with a central meridian in the center of the zone
Spatial Relationships

  • Topological Relationships
  • continguity- next to sharing a border
  • connectivity – connected to, allowing flow from one to the other
  • proximity- near
  • closure contained

Spatial attributes
always interested in location

Non-Spatial Attributes

 

easily handled by non-spatial databases

 

 

spatial attributes and relationships are not handled well by non-spatial databases

Spatial Data Aquisition

Land Surveying

Remote Sensing

Global Positioning System

Geographic Data Collection

  • ground surveys
  • tape measures
  • LIDAR

Global Position Systems (GPS)

  • U.S. DoD project devised in 1970s
  • constellation of 24 satellites
  • 6 orbital planes
  • 55 degree inclination
  • receiver computes position, velocity, time, four GPS satellites used to calculate position in 3 dimensions to offset receiver clock errors

Radio Frequency Identification (RFID)

  • Can be passive or active
  • can range from few cms to 20-30 ft
  • high powered tags can tansmit into low orbit
  • Point of sale terminals used for customer tacking

RFID

  • Can be virtually undetectable
  • can be woven into materials
  • uses a 96-bit code as a string of 96 zeroes and ones

Examples of GIS

  • Digital Raster Graphic (DRG)
  • Digital Line Graph (DLG)
  • Digital Elevation Model (DEM)
  • LiDAR derived elevation data

Non spatial Data for Thematic Mapping

  • Geographic referencing for attribute data uses the hierarchical referencing system
  • discrete often parcelled based
  • municipal addresses

Non Spatial Levels of Measurement

usually determined by the valid operations

  • numerical levels

Steven’s Levels of Measurement

  • Binary Scale
  • Nominal Scale
  • Ordinal Scale
  • Interval Scale
  • Ratio Scale

Binary Scale
presence or absence. frequently used to represent discrete data in raster systems
Nominal Scale
making distinctions in kind/class category
Ordinal Scale

  • Difference in Rank
  • can go beyond just equal or not equal
  • sorting is possible
  • non parametric/ qualitative

Interval Scale

 

uses a relative scale, zero is arbitrary and can have negative values
Ratio Scale
uses an abosolute zero. zero represents the absence of phenomena
Criticism of Steven’s levels of measurement

  • not adequate to address all possible circumstances in cartography and GIS
  • ratio is not the highest level of measurement
  • cyclic measures do not fit the concept of scale well

Data Dictionary
defines our terms, levels of measurement, translates computer representation or attributes.
Data entry

  • Manual coordinate capture
  • attribute capture
  • digital coordinate capture
  • data import

Editing

  • manual point, line, and area feature of editing
  • manual attribute editing
  • automated error detection and editing

data management

  • copy, subset, merge data
  • versioning
  • data registration and projectiong
  • summarization, data reduction
  • documentation

Analysis

  • spatial query
  • attribute query
  • interpolation
  • connectivity
  • proximity and adjacency
  • buffering
  • terrain analysis
  • boundary dissolve
  • spatial data overlay
  • moving window analysis
  • map algebra

Output

  • map design and layout
  • hardcopy map printing
  • digital graphic production
  • export format generation
  • metadata output
  • digital map serving

Cartography

  • historically cartographic products from GIS not Good
  • Digital Maps (soft copy)

human computer interaction
optimization of communication and presentation of geographic information
Computational Steering
user views intermediate results of spatial analysis and based on those results, the algorithmic parameters may be interactively changed
Isarithmic Mapping

  • mapping a real or conceptual 3-D geographical volume with quantitative line symbols. It is a planimetric representation of a 3-D volume

Two forms of Isarithmic Maps

  • Isometric
  • Isoplethic

 

Both involve the planimetric mapping of the traces of the intersections of horizontal planes with the 3-D surface

Isometric Maps

generated from point data

 

Actual – measure with instruments or other point sampling techniques

 

Derived- statistiacal measures and magnitudes

Isoplethic Maps
generated from mapping data that occured over geographic areas called unit series
Isarithmic Map Construction

  • magnitude or value of this isarithmic lines represent their vertical distance from teh data
  • planes constructed parallel to the Datum – each isarithm will have a constance magnitude, distance from the datum

When to use Isarithmic Maps

  • mapped data must be a geographical volume and must have a surface that bounds the volume
  • data must be continuous, not discrete
  • must be familiar with the phenomenon being mapped

Nearest Neighbor Interpoloation (Thiessen Polygons)

  • Assumption is that the value at each given location is the same as teh value of the nearest observation
  • Uses the concept of Thiessen Polygons
  • defined around each observation 
  • only one observation is contained in each polygon

Advantages of Nearest Neighbour Interpolation
most appropriate for qualitative data (nominal in scale)
Disadvantages

  • assumes values only change at the borders
  • does not perform well with interval or ratio scale data because it is a continuous surface
  • resulting surface is discontinuous without smooth graduations between observed values

Inverse Distance Weighted Interpolation

  • Assumption:
  • the value of an attribute z for each unvisited grid cell location is DISTANCE WEIGHTED AVERAGE of data points located at nearby observations
  • Original data points are located on a regular grid or irregularly distributed and interpolated to locations on a denser regular grid

Inverse Distance Weighted Method

d= distance between given and observed

z- observation point

N = number of observations

i = iteration

z1 = weighted value at a given point

T – weighting parameter

What is the Relationship between r and nearby points in the Inverse Distance Weighted Method
An increase in r means that the distance is more heavily weighted; thus, the predicted value will be more like closer values

Inverse Distance Weighted IDW Interpolation summary

 

  • based on the principle in geography that things closer together are more similar
  • if considering a location with no measured valuek IDW will look within a specified neighborhood around the point of interest and identity of the measured value
  • the closer observations will influence the predicted value more than observations measured farther away
  • closer points are thus more heavily weighted than points further away.. IDW as the distance from the point-to be predicted increase the values become more inversely wieghted

Inverse Distance Weighted Interpolation

Advantages

  • results in a smooth and continuous surface that changes between observations
  • derived surface passes through observed values

Disadvantages

  • requires subjective selection of parameters 
  • does not interpolate beyond min and max values in observation sets

Design Process

  • Graphic design produces visual forms
  • assign qualitative and quantitative meaning s to the distinctive marks – relate graphic characteristics of the marks to attributes of the data
  • arrange the marks in a total composition that enhances map communication with teh user, wehre the intended information is conveyed
  • skill and artistry important for map design

Criteria for a Good Decision

  • should be suited to the needs of map users
  • should be easy to use
  • should be accurate, present information without error and distortion
  • should be clear, legible, and aesthetically pleasing
  • symbols, color, layout, and typographic appearence
  • should be thought provoking and communicative

Design of Symbols

 

  • Class of symbols
  1. point, line and area polygon symbols are the basic elements used to create all visual design

Point
conveys a positon
Line
exhibit directions as well as position
Area
exhibits exten, direction, and position – graphically uniform over the area; even color uniform repetition of a point or line symbols
Shape
regular or geometric shapes, irregular shapes as well
Size
different geometric dimensions
Color
hue, value, and saturation
Pattern
combination of basic reptitive graphic elements produces an aerial graphic effect
Pattern
Exhibits the characteristics of arrangemetn texture/spaceing and orientation
Arrangement
shape and configuration of component marks that make up a patter
Texture
size and spacing of component marks that make up a pattern, fine texture; closely spaced small marks
Design Principles

  1. Clarity and Legibility
  2. Contents, Information, and Map Space
  3. Visual Contrast
  4. Figure-Ground Organization
  5. Visual Hierarchy

Map Composition and Layout

  1. Map Elements
title 
Map Legend
Map scale
Map symbols
Credits
  1. Aesthetic of Map Compositon
  2. Visual Balance

Golden Section

  • it is widely accepted that a rectangle with sides having a proportion of about 3-5 seems to be the most aesthetically pleasing formats
  • in mathematics, a geometric proportion in which a line is divided such that the ratio of the length of the longer line segments to the length of the entire line is equal to the length of the shorter line segment to the length of the longer line segment

Typography/Lettering

  • used to name places; identify or label objects, provides titles, legends, and other explanatory elements
  • letterform characteristics, size, letter spacing typeface personalities and legibility are important aspects of lettering.

Image Draping
established technique in GIS. Draping a topographic or thematic map onto a 3-D terrain surface is effective but relies on abstract colors, shading, and symbols. Draping a satellite image such as a digital orthophoto, results in good surface texture and can produce visualizations suitable for depicting landscape-scale vegetation patterns
Geometric Video Imaging
combines video imaging techniques with geometric registration typically undertaken within a GIS. its rarely used on a production bases due to the difficulty in accurately geo-referencing the photographic video image with the 3-D perspective
VRML (Virtual Reality Modeling Language)
standard file format for representing 3-D interactive vector graphics designed particularly for use with the world wide web
Moore’s Law
the number of transistors incorporated in a chip will approximately double every 24 months.
Digital Representation of Geographic Data

  • Geographic Databases are dynamic, not static; allows interactive data analysis
  • data models; methods of data representation

Object
representation of reality, not the thing itself, it is a model
Feature
an entity being represented by an object. A defined entity and its oobject representation. this term does not make a distinction between the real thing or the model
Entity
A real world phenomneon; entities have relationships and attributes which can be spatial or non spatial
Object
discrete and definite
phenomena
distributed continuously over a large area
Representing Geographic Space

approaches to representation of real world in geographic databases; 

 

  • Object based model
  • field based model

Spatial Relationships

  1. contiguity – next to sharing a border
  2. connectivity- connected to
  3. proximity
  4. closure/ containment

Spatial Attributes
always interested in location
Time
component of Berry’s geographic matrix
Temporal Relationships
time, algebra, measurement, relationship
Measurement of Time

  • Time measurement for objects can be made at an instance which is a measurement of existence
  • can also be made for a duration which is a measurement of evolution, occurence, and permanence

Relationship between Time and Space

  • spatial changes refer to geometric transformation of an object and includes the change in location, size, orientation and form
  • spatial relationships among objects may change as a function of geometric transformations

Temporarl Attributes of Geographic Process

  1. Generation Time: time at which object is created 
  2. Duration Time: time during which an object is in existence or is observed
  3. Temporal Significance: important of given event
  4. Temporal Scale: analogous to map scale adapted to time

Logical Organization
how data are classified and feature coded to facilitate identification of relationships between data items
physical organization
method by which data items are stored on a computer
Data Classification/classification scheme
purpose of classification scheme: provide an a priori standard with which individual observations can be observed and recorded during data collection process
Two components of Classification Schemes

  • Descriptive Names of classes and subclasses
  • Definitions of classes and subclasses

Entity
spatial object that has specific properties that categorically seperate it from other entities. These properties are known as attributes
Entity class, entity type, or feature class
collectively, entities that share common attributes
Feature Codes
process of encoding the values of the entities and attributes to graphical elements during the data collection process
Feature Codes comprised of two components

  1. Major Code
  2. Minor Code

Major Code
identifies the entity type to which a particular entity belongs
Minor Code
Identifies the attributes that an entity has, also referred to as a attribute code
Feature codes may be
alphabetic, numeric, or alphanumeric
Precision
function of the number of bits used
Byte
smallest addressable unit in the computer, an 8-bit data item
Data Item
basic building block of data organization in the computer, an occurence or instance of a certain characteristic pertaining to an entity
Related Data Items
data items are occurrences of different characteristics pertaining to the same entity
Record
a stored record, or tuple
Data File
formed by grouping a record together

True/False

 

 

Computer processing is based on databases rather than data files

 

True
Raster Data Representation
Grid Cells representing areas of the same entity type have identical values or patterns of the values tend to be spatially clumped
Raster Data
run length encoding raster data compression algorithm
Raster Data Representation Encoding
Adjacent cells along a row wit hthe same values are treated as a group. Value is stored once together with the number of cells that comprise the run
Clustering
goal is to reduce seek and latency time in answering common large queries. For spatial databases this implies that objects are adjacent in space and are commonly requested jointly by queires should be stored physically together in secondary memory
Three types of Clustering for SDBMS

  1. Internal Clustering
  2. Local clustering; and
  3. Global Clustering

Internal Clustering
to speed access to a single object, the complete representation of one object is stored in one disk page, assuming the size is smaller than the free space on the page. otherwise the object is stored on multiple, physically consecutive pages
Local Clustering
to speed access to several objects, a set of spatial objects is grouped onto one page
Global clustering
a set of spatially adjacent object is stored not on one but several physically consecutive pages that can be accessed by a single read request
Clustering
Design of spatial clustering techniques is more difficult than traditional clustering since there is no natural order in multi-dimensional space where spatial data resides. Also disk storage is logical and is only 1-D device
Clustering
regarding addressing systems, procedures exist to represent relative locations of the 2 or 3-D kind of 1-D sysstems
Clustering

  • Distance preserving manner needs to be mapped from higher dimensional space
  • No two points in the space are mapped onto the same point on the line, and should be one to one

Row order (TV)
total longer path than row prime sequence, every other line is traversed in a reverse direction and have several placves in which neighbors on the path are not adjacent in space
Diagonal and spiral orders
like the row prime sequence in possessing the property of immediate adjacency
Diagonal Sequence
mixes up corner and side joins
Spiral Order
Terminates in the middle, making it impossible to connect other blocks of space
Comparison of Paths

  1. Total length of the path
  2. variability in unit lengths, where unit length is the distance from one point on the path to the next sequence;
  3. the average distance on the path from the tiles to their neighbors in space

Space Filling curves

  • Special fractal curves which have characteristics of completely covering an area or volume
  • topological dimension of 2
  • if point is 0-D, not possible to define a 1-D curve passing thru the infiinity point
  • if point is conceptualized as 2-D square the side of which tends toward zero, it is possible to find a curve filling a 2-D space
  • In 3-D a point can then be defined as a small cube for which the side length tends toward zero and the curve as a 3 D curver

Paths as Space Filling Curves

  1. curve must pass only once to every point in multidimensional space
  2. two points that are neighbors in space must be neighbors on the curve
  3. two points that are neighbors on the curve must be neighbors in space
  4. it should be easy to retrieve neighbors at any point
  5. curver corresponds to a mapping from a multi- to a 1-D space
  6. curve should be able to be used for variable spatial resolution
  7. curve should be stable

Peano or N ordering
facilitates retrieving neighbors; while neighboring points in space are not always neighbors on the curver, they usually are
Hilber Curve
passes through all points in a set by means of single length steps only. it meets many of the criteria of an ideal curve, but does not enable the easy retrieval of neighbors. Unstable
Algorithm for the Z curve (peano)

  1. Read the binary representation of the x and y coordinates
  2. Interleave the bits of the binary numbers into one string
  3. Calculate the decimal value of the resulting binary string

Algorithm for the Hilbert Curve

  1. Read in the n-bit binary representation of the x and y coordinates
  2. interleave bits of the two binary numbers into one string
  3. divide the string from left to right onto 2-bit strings
  4. give decimal valu, d for each 2 bit string
  5. for each number j in the array i
  6. convert each number in the array to its binary representation, concatenate all the strings in order from left to right, and calculate teh decimal value

Disk access
Hilbert curvbe method is slightly better than the Z-curve because it does not have any diagonal lines
Block
square region that is the result of one or more quadtree subdivisions of the original image. a quadtree recursively subdivides space into four equal parts
Handling regions
Each object can be uniquely representaed by the Z-values of its blocks, Each such z-Value can be treated as a primary key of a record of the form
Space Filling curves for GIS systems

  1. Make scanning operations more efficient (hardware devices or scanning thru datafiles)
  2. They are used as spatial indexes, simplifying 2-D addressing as 1-D addressing

Vector Data Representation

  • Vector Data model is object based
  • best utilized to represent discrete objects
  • spatial objects are represented individually and represented mathematically, via coordinates
  • Vector data model is more complex than raster data model
  • Wide variety of formats

Vector Data Representation Formats

  1. Decomposing spatial objects into basic graphical elements
  2. use of topology (spatial relationships) as well as geometry coordinates

Vector Terminology

  • Lines (arcs)- begin and end with a node
  • Polygons (areas) – closed loop of coordinates
  • points

Spaghetti data model

  • not structured vector data from map digitizers, CAD- stores graphical elements, not graphical entities
  • redundant- stored twice
  • must be structured for use in GIs

Topological Data Model

  • Structured vector data
  • many variants of model – most common is the arc-node data model
  • Arc = line segments
  • node = end points of line segments
  • stores graphical elements rather than graphical entities
  • stored topological relation allows graphic entities to be constructed

Spatial Data Transfer Standard (SDTS)

robust way of transferring earth referenced spatial data between dissimilar computer systems with the potential for no information loss. it is a transfer standard that embraces the philosophy of self-contained transfers, spatial data, attribute, goereferencing, data quality report, data dictionary

 

Classification and intended use of objects in STPS

  • Geometry only – for drawing, display and geometrically defined operations on raster and vector data structures
  • geometry and topology – for vector data structures that use geometric drawing and topological operations
  • topology only – for certain analytical operations

Point

  • a zero dimensional object that specifies geometric locations
  • Entity point – point used for identifying the location of point features such as tower, buoys
  • Label Point – a reference point used for displaying map and chart text to assist in feature identification
  • Area Point- representative point within an area usually carrying attribute information about that area

Node
zero dimensional object that is a topological junction of two or more links or chains, or an end point of a link or chain
Line and Line Segment

  • A line is a generic term for a one dimensional object
  • Line segment- direct line between two points

String
connected nonbranching sequence of line segments specified as the ordered sequence of points between those line segments. A string may intersect itself or other strings
Arc
a locus of points that froms a curve that is defined by a mathematical expression
Link
a topological connection between two nodes. A link may be directed by ordering its nodes
Chain

  • a directed nonbranching sequence of nonintersecting line segments and arcs bounded by nodes, not neccesarily distinct at each end

Complete Chain
chain that explicity references left and right polygons and start at the end of nodes it is a two dimensional manifold
Network Chain
chain that explicitly references start and end nodes and not left and right polygons
Ring
sequence of nonintersecting chains or strings and or arcs with closure. a ring represents a closed boundary, but not the interior area inside the close boundary
Interior Area
Area not including its boundary
G-Polygon
area consisting of an interior area, one outer G ring and zero or more non intersecting nonnested inner G-rings. No ring, inner or outer, must be collinear with or intersect any other ring of the same G-polygon
GT polygon
area that is two dimensional component and only one two dimensional maniforld
Universe polygon
defines the part of the universe that is outside the perimeter of the area covered by other GT-polygons. This polygon completes the adjacency relationships of the perimeter links
Pixel
two dimensional picture element that is the smallest nondivisible element of a digital image
Grid Cell
two dimensional object that represents the smallest nondivisible element of a grid
Two dimensional aggregate spatial objects
Certain two dimensional aggregate spatial objects must be defined to provide context for many of the simple objects defined above. these aggregate objects are necessary for the definition of raster objects, topology
Digital Image
Two dimensional array of regularly spaced picture elements (pixels) constituting a picture
Grid
two dimensional set of grid cells forming a regular tesselation of a surface
Rectangle variant Grid
Each row and column of the Grid may have independent thickness or width
Layer
an areally distributed set of spatial data representing entity instances within one theme, or having one common attribute or attribute value in an association of spatial objects. A layer is specifically a two, three, or N-dimensional array of attribute values associated with all or part of a grid, image, voxel space or any other type of raster data.
Raster
one or more related overlapping layers for the same grid, digital image, voxel space, or any other type of raster data. the corresponding cells between layers are registered to the same raster object scan reference system . The layers overlap but need not be of the same spatial extent
Graph
set of topologically interrelated zero-dimensional, one dimenional and sometimes two dimensional objects taht conform to a set of defined constraint rules
Planar Graph
node and link or chain objects of the graph occur or can be represented as though they occur upon a planar surface. Not more than one node may exist at any given point on the surface
Network
a graph without two dimensional objects. If projected onto a two-dimensional surface, a network can have either more than one noede at a point, and intersecting links or chains without corresponding nodes
Voxel
a three dimensional object that represents the smallest nondivisible unit of a voxel of space (volume) (think cube)
Voxel Space
3 D array of voxels in which the volumetric dataset resides. The volume represents some measurable properties or independent variables of a real object or phenomenon.
MetaData/MetaInformation

  • documents knowledge of data accuracy, provenance, and age necessary for good decision making
  • can build a GIS catalog, internal or external portals allow others to search, find, and access the GIS resources

FGDC
Federal Geographic Data Committe with the goal to provide a complete description of a data source
ISO

  • International Organization for Standardization
  • attempts to satisfy the requirements of all existing metadata standards; flexible genearl or detailed descriptions

XML

  • eXtensible Markup Language was developed by the World Wide Web Consortium
  • standard for designing text formats

GIS
A GIS is a collection of hardware, software, data, policies, procedures, and people for the input, sotrage, & retrieval, manipulation and analysis, output and modeling of spatially referenced data
Relational DBMS

  • object oriented database systems
  • Small World

GIS FAct
Sometimes referred to as a scaleless system but it is not true because you are not changing the original scale that the data was originally collective
Cognitive GIS
deals with non-euclidean coordinates
Error Propagation
can amplify or cancel out your operations. Understanding how errors propagate is extremely important
Digital Raster Graph (DRG)
scanned hardcopy map that has been georeferenced
Line
Graphical
Chain/arc
topological
Differential Post Processing
Most accurate form of GPs
Projection Surface

  1. Plane (flat surfaces) – azimuthal projection family
  2. cylinder – cylindrical projection family
  3. Cone – Conic projection family

Light Source Position

  1. Gnomonic- Light source is at the center of the globe (flashlight shing at the center)
  2. Stereographic: light source is at the point exactly opposite of the point of tangency of the projection surface (from the poles)
  3. Orthographic- at a considerable distance (infinite point): Light rays are parallel, often used for persepctive views which means the map is often perspective but not conformal or equal in area.


Normal (regular Projection)

  • normal orientation for a plane is tangent at the pole (polar azimuthal)
  • cylindrical is normally oriented so that it is tangent along equator (equitorial)

 

  • A cone is normally oriented so that it is tangent along a parallel with its apex over the pole, in alignment with the axis of rotation

Transverse Projection

  • projection is tuned 90 degrees from normal
  • for a plane, the plane is tangent at the equator
  • for a transverse cylindrical projection, the cylinder is tangent along a meridean
  • transverse conic: not frequently seen

Oblique Projection
Projection surface lies at an angle somewhere between the normal and transverse position
Tangent Projection

  • The projection surface is tangent to the globe
  • a planar surface is tangent to the globe only at one point
  • tangent cones and cylinders contact, the globe along a line

Secant Projection

  • projection surface intersects the globe instead of merely touching the surface
  • a planar surface intersects the globe forming a small circle along the intersection line
  • A cone or cylinder intersects the globe resulting in two small circles along intersecting lines

Standard Point

  • point at which a planar surface touches the globe
  • only the one standard point exists for a planar tangent projection
  • directions from the point are accurate greate circles passing through are represented as straight lines
  • Distortions of area angle have a circular pattern and increases with distance from standard point

Standard line (line of true scale)

  • line along which projection surface touches or intersects the globe
  • this is the one standard line when a polar surface intersects the globe or a cone or cylinder is tangent to the globe
  • there are two standard lines when cone or cylinders intersect the globe
  • along a standard line a map has no distortions and map scales is identical to the nominal globe
  • Geometric distortion generally increases with distance

Geodemographics
demographic information about a population that is spatially consolidated
Exact Interpolators

  • honor the data points upon which interpolation is based
  •  surface passes through all points whose value is known
  • proximal interpolation, R splines and Krigin methods all honor given data points
  • Kriging may incorporate a nugget effect if so not exact interpolator

Approximate Interpolators

  • used when uncertainty exists about given surface values
  • utilize belief that in many data sets there are global trneds which vary slowly, overlain by local fluctuations which very rapidly produce uncertainty (error)
  • Effect of smoothing will therefor reduce effects of error on resulting surface

Stoichastic Interpolators

  • incorporate concept of randomness
  • interpolated surface is conceptualized as one of many that might be observed all of which could have been produced with known data points
  • Kriging is stoichastics because it allows statistical significane of the surface and uncertainty of predicted values to be computed

Deterministic Interpolators

  • do not use probability theory
  • example: TIN linear

Fuzzy Math
fuze up, precision up, knowledge up, and is easier to work with
Toblers Law of Geography
Things closer together in space tend to be more similar than things further apart
Kriging Interpolation Method

  • geostatistical
  • basis of the method is the rate at which the variance between point changes over space, this is expressed in the variogram which shows how the average difference between values at points changes with distance between points
  • requires characterization of spatial data to set parameters
  • usually more accurate

CAD Models

RGB

  • most applicable to computer display devices
  • additive based on a mixture of Red, green, blue
  • CAD system viewed as a cube with red green and blue dyes

 

Red + Green = Yellow

Red + Blue = Magenta

Green + Blue = Cyan

Red + Green + Blue = white

None = black

HSV Model

  • tecktronic developed by HSV system to specify selection of color projection from tints to shades
  • double cone with the central axis forming a lightness progression identical to black and white diagonal line through RGB cube
  • Many color graphic programs specify color based on hue, lightness value, and saturation

HSV Color Model

  • Hue is given as an angle counter clockwise between 0 and 360
  • lightness value is given as an integer between 0 (black) and 100 (white) and saturation between 0 (gray) and 100 (pure color)
  • triangular slice for each hue can also be viewed as a plane cut from the RGB cube and deforemed into the HSV triangles
  • The transformation is linear- simple equations can be used to transforms HSV specifications into RGB and vice versa

RGB to HSV

  • colors in the HSV are defined with respect ot normalized red, green, and blue values
  • R= R/ R+G+B
  • G= G/R+G+B
  • B= B/R+G+B
  • R+G+B= 1

Intensity
I = 1/3 (R+G+B)
Hue
H= cos-1{1/2 (r-g)+(r-b)/ [(r-g)2+(r-b)(g-b)]1/2}
Saturation
S = 1-3/ R+G+B [min(r,g,b)]
Range of calculated saturation and intensity
S and V [0,1] but can be rescaled down to [0,100]
HSV to RGB

  • normalize the Saturation and Intensity to the range of [0,1] and H to the range of [0,360] 
  • R = 1/3 (1-S)
  • G= 1/3 (1+Scos(H)/Cos60-H
  • B = 2 – (r+g)
  • range of computed r,g,b is [0,1] but can be rescaled [0,255]

.twf
contians coordinate information
Spatial Frequency
same occurence in a certain feature
Spatial Data in Raster Structure

  • almost no coordinates are stores, in fact, there only one coordinate pair that is stored – the origin
  • Block orientation
  • Storage
coordinate pair, cell size, orientation, orientation angle, matrix size # of rows, columns, projection info, scale

  • Disadvantages of Raster

  • Geometry and topology are limited but implicit in data structure
  • there is no connectivity
  • have contiguity built into arrangement of cells just need to know the numbering scheme and can do proximity analysis
  • Index schemes – order which we are storing data into the system, z scan back and forth
  • Problem: takes time to access the data we want, positions are close in space in sotrage
  • type of index scheme is application dependent

 

Vector Data Organization

  • Basic Unit – Chain
  • 3 Topological Objects
  • Nodes – 0 dimensional connection between chains
  • Chains – 1 D connection between nodes
  • Polygons – 2-D polygons share chain boundaries

Raster Facts

  • nodes bound chains
  • chains cobound nodes
  • polygons co bound chains
  • chain is basic unit since it relates to both nodes and polygons

Fully specified chain

  • From Node (left poly)
  • To Node (Right poly)
  • Chain is given a sense of direction

Nodes
Id, Point ID, List of Chains
Chains

ID, from node to node

 

topological information, deep structure

 

Line ID, or point poly – graphical shape info element – surface sturcture

Polygon
ID, Label Point ID, Chain List
SQL Structured Query Language
syntax for defining and manipulating from a relational database management system
ArcInfo

  • does not support SQL
  • Fields- smallest named unit – represents a characteristic

Entity – Record, File, Database

Record
collection of individual fields, an entry in the database
file
collection of records
Database
collection of files
Numeric Fields
Stores Numbers
Alphanumeric fields
stores numbers and letters
Kind of Number Fields

  • data fields
  • money fields – allowing for only two decimal places
  • integers- does not have decimal or fractional part
  • reads- does have decimal or fractional part

Key fields

  • Primary Index into the Database
  • First Field that data are stored on

Database Architecture

  • flat field
  • hierarchical 
  • Network
  • relational
  • object oriented

Flat File

  • a table or spreadsheet
  • one file to hold all the data
  • row records – number of objects
  • columns – fields
  • for 1 kind of object number of attributes = number of columns
  • 2 kinds of objects MxN = land parcels and soil polygons

Disadvantages of Flat Files

 

  • Dead Space
  • Redundancy

Relational Database

  • Looks like a flat file
  • tables or files called relations
  • records are called tuples
  • fields are referred to as domains

Organize data into multiple relations

  • minimizes the redundancy
  • link seperate relations together by redundancy
  • eliminates dead space
  • process of creating a unique table/relations from redundant tables/relations called normalization

Databases Need

  • A way to add data to a table without having to change the program used to access it
  • to create a system that allows multiple user access while maintaining security

Management System (DBMS)
system for providing efficient, convenient, and safe multi-user storage of and access to massive amounts of persistent data
Persistent Data
Data that outlives the software programs that were used to generate them
DBMS

  • Includes data, software/programs for storing and accessing the data as well as the security measures
  • User- Query – Query Processor – transaction manager- storage manager- data metadata

Transaction Manager
Enables Entry for new data or allow changes to the Database
Data Model
describes the conceptual structuring of the data stored in the database
DBMS History

  • 1970s starter were most hierarchical
  • 1980s relational DB model developed, algebra that describes the model and a calculus for data retrieval
  • 1990s – Object oriented DBs invented
  • 1997 Relational database > hierarchical – took 17 years to surpass hierarchical usage
  • 2000 – relational 60% data in the market

object 5-7%, hierarchical rest of market

2 main types of languages

  1. Data Manipulating Language
  2. Data Definition Language

 

Data Manipulating Language
Commands such as SELECT, INSERT, DELETE
Data Definition Language
Commands for creating the schema stored as metadata
Levels at which DBs can be described

  1. Logical Level
  2. Conceptual Level
  3. Research Area

Logical Level
level at which users view the DB
Conceptual Level
concerned with infor design
Research Area

Middleware

  • connects individual DBs to other application programs; reconciles the different terms used among the various DBs
  • Allows users to see at the logical level, one DB can perform just one query

Disk Drivers are Rated by

 

  • Seek Times
  • Read Times
  • DBs are concerned with where data are stored on the disk drive and affects retrieval time – good DBs have their own drivers to efficiently store data

ArcSDE

  • Arc Spatial Database Engine
  • Relational Model 
  • Examples 

Entity
like an object or thing
Entity Set

  • like a class = set of similar entities/objects
  • represented by a rectangle

Attributes

  • properties of entities in an entity set
  • Represented by an Oval
  • So entity sets turn into tables and attributes turn into columns in the table

Relationship

  • connect two or more entity sets
  • Represented by diamonds
  • taking represents a relationship set
  • think of this as a table iwth one column for each connected entity set, and one row for each list of entities are connected

 

Multiway (n-ary) relationships

  • Consider a relationship between students, courses, and TAs
  • if we want to know which TAs are assisting with which students this design will not tell you that

Multiplicity of Relationships

  • Many to Many
  • Many to One
  • One to One

Beer Diagram

All of these relationships are many to many (use a round arrow to denote that drinkers have exactly one”

  • many drinkers might like the same beer (many to One)

Leave a Reply

Your email address will not be published. Required fields are marked *