Pentaho:Contact Verify:GeoCode

From Melissa Data Wiki
Jump to navigation Jump to search

← Data Quality Components for Pentaho

Contact Verify Navigation
Overview
Tutorial
Advanced Configuration
On-Premise
Web
Local Appliance
Contact Verify Tabs
Name
Address
GeoCode
Phone/Email
Pass-Through Columns
Output Filter
Processing Options
Reporting
Result Codes



GeoCode determines the longitude and latitude of an address. This can be based on the centroid of the five-digit ZIP Code or the nine-digit ZIP + 4, but in many cases it can be accurate to rooftop level when there is sufficient address data available.

Input Address

There are four options that control how the GeoCoder feature of the CVC acquires the data it needs for geocoding.

Not GeoCoding
Disables the geocoding feature completely.
Use Results of Address Process
If you are using the Address Verify features of the CVC, you can use the results based on those input columns as input for the GeoCoder. To do this, you must meet the minimum requirements for Address Verify: Address and ZIP Code, or address and city/state.
Address Key Column
If you have previously used the Address Verify feature on the input data, or used a similar product, such as Melissa Data's Address Object or WebSmart Address Verifier service, you may have an Address Key available. You can use this column as the input for the GeoCoder feature.
Address Components
If you are GeoCoding new data and are not using the Address Verify features of the CVC, you can map the following columns here instead of the Address tab: Address, Address 2, City, State, ZIP/Postal Code.


Output Columns

Latitude
This column returns a string value containing the latitude for the centroid of the location described by the submitted address. Latitude is the geographic coordinate of a point measured in degrees north or south of the equator. The Web service uses the WGS-84 standard for determining latitude. Since all North American latitude coordinates are north of the equator, this value will always be positive.
Longitude
This field returns a string value containing the longitude for the centroid of the location described by the submitted address. Longitude is the geographic coordinate of a point measured in degrees east or west of the Greenwich meridian. The Web service uses the WGS-84 standard for determining longitude. Since all North American longitude coordinates are west of the Greenwich meridian, this value will always be negative
Additional Columns
To map additional GeoCoder results, click the Additional Output Columns... button.


GeoCoding Resolution

The available level of GeoCoding precision is based on your subscription level to the GeoCoder Object or the WebSmart GeoCoding Service, depending on your processing type (On-Premise or Off-Premise).

ZipCode
If you do not subscribe to the necessary GeoCoding product, GeoCoding will be limited to the five-digit ZIP Code level. Place Code and Census information will not be returned.
Street
If you subscribe to a GeoCoder product without GeoPoints, the CVC will return information at the street (ZIP + 4) level.
Rooftop
If you subscribe to a GeoCoder product with GeoPoints, the CVC will return information at the rooftop (ZIP + 4 + 2) level.


Output GeoCode Columns

Geographic Information

County Name
This column returns the name of the county where the input address is located.
County FIPS
This column returns the six-digit Federal Information Processing Standard (FIPS) code for the county where the input address is located.
Place Code
This column returns the Census Bureau Place Code for the physical location of the input address. This information is useful when the boundaries of the ZIP + 4 overlap city limits.
Place Name
This column returns the official Census Bureau name for the location indicated by the Place Code.
Time Zone
This column returns the name of the time zone where the verified input address is located.
Time Zone Code
This column returns a one- or two-digit number code for the time zone where the verified input address is located. The number also indicates the number of hours that the time zone is behind UTC/GMT. In other words, Eastern Standard Time has a time zone code of 5, indicating that the Eastern time zone is five hours behind UTC/GMT.
This number does not indicate differences due to daylight savings time.

Census Information

The following columns return information useful in determining the demographic characteristics for the location of the input address.

CBSA Code
Metropolitan and micropolitan statistical areas (metro and micro areas) are geographic entities defined by the U.S. Office of Management and Budget (OMB) for use by Federal statistical agencies in collecting, tabulating, and publishing Federal statistics. The term “Core Based Statistical Area” (CBSA) is a collective term for both metro and micro areas. A metro area contains a core urban area of 50,000 or more population, and a micro area contains an urban core of at least 10,000 (but less than 50,000) population. Each metro or micro area consists of one or more counties and includes the counties containing the core urban area, as well as any adjacent counties that have a high degree of social and economic integration (as measured by commuting to work) with the urban core.
The CBSA Code is a five-digit code for the specific CBSA of the input address.
CBSA Level
This column returns the level of the CBSA for the submitted address: micropolitan or metropolitan.
CBSA Title
This column returns the official U.S. Census Bureau name for the Core Based Statistical Area (CBSA) of the input address.
CBSA Division Code
This column returns the numeric code for the division within the Core Based Statistical Area (CBSA), if any. Some CBSA’s are broken into parts known as divisions. In this case, the CBSA Division columns will also be populated. If not, these columns will be empty. Each division also has a Code, Level, and Title.
CBSA Division Level
This column returns the level of the CBSA division for the submitted address: micropolitan or metropolitan.
CBSA Division Title
This column returns the official U.S. Census Bureau name for the CBSA division of the input address.
Census Block
This column returns the Census Block number for the input data.

Census blocks, the smallest geographic area for which the Bureau of the Census collects and tabulates decennial census data, are formed by streets, roads, railroads, streams, and other bodies of water, other visible physical and cultural features, and the legal boundaries shown on Census Bureau maps.

A Census Block Group is a cluster of blocks having the same first digit of their 3-digit identifying numbers within a Census Tract or Block Numbering Area (BNA). For example, Census Block Group 3 within a Census Tract or BNA includes all blocks numbered between 301 and 397. In most cases, the numbering involves substantially fewer than 97 blocks. Census Block Groups never cross Census Tract or BNA boundaries, however, they may cross the boundaries of county subdivisions, places, American Indian and Alaskan Native areas, urbanized areas, voting districts, and congressional districts. Census Block Groups generally contain between 250 and 550 housing units, with the ideal size being 400 housing units.

Census Blocks are small areas bordered on all sides by visible features such as streets, roads, streams, and railroad tracks, and by invisible boundaries such as city, town, township, county limits, property lines, and short, imaginary extensions of streets and roads.

The Census Block function returns a 4-character string. The first digit is the Block Group and the last three characters (if any) are the Block Number.

Census Tract
This column returns the Census Tract number for the input data.

Census Tracts are small, relatively permanent statistical subdivisions of a county. Census Tracts are delineated for all metropolitan areas (MA’s) and other densely populated counties by local census statistical areas committees following Census Bureau guidelines (more than 3,000 Census Tracts have been established in 221 counties outside MA’s).

This column returns a four or six-character string value.

The Census Tract is usually returned as a 4-digit number. However, in areas that experience substantial growth, a Census Tract may be split to keep the population level even. When this happens, a 6-digit number will be returned.