Geo*Data:FAQ
Instructions
Here is an Unsupported and Untested script that can be used with Windows PowerShell to Split the new US.txt
file found in GeoDAT_202310 into individual State Files.
This may be useful for GeoData users that still want previous legacy format with every file separated instead of the updated all-in-one state file (US.txt).
Download GeoData Update (GeoDAT_YYYYMM)
Download and Extract GeoDAT_202310 folder onto your local desktop.
Download and Extract GeoData_SplitState_Test.zip
Download and Extract the GeoData_SplitState_Test.zip
below and save the Split_All_States_20231114.ps1
and Split_By_State_Test.ps1
to the GeoDAT_202310 folder.
For example: C:\Users\Roxanne\Desktop\GeoDAT_202310\
Open Windows PowerShell as Administrator
Open "Windows PowerShell" as Administrator.
Change Drive to GeoDAT_YYYYMM file path
Change drive to the File path of GeoDAT_202310 on your desktop and press Enter.
For example: PS C:\WINDOWS\system32> cd "C:\Users\Roxanne\Desktop\GeoDAT_202310\" PS C:\Users\Roxanne\Desktop\GeoDAT_202310>
Determine if you want to Split the US.txt file
Determine if you want to Split the US.txt file by all States at one time OR one desired state file.
All States at One Time
To Split the entire US.txt
file into individual state files at one time, please perform the following steps:
Open the Split_All_States_Test.ps1
in notepad to update the the following file paths before running the script in PowerShell:
# $txtFilePath # $idxFilePath # $outputFolderPath
Once the file paths are updated, Save and close the the file.
Type .\Split_All_States_Test.ps1
in Windows PowerShell and hit enter to split all states at once.
NOTE: This will parse out the state files into individual state files as done previously and may take awhile, Press CTRL+C to Stop the script.
File Contents: Split_All_States_Test.ps1
# Split GeoData - "US.txt" file by All States at once using the "US.idx" file. # PLEASE NOTE: # This script has NOT been tested and is NOT supported. # This script may take some time. If using Windows Powershell, type "CTRL + C" to Stop script from running. # The following paths must be updated before running: # $txtFilePath # $idxFilePath # $outputFolderPath # Last Updated: 2023-11-14 # Define the paths to the US.txt file and the US.idx file $txtFilePath = "C:\Users\Roxanne\Desktop\GeoDAT_202310\TXT\US.txt" $idxFilePath = "C:\Users\Roxanne\Desktop\GeoDAT_202310\TXT\US.idx" $outputFolderPath = "C:\Users\Roxanne\Desktop\GeoDAT_202310\TXT" # Read the index file $indexLines = Get-Content $idxFilePath # Create the output folder if it doesn't exist if (!(Test-Path -Path $outputFolderPath -PathType Container)) { New-Item -Path $outputFolderPath -ItemType Directory } # Initialize variable to keep track of the cumulative record count $recordOffset = 0 foreach ($indexLine in $indexLines) { # Parse the index line to get the state abbreviation and count $stateAbbreviation, $stateFIPS, $count = $indexLine -split ',' # Convert the count to an integer $count = [int]$count if ($stateAbbreviation -and $count -gt 0) { # Create a new output file with the state abbreviation $outputFileName = "${stateAbbreviation}.txt" $outputFilePath = Join-Path -Path $outputFolderPath -ChildPath $outputFileName # Read and append lines from the US.txt file to the current output file Get-Content $txtFilePath | Select-Object -Skip $recordOffset -First $count | Add-Content -Path $outputFilePath # Update the cumulative record count for the next state $recordOffset += $count Write-Host "Splitting complete for $stateAbbreviation." } } Write-Host "Splitting complete for all states."
One Desired State File
To Split an individual state file out of the US.txt
one at a time, please perform the following steps:
Open the Split_By_State_Test.ps1
in notepad to update the the following file paths before running the script in PowerShell:
$txtFilePath $idxFilePath $outputFolderPath $desiredState
Once the file paths are updated, Save and close the the file.
Type .\Split_By_State_Test.ps1
in Windows PowerShell and hit enter to split an individual desired state.
NOTE: Please note the new State file may take a moment to display in the TXT folder depending on the size.
File Contents: Split_By_State_Test.ps1
# Split GeoData - "US.txt" file by Individual State using the "US.idx" file. # PLEASE NOTE: # This script has NOT been tested and is NOT supported. # The following paths must be updated before running: # $txtFilePath # $idxFilePath # $outputFolderPath # $desiredState # Last Updated: 2023-11-14 # Define the paths to the US.txt file and the US.idx file $txtFilePath = "C:\Users\Roxanne\Desktop\GeoDAT_202310\TXT\US.txt" $idxFilePath = "C:\Users\Roxanne\Desktop\GeoDAT_202310\TXT\US.idx" $outputFolderPath = "C:\Users\Roxanne\Desktop\GeoDAT_202310\TXT" # Read the index file $indexLines = Get-Content $idxFilePath # Create the output folder if it doesn't exist if (!(Test-Path -Path $outputFolderPath -PathType Container)) { New-Item -Path $outputFolderPath -ItemType Directory } # Initialize variables to keep track of line counts $currentState = $null $recordOffset = 0 # Define the state you want to parse (e.g., "AL" for Alabama) $desiredState = "AL" # Loop through the index lines foreach ($indexLine in $indexLines) { # Parse the index line to get the state abbreviation and count $stateAbbreviation, $stateFIPS, $count = $indexLine -split ',' # Convert the count to an integer $count = [int]$count # Check if the current state matches the desired state if ($stateAbbreviation -eq $desiredState -and $count -gt 0) { # Create a new output file with the state abbreviation $outputFileName = "${stateAbbreviation}.txt" $outputFilePath = Join-Path -Path $outputFolderPath -ChildPath $outputFileName # Read and write lines from the US.txt file to the current output file Get-Content $txtFilePath | Select-Object -Skip $recordOffset -First $count | Set-Content -Path $outputFilePath Write-Host "Splitting complete for $stateAbbreviation." break # Exit the loop after processing the desired state } # Update the record offset for the next state $recordOffset += $count } Write-Host "Splitting complete for $desiredState."