The GIS Adventures of Map Man

Saturday, September 7, 2019

Special Topics in GIS - Module 1.2: Data Quality - Standards

Hello Everyone!

This weeks lab is an extension of Spatial Data Quality. For this weeks lab, I did my data quality assessment according to the National Standard for Spatial Data Accuracy (NSSDA). According to the NSSDA, some criteria need to be met when selecting test points. For this lab, I was given two road datasets. One data set is Albuquerque streets from the city of Albuquerque. The second road dataset is Albuquerque streets from StreetMap USA which is distributed from ESRI. Finally, I was provided several satellite aerial images of the study area portion of Albuquerque divided into quadrangles. When comparing the road datasets to the satellite aerial images, it was evident that on the surface, the two datasets had very differing positional accuracy from each other. For my positional accuracy analysis, I chose 20 randomly selected intersection points within one of the provided aerial image quadrangles of Albuquerque. Proper intersections that I chose for analysis were cross(+) intersections and right angle '90-degree' 'T' intersections. Per the NSDAA standards, my test points had a distribution of at least 20 percent of points in each quadrant of my aerial quadrangle and at least 10 percent spacing (at least 370 feet apart) distance of the diagonal length of the quadrangle. To select these points, I created intersection points for both road datasets using a geoprocessing tool within ArcGIS Pro. I then selected the random test points at the appropriate type of intersection ensuring to select the correct intersection for both road datasets and following the aforementioned NSDAA distribution/spacing rules. My test points can be seen below for one of the road datasets:

Once my test points had been selected, I then digitized reference points to compare the positional accuracy bases on the aerial satellite imagery location of each intersection. Once the test points and reference points were created, test points were assigned matching Point IDs with the reference points so their coordinate values could easily be analyzed. After assigning XY coordinate values to both sets of test points and my reference points, I exported them as DBF files and then plugged them into a positional accuracy spreadsheet provided by the NSSDA that calculates the positional accuracy using the 95th percentile. Essentially the table compares the XY position of each test point to its matching reference point (the importance of matching Point IDs for both test points and reference point). This sheet calculated the following values. Sum, Mean, Root Mean Square Error (RMSE), and the National Standard for Spatial Data Accuracy statistic which multiplies the RMSE by a value of 1.7308 (95th Percentile for Horizontal Accuracy) to yield the horizontal positional accuracy at the 95th percentile. My formal accuracy statements can be found below that meet the NSSDA guidelines:

ABQ Streets Test Points:

Horizontal Positional Accuracy: Tested 14.106 feet horizontal accuracy at 95% confidence level.

Vertical Positional Accuracy: Not applicable

Street Map USA Test Points:

Horizontal Positional Accuracy: Tested 258.682 feet horizontal accuracy at 95% confidence level.

Vertical Positional Accuracy: Not applicable

I genuinely enjoyed working through this weeks lab and look forward to sharing more special topics with you and as always... ~Map On!

Tuesday, September 3, 2019

Special Topics in GIS - Module 1.1: Calculating Metrics for Spatial Data Quality

Hello Everyone!

It's hard to believe that we are already going full speed in the fall semester of 2019. Soon 2020 will be upon us and I will have completed my first year of graduate school here at UWF. Last semester, I focused primarily on GIS Programming and Spatial Data Management using SQL. This semester, I'll be focusing on both special and advanced topics in GIS as well as Remote Sensing and Aerial Imagery. Let's jump right in for this phenomenal first week!

For this weeks lab in Special Topics of GIS, I was tasked with calculating metrics for spatial data quality. In this lab, I analyzed spatial data quality for a set of gathered waypoints taken by a handheld GPS unit in two separate ways. The first was via a map with buffer zones (below) showing three percentiles of precision. The second which will not be discussed in this post is a root-mean-square error analysis and a cumulative distribution function graph.

Before I delve into my findings, accuracy and precision need to be explained in the realm of GIS. For the purpose of this post, I am assessing horizontal accuracy and precision. To derive the horizontal precision, which is the closeness of the recorded points to one another, I calculated an average projected waypoint I then created three buffers for precision percentiles that contain an x amount of points. The buffers I created were at 50%, 68%, and 95%. For horizontal accuracy, which is how close the measured values are to the actual (reference) point, I measured the distance of my average projected waypoint from my horizontal precision calculation to the actual reference point.

Now that my methods of determining horizontal precision and accuracy have been explained, I would like to share my results with you.

For horizontal precision, I got a value of 4.5 meters when measuring precision at the 68th percentile. If we are basing the precision off of the 68th percentile then these results would be precise. For my horizontal accuracy, the distance from the average waypoint (blue) to the reference (actual) point was 3.24 meters. Compared to the precision value, these results have fairly high accuracy. Overall, after assessing the horizontal accuracy and precision, it can be observed that the GPS waypoints collected in this test are more accurate than precise. Determining accuracy and precision is, of course, subjective. If these measurements were taken by a surveying company, the resulting precision and accuracy values would be considered failure by survey standards. However, if these waypoints were referencing an object such as the location of a fire hydrant or electrical unit box, they would be much more suitable. Finally, in terms of bias, many factors impact the results. How good is the GPS unit? Are there any satellite connection interference variables such as buildings or weather? Is the user holding the unit consistently in one position? These can all play a role in how data is collected.

I look forward to sharing my future work with you all and as always...

~Map On!

Friday, July 5, 2019

GIS Programming - Module 7 Lab

Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32

>>> print ("Hello Everyone!")

Hello Everyone!

I cannot believe it but this was the final lab of my GIS Programming Class! For this weeks lab material, I worked completely with raster image data which is hands down my favorite type of data in GIS (sorry vector...). This week I was tasked with creating a raster dataset from two existing rasters. The final output raster needed to meet the following conditions an include the following features:

1. Reclassified values of forest landcover (41,42, and 43). This only shows forested land.
2. Highlight elevation areas with a slope value greater than 5 degrees and less than 20 degrees.
3. Highlight elevation areas with an aspect-value greater than 150 degrees and less than 270 degrees.

Once the parameters of my script had been defined, I needed to create an overall structure of how my script would be written. This was how I designed my script.

Start
>>>import necessary modules
>>>import arcpy.sa (spatial analyst module)
>>>set outputs and overwrite parameter to true (print success message)
>>>conditional if statement that won't run if spatial analyst extension is not enabled
>>>reclassify values of 41,42, and 43, to a value of '1'
>>>set good slope value condition
>>>set good aspect value condition
>>>combine 3 rasters
>>>save final combined raster (print success message)
>>>Else portion of statement that prints message saying saptial analyst isnt enabled.
Stop

As you can see from the results above, my script ran correctly with the print messages spread throughout to give the user updated progress as the script runs. If the spatial analyst extension was not enabled, the else statement message would have printed instead of the script running through completion. The results of my final raster also turned out successful. The areas in red are those that are all suitable according to the parameters I was given. All are forested areas that have a slope between 5 and 20 degrees and an aspect between 150 and 270 degrees. I hope you have all enjoyed this journey with me through this course, I have learned so much! Thank you for taking the time to keep updated on my pursuits and until next time...

~Map On!

Wednesday, June 26, 2019

GIS Programming - Module 6 Lab

Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32

>>> print ("Hello Everyone!")

Hello Everyone!

It's so hard to believe that I just completed my second to last GIS Programming module lab! Time sure has flown by in this course. This weeks lab in addition to some other life events really had me stumped in the beginning but after the dust settled and about 6 cups of coffee later the results proved successful! For this weeks lab and lecture, I learned all about geometries. When looking at the geometries of features in GIS there is a hierarchy of understanding that can really help you. The first and highest level is a feature. This essentially is each row in the attribute table. In this weeks lab, I worked with a rivers shapefile from a Hawaii dataset so for this example, each feature would be a stream/river in its entirety. The next level of the hierarchy is an array. An array is essentially the collection of points/vertices that make up a feature. An example would be that a specific feature has an array of 15 vertices. Finally, the last level of the hierarchy is the individual point/vertex. These are usually expressed in the (X, Y) vertex format. Essentially the structure is as follows:

Feature > Array > Vertex

For this weeks lab, I was tasked with working with the aforementioned geometries. I was given a shapefile containing river features from Hawaii and was tasked with writing the geometries of each feature to a newly created TXT file. For my text file, I needed individual lines that provided me with the following information: Feature ID, Vertex ID, X Point, Y Point, and Feature Name. In total there were 25 features in my data and 247 total vertices that I had to list with their respective X and Y points and feature names. Before I get to the results, I would like to share the basis of my code so you can understand how I got my results.

~To Do:

1.     Set my environment parameters
2.     Create a new TXT file that’s writable
3.     Create a search cursor that calls on OID (Object ID), SHAPE, and NAME
4.     3 For loops:
a.      First: Iterates Rows
b.      Second: Iterates Array
c.      Third: Iterates Vertices
5.     Print and Write my results in the script and new TXT file
a.      Feature #, Vertex #, X Point, Y Point, Feature Name
6.     Delete row and cursor variables
7.     Close file access

Start
>>>import statements
>>>environment statements (workspace and overwrite)
>>>define feature class
>>>open new writable file
>>>define the cursor
>>>for loop to retrieve rows
>>>for loop to retrieve arrays
>>>for loop to retrieve vertices
>>>print and write to the newly created TXT file
>>>delete row and cursor variables
>>>close file
End

My results turned out better than expected (below):

As you can see, new lines were written to my TXT file starting with Feature 0 then iterating through each Vertex in the array providing you with the X point, Y point and the name. Once Feature 0's (Honokahua Stream) vertex array had been iterated through, Feature 1 (Honokowai Stream) was iterated through next until all 247 vertices were complete for the 25 features total. Overall implementing the nested for loops in my script was the toughest part and caused the most hangup for me. The final module will be one of my favorites in this class as it pertains to Raster data!

Wednesday, June 19, 2019

GIS Programming - Module 5 Lab

Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32

>>> print ("Hello Everyone!")

Hello Everyone!

For this weeks lab in GIS Programming, we have been working with data exploration and manipulation using Python in both ArcGIS Pro and Spyder. For this weeks lab, I was assigned to create a multi-part script that carried out several data tasks. First, I was to create a new file geodatabase. Then, I had to copy the original data into the new file geodatabase. Once the data had been copied into the new file geodatabase, I was tasked with creating a search cursor using a SQL query 'where' where clause that returned and printed 3 pieces of information, the name, feature type, and population from the year 2000 of cities that all had a feature type of 'County Seat'. Finally, I took the information from my search cursor and constructed a python dictionary that when printed shows the cities that met the SQL query search criteria in the format of 'City' : Population format. To break things down further I have created a pseudocode block to conceptualize my process.

Start
>>> Create new file geodatabase (fGDB) named mod5.gdb for data storage. Print result messages.
>>> Create a feature class list and copy features from data to new fGDB (use basename property to remove the .shp extension for compatibility). Print result messages.
>>> Create a search cursor with a SQL 'where' clause and for loop to select cities that have a feature type of 'county seat'. Print results in the form of clustered messages (City name, Feature Type, and Population from 2000).
>>> Create a new dictionary with a for loop that when printed returns a compiled dictionary list of all the cities and their populations that meet the search cursor criteria (ex. 'Townsville : 5000'). Print result messages and newly created dictionary.
Stop

To see my code in action, I have clustered the results together and will explain each section below. I decided that separating each running portion with a new line of '*' would make the results easier to read.

The picture above shows the results of my script being run. The first section of the code successfully creates a new file geodatabase (fGDB) and lets the user know that the process was started and completed successfully with the start and finish time. The second portion of the code copies the eight shapefiles over to the newly created fGDB. For this portion of the code, it was essential that I used the basename property when copying the features over as shapefiles are not compatible with file geodatabases. By using the basename property, 'airports.shp' now becomes 'airports' and is able to be copied to the new fGDB as a feature class. As you can see, the script prints start and finish messages after each feature is copied and lets the user know the task has been successfully completed. The third portion of the code is where I implement my search cursor. As you can see, a message displays showing that the search cursor process has started. By incorporating a 'where' clause SQL statement, the search cursor returned only the name, feature type, and population for the year 2000 of the cities with a feature type of 'County Seat' as shown in the results. Once again, a message prints informing the user the task has been completed successfully. Finally, the script informs the user that a new dictionary named county seats is being created and upon successful completion prints the runtime message and the completed dictionary in the {'City' : Population}format with city names being the dictionary keys and population number being the dictionary values. At the end of the script, a message prints informing the user that the script has finished running in its entirety.

Overall I really enjoyed this weeks lab and am so happy with the progress I have been making. One hangup this lab gave me was with the creation of my dictionary. I found that whenever I tried to use the required loop to populate my empty dictionary, it would only ever print the empty dictionary. To solve this issue, I called my search cursor a second time and the dictionary appended itself perfectly. Thank you for tuning in this week and I look forward to sharing my future work with you!

Wednesday, June 12, 2019

GIS Programming - Module 4 Lab

Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32

>>> print ("Hello Everyone!")

Hello Everyone!

This weeks lab focuses on using geoprocessing tools through two streamlined methods: Model Builder in ArcGIS Pro, and Python Scripting through Spyder. For this lab, I was tasked with creating two projects. The first project was to determine suitable farmland in a study area using Model Builder. To accomplish this task, I first clipped the soil data to my basin (study area). I then selected all soil features that were not suitable for farming based on an attribute value. Finally, I used the erase tool to erase the features in my soil data that were not suitable for farming. It is essential to know that Model Builder provides a means for users to string together multiple tools and allows them to bypass running each tool individually. My results (below) show that the soil features not suitable for farming were removed.

Initial Clipping

Final Results With Unusable Soils Removed

The second part of the lab required me to run various geoprocessing tools in the Python environment. For this project, I was tasked with three primary objectives. I first had to add XY Coordinate data to a shapefile containing hospital data. I then had to add a 1000 meter buffer around the hospital features. Finally, I had to dissolve the buffers in the new hospital shapefile to create one feature. This portion of my project was all completed in Spyder. The results below show that each tool function of my script ran successfully and with the addition of the GetMessage() function, I was able to print the run-time messages after each portion was run.

Wednesday, June 5, 2019

GIS Programming - Module 3 Lab

Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32

>>> print ("Hello Everyone!")

Hello Everyone!

This weeks lab was all about Debugging! Debugging is probably one of the most important aspects of Python as it allows you to catch all the errors in your code that are causing it to fail. For this weeks lab, I was given three template scripts that contained errors. Two of these scripts required me to fix the errors within them, and the third required me to write a try-except statement that bypassed the error in the first part of the script and then run the second part. In Spyder, the Debugging tool is invaluable. With the ability to go line by line and find errors within your code is essential to ensure you're catching and proofing every line of code. For the first two scripting examples, this is the flow methodology I used:

Start
>>>Look over the code to find individual errors

->>>Run the code and find any errors missed individually
or
->>>Run the debugging tool line by line and fix errors.

>>>Run the successfully fixed code and get desired results
Stop

Once the debugging process had been completed, the scripts ran successfully with these results:

The first two examples contained multiple case-sensitive errors and attribute name errors that cause the interpreter to get hung up on as Python is very case sensitive.

For the third script, I had to write a try-except statement that would run a part of code even with an apparent error/exception. For this portion my methodology was as follows:

Start
>>>Look over the code to find errors within Part A
>>>Run the code using the debugging tool to ensure the error is where it is
>>>Write a try-except statement that will print the error exception message without hang up
>>>Ensure Part B of script runs
>>> Run both parts getting the exception message and Part B success.
Stop

As you can see in the results below, the script ran as intended, printing the Exception/Error message of Part A without hanging up the script and successfully running Part B.