Wednesday, September 18, 2019

Special Topics in GIS - Module 1.3: Data Quality - Assessment

Hello Everyone!

For this weeks lab, we focused on the assessment of data. The lab that I will be sharing with you this week was all about comparing the completeness of two different road shapefiles in the state of Oregon. For this lab, the first level of assessment was comparing the overall completeness of each road dataset by comparing the sum values. Out of the two road datasets I was given (TIGER Lines and Jackson County, OR Centerlines) the TIGER lines are more complete from by over 500 kilometers. Once the overall completeness assessment had been completed, I was then tasked with assessing completeness within a grid. This task would essentially take a grid of 297 cells and compare the total length of roads within each cell. To find these values, I used a tool in ArcGIS Pro called 'Summarize Within'. This tool essentially allows you to find out various levels of information regarding features within other features. In this case, I was looking at the sum of roads for both datasets within each grid. Before I could run my analysis, I needed to clean up the data a little. The TIGER Lines were in a different projection system that the rest of my data, so I reprojected them to match the other data. I also needed to clip my road features to the extent of my grid so that no roads would be outside my study area. I then ran my summarize within to get the sum of road segments in each cell. Finally, I needed to find the percentage of completion for each cell. To achieve this, I used a standard percent difference calculation that gave me both a positive and a negative percentage. Below is the map of my data without the roads to avoid excess map clutter:


As you can see on my map, there are both grid values of positive and negative percentage values. Areas with positive percentages increasing along the blue color ramp portion indicate cells where the sum of the County Center Line roads is greater than the sum of the TIGER Lines (higher percentage completion of County than TIGER). Cells that move up the red portion into the negative percentages indicate cells that have a greater sum of TIGER Line roads compared to the sum of the County Center Line roads (higher percentage completion of TIGER than County). Within this data, it should also be noted that there are two specific grids that are special. There is one cell that contains no roads for either of the datasets and is marked as gray with a 'No Data' attribute. Second, there is one cell (darkest red) where TIGER lines are 100% completion because there is no County road data within that cell.

No comments:

Post a Comment