GIS – How to Check a Shapefile’s Data in C#

Recently I needed to learn how to read data from ESRI shapefiles in C#. The goal was to read in a shapefile, read in a list of points, and see if any of the points fell within any of the shapes. I’ve never done anything like this and it felt like every piece of information was hard fought. The resulting 175 lines of code took me about 20 hours to write. Considering that, hopefully I can help someone else out. I felt like the result was extremely cool and plenty useful.

Globe

Step 1 – Learning about ESRI shapefiles:
If you are interested, here’s the wikipedia link: http://en.wikipedia.org/wiki/Shapefile

Essentially a shapefile is not composed of a single file, but is a collection of files including a list of shapes, a db2 database, and potentially a projection file that explains how to convert latitude and longitude to the coordinate system used in the shapefile.

Typically the coordinate system is not in latitude and longitude, but universal transverse mercator or UTM. More insight into the coordinate system does not fall within what I want to cover in this post.

Step 2 – Get a shapefile:
For this example, I’m using a shapefile for New York City: NYC Shapefiles

Specifically I used the shapefile of the boroughs. In the end this would allow me to extend the functionality of the program to report which borough the point was in instead of just knowing that it fell within New York City.

Step 3 – Download C# libraries:
This is a complicated area, and it makes more sense to use prebuilt libraries instead of spending a lot of time reinventing the wheel on this one.

Here are the libraries I used.
DotSpatial – This library parses shapefiles and provides controls for displaying them overlaid on a map.
Proj.Net – This library provides easy projection tools to project coordinates from one coordinate system to another, such as lat/long to UTM.

Step 4 – Read in Point Data
We need to be able to read in point data in order to check points against the shapefile.

I used a CSV file where each row contained a study ID, latitude, and longitude.

Here is the class I used to store the point data:

public class StudyInfo
{
    public int StudyID { get; set; }
    public double Latitude { get; set; }
    public double Longitude { get; set; }

    public StudyInfo(int studyID, double latitude, double longitude)
    {
        StudyID = studyID;
        Latitude = latitude;
        Longitude = longitude;
    }

    public static StudyInfo CreateFromStrings(String[] tokens)
    {
        int studyID = Int32.Parse(tokens[0]);
        double latitude = Double.Parse(tokens[1]);
        double longitude = Double.Parse(tokens[2]);
        return new StudyInfo(studyID, latitude, longitude);
    }
}

There is a constructor allowing the developer to manually set the point data, but since I was reading them from a CSV file (exported from a DB query) I also created a method to parse a string array containing study ID, lat, and long, and return a new StudyInfo object.

The CSV reading was relatively simple:

var studyData = new List<StudyInfo>();

using (StreamReader sr = new StreamReader(csvFile))
{
    string line;
    while ((line = sr.ReadLine()) != null)
    {
        var tokens = line.Split(',');
        var study = StudyInfo.CreateFromStrings(tokens);
        if (study.Latitude > 40 && study.Latitude < 42
             && study.Longitude > -75 && study.Longitude < -70)
                studyData.Add(StudyInfo.CreateFromStrings(tokens));
    }
}

The check here is a check against a large bounding box for NYC. The input file contained approximately 60,000 records. If the later comparison was done against all of these points then the program slowed down significantly. This is a basic check which was able to cut the actual search down to approximately 5,000 records.

After this code is run, all of the points are contained within studyData and are ready to be tested.

Step 5 – Convert points to UTM
Now the points need to be converted to UTM. This is done by reading the Well Known Text (WKT) from the projection file in order to figure out how to properly transform the coordinates.

var wktstring = "PROJCS[\"NAD_1983_StatePlane_New_York_Long_Island_FIPS_3104_Feet\",GEOGCS[\"GCS_North_American_1983\",DATUM[\"D_North_American_1983\",SPHEROID[\"GRS_1980\",6378137.0,298.257222101]],PRIMEM[\"Greenwich\",0.0],UNIT[\"Degree\",0.0174532925199433]],PROJECTION[\"Lambert_Conformal_Conic\"],PARAMETER[\"False_Easting\",984250.0],PARAMETER[\"False_Northing\",0.0],PARAMETER[\"Central_Meridian\",-74.0],PARAMETER[\"Standard_Parallel_1\",40.66666666666666],PARAMETER[\"Standard_Parallel_2\",41.03333333333333],PARAMETER[\"Latitude_Of_Origin\",40.16666666666666],UNIT[\"Foot_US\",0.3048006096012192]]";

var latlongwkt = "GEOGCS [\"Longitude / Latitude (NAD 83)\",DATUM [\"NAD 83\",SPHEROID [\"GRS 80\",6378137,298.257222101]],PRIMEM [\"Greenwich\",0.000000],UNIT [\"Decimal Degree\",0.01745329251994330]]";

ICoordinateSystem nycCS =
    CoordinateSystemWktReader.Parse(wktstring) as ICoordinateSystem;

ICoordinateSystem baseCS =
    CoordinateSystemWktReader.Parse(latlongwkt) as ICoordinateSystem;

var ctFactory = new CoordinateTransformationFactory();
var transformer = ctFactory.CreateFromCoordinateSystems(baseCS, nycCS);

Rather than read in the projection file, I copied the text out of it and placed it directly in the code. This means there is no need to check the file, and there is no reliance on the file existing.

The second string is a coordinate system for latitude and longitude based on a standard called GRS 80. This coordinate system is listed in the WKT provided by the projection file and can be found in the wktstring field.

ICoordinateSystem is provided by Proj.Net and stores a coordinate system. We are parsing the WKT in order to create the two systems. This is significantly easier than creating the programmatically without using WKT.

After that we need to create a transformer — again provided by Proj.Net. This transformer transforms from the first provided coordinate system to the second — in this case baseCS to nycCS. This is because it makes more sense to transform individual lat/long points to UTM than it does to convert shapes, which contain upwards of 1000 points, into lat/long coordinates.

Finally, we iterate through our list of studies, perform the transformation, create a new Coordinate (provided by DotSpatial), and test it with a point in polygon algorithm. If the point is within the shapes provided by the shapefile add it to the list of NYC studies.

nycBoroughs is the shapefile object provided by DotSpatial. nycBoroughs.Features is a collection of shapes within the shapefile. In this case it contains shapes for each of NYC’s five boroughs.

var nycStudies = new List<StudyInfo>();

foreach (var study in studyData)
{
    double[] fromPoint = { study.Longitude, study.Latitude };
    double[] toPoint = transformer.MathTransform.Transform(fromPoint);
    var studyLocation = new Coordinate(toPoint[0], toPoint[1]);
    if (IsPointInShape(studyLocation, nycBoroughs.Features))
        nycStudies.Add(study);
}

This results in a list of all the studies that took place within NYC. Mission accomplished.

Odd Bits
You might be thinking, “Wait! How did you open and parse the shapefile?” DotSpatial provides that functionality with a single line of code.

var nycBoroughs = Shapefile.OpenFile(@"C:\NYC Shape\nybb_13a\nybb.shp");

You also might want more details on the check to see if the point is within the shape.

private static bool IsPointInShape(Coordinate point, IFeatureList features)
{
    foreach (var feature in features)
    {
        if (!PointInBoundingBox(point, feature.Coordinates))
            continue;
        if (PointInPolygon(point, feature.Coordinates.ToArray()))
            return true;
    }

    return false;
}

We iterate through all of the shapes provided by the shapefile, check the provided point against a smaller bounding box, and then perform the point in polygon test if it lies within the bounding box.

Bounding Box Check:

private static bool PointInBoundingBox(Coordinate point, IList<Coordinate> vertices)
{
    double minX = vertices.Min(v => v.X);
    double maxX = vertices.Max(v => v.X);
    double minY = vertices.Min(v => v.Y);
    double maxY = vertices.Max(v => v.Y);

    double x = point.X;
    double y = point.Y;

    bool outsideBox = x > maxX || x < minX || y > maxY || y < minY;

    return !outsideBox;
}

And the point in polygon check — which I took from a stackoverflow answer.

Point in Poly Check:

private static bool PointInPolygon(Coordinate p, Coordinate[] poly)
{
    Point p1, p2;

    bool inside = false;

    if (poly.Length < 3)
    {
        return inside;
    }

    Point oldPoint = new Point(
    poly[poly.Length - 1].X, poly[poly.Length - 1].Y);

    for (int i = 0; i < poly.Length; i++)
    {
        Point newPoint = new Point(poly[i].X, poly[i].Y);

        if (newPoint.X > oldPoint.X)
        {
            p1 = oldPoint;
            p2 = newPoint;
        }
        else
        {
            p1 = newPoint;
            p2 = oldPoint;
        }

        if ((newPoint.X < p.X) == (p.X <= oldPoint.X)
        && ((long)p.Y - (long)p1.Y) * (long)(p2.X - p1.X)
         < ((long)p2.Y - (long)p1.Y) * (long)(p.X - p1.X))
        {
            inside = !inside;
        }

        oldPoint = newPoint;
    }

    return inside;
}

Full source code here: GeoFence

Summary:
This was a really fun problem to work on. I hope someone is helped by this. I know that finding a resource like this before tackling the problem myself would have saved me a lot of time. Most information online is regarding overlaying a shapefile on a map, but often there’s a lot that can be done with the data without needing a map.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s