rachelbratek.files.wordpress.com · web viewit can give a better understanding of how large of an...

9
Supervised vs. Unsupervised Classification of HANA By: Rachel Bratek

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: rachelbratek.files.wordpress.com · Web viewIt can give a better understanding of how large of an area will need treatment so they can create a budget for knowing how much money they

Supervised vs. Unsupervised Classification

of HANA

By: Rachel Bratek

Page 2: rachelbratek.files.wordpress.com · Web viewIt can give a better understanding of how large of an area will need treatment so they can create a budget for knowing how much money they

Background

High Acre Nature Area (HANA) is located in Fairport, NY. The land holds a series of created, natural, and restored wetlands that are owned by Waste Management of New England and New York, LLC. Waste Management obtained this land to expand their landfill, but once they determined they did not need the extra space, New York State required them to mitigate the wetlands they filled in; HANA became that area. The created wetlands are managed by following the Clean Water Act of 1972 requirements. Their progress is checked every so often to see if the wetlands’ ecosystem services are at the quality of a natural wetland. The work of the RIT Capstone class is to help the volunteers manage these areas to make sure they are up to code and to create a guide on how to do so.

This project is important because Waste Management and the volunteers of HANA do not know how many acres there are of wetland sitting in the 378 acres of the whole property. They have also not mapped out the classifications for all the other types of land cover that they have within HANA. Past capstone students have left this kind of data incomplete, so there is currently know record of how big each make up of HANA is.

My portion of the project is too look at how a supervised classification various from an unsupervised classification and then determine which process is better for the task. It is important to see which application works better at producing a better accuracy of the land classes of the area. Waste Management and the HANA volunteers can use to know how much land they are managing and the different types of land covers they most have knowledge on for the up keep of the area. HANA currently has a problem with invasive plant species that are endangering the health of the wetlands on the property. This type of data can be used in determining how big the areas of concern are on the property. It can give a better understanding of how large of an area will need treatment so they can create a budget for knowing how much money they will need to spend on treatment processes.

Methods

The first step before any ENVI work can be done is getting the images and data needed to insert into ENVI. The imagery was obtained from NYS GIS Clearinghouse as a NYS Ortho Imagery. The data spans over Monroe and Wayne County, and the images came in as separate files in different segments of the area. Within ENVI, the different files were loaded in and then the mosaic feature was used to fuse each file into one larger picture of the whole area of interest. The first method of classification that was done was a supervised classification within ENVI.

The first step of the supervised classification was to determine what land cover classes exist within HANA, and then a number was assigned to them. Six different classifications were used, which were open water, forested, upland, impervious surface, wetland, and baseball field/open field. The ROI tool was used to input the class number that corresponded with a class type, and training sites were completed for each. The training sites consisted of at least ten polygons for each class to ensure that there was enough variation for the program to be able to determine the differences between the classes. The only exception was the baseball field/open field, which only had two different training sites for it because it was such a small area within HANA. Next, a maximum likelihood classification was done using the ROIs that were created for the classes. Once that was completed, the final image that was created from that was assessed to see if there were any visual errors with it. An accuracy assessment of the maximum likelihood

Page 3: rachelbratek.files.wordpress.com · Web viewIt can give a better understanding of how large of an area will need treatment so they can create a budget for knowing how much money they

was done to see what the percentage of accuracy was for the created map. The hopes were to create an image with more than 65% accuracy.

The first step for the unsupervised classification was to run an ISODATA application on the original map image. The parameters of this process were played around with multiple times to try to create a map with best representation of the classes that exist. The last settings used for the ISODATA were number of classes minimum was set to five and the maximum was set to seven. The maximum iterations were set to eight and the maximum number of pixels in a class was set to a hundred. All other settings were left at their default. ENVI then produced a color-coded map that broke apart the different classes found. The image was examined to determine what classifications were produced. Since some of the classifications were defining the same class type, the combine classes function was used to combine the duplicates. Next, the colors were defined for what classes they represented. Once that was completed, an accuracy assessment of the image was completed.

The two different accuracy assessments were then used to figure out which method created a more accurate map. Finally, in the ROI tools menu, the option of measurement of the different classes was used. It was set to acres for the area of measurement to match the other data already known about HANA.

Results

Figure 1: The image on the left is the Supervised Classification map. On the right is the Unsupervised Classification map. Note that the supervised classification has less class than the unsupervised classification.

Page 4: rachelbratek.files.wordpress.com · Web viewIt can give a better understanding of how large of an area will need treatment so they can create a budget for knowing how much money they

Figure 2: This image is the accuracy assessment for the supervised classification. Note that the overall accuracy was 85.17%.

Figure 3: This image is the accuracy assessment for the unsupervised classification. Note that the overall accuracy was 11.97%.

Page 5: rachelbratek.files.wordpress.com · Web viewIt can give a better understanding of how large of an area will need treatment so they can create a budget for knowing how much money they

Figure 4: This image is of the separability of the supervised classification.

Figure 5: This image is of the measurements of each classification in acres.

Discussion

As can be seen in Figure 1, the supervised classification has fewer classes than the unsupervised classification. The supervised classification is more pleasing to the because it is easier to distinguish the different land cover types from each other, while the unsupervised classification has the classes all mixed together in some areas making it harder to know what is what. In addition, the unsupervised method classifies things wrong because it shows that some classes have multiple colors within it and are used more than once. From examining the unsupervised method’s map, the classifications can be seen as:

Open water: Green Forested: Yellow, Cyan, Blue

Page 6: rachelbratek.files.wordpress.com · Web viewIt can give a better understanding of how large of an area will need treatment so they can create a budget for knowing how much money they

Upland: Blue, Yellow Impervious Surface: Cyan, Magenta, Green Wetland: Cyan, Magenta Open Field: Yellow

The map is very confusing to use when classifying what land cover exists in HANA. As seen in Figure 3, the overall accuracy of the unsupervised method was only 11.97%, which is nowhere near the 65% that was hoped for. Even after six different attempts of changing around the parameters, the map was not able to become more accurate. This map is not a good representation of the site.

As seen in Figure 2, the supervised method had an overall accuracy of 85.17%, which is well above the 65% that was hoped for. As shown in Figure 1, it can be seen how easy it is to tell apart the six different classes that are found at HANA, which were open water, forested, upland, impervious surface, wetland, and baseball field/open field. As shown in Figure 2, it can be seen that the kappa coefficient is 0.8193, which shows that the map has a good accuracy representation. A kappa coefficient close to one is what is what the user wants to try to achieve. As seen in Figure 4, the separability of the sites was done and the results show that class number six and class number five were the two classes that were the most confused for the other within the supervised classification method.

As shown in Figure 5, the measurements for how large each class is in acres were calculated. The measurements appear to be quiet small for knowing that all of HANA sits on 378 acres of land. It is unknown what happened in those measurements to get acreages that are so small compared to the completely known size.

ConclusionsIn conclusion, the supervised classification worked better than the unsupervised method.

It is unknown what happened to the unsupervised method to make the accuracy so low. The steps were redone six separate times with no better data found in the end. It may have been that the vegetation in areas appeared very similar and did not have much spectral variation from the other, making it hard for the program to differentiate between the classes. There was also a lot of shadow on some part of the images that could have caused the error. While the supervised classification was the better of the two, it may have been better to add more training sites for the baseball field/open field to get a better representation of what those areas look like so that the accuracy could have been even better. It may have also helped to create a class for the clipped areas that appear in black so that they were not confused with the impervious surfaces in the images. It may have been possible to get better measurements for the acreage of each class by trying to find another method to do so than using the tool in the ROI menu.