This is the 4th tutorial. Image classification is a common satellite imagery analysis. Satellite imagery records enormous information on the ground, however, to utilize the ground information, we need to transform it to vector format for bounding attribute information such as type of land, owner of the land, area, perimeter. In the past time, people need to manually identify the parcel on the imagery and then digitalize it. It takes a lot of time to process the data, so the scientists integrate the use of statistics to mitigate human resource in this transformation process. In the previous tutorial, we mentioned each object in the imagery has different spectral characteristics. By identifying the spectral pattern in the satellite imagery, we can generate classes and transform them into the vector format. Look at the example in the figure. It is a scatter plot of red and infrared spectral for each land cover. Each class shows the cluster in the scatter plot. The statistics utilize this characteristic to cluster the spectral and come up with the land cover classification.

Land cover and spectral scatter plot

Fig 1. Land cover and spectral scatter plot

The workflow of image classification describes in the figure below. Firstly, extracting the training dataset by supervised or unsupervised approaches. The trained dataset is the answer for the machine, then it assigns pixels into classifications based on spectral similarity.

Land cover and spectral scatter plot

Fig 2. Classification workflow

In this turtorial, we will deliver:

Common Classification Approaches

Unsupervised Classification

In unsupervised classification, it groups pixel based on their specturm signiture value, and come out with multiple clusters which refer to a type of land cover. The common method includes:

Supervised Classification

Minimum Distance
Land cover and spectral scatter plot
The minimum distance calculates the difference between a pixel and trained sets, then assigns the pixel to the classification with the smallest difference. However, it doesn't consider the distribution of an object and result some absurd classification result.
Maximum Likelihood Classification
Land cover and spectral scatter plot
The maximum likelihood classification assume the spectral is a normal distribution, so it takes the mean value and the standard deviation from trained sets and generate a prosibility distribution. The pixel will be assigned to the classification with most similar spectral distribution.

Implementation of image classification

First of all, download and extract the raster image which we have added in a Zip folder form, called “Clip area”. This is the pre-processed image which you are going to work with in this tutorial. We will show you that how we have pre-processed this Sentinel-2 satellite imagery. You can follow these steps on your personal interests as in this tutorial we don’t have much time to discuss each and everything in such detail. But we will discuss how we did the Pre processing to get ready this imagery for classification. Just go through it for your understanding as we have already provided you the pre-processed imagery.

Before any implementation,we will need QGIS Plugin,If you have not installed the Semi Automatic Classification Plugin then watch our 3rd tutorial video about QGIS Plugins. Another option is to click on the Plugins in QGIS and write a Semi Automatic Classification in the search box, install that plugin if its not already installed in your systems. After installing the plugin, you can see the main menu bar of QGIS “SCP” tab.

To sucessfully implement the image classification, you will need to follow the sections below and operate the QGIS:

  1. Load data
  2. Data preprocessing - Extract area and bands
    Extended sections for understanding how we process the data and no need operation in the QGIS
  3. Data preprocessing - Band Composite
  4. Image Classification
Now, you have to just load the pre-processed image which we have provided to you. You can follow the steps after the pre-processing.

Loading the raster image to QGIS

Click on the raster in the left

Then click on the brows data where your raster data set is. Add your data

Setting the coordinate system of the input data and QGIS
If the loaded imagery is not showed in the QGIS window, it is because of the difference in the coordinate system of the imagery and QGIS. For this purpose, follow the steps below to set the same coordinate system in QGIS as of imagery. First of all, go to setting and then options:

Click on the CRS and make sure that your system has the exact same setting as below, click OK and restart your QGIS and load the same data. Now it will work

This is how your data looks like after adding it to the QGIS

Data preprocessing - Extract area and bands
*This is an extended section for understanding data extraction. Please read the section and nooperation is needed

The imagery that you download often comes with a huge file size, but we don't always need to use the whole imagery to classify. Therefore, we need some steps to extract the certain parts only for classify and analysis. By doing so, we can reduce the burden of computation and enhance the efficiency of workflow.

To do this we need to click on the SCP > click Band set > here you will see menu of the Semi Automatic Classification window.

Click on the refresh band sat and you will see the list of band set loaded to the QGIS as seen below

Now select the bands which you want to clip and work with in QGIS. Here we have selected the first four bands (RGB) because for land cover classification these bands are enough to classify the image. Then click on the plus sign (Add band to Band set) It will create “Band set 1”, as seen below.

After this click on the Preprocessing tool in the left and click on the Clip Multiple Raster. Here you can see the select input band set “1”. It is the same band set which we have created before. You can see the clip coordinates below in which we can enter the coordinates of our area of interest to be clipped or we can directly draw our area of interest inside the window. Here we will draw our study area as we don´t have the coordinates of our area of interest.

By clicking on one side of the image and then right click on other side which you want to clip a rectangular shape of area will be show that defines coordinates, which we can clip as below. Click run and it will clip the study area

The area of interest with the selected bands has been clipped and can be seen below and new layers of this clip area had been added automatically to QGIS layers window

Data preprocessing - Band Composite

Now we are going to convert the band set into the Surface Reflectance. For this purpose, we will click on SCP > click on Preprocessing > click on Sentinel-2, because we are working with Sentinel-2 data

In the input directory select the directory which your clipped data is. In the Meta data select the directory where your meta data of the imagery is (it is in the same folder as an XML Document which you have extracted after downloading). Click on the Apply DOS1 Atmospheric Correction. Uncheck the create band set and use band tools. Click Run and give the output location. Running will be started

The output can be seen as RT file

Remove the rest of the bands and the Clipped layers and just select the RT layers to define the Band set in which we have to define the combination of band set in order to classify this imagery.

Click on SCP > click Band set > click on Refresh list

Add these to the refreshed bands to the Band set 1. But first delete the Band set one bands which we have added during clipping the study area. Go to the right-side menu below and click on the reset. It will delete the previous bands from QGIS

Now select all the bands and add it to the Band set 1, as shown below

Click on the Band set tools at the bottom and select Sentinel-2 as shown below

The centre wavelength has been defined here and can be seen which is useful for the spectral signature. That’s it for the band set

Now we have defined the band set and we can hide the existed layers or bands in the QGIS by unchecking it.

Go to the SCP bar above which is RGB, to see the colour composite. Select the band set 3-2-1 which created a Virtual Band Set 1, a colour composite image as shown below. We can change the colour composite such as 2-3-1 to see the false colour

Image Classification
Create Training samples

Now its time to create training input. To do so click on Training Input > click Create a new training input as shown below. Give a name and an output location to save it

Now we will create training samples for different classes in this imagery. We will create four classes in this image namely:

  1. Water
  2. Vegetation
  3. Baren Soil
  4. Settlements
Here we have two types of classes - Macro class and Class. The difference between them is that Macro class is the head class of the class.

So, lets start with creating the training samples for the Region of Interests ROI. Click on the Create a ROI polygon. There are two ways to do so this, one is to create it manually by drawing polygons for each class and the other is to create it automatically just by playing with the Spectral Distance Value

First, we will do it manually

It will activate the creating ROI manual polygon. Left click to start the polygon and right click to end it. Below you can see we have a created our first polygon for one class

Now save this training sample by setting the Macro class and Class information and save it

By saving it will save the information such as color, ID and type. In the image below we can see that Forest is the Class of Macro Class (Vegetation). Try to take as much training samples you can as it will make better the accuracy of your classification, at least take 10 to 20 training samples. You can create more training samples of the same class by using the same method except by not changing its Macro Class ID (MC ID) and Class ID (C ID). These should be the same for one class no matter how many training samples you are creating

We can see the information about the Macro Class and Class. In the image below Vegetation is the Macro Class of the Class (Forest)

Second method of creating training samples of the ROI by using the ROI algorithm. It is much easier and quicker then drawing a polygon for each ROI. This algorithm selects all the pixel as a ROI in an image with the spectral distance less than the specified limit such as 0.1 in this case

Spectral Signature:

It is the amount of reflected energy measure by the remote sensing technology. Different objects have different spectral response or values such as Water bodies have different spectral values then Vegetation and urban areas. So, by using this spectral signature values we can classify an image into different classes.

By using the ROI algorithm select the spectral distance of your choice for example in this case we set it 0.08. After setting the Spectral distance click on any where in the image as shown below. It will make polygons of the region with the spectral distance less then 0.08 in that limited area

Set its Macro class and Class name along with their ID´s. ROI has been created for the Built up class in the image as a black region can be seen below. So it quickly create the ROI´s on the bases of the spectral distance values

The spectral distance has to be set on by doing your own observation. Just play with its value increase it or decrease it and observe that which area it is taking in ROI`s and which is skipping it. Try to set its value as fit as possible so that it can take the exact area of ROI which it should. By changing its spectral distance value, you can see the result for the water bodies as below

Again, take at least 10 to 20 training samples for this class by following the same instructions above.

Follow the same steps and create another class with the Macro class name (Vegetation) and Class name (Agriculture)

Create a Macro class with the name Barren Soil and Class with the name Low Vegetation

Take 10 to 20 training samples for this class too by not changing its Macro and Class ID.

That’s all we have finished the training samples for the ROIs by creating these four Macro Classes.

Now we can see each Class spectral values by selecting all the Classes and click on the Add highlighted signatures to spectral signature plot

After clicking on the highlighted signature, you can see each class signature in the plot. We can show and hide spectral signature plot of each class by checking and unchecking the box. We can compare and assess the spectral distance by observing the plot

We can also display the whole statistics of the spectral distance by clicking on Calculate spectral distances

We can change colour for each class by double clicking on the colour of the class

After changing the colours, we can see each class

All is set now we are going to use the classification algorithm. There are three algorithms here from which we can select. Here we will select the Maximum Likelihood Classification.

Maximum Likelihood Classification

This algorithm used to predict the class label y that maximizes the likelihood of our observed data x (Training samples of the ROI).
By activating the Classification preview we can see our image classification classes. Click on any area in the image and it will show you the classification preview. Classes can be seen on the left side. We have used Class ID which is the subclass of the Macro Class. Classification preview will help us in the verification of the classified image. If we are satisfied with the classification preview, we can run it and the image will be classified otherwise select the ROIs again for a better and accurate classification. Once we are satisfied with the classification, we can create an output of the classification by clicking on the Run tool and give output location and name

We can also select the Macro class ID to show the classification preview of the Macro class.

Once we are satisfied with the classification, we can create an output of the classification by clicking on the Run tool and give output location and name

It is saved as a tiff format which can be open in QGIS. In this classification there could be some accuracy limitation such as assigning the wrong class to an area. As you can see that Built up and Water class is appearing too much in the classified image. It is because that we have only taken one ROIs and training sample for each class. We did so because it would take much time for you to follow each ROI creation. Therefore, we have just showed you how to do it accurately by creating a greater number of ROIs. Remember the more the number of training samples the better is the classification.

You can save it in many formats

Click on the format and you will see a list of formats which you can export this output

Congradulation!
You've finished the tutoriol. We hope that we have added some new skills to your existed knowledge. If you have any question regarding our tutorials please do not hesitate to ask. You can ask question in general also if you are interested in our topic. We will try our best to answer to your question. Thank you and best of luck for your bright future.

Reference