Misanthrope's Thoughts

What is the cause for the degradation of environment?
Capitalism, corruption, consuming society? - OVERPOPULATION!
Please, save the Planet - kill yourself...

Wednesday, October 1, 2014

Interactive Visualisation of the Profitable Amount of Waste to Dispose Illegally

"Wow!" - I said to myself after reading R Helps With Employee Churn post - "I can create interactive plots in R?!!! I have to try it out!"

I quickly came up with an idea of creating interactive plot for my simple model for assessment of the profitable ratio between the volume waste that could be illegally disposed and costs of illegal disposal [Ryabov Y. (2013) Rationale of mechanisms for the land protection from illegal dumping (an example from the St.-Petersburg and Leningrad region). Regional Researches. №1 (39), p. 49-56]. The conditions for profitable illegal dumping can be describes as follows:



Here: k - the probability of being fined for illegal disposal of waste;
P - maximum fine for illegal disposal of waste (illegal dumping);
V - volume of waste to be [illegally] disposed by the waste owner;
E - costs of illegal disposal of waste per unit;
T - official tax for waste disposal per unit.

The conditions for the profitable landfilling can be described as follows:

Here: V1 - total volume of waste that is supposed to be disposed at illegal landfill;
Tc - tax for disposal of waste at illegal landfill per unit;
P1 - maximum fine for illegal landfilling;
E1 - expenditures of the illegal landfill owner for disposal of waste per unit.

Lets plot the graphs (with some random numbers (except for fines) for a nice looking representation) to have a clue how it looks like.


Note that there is a footnote (this post provides nice examples on how to do it) with the values used for plotting - it is important to have to have this kind of indication if we want to create a series of plots.

Now I will show you the result and then will provide the code and some tips.

Playing with the plot

Tips and Tricks

Before I will show you code I want to share my hardly earned knowledge about nuances of the manipulate library. There are several ways to get static plot like that using ggplot, but some of them will fail to be interactive with manipulate.

  1. All the data for the plot must be stored in one dataframe.
  2. All data for plots must be derived from the dataframe (avoid passing single variables to ggplot).
  3. Do not use geom_hline() for the horizontal line - generate values for this line and store them inside dataframe and draw as a regular graph.
  4. To create a footnote (to know exactly which parameters were used for the current graph) use arrangeGrob() function from the gridExtra library.
  5. Always use $ inside aes() settings to address columns of your dataframe if you want plots to be interactive

The Code

library(ggplot2)
library(grid)
library(gridExtra)
library(manipulate)
library(scales)
library(reshape2)

## Ta --- official tax for waste utilisation per tonne or cubic metre.
## k --- probability of getting fined for illegal dumping the waste owner (0 V, y <=> E

max_waste_volume <- 2000 
Illegal_dumping_fine_P <- 300000
Illigal_landfilling_fine_P1 <- 500000
Fine_probability_k <- 0.5
Official_tax_Ta <- 600

# mwv = max_waste_volume
# P = Illegal_dumping_fine_P
# P1 = Illigal_landfilling_fine_P1
# k = Fine_probability_k
# Ta = Official_tax_Ta

updateData <- function(mwv, k, P1, P, Ta){
    
    # creates and(or) updates global data frame to provide data for the plot
    
    new_data <<- NULL
    new_data <<- as.data.frame(0:mwv)
    names(new_data) <<- 'V'
    new_data$IlD <<- k*P1/new_data$V
    new_data$IlD_fill <<- new_data$IlD
    new_data$IlD_fill[new_data$IlD_fill > Ta] <<- NA # we don't want ribbon to fill area above Official tax 
    new_data$IlL <<- Ta-k*P/new_data$V
    
    new_data$Ta <<- Ta
    new_data$zero <<- 0
    dta <<- melt(new_data, id.vars="V", measure.vars=c("IlD", "IlL", "Ta"))
    dta.lower <<- melt(new_data, id.vars="V", measure.vars=c("IlD_fill", "zero", "Ta"))
    dta.upper <<- melt(new_data, id.vars="V", measure.vars=c("Ta", "IlL", "Ta"))
    dta <<- cbind(dta, lower=dta.lower$value, upper=dta.upper$value)
    dta$name <<- factor(NA, levels=c("Illegal landfill owner's\nprofitable ratio",
                                    "Waste owner's\nprofitable ratio", 
                                    "Official tax"))
    dta$name[dta$variable=="IlD"] <<- "Illegal landfill owner's\nprofitable ratio"
    dta$name[dta$variable=="IlL"] <<- "Waste owner's\nprofitable ratio"
    dta$name[dta$variable=="Ta"] <<- "Official tax"
}

updateLabels <- function(k, P1, P, Ta){
    
    ### creates footnote caption for the plot
    
    prob <- paste('Fining probability = ', k, sep = '')
    landfilling_fine <- paste('Illegal landfilling fine = ', P1, sep = '')
    dumping_fine <- paste('Illegal dumping fine = ', P, sep = '')
    tax <- paste('Official tax = ', Ta, sep = '')
    note <<- paste(prob, landfilling_fine, sep = '; ')
    note <<- paste(note, dumping_fine, sep = '; ')
    note <<- paste(note, tax, sep = '; ')
    note
}


plotDumping <- function(mwv, P, P1, k, Ta){
    
    ### this function draws the plot
    
   # initialise plot data
    updateData(mwv, k, P1, P, Ta)
    updateLabels(k, P1, P, Ta)
    
    # draw the plot
    profit <- ggplot(dta, aes(x=dta$V, y=dta$value, ymin=dta$lower, ymax=dta$upper, 
                              color=dta$name, fill=dta$name, linetype=dta$name)) +
        geom_line(size=1.2) + 
        geom_ribbon(alpha=.25, linetype=0) +
        theme(axis.text.x = element_text(angle=0, hjust = 0),
              axis.title = element_text(face = 'bold', size = 14),
              title = element_text(face = 'bold', size = 16),
              legend.position = 'right',
              legend.title = element_blank(),
              legend.text = element_text(size = 12),
              legend.key.width = unit(2, 'cm'),
              legend.key.height = unit(1.2, 'cm'))+
        scale_linetype_manual(values=c(4, 5, 1)) +
        scale_fill_manual(values = c("#F8766D","#00BFC4",NA)) +
        scale_color_manual(values = c("#F8766D","#00BFC4", '#66CD00')) + 
        labs(title="Profitable ratio between the volume \nof illegally disposed waste \nand costs of illegal disposal of waste",
             x="Waste volume, cubic meters",
             y="Cost per cubic meter, RUB") +
        xlim(c(0, max(new_data$V)))+
        ylim(c(0, Ta*1.5))
        
    
    # add a footnote about paramaters used for the current plot
    profit <- arrangeGrob(profit, 
                          sub = textGrob(note, 
                                         x = 0, 
                                         hjust = -0.1, 
                                         vjust=0.1, 
                                         gp = gpar(fontface = "italic", fontsize = 12)))
    
    # show plot
    print(profit)

}


simDumping <- function(max_waste_volume = 2000, 
                       Illegal_dumping_fine_P = 300000,
                       Illigal_landfilling_fine_P1 = 500000,
                       Fine_probability_k = 0.5,
                       Official_tax_Ta = 600) {
    
    ### this function creates interactive plot
    
    max_waste_volume <<- max_waste_volume 
    Illegal_dumping_fine_P <<- Illegal_dumping_fine_P
    Illigal_landfilling_fine_P1 <<- Illigal_landfilling_fine_P1
    Fine_probability_k <<- Fine_probability_k
    Official_tax_Ta <<- Official_tax_Ta
    
    manipulate(suppressWarnings(plotDumping(max_waste_volume, 
                           Illegal_dumping_fine_P,
                           Illigal_landfilling_fine_P1,
                           Fining_probability_k,
                           Official_tax_Ta)
                           ),
               
               # set up sliders
               max_waste_volume = slider(0, 50000, 
                                         initial = max_waste_volume, 
                                         step = 100,
                                         label = 'X axis range'),
               Illegal_dumping_fine_P = slider(0, 5000000, 
                                               initial = Illegal_dumping_fine_P,
                                               step = 10000, 
                                               label = 'Illegal dumping fine (P)'),
               Illigal_landfilling_fine_P1 = slider(0, 5000000, 
                                                    initial = Illigal_landfilling_fine_P1, 
                                                    step = 10000,
                                                    label = 'Illegal landfilling fine (P1)'),
               Fining_probability_k = slider(0, 1, 
                                             initial = 0.5,
                                             step = 0.01,
                                             label = 'Probability of being fined (k)'),
               Official_tax_Ta = slider(0, 3000, 
                                        initial = Official_tax_Ta, 
                                        step = 50,
                                        label = 'Official waste disposal tax (T)')
    )
}

simDumping() # for reasons unknown I have to run this function twice to get proper interactive plot

Saturday, September 13, 2014

The Most Hilarious Rendering "Bug" I've Ever Seen

Can you guess the projection that I used to display a world map at the picture below (EPSG code is left there because I doubt someone would recall it anyway)?


It is the World Bonne projection in QGIS (original shp's CRS is EPSG:4326). I already had some fun with Bonne in this post. The world got torn apart at scale 1:75 000 000. At bigger scales map renders normally:



Fortunately this issue can be easily resolved by disabling "feature simplification by default for newly added layers" under Settings -> Options -> Rendering. 

Monday, September 8, 2014

Schema of the Conservation Areas in Leningradskaya Region: Some Notes for Beginner Mappers

Schema of the Conservation Areas in Leningradskaya Region

About a year ago I was asked to create a small (a B5 size) and simple schema of the conservation areas in Leningradskaya region. I did it using QGIS. Here you are the author version of the schema and several notes that might be helpful for a beginner map-maker.

There was a huge disproportion between areas of different objects and both polygon and point markers were needed to show them. I decided to use Centroid Fill option in polygon style to be able to use only one layer (polygon) instead of two (polygon and point). Using Centroid Fill makes points in centres of the small polygons overlap and mask these tiny polygons. 

All the administrative borers were stored in one layer (and there are far more borders than one see here). They are drawn using rule-based symbology so I didn't even need to subset this layer to get rid of the rest of the polygons in this layer.

All the names of the surrounding countries, regions, city and the water bodies are not labels derived from layers, but labels created inside the map composer. It was quicker and their position was more precise which is crucial for such a small schema.

There was a lack of space for the legend so I had to utilise every bit of canvas to place it. I had to use 3 legend items here. One of them is actually overlapping the other and setting a transparent background for the legends was helping with that.

Finally labels for the conservation areas (numbers) were outlined with white colour to be perfectly readable. Some of them were moved manually (with storing coordinates in special columns of the layer) to prevent overlapping with other labels and data.

P.S. Don't be afraid to argue with the client about the workflow. Initially I was asked it manually digitise a 20 000 x 15 000 pixels raster map of the Leningrad region to extract the most precise borders of the conservation areas (and districts of the region). Of course I could do it and charge more money for the job, but what's the point if some of that borders are not even to be seen at this small schema? So I convinced client to use data from OSM and saved myself a lot of time.

Sunday, August 31, 2014

Donetsk Label at OpenStreetMap

Now, when Donetsk became the hot spot of the civil war at the Ukraine I decided to take a glance at the map of the war zone. Of course I used OSM for this. But despite I knew the approximate location of the city I couldn't find it for several minuets. I was confused... The reason was that "Donetsk" label (label of the city which is the administrative center of the Donetskaya region BTW) at the scales smaller than 10 km is suppressed by the label of adjusted town "Makeevka". WTF?! 

"Donetsk" ("Донецьк") label at OSM is suppressed by the label of adjusted town

Thursday, May 29, 2014

Random VS Pseudo Random PROC or Why Does Axe's Win Rate Dropped in 6.81 Patch (Dota 2)

Introduction

Here is an interesting post about 6.81 patch influence on heroes win rate in Dota 2. One point caught my attention: a drastic change in Axe win rate: from 52% to 48%; see image below.


Axe's win rate before and after 6.81 patch (29-th of April)
The only change Axe get in this patch was implementation of the preudo random approach for the determination of Counter Helix skill triggering instead of random. As was guessed in the article:  
"...where you manage to Berserker's Call several enemies, you only have 3.2 seconds of being unconditionally hit by the enemy. In this time frame, the amount of successful Counter Helixes that use PRD is probably lower than with a true random. Hence the decrease in damage output and Win Rate." 
 Ok, nice guess! Lets test it!

Create a simulation to test the hypothesis

Import needed modules
import random
import numpy
from tabulate import tabulate
import collections
Lets create functions to calculate number of Counter Helix PROCs for given number of hits using random and pseudo random approaches. First one will be using random approach (notice that random.randrange() actually generates pseudo random set of values)

def randHelixProc(hits):
  '''
  checks if Counter Helix PROCs (chance = 17%) using random approach
  @param hits: number of times Axe was hit
  @return procs: number of times Counter Helix PROCs
  '''
  procs = 0

  for i in xrange(hits):
    chance = random.randrange(1, 101)
    if chance < 18:
      procs += 1
    else:
      continue

  return procs

I don't know how exactly pseudo random PROCs are designed for Axe or for Dota 2 in general, but they say that the probability of the pseudo random event in Dota 2 in it initial state has lower chance to proc than the stated probability for the event and each time the event is not triggered when it could the chance for this event to occur is increased. Lets say some event triggers with the 50% chance. Suggested pseudo random approach can be realised with 3 consecutive checks that have different chances: 0%, 50% and 100% (average is 50%). When event finally triggers it reset chance for the event to initial value (0% in this case) and it all starts over again.
def pseudoRandHelixProc(hits):
  '''
  checks if Counter Helix PROCs (chance = 17%) using pseudo random approach
  @param hits: number of times Axe was hit
  @return procs: number of times Counter Helix PROCs
  '''
  treshold = 0
  prob_list = [2, 5, 9, 14, 23, 47] # ~17% on average
  procs = 0
  for i in xrange(hits):
    chance = random.randrange(1, 101)
    try:
      success = prob_list[treshold]
    except:
      success = prob_list[5]
    if chance >= success:
      treshold += 1
    else:
      procs += 1
      treshold = 0
  return procs
Check if the chances for PROCs are the same for the both functions. We should have 1700 procs of Counter Helix for 10000 attacs on Axe. Launch 100 simulations of 10000 hits for each function and compute average procs for random and pseudo random approaches.
rhelix_list = []
for i in xrange(100):
  value = randHelixProc(10000)
  rhelix_list.append(value)
numpy.mean(rhelix_list)

>>>1702.79

p_rhelix_list = []
for i in xrange(100):
  value = pseudoRandHelixProc(10000)
  p_rhelix_list.append(value)
numpy.mean(p_rhelix_list)

>>>1702.3
Output difference is negligible for random and pseudo random implementations.
Now its time to create a simulation function.
def simulation(times, hits):
  '''
  Computes average number of PROCs for given hits for random and pseudo random
  approaches and difference between them.

  @param times: number of times simulation will run
  @param hits: number of hits to simulate
  @return: average difference in pocs between random and pseudo random approaches;
           table of simul results
  '''
  # create lists of results
  diff_list = []
  rand_mean_list = []
  p_rand_mean_list = []

  # run simulation
  for hit in xrange(1, hits + 1):

    rand_list = []
    pseudo_rand_list = []

    for t in xrange(times):
      rand = randHelixProc(hit)
      p_rand = pseudoRandHelixProc(hit)
      rand_list.append(rand)
      pseudo_rand_list.append(p_rand)

    # compute statistics and populate lists of results
    rand_mean = numpy.mean(rand_list)
    rand_mean_list.append(rand_mean)
    p_rand_mean = numpy.mean(pseudo_rand_list)
    p_rand_mean_list.append(p_rand_mean)
    diff = rand_mean - p_rand_mean
    diff = round(diff, 2)
    diff_list.append(diff)

  # print average difference in PROCs
  total_diff = sum(diff_list)
  l = len(diff_list)
  print 'average difference:', total_diff/l
  print '#######################################################################################################'

  # create table output for simulation results
  out_dict = {}
  out_dict['(1) cumulative hits'] = range(1, l + 1)
  out_dict['(2) random mean PROCs'] = rand_mean_list
  out_dict['(3) pseudo random mean PROCs'] = p_rand_mean_list
  out_dict['(4) difference'] = diff_list
  out_dict = collections.OrderedDict(sorted(out_dict.items()))
  print tabulate(out_dict, headers = 'keys', tablefmt="orgtbl")
Lets run 100 simulations of 100 consecutive hits. Of course it is possible to run 10000 hits but it is unnecessary since every time the Counter Helix is triggered we jump back to the first row from the table below (result of the simulation). As was already mentioned:
"[if] ...you manage to Berserker's Call several enemies, you only have 3.2 seconds of being unconditionally hit by the enemy"
If Axe was able to catch 3 enemies, together they will hit him about 9-12 times. This means that with the pseudo random approach he will spin 0.3-0.4 times less than with random approach. It will cause him to deal ~(205*0.35)*3 = 215,25 (or 71.75 per hero) less damage in engagement.

So the hypothesis was true - pseudo random approach caused the decrease in damage output on the short time interval even if total number of PROCs during the game stayed the same.
average difference: 0.34
#######################################################################################################
|   (1) cumulative hits |   (2) random mean PROCs |   (3) pseudo random mean PROCs |   (4) difference |
|-----------------------+-------------------------+--------------------------------+------------------|
|                     1 |                    0.17 |                           0    |             0.17 |
|                     2 |                    0.45 |                           0.09 |             0.36 |
|                     3 |                    0.56 |                           0.12 |             0.44 |
|                     4 |                    0.71 |                           0.26 |             0.45 |
|                     5 |                    0.81 |                           0.36 |             0.45 |
|                     6 |                    1.06 |                           0.75 |             0.31 |
|                     7 |                    1.37 |                           0.88 |             0.49 |
|                     8 |                    1.35 |                           0.97 |             0.38 |
|                     9 |                    1.47 |                           1.22 |             0.25 |
|                    10 |                    1.64 |                           1.33 |             0.31 |
|                    11 |                    1.95 |                           1.65 |             0.3  |
|                    12 |                    2.1  |                           1.68 |             0.42 |
|                    13 |                    2.36 |                           1.79 |             0.57 |
|                    14 |                    2.56 |                           2.16 |             0.4  |
|                    15 |                    2.61 |                           2.26 |             0.35 |
|                    16 |                    2.61 |                           2.37 |             0.24 |
|                    17 |                    2.87 |                           2.43 |             0.44 |
|                    18 |                    2.77 |                           2.83 |            -0.06 |
|                    19 |                    3.22 |                           2.84 |             0.38 |
|                    20 |                    3.03 |                           2.84 |             0.19 |
|                    21 |                    3.6  |                           3.22 |             0.38 |
|                    22 |                    3.71 |                           3.32 |             0.39 |
|                    23 |                    3.68 |                           3.68 |             0    |
|                    24 |                    3.93 |                           3.72 |             0.21 |
|                    25 |                    4.54 |                           3.92 |             0.62 |
|                    26 |                    4.65 |                           4.04 |             0.61 |
|                    27 |                    4.47 |                           4.27 |             0.2  |
|                    28 |                    4.83 |                           4.39 |             0.44 |
|                    29 |                    4.78 |                           4.48 |             0.3  |
|                    30 |                    4.93 |                           4.8  |             0.13 |
|                    31 |                    5.3  |                           4.92 |             0.38 |
|                    32 |                    5.1  |                           5.04 |             0.06 |
|                    33 |                    5.79 |                           5.34 |             0.45 |
|                    34 |                    5.82 |                           5.54 |             0.28 |
|                    35 |                    6.04 |                           5.52 |             0.52 |
|                    36 |                    5.67 |                           5.7  |            -0.03 |
|                    37 |                    6.64 |                           5.98 |             0.66 |
|                    38 |                    6.4  |                           6.03 |             0.37 |
|                    39 |                    6.72 |                           6.41 |             0.31 |
|                    40 |                    7.07 |                           6.46 |             0.61 |
|                    41 |                    7.04 |                           6.59 |             0.45 |
|                    42 |                    7.18 |                           6.81 |             0.37 |
|                    43 |                    7.08 |                           6.9  |             0.18 |
|                    44 |                    7.61 |                           7.09 |             0.52 |
|                    45 |                    7.72 |                           7.21 |             0.51 |
|                    46 |                    7.73 |                           7.66 |             0.07 |
|                    47 |                    8    |                           7.68 |             0.32 |
|                    48 |                    8.41 |                           7.7  |             0.71 |
|                    49 |                    8.57 |                           7.93 |             0.64 |
|                    50 |                    8.54 |                           8.11 |             0.43 |
|                    51 |                    8.16 |                           8.31 |            -0.15 |
|                    52 |                    9    |                           8.4  |             0.6  |
|                    53 |                    9.01 |                           8.79 |             0.22 |
|                    54 |                    8.98 |                           8.88 |             0.1  |
|                    55 |                    9.54 |                           8.99 |             0.55 |
|                    56 |                    9.4  |                           9.13 |             0.27 |
|                    57 |                    9.75 |                           9.08 |             0.67 |
|                    58 |                   10.1  |                           9.42 |             0.68 |
|                    59 |                    9.71 |                           9.64 |             0.07 |
|                    60 |                   10.31 |                           9.87 |             0.44 |
|                    61 |                   10.19 |                          10.31 |            -0.12 |
|                    62 |                   10.29 |                          10.21 |             0.08 |
|                    63 |                   10.76 |                          10.55 |             0.21 |
|                    64 |                   10.82 |                          10.48 |             0.34 |
|                    65 |                   10.7  |                          10.77 |            -0.07 |
|                    66 |                   11.27 |                          10.94 |             0.33 |
|                    67 |                   11.81 |                          11.06 |             0.75 |
|                    68 |                   11.54 |                          11.34 |             0.2  |
|                    69 |                   11.98 |                          11.26 |             0.72 |
|                    70 |                   12.26 |                          11.66 |             0.6  |
|                    71 |                   11.35 |                          12.01 |            -0.66 |
|                    72 |                   12.03 |                          11.64 |             0.39 |
|                    73 |                   12.37 |                          11.94 |             0.43 |
|                    74 |                   12.74 |                          12.16 |             0.58 |
|                    75 |                   13.16 |                          12.66 |             0.5  |
|                    76 |                   12.77 |                          12.59 |             0.18 |
|                    77 |                   13.1  |                          12.65 |             0.45 |
|                    78 |                   13.2  |                          12.95 |             0.25 |
|                    79 |                   13.59 |                          13.06 |             0.53 |
|                    80 |                   13.32 |                          13.27 |             0.05 |
|                    81 |                   13.74 |                          13.25 |             0.49 |
|                    82 |                   13.98 |                          13.47 |             0.51 |
|                    83 |                   14.86 |                          13.74 |             1.12 |
|                    84 |                   14.53 |                          13.8  |             0.73 |
|                    85 |                   14.54 |                          14.29 |             0.25 |
|                    86 |                   14.49 |                          14.22 |             0.27 |
|                    87 |                   15.03 |                          14.46 |             0.57 |
|                    88 |                   15.49 |                          14.55 |             0.94 |
|                    89 |                   15.51 |                          14.8  |             0.71 |
|                    90 |                   15.58 |                          15    |             0.58 |
|                    91 |                   16.17 |                          15.01 |             1.16 |
|                    92 |                   14.89 |                          15.43 |            -0.54 |
|                    93 |                   15.73 |                          15.4  |             0.33 |
|                    94 |                   16.16 |                          15.67 |             0.49 |
|                    95 |                   16.11 |                          16.17 |            -0.06 |
|                    96 |                   16.03 |                          16.17 |            -0.14 |
|                    97 |                   16.41 |                          16.07 |             0.34 |
|                    98 |                   17.26 |                          16.27 |             0.99 |
|                    99 |                   15.65 |                          16.7  |            -1.05 |
|                   100 |                   16.15 |                          16.96 |            -0.81 |

Thursday, May 8, 2014

Classification of the Hyper-Spectral and LiDAR Imagery using R (mostly). Part 3: Shadow Removal

This is my third post in this series, here are the links to the Part 1 and Part 2. Sorry for the delay, but at that time I got completely consumed with my PhD thesis and this story just went out of my mind.

To the point. There were 2 types of shadows I had to deal with: cloud shadows and cast shadows (from building and trees). In scientific literature you can find several relatively complicated algorithms to deal with them (including DEM usage for cast shadow detection) and I definitely should have used some of them, but... I was too lazy to implement them when there were couple of quick'n'dirty workarounds available.

Cloud shadow removal

To remove cloud shadows I decided to calculate ratio between mean pixel value of the lighted up and shadowed parts of the raster for each band and multiply shadowed pixels by this ratio.  I manually created a mask for shadowed areas. There were 3 areas and their borders were quite distinct in NIR channels:

Cloud shadows in one of the NIR bands

I drew several polygons over shadows in QGIS and converted them to raster (Raster -> Conversion -> Rasterise). Now everything was ready for patching:

library(raster)
library(rgdal)

# load rasters
image <- stack('2013_IEEE_GRSS_DF_Contest_CASI.tif') # hyper spectral imagery
s_mask <- raster('shadow_mask.tif') # mask for cloud shadow

# extract shadowed area
m_cloud <- s_mask
m_cloud[m_cloud < 1] <- NA
shadowed <- mask(image, m_cloud)

# calculate zonal statistics
zd <- zonal(image, s_mask, fun = 'mean', na.rm = T)
zd <- as.data.frame(zd)

# calculate ratio between lighted up and shadowed zones
zd[3,] <- zd[1,] / zd[2,] 

# recalculae data in shadows
multiplyer <- zd[3,]
multiplyer <- multiplyer[,2:ncol(zd)]
multiplyer <- as.numeric(multiplyer)
enlight <- shadowed*multiplyer # patch layer
restored <- cover(enlight, image, progress = 'text')

# save result
wr <- writeRaster(restored, 
                  filename = 'restored.tif', 
                  datatype = 'GTiff', overwrite = TRUE)


Now it's time to check out results:

The same NIR band after cloud shadow removal
Yes, we can easily find patched areas, but overall result is good for such simple approach.

Cast shadow removal

Shadows from buildings can be processed the same way as cloud shadows as shown above except mask creation approach is different. In case of cast shadows they have to be detected automatically (for there are a bit more than 3 as in case with clouds). Using one of the NIR bands I created a learning sample of points and passed it to Random Forest algorithm to classify imagery into shadowed and non-shadowed areas. I will not describe here this classification process for Random Forest was used for overall classification of the imagery and it is the subject of my next post of this series.

Monday, April 21, 2014

Unify Extent and Resolution Script Updated

Script for unifying extent and resolution is now able to work with multi-band rasters. Also I fixed an error that caused output rasters to have different resolutions in some cases.