Misanthrope's Thoughts

What is the cause for the degradation of environment?
Capitalism, corruption, consuming society? - OVERPOPULATION!
Please, save the Planet - kill yourself...

Sunday, August 31, 2014

Donetsk Label at OpenStreetMap

Now, when Donetsk became the hot spot of the civil war at the Ukraine I decided to take a glance at the map of the war zone. Of course I used OSM for this. But despite I knew the approximate location of the city I couldn't find it for several minuets. I was confused... The reason was that "Donetsk" label (label of the city which is the administrative center of the Donetskaya region BTW) at the scales smaller than 10 km is suppressed by the label of adjusted town "Makeevka". WTF?! 

"Donetsk" ("Донецьк") label at OSM is suppressed by the label of adjusted town

Thursday, May 29, 2014

Random VS Pseudo Random PROC or Why Does Axe's Win Rate Dropped in 6.81 Patch (Dota 2)

Introduction

Here is an interesting post about 6.81 patch influence on heroes win rate in Dota 2. One point caught my attention: a drastic change in Axe win rate: from 52% to 48%; see image below.


Axe's win rate before and after 6.81 patch (29-th of April)
The only change Axe get in this patch was implementation of the preudo random approach for the determination of Counter Helix skill triggering instead of random. As was guessed in the article:  
"...where you manage to Berserker's Call several enemies, you only have 3.2 seconds of being unconditionally hit by the enemy. In this time frame, the amount of successful Counter Helixes that use PRD is probably lower than with a true random. Hence the decrease in damage output and Win Rate." 
 Ok, nice guess! Lets test it!

Create a simulation to test the hypothesis

Import needed modules
import random
import numpy
from tabulate import tabulate
import collections
Lets create functions to calculate number of Counter Helix PROCs for given number of hits using random and pseudo random approaches. First one will be using random approach (notice that random.randrange() actually generates pseudo random set of values)

def randHelixProc(hits):
  '''
  checks if Counter Helix PROCs (chance = 17%) using random approach
  @param hits: number of times Axe was hit
  @return procs: number of times Counter Helix PROCs
  '''
  procs = 0

  for i in xrange(hits):
    chance = random.randrange(1, 101)
    if chance < 18:
      procs += 1
    else:
      continue

  return procs

I don't know how exactly pseudo random PROCs are designed for Axe or for Dota 2 in general, but they say that the probability of the pseudo random event in Dota 2 in it initial state has lower chance to proc than the stated probability for the event and each time the event is not triggered when it could the chance for this event to occur is increased. Lets say some event triggers with the 50% chance. Suggested pseudo random approach can be realised with 3 consecutive checks that have different chances: 0%, 50% and 100% (average is 50%). When event finally triggers it reset chance for the event to initial value (0% in this case) and it all starts over again.
def pseudoRandHelixProc(hits):
  '''
  checks if Counter Helix PROCs (chance = 17%) using pseudo random approach
  @param hits: number of times Axe was hit
  @return procs: number of times Counter Helix PROCs
  '''
  treshold = 0
  prob_list = [2, 5, 9, 14, 23, 47] # ~17% on average
  procs = 0
  for i in xrange(hits):
    chance = random.randrange(1, 101)
    try:
      success = prob_list[treshold]
    except:
      success = prob_list[5]
    if chance >= success:
      treshold += 1
    else:
      procs += 1
      treshold = 0
  return procs
Check if the chances for PROCs are the same for the both functions. We should have 1700 procs of Counter Helix for 10000 attacs on Axe. Launch 100 simulations of 10000 hits for each function and compute average procs for random and pseudo random approaches.
rhelix_list = []
for i in xrange(100):
  value = randHelixProc(10000)
  rhelix_list.append(value)
numpy.mean(rhelix_list)

>>>1702.79

p_rhelix_list = []
for i in xrange(100):
  value = pseudoRandHelixProc(10000)
  p_rhelix_list.append(value)
numpy.mean(p_rhelix_list)

>>>1702.3
Output difference is negligible for random and pseudo random implementations.
Now its time to create a simulation function.
def simulation(times, hits):
  '''
  Computes average number of PROCs for given hits for random and pseudo random
  approaches and difference between them.

  @param times: number of times simulation will run
  @param hits: number of hits to simulate
  @return: average difference in pocs between random and pseudo random approaches;
           table of simul results
  '''
  # create lists of results
  diff_list = []
  rand_mean_list = []
  p_rand_mean_list = []

  # run simulation
  for hit in xrange(1, hits + 1):

    rand_list = []
    pseudo_rand_list = []

    for t in xrange(times):
      rand = randHelixProc(hit)
      p_rand = pseudoRandHelixProc(hit)
      rand_list.append(rand)
      pseudo_rand_list.append(p_rand)

    # compute statistics and populate lists of results
    rand_mean = numpy.mean(rand_list)
    rand_mean_list.append(rand_mean)
    p_rand_mean = numpy.mean(pseudo_rand_list)
    p_rand_mean_list.append(p_rand_mean)
    diff = rand_mean - p_rand_mean
    diff = round(diff, 2)
    diff_list.append(diff)

  # print average difference in PROCs
  total_diff = sum(diff_list)
  l = len(diff_list)
  print 'average difference:', total_diff/l
  print '#######################################################################################################'

  # create table output for simulation results
  out_dict = {}
  out_dict['(1) cumulative hits'] = range(1, l + 1)
  out_dict['(2) random mean PROCs'] = rand_mean_list
  out_dict['(3) pseudo random mean PROCs'] = p_rand_mean_list
  out_dict['(4) difference'] = diff_list
  out_dict = collections.OrderedDict(sorted(out_dict.items()))
  print tabulate(out_dict, headers = 'keys', tablefmt="orgtbl")
Lets run 100 simulations of 100 consecutive hits. Of course it is possible to run 10000 hits but it is unnecessary since every time the Counter Helix is triggered we jump back to the first row from the table below (result of the simulation). As was already mentioned:
"[if] ...you manage to Berserker's Call several enemies, you only have 3.2 seconds of being unconditionally hit by the enemy"
If Axe was able to catch 3 enemies, together they will hit him about 9-12 times. This means that with the pseudo random approach he will spin 0.3-0.4 times less than with random approach. It will cause him to deal ~(205*0.35)*3 = 215,25 (or 71.75 per hero) less damage in engagement.

So the hypothesis was true - pseudo random approach caused the decrease in damage output on the short time interval even if total number of PROCs during the game stayed the same.
average difference: 0.34
#######################################################################################################
|   (1) cumulative hits |   (2) random mean PROCs |   (3) pseudo random mean PROCs |   (4) difference |
|-----------------------+-------------------------+--------------------------------+------------------|
|                     1 |                    0.17 |                           0    |             0.17 |
|                     2 |                    0.45 |                           0.09 |             0.36 |
|                     3 |                    0.56 |                           0.12 |             0.44 |
|                     4 |                    0.71 |                           0.26 |             0.45 |
|                     5 |                    0.81 |                           0.36 |             0.45 |
|                     6 |                    1.06 |                           0.75 |             0.31 |
|                     7 |                    1.37 |                           0.88 |             0.49 |
|                     8 |                    1.35 |                           0.97 |             0.38 |
|                     9 |                    1.47 |                           1.22 |             0.25 |
|                    10 |                    1.64 |                           1.33 |             0.31 |
|                    11 |                    1.95 |                           1.65 |             0.3  |
|                    12 |                    2.1  |                           1.68 |             0.42 |
|                    13 |                    2.36 |                           1.79 |             0.57 |
|                    14 |                    2.56 |                           2.16 |             0.4  |
|                    15 |                    2.61 |                           2.26 |             0.35 |
|                    16 |                    2.61 |                           2.37 |             0.24 |
|                    17 |                    2.87 |                           2.43 |             0.44 |
|                    18 |                    2.77 |                           2.83 |            -0.06 |
|                    19 |                    3.22 |                           2.84 |             0.38 |
|                    20 |                    3.03 |                           2.84 |             0.19 |
|                    21 |                    3.6  |                           3.22 |             0.38 |
|                    22 |                    3.71 |                           3.32 |             0.39 |
|                    23 |                    3.68 |                           3.68 |             0    |
|                    24 |                    3.93 |                           3.72 |             0.21 |
|                    25 |                    4.54 |                           3.92 |             0.62 |
|                    26 |                    4.65 |                           4.04 |             0.61 |
|                    27 |                    4.47 |                           4.27 |             0.2  |
|                    28 |                    4.83 |                           4.39 |             0.44 |
|                    29 |                    4.78 |                           4.48 |             0.3  |
|                    30 |                    4.93 |                           4.8  |             0.13 |
|                    31 |                    5.3  |                           4.92 |             0.38 |
|                    32 |                    5.1  |                           5.04 |             0.06 |
|                    33 |                    5.79 |                           5.34 |             0.45 |
|                    34 |                    5.82 |                           5.54 |             0.28 |
|                    35 |                    6.04 |                           5.52 |             0.52 |
|                    36 |                    5.67 |                           5.7  |            -0.03 |
|                    37 |                    6.64 |                           5.98 |             0.66 |
|                    38 |                    6.4  |                           6.03 |             0.37 |
|                    39 |                    6.72 |                           6.41 |             0.31 |
|                    40 |                    7.07 |                           6.46 |             0.61 |
|                    41 |                    7.04 |                           6.59 |             0.45 |
|                    42 |                    7.18 |                           6.81 |             0.37 |
|                    43 |                    7.08 |                           6.9  |             0.18 |
|                    44 |                    7.61 |                           7.09 |             0.52 |
|                    45 |                    7.72 |                           7.21 |             0.51 |
|                    46 |                    7.73 |                           7.66 |             0.07 |
|                    47 |                    8    |                           7.68 |             0.32 |
|                    48 |                    8.41 |                           7.7  |             0.71 |
|                    49 |                    8.57 |                           7.93 |             0.64 |
|                    50 |                    8.54 |                           8.11 |             0.43 |
|                    51 |                    8.16 |                           8.31 |            -0.15 |
|                    52 |                    9    |                           8.4  |             0.6  |
|                    53 |                    9.01 |                           8.79 |             0.22 |
|                    54 |                    8.98 |                           8.88 |             0.1  |
|                    55 |                    9.54 |                           8.99 |             0.55 |
|                    56 |                    9.4  |                           9.13 |             0.27 |
|                    57 |                    9.75 |                           9.08 |             0.67 |
|                    58 |                   10.1  |                           9.42 |             0.68 |
|                    59 |                    9.71 |                           9.64 |             0.07 |
|                    60 |                   10.31 |                           9.87 |             0.44 |
|                    61 |                   10.19 |                          10.31 |            -0.12 |
|                    62 |                   10.29 |                          10.21 |             0.08 |
|                    63 |                   10.76 |                          10.55 |             0.21 |
|                    64 |                   10.82 |                          10.48 |             0.34 |
|                    65 |                   10.7  |                          10.77 |            -0.07 |
|                    66 |                   11.27 |                          10.94 |             0.33 |
|                    67 |                   11.81 |                          11.06 |             0.75 |
|                    68 |                   11.54 |                          11.34 |             0.2  |
|                    69 |                   11.98 |                          11.26 |             0.72 |
|                    70 |                   12.26 |                          11.66 |             0.6  |
|                    71 |                   11.35 |                          12.01 |            -0.66 |
|                    72 |                   12.03 |                          11.64 |             0.39 |
|                    73 |                   12.37 |                          11.94 |             0.43 |
|                    74 |                   12.74 |                          12.16 |             0.58 |
|                    75 |                   13.16 |                          12.66 |             0.5  |
|                    76 |                   12.77 |                          12.59 |             0.18 |
|                    77 |                   13.1  |                          12.65 |             0.45 |
|                    78 |                   13.2  |                          12.95 |             0.25 |
|                    79 |                   13.59 |                          13.06 |             0.53 |
|                    80 |                   13.32 |                          13.27 |             0.05 |
|                    81 |                   13.74 |                          13.25 |             0.49 |
|                    82 |                   13.98 |                          13.47 |             0.51 |
|                    83 |                   14.86 |                          13.74 |             1.12 |
|                    84 |                   14.53 |                          13.8  |             0.73 |
|                    85 |                   14.54 |                          14.29 |             0.25 |
|                    86 |                   14.49 |                          14.22 |             0.27 |
|                    87 |                   15.03 |                          14.46 |             0.57 |
|                    88 |                   15.49 |                          14.55 |             0.94 |
|                    89 |                   15.51 |                          14.8  |             0.71 |
|                    90 |                   15.58 |                          15    |             0.58 |
|                    91 |                   16.17 |                          15.01 |             1.16 |
|                    92 |                   14.89 |                          15.43 |            -0.54 |
|                    93 |                   15.73 |                          15.4  |             0.33 |
|                    94 |                   16.16 |                          15.67 |             0.49 |
|                    95 |                   16.11 |                          16.17 |            -0.06 |
|                    96 |                   16.03 |                          16.17 |            -0.14 |
|                    97 |                   16.41 |                          16.07 |             0.34 |
|                    98 |                   17.26 |                          16.27 |             0.99 |
|                    99 |                   15.65 |                          16.7  |            -1.05 |
|                   100 |                   16.15 |                          16.96 |            -0.81 |

Thursday, May 8, 2014

Classification of the Hyper-Spectral and LiDAR Imagery using R (mostly). Part 3: Shadow Removal

This is my third post in this series, here are the links to the Part 1 and Part 2. Sorry for the delay, but at that time I got completely consumed with my PhD thesis and this story just went out of my mind.

To the point. There were 2 types of shadows I had to deal with: cloud shadows and cast shadows (from building and trees). In scientific literature you can find several relatively complicated algorithms to deal with them (including DEM usage for cast shadow detection) and I definitely should have used some of them, but... I was too lazy to implement them when there were couple of quick'n'dirty workarounds available.

Cloud shadow removal

To remove cloud shadows I decided to calculate ratio between mean pixel value of the lighted up and shadowed parts of the raster for each band and multiply shadowed pixels by this ratio.  I manually created a mask for shadowed areas. There were 3 areas and their borders were quite distinct in NIR channels:

Cloud shadows in one of the NIR bands

I drew several polygons over shadows in QGIS and converted them to raster (Raster -> Conversion -> Rasterise). Now everything was ready for patching:

library(raster)
library(rgdal)

# load rasters
image <- stack('2013_IEEE_GRSS_DF_Contest_CASI.tif') # hyper spectral imagery
s_mask <- raster('shadow_mask.tif') # mask for cloud shadow

# extract shadowed area
m_cloud <- s_mask
m_cloud[m_cloud < 1] <- NA
shadowed <- mask(image, m_cloud)

# calculate zonal statistics
zd <- zonal(image, s_mask, fun = 'mean', na.rm = T)
zd <- as.data.frame(zd)

# calculate ratio between lighted up and shadowed zones
zd[3,] <- zd[1,] / zd[2,] 

# recalculae data in shadows
multiplyer <- zd[3,]
multiplyer <- multiplyer[,2:ncol(zd)]
multiplyer <- as.numeric(multiplyer)
enlight <- shadowed*multiplyer # patch layer
restored <- cover(enlight, image, progress = 'text')

# save result
wr <- writeRaster(restored, 
                  filename = 'restored.tif', 
                  datatype = 'GTiff', overwrite = TRUE)


Now it's time to check out results:

The same NIR band after cloud shadow removal
Yes, we can easily find patched areas, but overall result is good for such simple approach.

Cast shadow removal

Shadows from buildings can be processed the same way as cloud shadows as shown above except mask creation approach is different. In case of cast shadows they have to be detected automatically (for there are a bit more than 3 as in case with clouds). Using one of the NIR bands I created a learning sample of points and passed it to Random Forest algorithm to classify imagery into shadowed and non-shadowed areas. I will not describe here this classification process for Random Forest was used for overall classification of the imagery and it is the subject of my next post of this series.

Monday, April 21, 2014

Unify Extent and Resolution Script Updated

Script for unifying extent and resolution is now able to work with multi-band rasters. Also I fixed an error that caused output rasters to have different resolutions in some cases.

Sunday, March 16, 2014

Analysis of the Time Aspect of the Matches at The International 3 (Dota 2 Tournament)

It's been a while since I analysed something with R. I decided to pick into Dota 2 statistics. This game is the real eSport now with players fighting for respectable prise pools. And the main Dota 2 event is The International where best teams compete each other in astonishing battles.

So I grabbed some stats on The International 3 from here and started to think what can I do with it... The one major aspect of the game that was actively discussed prior the Tournament was the Dire side has an advantage over the Radiant via having Roshan Pit in their possession. Many people still think that Dire side is preferable in competitive games (they say that Roshan is killed by the Dire in 70% of cases). In fact The Alliance (TI3 winner) played like 90% of their games on the Dire side at the tournament. But is it a valid proof of the Dire side advantage? - I doubt. I think that in contrary - the Radiant side has an advantage over Dire, but I will add my arguments after I prove I'm right.

Ok, here is my hypothesis. There is no time limitation for the game. The match in Dota 2 lasts until the main building of the one of the sides is destroyed (or one of the team will give up). So if one of the sides (all the other things being equal) has an advantage it will cause the median time that was used to win the game for this side to be lower than to win for the other (and vice versa for time to loose). So lets take a look at the boxplot below:

dota 2 the international 3 winning time comparrison by side
Dire Vs Radiant winning time comparisson

Code:
library(ggplot2)

TI3 <- read.csv("~/R/Dota2_Stats/The_International-3")

# create winning time per side plot
p <- ggplot(TI3, aes(TI3[,5], TI3[,8])) +
  geom_boxplot(aes(fill = TI3[,5]))+
  labs(title = 'Winning time per side at TI-3',
       x = 'Side',
       y = 'Time, min.',
       fill = 'Side')
  
print(p)
Clearly, the Radiant side generally wins slightly quickly then Dire (and have higher number of wins: 82 against 76). This means that not the Dire, but the Radiant team has an advantage in game. But why? (You may skip the rest of the paragraph if you never played Dota). There are several reasons. Radiant advantages are: easier rune control, ability to deny creeps at the hard lane, camera angle (it is not parallel to the terrain surface and facing north towards the Dire side). Camera angle was never discussed as the advantage/disadvantage because most people just got used to it, but Radiant has slight, but sure vision advantage. Seems Roshan accessibility and a safer ancient camp does not help that much to the Dire. 

What else can we do with time analysis? We can compare win and loss times for all the teams that competed at TI3:
Teams winning and loosing time comparison
Code:
# get the list of teams
Teams <- unique(TI3[,3])

# create dataset to store data about winning/loosing times
A_time <- data.frame('Team'= character(),
                     'Result' = character(),
                     'Time' = numeric(),
                     stringsAsFactors = FALSE)

# extract time data and write it to the A_time data frame
for (team in Teams) {
  A <- subset(TI3, TI3[,3] == team | TI3[,4] == team)
    
  for (i in 1:nrow(A)) {
    winner <- A[i,][5]
    dire <- A[i,][4]
    radiant <- A[i,][3]
    time <- A[i,][8]
    if ( (winner == 'DIRE' & dire == team) | (winner == 'RADIANT' & radiant == team) ) {
     result <- paste(team, 'WIN')
    }
    else {
     result <- paste(team, 'LOSS') 
    }
    A_time[(nrow(A_time)+1),] <- c(team,result, time)
  }
}

# create plot for winning time per team
p <- ggplot(A_time, aes(A_time[,2], A_time[,3])) +
     geom_boxplot(aes(fill = A_time[,1]))+
     theme(axis.text.x = element_text(angle=90, hjust = 0),
          axis.title = element_text(face = 'bold', size = 14),
          title = element_text(face = 'bold', size = 16),
          legend.position = 'right'
          ) +
     labs(title = 'Win and loss times at The International 3',
          x = 'Results',
          y = 'Time, min.',
          fill = 'Teams')
  
print(p)

Generally my assumption was correct: better teams wins quicker. Alliance and NaVi  (1-st and 2-nd places) is a nice conformation to it (DK and IG (TI-2 champion) have the similar pattern as well despite shared 5-th place). But Orange and TongFu (3-rd and 4-th place) tends to loose quicker than win. This could be explained by the general playstile of this two Asian teams which often aims at the late game. This infamous style with prolonged no action farming stage is often referred as 'Chinese Dota'. But DK and IG are Chinese teams too.   Seems that both TongFu and Orange were able overcame the odds and jumped over their heads in the given tournament. They took places that DK and IG should have get (DK and IG were more favourable teams than Orange and TongFu before the tournament).

Monday, February 3, 2014

About Corruption and Economical Health of Countries

There is a quite interesting article about the corruption level in EU countries was published at BBC recentely. Of course the map is the most interesting part. 


The thing that I totally noticed in the very first moment of observing it, is that the countries with the highest corruption level have the lowest credit ratings (see this interactive map).


When will all these bastards understand that corruption hurts everyone?

Saturday, January 4, 2014

Unifying Extent and Resolution of Rasters Using Processing Framework in QGIS

My post about modification of extent and resolution of rasters drew quite a bit of attention and I decided to make a small New Year's present to the community and create a QGIS Processing script to automate the process.

The script was designed to be used within Processing module of QGIS. This script will make two or more rasters of your choice to have the same spatial extent and pixel resolution so you will be able to use them simultaneously in raster calculator. No interpolation will be made - new pixels will get predefined value. Here is a simple illustration of what it does:
Modifications to rasters A and B


To use my humble gift simply download this archive and unpack files from 'scripts' folder to your .../.qgis2/processing/scripts folder (or whatever folder you configured in Processing settings). At the next start of QGIS you will find a 'Unify extent and resolution' script in 'Processing Toolbox' in 'Scripts' under 'Raster processing' category:

If you launch it you will see this dialogue:
Main window


Note that 'Help' is available:
Help tab
Lets describe parameters. 'rastersare rasters that you want to unify. They must have the same CRS. Note that both output rasters will have the same pixel resolution as the first raster in the input.
Raster selection window
'replace No Data values with' will provide value to pixels that will be added to rasters and replace original No Data values with the value provided here. Note that in output rasters this value will not be saved as No Data value, but as a regular one. This is done to ease feature calculations that would include both of this rasters, but I'm open to suggestions and if you think that No Data attribute should be assigned I can add corresponding functionality.

Finally you mast provide a path to the folder where the output rasters will be stored in 'output directory' field. A '_unified' postfix will be added to the derived rasters file names: 'raster_1.tif' -> 'raster_1_unified.tif'

If CRSs of rasters won't match each other (and you will bypass built-in check) or  an invalid pass will be provided a warning message will pop up and the execution will be cancelled:
Example of the warning message
When the execution is finished you will be notified and asked if rasters should be added to TOC:


Happy New Year!

P.S. I would like to thank Australian government for making the code they create an open source. Their kindness saved me a couple of hours.