Pages

Wednesday, 25 February 2015

Count Occurrences of Factor in R

events <- data.frame(type = factor(c('A', 'A', 'B'), c('A','B','C')), quantity = c(1, 2, 1))

# Method 1
table(events$type)

# Method 2
xtabs(quantity~type, events)

# Method 3
aggregate(quantity~type, events, FUN=sum)

# Method 4
library(plyr)
ddply(events, .(type), summarise, quantity = sum(quantity), .drop=FALSE)


Thursday, 19 February 2015

Adjacency Matrix is an amazing Tool for Analysing Networks

I work a lot with adjacency matrix, a way of representing a relationship network. Adjacency matrix transfers a network graph into a well developed mathematical form. This opens the door to analysing networks using rigorous mathematical tools. The more I use it, the more useful I find this invention is.

1. In addition to indicating which nodes link to which nodes, with or without direction -- the most fundamental function, the matrix can conveniently reflect how important the links are (adding weight values to the corresponding elements).

2. Doing various elementary transformations of the adjacency matrix can achieve a variety of changes of the network, such as changing the order of the nodes, merging the relationships of two nodes.

3. One can assign attributes to nodes by multiplying the matrix (by row or column) with a vector storing the attributes.

Tuesday, 3 February 2015

Dropping Factor Levels in R

  • drop.levels {gdata} # Drop unused factor levels
  • droplevels {base} # The function droplevels is used to drop unused levels from a factor or, more commonly, from factors in a data frame.
  • When creating data frame or importing data into data frame (loading data with read.table, read.csv or the like), character vectors are converted to factors by default. To avoid this, set option (stringsAsFactors = FALSE)