Hang's Blog

Monday, 23 June 2014

Transforming NetLogo Simulated Networks into UCINET for Analysis

There's a program called NetLogo. It's an open-source agent-based modeling program. And it's awesome, although you have to spend some time with it. There are some example models that are automatically downloaded into a library when you download the program, and several of them are network simulators (they'll simulate a virus in a network, diffusion of a directed network, the formation of giant components, preferential attachment, etc).

There's a program called UCINET too. It's a social network analysis program which works with a freeware program called NETWORK which allows you to visualize networks. UCINET can only be run on Windows, and you can get a free trial you 60 days. With it, you can take a network generated in NetLogo and run all sorts of network diagnostics and alter your network image in whatever ways you like (I have yet to figure out if there is a way to keep track of the time variable somehow- perhaps as a link or turtle attribute, which could then be input as a node attribute in NETWORK).

They should probably be able to talk to each other, right? And then you could create rules for behavior in your NetLogo, simulate a network on those rules, analyze that network or a set of random simulations, and see if your rules are creating the types of networks you see in reality. That would be pretty cool, not? In the absence of knowing much about programming, here's a semi-labor intensive way to make this happen (doubtless R or some other program can do a lot of this already):

(1) In NetLogo:
File > Models Library > Networks > Preferential Attachment > Open
Select the Procedures tab
At the very top of Procedures, above the setup procedures, add "turtles-own [ my-list-partners; ]"
In the Main Procedures section, in the "to go" command, add "ask turtles [ set my-list-partners ([who] of link-neighbors) ]" just above the "tick" command
Select the Interface tab (and pray you don't see an error message)
Setup
Go until you have a network you like the size of
File > Export > World... > Save it somewhere
(there may be a more efficient way to create an output with just the variables of interest, but the below commands will work as well)

(2) In Excel:
Open the network's csv file you just created
Delete columns B:M
Delete rows 1:13
Delete rows below your initial ego/alter list (everything including and below "PATCHES")
Edit > Replace... > type in "[" into the "Find what:" space, leave "Replace with:" blank > Replace All > OK
type in "]" into the "Find what:" space, leave "Replace with:" blank > Replace All > OK > Close
Select column B (the commands below might depend on your version of Word)
Data > Text to Columns... > Delimited > Next > select "Space" as the delimiter and deselect "Tab" > Next > Finish
Hand select the frame with the data
Copy

(3) In UCINET 6
Data > Date editors > DL Editor
In the DL Editor, select the data tab
Select the first cell
Paste
Under "Data format:", select "Nodelist (1-mode)"
File > Save UCINET dataset > name it and save it somewhere
Close the DL editor
Visualize > NetDraw

(4) In NetDraw (which is automatically downloaded as part of UCINET)
File > Open > Ucinet dataset > Network
Select your file, which will end with ##h
Click OK

You can play with it from there. There's also an ability to upload attributes into NetDraw (like turtle color or breed), or resize nodes according to their degree, or any of a number of other options. If you're network in NetLogo has several components, UCINET can handle that too. I imagine there's some way to do a lot of the data reformatting/exporting/
importing automatically. You could also probably apply a similar alteration in code to other network models within NetLogo and use the same procedures to get it into UCINET.

Best of luck! Let me know if anything interesting comes of this/you find a more efficient way to accomplish the same thing.

Source: http://sexandstats.blogspot.ie/2011/09/network-analysis-transforming-netlogo.html

Thursday, 12 June 2014

Common Words in Spoken Dialogues

There are a number of words that are common in spoken dialogues that do not occur in written forms. This section discusses how such words should be transcribed. Where possible, we use the spelling from Quirk et al. (1985).

Filled Pauses Filled pauses are very common in natural dialogue. There seem to be two types, ones that sound like "uh" and ones that sound like "um". The endings of these words are often prolonged, thus tempting transcribers to label it is "ummm". Rather, these words should be classified as either "um" or "uh", and transcribed as such. We also include "er", which is more common in British accents. Note that the filled-pauses should never be transcribed as partial words.

um	Filled pause.
uh	Filled pause.
er	Filled pause. More common in British English.

Acknowledgments

The following is a list of commonly occurring acknowledgments, and how they should be spelt.

okay	Agreement. Speakers will often produce variants of this, such as "kay", "mkay", "umkay". All of these variants should be spelt as "okay".
uh-huh	Agreement.
uh-hm	Agreement.
mm-hm	Agreement.
uh-uh	Disagreement.
mm	Agreement, stalling for time
huh	Request for clarification. Puzzlement.
hm	Stalling for time.
nah	Informal version of "no".
nope	Informal version of "no".
a-ha	Interjection denoting surprise, as in "aha< I found it", rather than "uh-huh" as an acknowledgement ha Interjection, similar to "aha".
oh	Surprise. ooh As in "ooh, that's gross."
yeah	Informal version of "yes"
yep	Informal version of "yes"

Contractions

Contractions, that are common, should be written as one word. The following is a list of common contraction endings. Note that there can often be an ambiguity as to whether the speaker was saying the words as one or as two individual words, especially since words are often blurred together. If in doubt, annotate the word pair as two separate words, spelling out the second in full.

'll	for "will"
've	for "have"
n't	for "not"
're	for "are"
's	for "is"

All other contractions are left to the transcriber's discretion as to whether they should be transcribed as one word or two.
Word Pairs There are some word pairs that are so altered (in pronounciation) that they seem to be one lexical item. Such pairs can be transcribed as single words. Below, we give some common word pairs.

lemme	for "let me"
wanna	for "want to"
gonna	for "going to"
gotta	for "going to"

Extracted from PostScript by R. Paul McCarty 2001/07/30

Friday, 6 June 2014

On Paper Reading

Paper is not something you should read a large number of, but something you should read a number of times.

Rereading papers can be more useful than starting new papers, conditional on careful selection of papers to read.

A paper is good because it clearly tell you where it comes from and where it goes to, and can inspire readers with various interests.

Monday, 19 May 2014

Start to Change

I feel the situation starts to change from this week on. I will become more efficient at work gradually. Here I list the tasks to do for this week (20140519-20140525):

Submit the abstract by Monday;
Interview (QQ msgs) Li and Dad for setting up the ABM;
Finalize the ABM of peer effects among smallholders;
Complete the statistical analysis on the simulation data;
Complete a first draft paper for peer effects (including all sections);
Start to do the daily recaps (new words, paper(s) read, being honesty, etc.);
Figure out how to import the map to NetLogo;
Email to a prof.

Saturday, 17 May 2014

Living beside the Sea

My current living place is 2 minutes walk to the sea. Every morning I cycle along the seaside to my work, and every evening I cycle back along the seaside. Every weekend morning I practise Tai Chi facing the sea.

I once had a strong expectation to the sea. I grew up in a very inland place. To see the sea was one of my dreams when I was a child. The dream had not been realised until I was 25 years old. Now I am so close to the sea. I can see it when I lay on the bed. I can hear and smell it on my way to work and on my way home.

Sea is often compared to the board minds or deep thoughts, which I am short of. A Chinese saying goes: "the wise delight in water, the benevolent delight in mountains". Water is flexible and profound.

Sea is also linked to boldness and tenacity. People obtain such qualities in the fight with sea. This seems true. People living in the coastal places appear more adventurous than those dwelling in inland places. For example, the southerners in China are more likely to do business than the northerners.

Thursday, 20 March 2014

Codes for Replacing Special Values in R

grep {base} Pattern Matching and Replacement.

E.g., sub(“”, 0 ,dat)

dat[dat==“”] = 0 # replace empty cell in dat as 0
replace(x, list, values)

E.g., dat <- replace(dat, is.na(dat), 0)

na.strings=c("","NA") # replace empty cell as NA when importing data

Wednesday, 12 March 2014

Five Common Relationships among Three Variables in a Statistical Model

In a statistical model–any statistical model–there is generally one way that a predictor X and a response Y can relate:

This relationship can take on different forms, of course, like a line or a curve, but there’s really only one relationship here to measure.

Usually the point is to model the predictive ability, the effect, of X on Y.

In other words, there is a clear response variable*, although not necessarily a causal relationship. We could have switched the direction of the arrow to indicate that Y predicts X or used a two-headed arrow to show a correlation, with no direction, but that’s a whole other story.

For our purposes, Y is the response variable and X the predictor.

But a third variable–another predictor–can relate to X and Y in a number of different ways. How this predictor relates to X and Y changes how we interpret the relationship between X and Y.These relationships have common names, but the names sometimes differ across fields, so you may be familiar with a different name than the ones I give below. In any case, the names are less important than:

(1) making sure the relationship you are testing is the one that answers your research question and

(2) the relationship reflects the data.

So let’s look at some possible relationships, once we add a second predictor, Z.

1. Covariate correlated with X

In this model, the Covariate, Z, is correlated with X, and both predict Y.

Because X and Z are not independent, there will usually be some joint effect on Y–some part of the relationship between X and Y that can’t be distinguished from the relationship between Z and Y.

The relationship between X and Y is no longer the full effect of X on Y.

It’s the marginal, unique effect of X on Y, after controlling for the effect of Z.

While the model fit as a whole will include both the joint and the unique effects of both X and Z on Y, the regression coefficient for X will only include its unique effect.

When both X and Z are observed variables, this is nearly always the situation.

As long as the correlation is moderate, it’s still possible to measure the unique effect of X. If it gets too high, however, you will start to hit a point of multicollinearity in which the model has problems calculating estimates.

A good example of this kind of relationship would be in a study that measures the nutritional composition of soil cores at different altitudes and moisture levels.

Because water flows downhill, lower altitudes (X) tend to be more moist (Z). So while we can’t completely separate altitude from moisture levels, as long as they’re only moderately correlated, we should be able to find the unique effect of altitude on potassium levels.

Test this effect by including X and Z in a model together. To understand how much Z and X overlap in their explanation of Y, rerun the model without Z and look at how much X’s coefficient changes.

2. Covariate Independent of X

When a covariate Z is NOT related to X, it has a slightly different effect. It needs to be able to predict Y as well to be useful in the model, but the effects of X and Z don’t overlap.

Including Z in the model often leads to the relationship between X and Y becomingmore significant because Z has explained some of the otherwise unexplained variance in Y.

An example of this kind of covariate is when an experimental manipulation (X) on response time (Y) only becomes significant when we control for finger dexterity levels (Z).

If finger dexterity has a large effect on response time and we don’t account for it, all of that variation due to dexterity will go into unexplained error–the denominator of our test statistic.

Controlling for that variance means it is no longer unexplained, and is removed from the denominator.

Test this type of effect by running hierarchical regression (add each predictor in on a separate step).

3. Spurious Relationship
A confounding variable Z creates a spurious relationship between X and Y because Z is related to both X and Y.

This is the relationship seen in most “correlation is not causation” examples: The amount of ice cream consumption (X) in a month predicts number of shark attacks (Y). Do sharks like eating ice-cream laden people? No.

This spurious relationship is created by a confounder that leads to increases in both ice-cream consumption and shark attacks: temperature (Z). People both eat more ice cream and swim in the ocean in hotter months.

It can be difficult to distinguish between this relationship and #1 above through testing alone. This is a situation where you have to entertain spuriousness as a possibility and question your results.

4. Mediation
Mediation indicates a specific causal pathway. It occurs when at least part of the reason X affects Y is through Z. X affects Z and Z affects Y.

There are then two effects of X on Y: the indirect effect of X on Y through Z and the direct effect of X on Y.

An example here would be if the relationship between stress (X) and depression (Y) was mediated through increased levels of stress hormones (Z).

There are a number of ways to test for mediation, including running a series of regression models and through path analysis. More modern approaches recommend testing the strength of the indirect effect.

5. Moderation
Despite the similarity in the names, a moderation is an entirely different beast than mediation.

Moderation indicates that the effect of X on Y is different for different values of Z. In other words, Z moderates (affects) the effect of X on Y.

Maybe when Z is large, there is a strong positive relationship between X and Y, but when Z is small, there is no relationship between X and Y.

So essentially, we’re saying that X predicts Y in different ways, depending on the value of Z.

X and Z can be correlated or not.

One example of moderation can be seen in the relationship between depression (X), physical health scores (Y), and poverty status (Z). Poverty status (Yes/No) would be said to moderate the relationship between depression and physical health if there was a weak negative relationship among people not in poverty, but a strong negative relationship among people in poverty.

Test moderation by including an interaction term between X and Z.

All of these relationships can occur with more than two predictors in the model, and a model can contain more than one of them. Figuring out the potential relationships among variables is often the fun part of data analysis.

Source: http://www.theanalysisfactor.com/five-common-relationships-among-three-variables-in-a-statistical-model/

Friday, 27 December 2013

当前学习中的几对关系

刚才和来做访问的汪老师谈到当前我们的学习中存在的一些问题，表现为以下几对关系。

一对是学与思的关系。我们的问题是有时候注重学习而不够注重思考，有时候注重思考而不够在注重学习。前者的表现在于：一是没能较好地将所学的概念、理论同实际问题、特别是自己所研究的话题结合起来。有效的学习者总是注意在学习新知的过程中始终想到这个知识点对我的研究的问题有哪些帮助、我可以怎样它运用到自己的研究中去。把别人的研究作为一个自己研究的一个隐喻，去挖掘其中可资借鉴的部分。这一方面使得学习更有目标性，可以带着目标去学，这样可以强化记忆和认知，帮助我们学习；另一方面促进读对于研究主题的思考，可以从不同角度去认识自己的研究主体。一个可能的弊端是，这可能导致对所学知识的认识比较狭窄和片面，不能很好进行一般化。避免的途径首先是从整体上把握这个知识，其次是在不同领域和主题中进行运用。二是没有注意将所学的知识放在自身的中知识体系之中。这首先要求要构建自身的知识体系，这样的各个知识都有它的位置和坐标。“学而不思则罔，死而不学则殆”。

后者说的是没有把所思、所悟放在一定的理论框架之中、没有认真得调查前人做了哪些研究，因为思考所得的学术价值不高。这首先要求掌握一定的核心理论，形成自己观察世界的视角。

二是思与写的关系。我们往往是想的多、写下来的少。这既不能很好地对所思进行梳理和总结，也不利于认识的积累和提高。此外，写的多的一种重要好处可以练笔。

Thursday, 19 December 2013

你所在研究团队强吗

一个研究团队、实验室或科研小组如果处在这样的情形中，那它就还比较弱：

搞技术的人（如玩统计、计量的人，编程的人）比搞理论的人更吃香；
申请经费要按照经费提供者、政策制定者的意图来写研究计划，而不是根据本团队的兴趣来写；
各个成员研究的主题很分散（研究的方法和所采用的理论不同没有关系）。

如果出在这样的情形中，那它就没多大前途：

毕业的学生（如硕士生、博士生）大部分不再从事科研，甚至不从事与研究相关的工作（甚至他们进来的就是不是冲着科研来的）；
比较强的人（包括在某一方面有专长的人，比如搞技术的人），无论是在读的学生、还是其他研究人员，会轻易地离开这个团队；
不重视基础理论研究，一心只做能拿到经费的、发文章快的研究；
团队中比较强的人（比如PI）跟比较弱的人（比如硕、博士生）在学术上的交流很少或没有。

如果问，按照这些标准，本人所在的研究团队怎么样，结论是它不强，但还比较有前途。

Sunday, 8 December 2013

Some Famous Metaphors

The Big Bang.
Fred Hoyle

All the world’s a stage, and all the men and women merely players. They have their exits and their entrances.
William Shakespeare

Art washes away from the soul the dust of everyday life.
Pablo Picasso

I am the good shepherd, … and I lay down my life for the sheep.
The Bible, John 10:14-15

All religions, arts and sciences are branches of the same tree.
Albert Einstein

Chaos is a friend of mine.
Bob Dylan

All our words are but crumbs that fall down from the feast of the mind.
Khalil Gibran

If you want a love message to be heard, it has got to be sent out. To keep a lamp burning, we have to keep putting oil in it.
Mother Teresa

America has tossed its cap over the wall of space.
John F. Kennedy

A hospital bed is a parked taxi with the meter running.
Groucho Marx

A good conscience is a continual Christmas.
Benjamin Franklin

Let us be grateful to people who make us happy, they are the charming gardeners who make our souls blossom.
Marcel Proust

And your very flesh shall be a great poem.
Walt Whitman

Advertising is the rattling of a stick inside a swill bucket.
George Orwell

Dying is a wild night and a new road.
Emily Dickinson

Fill your paper with the breathings of your heart.
William Wordsworth

Conscience is a man’s compass.
Vincent Van Gogh

Pages