Your Friends Are More Interesting Than You On Average

The Friendship Paradox

Feld’s friendship paradox states that ‘your friends have more friends than you, on average’. This paradox arises because extremely popular people, despite being rare, are overrepresented when averaging over friends.

Using a sample of the Twitter firehose, we confirm that the friendship paradox holds for >98% of Twitter users. Because of the directed nature of the follower graph on Twitter, we are further able to confirm more detailed forms of the friendship paradox: everyone you follow or who follows you has more friends and followers than you.This is likely caused by a correlation we demonstrate between Twitter activity, number of friends, and number of followers.

But wait, there’s more..

In addition, we discover two new paradoxes: the virality paradox that states ‘your friends receive more viral content than you, on average’, and the activity paradox, which states ‘your friends are more active than you, on average’. The latter paradox is important in regulating online communication. It may result in users having difficulty maintaining optimal incoming information rates, because following additional users causes the volume of incoming tweets to increase super-linearly. (And this also may relate to why in large complex communities personalized moderation works better than community moderation, as explored in my last blog post).

While users may compensate for increased information flow by increasing their own activity, users become information overloaded when they receive more information than they are able or willing to process. We compare the average size of cascades that are sent and received by overloaded and underloaded users. And we show that overloaded users post and receive larger cascades and they are poor detector of small cascades.

What are the dangers of overload?

Those users who become overloaded, measured by receiving far more incoming messages than they send out, are contending with more tweets than they can handle. Controlling for activity, they are more likely to participate in viral cascades, likely due to receiving the popular cascades multiple times. Any individual tweet’s visibility is greatly diluted for overloaded users, because overloaded users receive so many more tweets than they can handle. Because of the connection between cognitive load and managing information overload, the present results suggest that users will dynamically adjust their social network to maintain some optimal individual level of information flux. (What does this mean for Facebook’s growth?)

Friendship Paradox Redux: Your Friends Are More Interesting Than You – Nathan O. Hodas, Farshad Kooti, Kristina Lerman (PDF of the paper)
http://arxiv.org/abs/1304.3480

How to design large complex online communities using social science

Sorry if I jump around a bit in this blog post but by reading these points, and listening to the video, you’ll have a better idea of how social science can help you design a successful community, using a specific kind of moderation approach. Or at least how to impress to use the difference between a theory vs design-type approach to community building to respond better to new customer needs.

OK, I am paraphrasing here so bear with me, with me taking notes from Robert Kraut’s Stanford presentation above. My aim is to show how social science can inform good online community design. So the first point is that Kraut makes that I want to highlight is that real community design is “highly multidimensional”. And that this is at odds with logic of social science which seeks to understand effects of one variable at a time, while all other variables are else held constant, to discover causality. OK, so that’s some of the fundamentals sorted. Skip to this section on the video to hear the explanation.

This social science approach is at odds with (i.e. online community) design where you are trying to figure out the configuration of all possible variables to have the effect that you want to have. Kraut says that basically with design you don’t want one variable at a time you want ‘kitchen sink experiments which are theory-based experiments which you want to try out in a relatively cheap way.

But they use agent based modelling – allow theory to be tested as models in community environment, change member behaviour, which change environment (see 1:12:56) – where the ‘Identity Benefit’ is greater when agent’s interests are similar to group interests:

Here’s how to simply capture that ‘Identity Benefit’:
# viewed messages that match // # viewed messages

In comparison for the other principal type of community benefit to members Kraut identifies, the ‘Bond-based benefit’ is greater when there is repeated interaction. Kind of obvious I guess, but this is social science, so still worth stating!

Agent-based modelling and simulated communities results

And from simulated communities what Kraut found is that the simulated agent models (taking the place of community members) produced results very similar to that observed in real Usenet groups.

So the next step is that if we have a working agent model that shows how community works we can test out different types of moderation techniques, which can test in this simulated community.

From this Kraut found that ‘Personalised moderation’ out performs ‘Community level moderation’, though this really matters significantly when dealing with a large volume of content, or diverse content. In other words ‘Personalised moderation’ works well with large complex communities.

personalised-moderation

And as an example, I see this personalised moderation functionality  appears to be available in community platform Telligent’s latest version of their analytics, which sounds useful. Be good to know which other major community platforms like Lithium offer such beneficial functionality, and how well it really works in the day-to-day:

Your community can now offer its participants dynamic and personalized recommendations of both people and content. Telligent Analytics looks at your community’s data, compares it with each member’s unique interests, and then delivers personalized recommendations to that member. Telligent Analytics doesn’t just tell you how your community’s doing; it applies the analytics to improve your community members’ experience.

So if you want to go into this study applied in more practical detail here’s Robert Kraut’s paper (pdf) with the graphs and stats:

A Simulation for Designing Online Community: Member Motivation, Contribution, and Discussion Moderation – (pdf: 10.1.1.141.6657)

Or maybe you’d like to read the chapter’s of Kraut’s 2012 bookBuilding successful online communities: Evidence-based social design:

  • Resnick, P. & Kraut, R. Introduction [PDF]
  • Kraut, R. E. & Resnick, P. Encouraging contributions to online communities [PDF]
  • Ren, Y, Kraut, R. E. & Kiesler, S. Encouraging commitment in online communities [PDF]
  • Kraut, R. E., Burke, M. & Riedl, J. Dealing with newcomers [PDF]
  • Kiesler, S, Kittur, A., Kraut, R., & Resnick, P. Regulating behavior in online communities [PDF]
  • Resnick, P, Konstan, J & Chen, Y. Starting a community. [PDF]

Influential people + influential friends = spread products

Identifying social influence in networks is critical to understanding how behaviors spread. We present a method for identifying influence and susceptibility in networks that avoids biases in traditional estimates of social contagion by leveraging in vivo randomized experimentation. Estimation in a representative sample of 1.3 million Facebook users showed that younger users are more susceptible than older users, men are more influential than women, women influence men more than they influence other women, and married individuals are the least susceptible to influence in the decision to adopt the product we studied. Analysis of influence and susceptibility together with network structure reveals that influential individuals are less susceptible to influence than non-influential individuals and that they cluster in the network, which suggests that influential people with influential friends help spread this product [red text highlighting added].

Identifying Influential and Susceptible Members of Social Networks
Sinan Aral, Dylan Walker

Science http://dx.doi.org/10.1126/science.1215842

Social media have provided plentiful evidence of their capacity for information diffusion. Fads and rumors but also social unrest and riots travel fast and affect large fractions of the population participating in online social networks (OSNs). This has spurred much research regarding the mechanisms that underlie social contagion, and also who (if any) can unleash system-wide information dissemination. Access to real data, both regarding topology—the network of friendships—and dynamics—the actual way in which OSNs users interact, is crucial to decipher how the former facilitates the latter’s success, understood as efficiency in information spreading. With the quantitative analysis that stems from complex network theory, we discuss who (and why) has privileged spreading capabilities when it comes to information diffusion. This is done considering the evolution of an episode of political protest which took place in Spain, spanning one month in 2011

Locating privileged spreaders on an online social network

Javier Borge-Holthoefer, Alejandro Rivero, and Yamir Moreno

Phys. Rev. E 85, 066123 (2012)

http://link.aps.org/doi/10.1103/PhysRevE.85.066123

New research challenges assumptions about Twitter news sharing communities

A new study of tweets spreading news from The New York Times finds that the Internet, while creating an open line of communication across continents, may at the same time be strengthening walls that separate users into ideological camps, and more.

Researchers for the study, “An Exploration of Social Identity: The Geography and Politics of News-Sharing Communities in Twitter,” collected 521,733 tweets posted by 223,950 unique users — all of them posting or retweeting at least three links referring to NYT articles over a fifteen day period, September 14 – 29, 2011. The tweeters were clustered by who communicates with whom, and groups were characterized by the topics they posted most, tweeters’ location, and their biography key words.

What the research team found were obvious and not so obvious connection points along with revelations that challenge easy assumptions about Twitter communities.

While liberal and conservative national political subgroups were identified, other dynamics were teased out in the mathematical modeling performed by the research team.

“A person who is cosmopolitan associates with others who are cosmopolitan, and a US liberal or conservative associates with others who are US liberal or conservative, creating separated social groups with those identities,” said Yaneer Bar-Yam, president of New England Complex Systems Institute (NECSI), where the research was done.

The clusters revealed not only local and national but also global (cosmopolitan) associations. The national group has subgroups specifically political (liberal and conservative) and one that is broadly interested in business, arts and sports. Contrary to frequent media portrayals, said Bar-Yam, the findings in turn suggest that online readers of The New York Times can have competing priorities and are not uniformly liberal.

“A significant fraction of the population has become so strongly identified with ideological camps that those identities drive their social associations,” said Bar-Yam. “For those who are concerned about the polarization of society into liberal and conservative camps, the results have both positive and negative connotations. There are specific subgroups that are polarized into opposing camps, but often associations are local, national and cosmopolitan.”

The study found these dominant clusters in this sample:

  • The cosmopolitan Global Political Group – those interested in international topics, who live in various cities around the world, including New York and Washington DC, are focused on human rights and politics, and may themselves be journalists.
  • The New York Scene – A New York City-oriented group interested in a diverse set of topics including world news, US news, business, arts, fashion and sports.
  • National Business – a group with the strongest focus on business, but also interest in world news, sports, fashion and the arts. It is geographically spread across the US.
  • Two clusters that are also US-based but are specifically liberal and conservative in their political orientation.

The study is available free at www.necsi.edu/research/social/nyttwitter/.

The authors note that more than 100 million tweets are posted each day, and that a significant portion includes links to online information.

Bar-Yam, in assessing the study, noted that “Twitter cannot be ignored in how peer-to-peer and mass media are connecting people separated in space and time—and what that means in the behavior of social systems.”

In a scientific context, each user, he said, “can be thought of as a node in a network, and the relationships as links between them.”

The study authors are Amaç Herdağdelen, Wenyun Zuo, Alexander Gard-Murray and Yaneer Bar-Yam. The work was supported in part by the Office of Naval Research.

Disclaimer: This is post is a press release from NECSI, with which I have no paid connection. While I have used tools borrowed from complexity science in the health sector, my primary interest lies in adapting such insights for everyday use.

Explaining the power of the Facebook social graph using containers and social networks

I had a great time at Lean Startup Machine London this weekend, learning about using lean startup ideas and practice from a social networking perspective to build a business. It helped that I’d already been to hear Eric Ries talk, thanks to a tip off from Andy at Crocodile Clips (currently looking for investment himself I believe, and I picked up a good contact for him at the event). And also because I’ve been helping Barnaby with his Name That Place concept, thinking about how to get proof of concept and wondering about what the best way to take that forward (btw he’s not in the office today at Regus, but moving lodgings to a house boat near Vauxhall:-)

So while I promised myself a lazy day today I wanted to quickly note down two things. I still have to prepare for a talk at Cass next week on using MVP to help corporates build successful online communities, and I still have ot find a job/drive revenue before my severance from eBay runs out in X number of weeks. So time is short and comes with a cost attached, and before I pop into town to watch Mr Spacey in ‘Margin Call’ here’s a couple of quick creative thoughts.

Containers – in a container (paper page) – in a container (photo) – in a container (blog post) – etc

Mapping containers to networksPhoto by Stuart Glendinning Hall

I like to try and simplify things where possible as that way you can get difficult things done more easily right? So in thinking about what works as a social business I came up with the idea of matching up ‘containers’ – that is simply a tool for mapping how a social concept might work. The example above is an attempt to show across 3 degrees of separation how in rough and ready terms a business like Airbnb  works best.

In trying to find somewhere to stay you are first going to see if any of your ‘friends’ live in the city you are visiting (the idea behind Airbnb is providing cheap places for people to stay in other people’s homes). But the chances they have a room in that city are ‘unlikely’ as your friendship network is relatively small. So you turn to ‘friends of friends,’ and they are ‘likely’ as they are my the virtue of wider geo-distribution going to have a possible place to stay. But maybe the night you want to stay they are busy? So the next container along, which for the sake of 3rd degree of separation symmetry I’ve called ‘friends of friends of friends’ is very likely to provide the room you want, and for the time/date you want. (It’s a nice fact that the average user on Facebook is connected to everyone else by 3.74 degrees of separation, so you can see why Facebook based commerce using the social graph is so potentially powerful).

As a side note I really liked the pivot by lean startup participants ‘You never know’ led by ‘Easy Ed’ (alliteration really helps remember ppl’s first names:-) who started with the idea of an app where you could get matched up with single people in your immediate social network, but found that people didn’t want to do that for themselves. But then on pivoting realised that ‘smug married’ people would happily introduce single people to other single people. Neat change of the social networking dynamic, from ‘doing it for yourself’ as a single person not working due to fear of rejection for example to someone with a networking ‘doing it for you’. So maybe that’s why blind dates work, so long as someone you know sets it up for you!

Superbowl Sunday: data crunchers vs grandmothers

While I was talking to Javi he happened to mention one of LSM London teams ‘hstream’ had a real time Twitter analytics idea. I got excited at the idea of tracking sentiment around Patriots vs Giants and even had a look at the odds at Betfair. I also tried Twitter manually, so to speak, and found and favorited one tweet which from a gambler’s perspective seemed to stand out. It turned out to be right, the 94-year-old grandmother backed the Giants, the winners of the Superbowl XLVI. Wonder what the results of hstream’s real time data analytics were?

PS: Post-Sony I now know this Giants case to be an example of #thinslicing on yes:-)

94-year-old grandmother predicts Giants to winPhoto by Stuart Glendinning Hall

Viral Loop notes

I just stumbled across a great site, Books Noted, which in it’s words provides “quotations, notes and takeaways on interesting books. These books further our knowledge in a variety of topics from psychology, entrepreneurship, philosophy, business and more. Since they tend to be hundreds of pages in length, these short notes and takeaways will get to the essence of the book for time strapped readers.” Below are the notes from Viral Loop, with my particular interest on the growth curve behind Hotmail and the story of how one mother got out of Chernobyl just in time – intuition is powerful:

* Products that require a customer education are best suited to direct selling… if you are unloading blue jeans, direct selling probably isn’t for you, since everyone knows what jeans are and what they are used for [what about cosmetics? Avon?]
* Jim Clark contacted Marc Andreesen after using the Mosaic browser for the first time
* Warren Buffett says “Get greedy when others are fearful and fearful when others are greedy.”
* Network effect: the more connections you have, the more nodes, the more people, the more valuable it will be
* “double viral loop” – every network creator is a user and every user is a potential network creator
* Successful viral expansion loop companies share these characteristics:
o Web-based: suited to the frictionless world of the Internet
o Free: users consume product at not charge
o Organizational technology: no content is created, users create it. Companies just organize the content
o Simple concept: easy and intuitive to use
o Built-in virality: users spread the product out of their own self-interest
o Extremely fast adoption, exponential growth, virality index, predictable growth rates, network effects, stackability, point of nondisplacement and ultimate saturation
* Fanout: Jurvetson noted a “mathematical elegance” to Hotmail’s “smooth exponential growth curves” in the company’s early days: Cumulative users = (1 + fanout) cycles Where “cumulative users” related to the number of Hotmail registered subscribers, “fanout” was the rate by which the product spread and “cycles” was the number of times the product was used in the time period since launch (or frequency multiplied by time). At the beginning, each Hotmail user, on average, brought in two new users each month. (In other words, the fanout equaled 2.)
* Controlled growth at Gmail: Paul Buchheit, the brains behind Gmail, purposely controlled the rate of adoption by instituting an invitation-only sign-up procedure. Because Gmail offered 1,000 megabytes of storage while others gave users only 4 megabytes, Buchheit chose to drag down Gmail’s growth rate so Google could keep the application operational without risking sluggish download times, outages, data loss, or any other performance problem that often emerges with rapid scaling.
* How Max Levchin survived Chernobyl: Fleeing the crumbling Soviet empire most likely saved their lives. When Levchin was eleven, his mother, a physicist who worked as a government research assistant, overheard news of a leak at the Chernobyl nuclear reactor, which was on the verge of a meltdown. Acid rain misted down as the family quietly vacated their home and rushed to the train station in Kiev. After they were onboard, news of the disaster became public, and hours later, as they chugged into Crimea, Soviet guards ordered them to turn back, fearful of contamination. Following an animated discussion, Levchin’s mother convinced them to check for radiation. All were clean, except Max, whose right foot sent the Geiger counter into spasms. The guard said the boy’s bone marrow was contaminated; his leg might have to be amputated. His mother told Max to take off his shoe and he was tested again. This time he passed and they were let through, sans shoe. The culprit was a radioactive rose thorn that had lodged in his sole as they escaped Kiev.
* What early PayPal was looking for: Because Thiel’s ultimate plan was to create a Web-based currency to undermine government tax structures, which would require taking on powerful interests like commercial banking, the cofounders sought people a lot like them: hypercompetitive, well-read, multilingual workaholics who had, above all, a proficiency in math and an aversion to authority.
o Thiel subscribed to a theory of human behavior known as “mimetic desire,” propounded by French historian and philosopher René Girard, who believed that people were essentially sheep who, without much reflection, borrowed their desires from others. This theory has been applied to describe financial bubbles and panics, when investors blindly act as a flock and follow what others are doing even if it flies in the face of economic logic, and to war and violence, which arise when two individuals vie for the same possession, leading to antagonism and strife. Pretty soon the object of desire is forgotten and all you’re left with is the antipathy.
o PayPal Mafia: The day that eBay took over, Thiel, Levchin, and Hoffman, who collectively took in more than $100 million, walked away from the viral company they had started just a few years earlier. But PayPal was simply the beginning for these former employees, and the lessons they learned at PayPal would spread virally to other viral concerns. Thiel founded his own hedge fund, Clarium Capital Management LLC, invested $750,000 in Facebook, and joined the board. Levchin created Slide, a widget maker that counts hundreds of millions of installs of its photo slideshow and other applications across social networks. Hoffman is the CEO of LinkedIn, a networking tool for business that counts almost 40 million members and, since it has multiple revenue streams, boasts that it is profitable. Roelof Botha, PayPal’s CFO, moved over to the venture capital side and became a partner at Sequoia Capital. One of his first investments was in YouTube, which was started by former PayPal alums Chad Hurley and Steve Chen.
* Michael Birch, founder of Bebo’s daily life: While Bebo grew at a fantastic rate overseas, Birch’s family life in San Francisco remained downright normal. He woke up at 6:30 a.m. to help his children get ready for school; then, after dropping them off, he and his wife would arrive at the office at 8:00 a.m., and Birch would spend half the day programming. The few meetings he held were with suppliers and prospective partners or advertisers. With Bebo adding ten thousand new members a week, either he or his head programmer was always on call, since they were the only ones who knew the site’s architecture. Since his co-coder got his kicks from skydiving, Birch would stress whenever he took to the sky. Then he would leave around 7:00 p.m. to see his kids for an hour before they went to bed. A few times a week he attended networking events, for instance, a website launch, and he traveled to England, where he and his wife would combine business with pleasure.
* Summary of online viral loop companies:
o Web-based: Better suited to the Internet
o Free: Users consume the product at no charge
o Organizational technology: They don’t create content, their users do
o Simple concept: Easy and intuitive to use
o Built-in virality: Users spread the product out of own self-interest
o Exponential growth: That is, the virality index is above 1.0, which creates predictable growth rates
o Network effects: The more who join, the more who have an incentive to join
o Stackability: A viral network can be laid over the top of another, helping both grow
o Point of nondisplacement: Becomes virtually impregnable
o Ultimate saturation: A point of maturity when growth slow

Social networking ability & field sense

Wayne Gretzky-Style ‘Field Sense’ May Be Teachable http://shar.es/m3E8W An application to social networking influence is my thought. Cheers!

My tweet today (above) follows my last blog post on the importance of location, rather than the number of connections, in determining an individual’s influence: “we may have got too focused on valuing networks in terms of who is the best connected. In fact the most influential person in the network comes down to location, rather than connections,” according to the research paper ‘Identifying Influential Spreaders in Complex Networks’.

In other words if location is important, and if networks are dynamic, then maybe you can get better at being in the right place at the right time to maximise your influence? Perhaps the sports science of ‘field sense’ has something to offer here to online social networking strategy? It’s just a hunch for now.

Wanted, a young genius to be the next Charles Darwin?

Stimulating article in the Daily Telegraph yesterday by Leicester’s very own David Attenborough on who and how the next Darwin might ‘emerge’ (to use an evolutionary term).

My comment didn’t pass the moderator’s test, but I’ll cheat and publish it here:

“Sorry David but the next scientific revolution will not be televised. By the nature of where contemporary science is at, and where the internet is at, means the next Darwin is most likely to be found right now on Facebook or Twitter experimenting with their science and sharing their failures & breakthroughs.”

Actually I’ve taken the opportunity to add the last bit about experimenting & sharing. and this co-incidental pic from 2007…

Emerge emerge emergePhoto by Stuart Glendinning Hall

Darwin dinner at Christ’s College

News just in for Darwin lovers..

The Development Office is delighted to announce that on 12th February 2009 Christ’s will host a unique fundraising gala dinner to mark not only Charles Darwin’s birthday, but also the birth of College’s exciting collaboration with the Galapagos Conservation Trust. This event will undoubtedly be the highlight of the Bicentenary celebrations for both organisations.

There will be a full programme of activities throughout the afternoon and evening, including a discussion led by Sir David Attenborough and Andrew Marr, tours of the refurbishment of Darwin’s student set in College and a lavish black tie dinner in Hall.

Places are strictly limited. We are hoping to raise around £500,000 to establish a Charles Darwin and Galapagos Islands Trust to promote scholarship linked to present-day aspects of the work originally undertaken by Charles Darwin (Christ’s 1827-31, 1836-37).
 
Further information is available from Alex Cullen:
ac597@cam.ac.uk