Your Friends Are More Interesting Than You On Average

The Friendship Paradox

Feld’s friendship paradox states that ‘your friends have more friends than you, on average’. This paradox arises because extremely popular people, despite being rare, are overrepresented when averaging over friends.

Using a sample of the Twitter firehose, we confirm that the friendship paradox holds for >98% of Twitter users. Because of the directed nature of the follower graph on Twitter, we are further able to confirm more detailed forms of the friendship paradox: everyone you follow or who follows you has more friends and followers than you.This is likely caused by a correlation we demonstrate between Twitter activity, number of friends, and number of followers.

But wait, there’s more..

In addition, we discover two new paradoxes: the virality paradox that states ‘your friends receive more viral content than you, on average’, and the activity paradox, which states ‘your friends are more active than you, on average’. The latter paradox is important in regulating online communication. It may result in users having difficulty maintaining optimal incoming information rates, because following additional users causes the volume of incoming tweets to increase super-linearly. (And this also may relate to why in large complex communities personalized moderation works better than community moderation, as explored in my last blog post).

While users may compensate for increased information flow by increasing their own activity, users become information overloaded when they receive more information than they are able or willing to process. We compare the average size of cascades that are sent and received by overloaded and underloaded users. And we show that overloaded users post and receive larger cascades and they are poor detector of small cascades.

What are the dangers of overload?

Those users who become overloaded, measured by receiving far more incoming messages than they send out, are contending with more tweets than they can handle. Controlling for activity, they are more likely to participate in viral cascades, likely due to receiving the popular cascades multiple times. Any individual tweet’s visibility is greatly diluted for overloaded users, because overloaded users receive so many more tweets than they can handle. Because of the connection between cognitive load and managing information overload, the present results suggest that users will dynamically adjust their social network to maintain some optimal individual level of information flux. (What does this mean for Facebook’s growth?)

Friendship Paradox Redux: Your Friends Are More Interesting Than You – Nathan O. Hodas, Farshad Kooti, Kristina Lerman (PDF of the paper)
http://arxiv.org/abs/1304.3480

Three books on thinslicing worth reading

Reading this piece from Peter Adam on the use of thinslicing this morning in in-game decision making it was useful to note that thinslicing works best when the information being analysed is bounded, that is when there is a ‘yes/no’ choice for the subconscious brain:

“The human brain is fantastic at providing answers to complex yes/no questions quickly, but it starts to break down when the questions being presented are unbounded.* Gladwell provides many examples in Blink of complex snap decisions being made correctly when phrased as yes or no questions.”

In addition in comments there appears to be 3 useful books worth following up on in this field:

1. Art of Learning by Josh Waitzin (a seminal work in the search for competence and mastery for me – I lend both my copies out frequently)

2. Gut Feelings: The Intelligence of the Unconscious by Gerd Gigerenzer (influenced Gladwell’s Blink heavily)

3. Thinking, Fast and Slow by Daniel Kahneman (these last two books are almost a point-counterpoint view of decision heuristics, and without either being right or wrong you get a more holistic view of the decision-making progress)

‘Thinslicing’ connects the data, to the behaviour that creates it

If you’re here for the two examples of companies that improved customer service by allowing people (customers) to talk to people (employees), highlighted in red – and the 2nd example is in the 3rd comment. You can ignore the stuff about thinslicing:-)

To explain why I like the term thinslicing first take a look at the cool piece about data interpretation written today by Lithium’s Dr Michael Wu, including this neat illustration:

Then consider this, that my response to reading this blog post clarified a key thing I have been trying to say. Firstly, that I’ve come to term the business objective of finding the “interpretable, relevant and novel” in data as Michael terms it – through a combination of art and science – namely that of thinslicing.

thinslicing

But now I’ve made the next step. Identifying the value of thinslicing lies in the elegant and powerful way the term thinslicing connects the approach to data analytics to the behaviour that creates that data - namely with the thinslicing of online consumers who “tend to ignore most information available and instead ‘slice off’ a few relevant information or behavioral cues that are often social to make intuitive decisions,” as Brian Solis puts it. 

But perhaps it would help if I made clear what I don’t mean by thinslicing as a strategic tool, is that summed by nicely in these two paragraphs written by Bob Thompson on the CustomerThink community:

“Despite our best efforts to collect and analyze data, good business decisions will always include elements of judgement, intuition or just plain luck. Many day-to-day decisions are made with little or no thought, because the option selected just seems “right.” Gut-feel decisions might be examples of what Malcolm Gladwell called “thin-slicing” in his provocative 2005 bestseller Blink.

“However, the best decision can sometimes be counter-intuitive. For example, the financial services firm Assurant Solutions wanted to improve its “save” rate on customers calling in to cancel their protection insurance. The industry’s conventional wisdom, which resulted in 15-16% retention rates, was to focus on reducing wait time to boost customer satisfaction. But data analysis found a solution that tripled the retention rate: matching customer service reps with customers based on rapport and affinity.”

What I mean is the approach to data as you outline above which I categorize as thinslicing, coupled with the way consumers make purchasing decisions – which like good business “will always include elements of judgment, intuition or just plain luck”.

In other words by thinslicing, rather than using intuition to make decisions, I mean adopting a strategy which is based on the understanding that by connecting the means of analyzing the data with the way the data is created by customers.

The question then is why? While it may be clever to see a way which logically connects the way to analyse data with the way it’s created, why is that potentially so useful to a business? Now there’s a good question. The obvious answer is that by aligning the analytic method used by your business, with the way the data is created by your customers, you are going to produce better results in terms of both better quality actionable recommendations which also produce an increase in ROI. How does that sound?

Update: so there’s a nice response from Dr Michael Wu on that question of linking the too together, the way you approach the data, with the way its created, that connects the two ends of the spectrum together:

Good data scientists must know everything that happen to the data, from its creation, all the way to the point where they get their hands on the data. It is actually a pretty standard practice for hardcore financial/business analysts. Not only you need to “connecting the means of analyzing the data with the way the data is created,” you must know everything that happen to the data along the way, until the data reaches you (or the analyst). Only then can you be certain that your analysis is not biased or confounded by something before you get your hands on it. In statistics term, only then can you know the confidence interval of your result.

Influential people + influential friends = spread products

Identifying social influence in networks is critical to understanding how behaviors spread. We present a method for identifying influence and susceptibility in networks that avoids biases in traditional estimates of social contagion by leveraging in vivo randomized experimentation. Estimation in a representative sample of 1.3 million Facebook users showed that younger users are more susceptible than older users, men are more influential than women, women influence men more than they influence other women, and married individuals are the least susceptible to influence in the decision to adopt the product we studied. Analysis of influence and susceptibility together with network structure reveals that influential individuals are less susceptible to influence than non-influential individuals and that they cluster in the network, which suggests that influential people with influential friends help spread this product [red text highlighting added].

Identifying Influential and Susceptible Members of Social Networks
Sinan Aral, Dylan Walker

Science http://dx.doi.org/10.1126/science.1215842

Social media have provided plentiful evidence of their capacity for information diffusion. Fads and rumors but also social unrest and riots travel fast and affect large fractions of the population participating in online social networks (OSNs). This has spurred much research regarding the mechanisms that underlie social contagion, and also who (if any) can unleash system-wide information dissemination. Access to real data, both regarding topology—the network of friendships—and dynamics—the actual way in which OSNs users interact, is crucial to decipher how the former facilitates the latter’s success, understood as efficiency in information spreading. With the quantitative analysis that stems from complex network theory, we discuss who (and why) has privileged spreading capabilities when it comes to information diffusion. This is done considering the evolution of an episode of political protest which took place in Spain, spanning one month in 2011

Locating privileged spreaders on an online social network

Javier Borge-Holthoefer, Alejandro Rivero, and Yamir Moreno

Phys. Rev. E 85, 066123 (2012)

http://link.aps.org/doi/10.1103/PhysRevE.85.066123

New research challenges assumptions about Twitter news sharing communities

A new study of tweets spreading news from The New York Times finds that the Internet, while creating an open line of communication across continents, may at the same time be strengthening walls that separate users into ideological camps, and more.

Researchers for the study, “An Exploration of Social Identity: The Geography and Politics of News-Sharing Communities in Twitter,” collected 521,733 tweets posted by 223,950 unique users — all of them posting or retweeting at least three links referring to NYT articles over a fifteen day period, September 14 – 29, 2011. The tweeters were clustered by who communicates with whom, and groups were characterized by the topics they posted most, tweeters’ location, and their biography key words.

What the research team found were obvious and not so obvious connection points along with revelations that challenge easy assumptions about Twitter communities.

While liberal and conservative national political subgroups were identified, other dynamics were teased out in the mathematical modeling performed by the research team.

“A person who is cosmopolitan associates with others who are cosmopolitan, and a US liberal or conservative associates with others who are US liberal or conservative, creating separated social groups with those identities,” said Yaneer Bar-Yam, president of New England Complex Systems Institute (NECSI), where the research was done.

The clusters revealed not only local and national but also global (cosmopolitan) associations. The national group has subgroups specifically political (liberal and conservative) and one that is broadly interested in business, arts and sports. Contrary to frequent media portrayals, said Bar-Yam, the findings in turn suggest that online readers of The New York Times can have competing priorities and are not uniformly liberal.

“A significant fraction of the population has become so strongly identified with ideological camps that those identities drive their social associations,” said Bar-Yam. “For those who are concerned about the polarization of society into liberal and conservative camps, the results have both positive and negative connotations. There are specific subgroups that are polarized into opposing camps, but often associations are local, national and cosmopolitan.”

The study found these dominant clusters in this sample:

  • The cosmopolitan Global Political Group – those interested in international topics, who live in various cities around the world, including New York and Washington DC, are focused on human rights and politics, and may themselves be journalists.
  • The New York Scene – A New York City-oriented group interested in a diverse set of topics including world news, US news, business, arts, fashion and sports.
  • National Business – a group with the strongest focus on business, but also interest in world news, sports, fashion and the arts. It is geographically spread across the US.
  • Two clusters that are also US-based but are specifically liberal and conservative in their political orientation.

The study is available free at www.necsi.edu/research/social/nyttwitter/.

The authors note that more than 100 million tweets are posted each day, and that a significant portion includes links to online information.

Bar-Yam, in assessing the study, noted that “Twitter cannot be ignored in how peer-to-peer and mass media are connecting people separated in space and time—and what that means in the behavior of social systems.”

In a scientific context, each user, he said, “can be thought of as a node in a network, and the relationships as links between them.”

The study authors are Amaç Herdağdelen, Wenyun Zuo, Alexander Gard-Murray and Yaneer Bar-Yam. The work was supported in part by the Office of Naval Research.

Disclaimer: This is post is a press release from NECSI, with which I have no paid connection. While I have used tools borrowed from complexity science in the health sector, my primary interest lies in adapting such insights for everyday use.

Explaining the power of the Facebook social graph using containers and social networks

I had a great time at Lean Startup Machine London this weekend, learning about using lean startup ideas and practice from a social networking perspective to build a business. It helped that I’d already been to hear Eric Ries talk, thanks to a tip off from Andy at Crocodile Clips (currently looking for investment himself I believe, and I picked up a good contact for him at the event). And also because I’ve been helping Barnaby with his Name That Place concept, thinking about how to get proof of concept and wondering about what the best way to take that forward (btw he’s not in the office today at Regus, but moving lodgings to a house boat near Vauxhall:-)

So while I promised myself a lazy day today I wanted to quickly note down two things. I still have to prepare for a talk at Cass next week on using MVP to help corporates build successful online communities, and I still have ot find a job/drive revenue before my severance from eBay runs out in X number of weeks. So time is short and comes with a cost attached, and before I pop into town to watch Mr Spacey in ‘Margin Call’ here’s a couple of quick creative thoughts.

Containers – in a container (paper page) – in a container (photo) – in a container (blog post) – etc

Mapping containers to networksPhoto by Stuart Glendinning Hall

I like to try and simplify things where possible as that way you can get difficult things done more easily right? So in thinking about what works as a social business I came up with the idea of matching up ‘containers’ – that is simply a tool for mapping how a social concept might work. The example above is an attempt to show across 3 degrees of separation how in rough and ready terms a business like Airbnb  works best.

In trying to find somewhere to stay you are first going to see if any of your ‘friends’ live in the city you are visiting (the idea behind Airbnb is providing cheap places for people to stay in other people’s homes). But the chances they have a room in that city are ‘unlikely’ as your friendship network is relatively small. So you turn to ‘friends of friends,’ and they are ‘likely’ as they are my the virtue of wider geo-distribution going to have a possible place to stay. But maybe the night you want to stay they are busy? So the next container along, which for the sake of 3rd degree of separation symmetry I’ve called ‘friends of friends of friends’ is very likely to provide the room you want, and for the time/date you want. (It’s a nice fact that the average user on Facebook is connected to everyone else by 3.74 degrees of separation, so you can see why Facebook based commerce using the social graph is so potentially powerful).

As a side note I really liked the pivot by lean startup participants ‘You never know’ led by ‘Easy Ed’ (alliteration really helps remember ppl’s first names:-) who started with the idea of an app where you could get matched up with single people in your immediate social network, but found that people didn’t want to do that for themselves. But then on pivoting realised that ‘smug married’ people would happily introduce single people to other single people. Neat change of the social networking dynamic, from ‘doing it for yourself’ as a single person not working due to fear of rejection for example to someone with a networking ‘doing it for you’. So maybe that’s why blind dates work, so long as someone you know sets it up for you!

Superbowl Sunday: data crunchers vs grandmothers

While I was talking to Javi he happened to mention one of LSM London teams ‘hstream’ had a real time Twitter analytics idea. I got excited at the idea of tracking sentiment around Patriots vs Giants and even had a look at the odds at Betfair. I also tried Twitter manually, so to speak, and found and favorited one tweet which from a gambler’s perspective seemed to stand out. It turned out to be right, the 94-year-old grandmother backed the Giants, the winners of the Superbowl XLVI. Wonder what the results of hstream’s real time data analytics were?

PS: Post-Sony I now know this Giants case to be an example of #thinslicing on yes:-)

94-year-old grandmother predicts Giants to winPhoto by Stuart Glendinning Hall

Body Wisdom – Interplay of Body and Ego

‘Body Wisdom – Interplay of Body and Ego’ – new book by Ken Bausch. A few details below..

Body is wiser than Ego.
Ego is cleverer than Body.
When Ego catches Body’s tune, a song happens.
When Ego catches Body’s intuition, magic happens.
An idea is born.

When we focus with our hearts on troubling questions, our unconscious comes through for us. Open questions posed to the unconscious act as the strange attractors of chaos theory. They enable the creative speech of discovery.

In this book, you will explore how your ego rises from your body through language. You will appreciate how the creative thinking enabled by body–ego interplay builds your personality overtime. The personal and social realms you create have remarkable properties. When you understand those properties, you open new vistas for viewing empathy, visions, hallucinations, dreams, and the reality of language. You open new ways to understand objective reality, the reality of religious myths, and even the reality of death.

The motif of most Western thought since the time of Zoroaster and Plato is that we are minds (and souls) trapped within physical bodies. St Augustine reinforced this tradition and Descartes formalized it. The upright man, as symbolized by the stick figure (all head and almost no body) became a standard Western conception.

Nietzsche saw the evil of this conception and protested it loudly. Merleau-Ponty demonstrated that our bodies both know and are known. Freud and Lacan showed how the ego rises from the body through the magic of language. Our bodies are microcosms of the universe and bearers of its unspoken secrets. Holograms, chaos theory, and fractal geometry bear witness.

For more about this book and its blog, go to
www.bodywisdombook.com

Notes on social media feedback loops

A few slides to layout the principle of different feedback loops between your online community, your site, contributors, readers and other blogs and communities. Any feedback?

…And thanks to tweet-feedback from Jenny Ambrozek (@sagenet) for the wider context around the power of feedback loops – see the Fast Company article on how Ning is using this concept (what they term a ‘viral expansion loop’) to great effect. [I'm currently at the British Computer Society at Covent Garden, so looks like I'll be reading the print-out over lunch].

PS: It’s also a key way in which the world’s biggest social network site Facebook, by implementing the ‘status update’ feature, managed to rapidly grow its membership, as I outlined in a recent post. In other words this is a very powerful tool if done well, and with something people want. Anyone want my viral loop consultancy better get in touch quick as I’m off to see a London-based social media agency about this on Thursday!

In the meantime I’ve ordered Adam Penenberg’s book ‘Viral Loop’ (see the Amazon widget on my homepage to order a copy) after a ‘winning streak’ of blog posts on the power of networks & feedback loops led me to his virtual door. If you fancy creating some feedback loops, or plain user flows for that matter, I’ve tracked down what appears to be a useful site: Product Planner. It allows you to create your own viral loops and check out some that have already been created.

And of course I did a very quick search today on Twitter on the key phrase ‘viral loops’ which unearthed this gem of a slideshow, from Josh Jeffreys (Interactive Creative Director at BusyEvent) which provides (in his words) an overview of how to build applications that have built-in mechanisms for driving users to recruit additional users through normal use of the application. Look out for the new acronym ‘UDU’ (users drive users):Viral Loops: Making Self-Marketing Apps

Social networking ability & field sense

Wayne Gretzky-Style ‘Field Sense’ May Be Teachable http://shar.es/m3E8W An application to social networking influence is my thought. Cheers!

My tweet today (above) follows my last blog post on the importance of location, rather than the number of connections, in determining an individual’s influence: “we may have got too focused on valuing networks in terms of who is the best connected. In fact the most influential person in the network comes down to location, rather than connections,” according to the research paper ‘Identifying Influential Spreaders in Complex Networks’.

In other words if location is important, and if networks are dynamic, then maybe you can get better at being in the right place at the right time to maximise your influence? Perhaps the sports science of ‘field sense’ has something to offer here to online social networking strategy? It’s just a hunch for now.