Social Media Header

The Power of Social Media

By Lesley Kriewald

Finding, identifying and evaluating online crowds for change

"The social Web and social media have essentially become weapons of mass persuasion. You have large numbers of people interacting with each other, so you see not only spam but political campaigns involved in this. You see evidence of governments and hate groups engaging in this."

Social media — user-generated content. Texas A&M computer scientist James Caverlee is studying the use of social media to harness collective intelligence to perform tasks, to persuade and change minds, and maybe even to change the world. 

As an expert in large-scale networked information systems—such as the World Wide Web, social media and mobile information systems—Caverlee looks at tremendous numbers of users and tremendous amounts of information, the places people go, the people they talk to and interact with, the content they are interested in, and the connections between them. 

Social Media by the Numbers

These large-scale networked information systems are typically open and intentionally designed to encourage participation, Caverlee says. 

"This has lots of good benefits," he says. "You see self-organized systems. You see serendipitous discovery of new information and new uses of these systems beyond what the systems designers had ever imagined. You see communities form. There are all these exciting possibilities that happen when you let people collaborate."

Now he wants to know whether these systems can be mined to find interesting or useful information to empower decision makers.

Caverlee asks, Can collective intelligence be harnessed?

Online gathering places

Crowds are naturally forming, Caverlee says, and the goal of his current research is to find these online crowds and engage them to accomplish tasks.

In 2010, Caverlee received a grant from the Defense Advanced Research Projects Agency (through the agency's Information Innovation Office) to identify online "hotspots" in real-time social systems, mainly Twitter, where hundreds of millions of messages are posted per day. 

Dr. James Caverlee

Caverlee looks at tremendous numbers of users and tremendous amounts of information, the places people go, the people they talk to and interact with, the content they are interested in, and the connections between them.

These online gatherings can be driven by natural disasters or by sporting events such as the Super Bowl. How do you find these hotspots in real time when huge numbers of people are posting hundreds of millions of messages? The task required developing algorithms and methods. 

"After the Vancouver riots, people started posting photos and video and messages of the riots," Caverlee says. "So you had this online reflection of current events. Assuming you can mine these large-scale systems and figure out where people are talking about this particular event, can we close the loop and engage with these crowds that are forming in these systems?

"It's a form of crowdsourcing," Caverlee says. "Can crowds of people online work together intelligently, and if so, how?"

Such crowdsourcing would be useful in a disaster, natural or otherwise. When disaster strikes, human impulse is for people on-site to use Facebook and Twitter to post photos and video of the damage, status updates, and their locations. Once these online crowds that have formed in response to this particular emergency are detected out of the hundreds of millions of other online activities, the challenge then becomes connecting the crowds to emergency responders. 

"If I am an emergency responder with limited resources, I need to know where to send trucks and rescue teams," he says. "Emergency responders have to make these decisions with limited information in the immediate moments after this disaster has struck. So let's engage with these detected crowds and know where they are, but also issue them jobs to complete."

The National Science Foundation has sponsored some of Caverlee's work in this area. He says that one job, for instance, would be for people to step outside their homes and take photos of the ground or buildings so that structural engineers can assess the safety of structures in the area. Computers can't do that job, Caverlee says. Real people need to be directed to take better photos, to give better information that can help decision makers know where to send resources.

"In the moment, as this is happening, the big challenge is to detect this crowd, connect them back to the stakeholder—in this case, the emergency responder—and then give them tasks to do to accelerate decision making and reduce response time."

Crowd quality—in the moment

Alongside this detection and task assignment is assessing the relative quality of a newly formed crowd. A second project, this one funded by the National Science Foundation, aims to assess and monitor the quality of these crowds. 

A typical measure of Web quality is Google's PageRank, or the content of a page or site, the number of clicks on a page, the links around the page, or how people engage with the site. CNN.com, for example, may be a reputable page compared with a blog because CNN has a lot of in-links, a long life on the Web, more site visitors and higher click rates. 

"It's a form of crowdsourcing. Can crowds of people online work together intelligently, and if so, how?"

But crowds form in real-time systems in response to events, such as a presidential debate, a natural disaster or even just friends talking about a game. The crowd itself may not have a long life span, so a long history to assess the crowd is not available. Instead, researchers have to develop other methods to evaluate relative quality. 

"We don't have history for these people, or accounts," Caverlee says, "so we have to make assessments in near real time and build models quickly so that we can speed and improve decision making."

Spam: Not just potted meat

Caverlee says the flip side of these open systems is getting people to participate in, but not abuse, the system. 

"Spam has moved out of our e-mail inboxes and into these social systems," he says, citing the example of Astroturf, a campaign that looks like a grassroots effort that instead uses fake accounts and bots to promote particular candidates or ideas.

Caverlee started his research with Web spammers who tried to manipulate search engine rankings. But spam has evolved, and with the rise of social media such as Facebook, Twitter and Flickr, spammers are now infiltrating and attacking those systems.

In work funded by Google and in collaboration with Steve Webb (then at Georgia Tech and now an independent consultant), Caverlee studied social spammers by deploying social honeypots, a new twist on a classic security reconnaissance trick to attract spammers and then study their tactics and behaviors to reverse engineer a solution to stop them. 

The research group ran the social honeypots in MySpace (when MySpace was still popular and active) and in Twitter, engaging in behaviors to attract spammers to them to serve as an early warning system to detect new and emerging behaviors in these social systems. 

In Twitter, the group set up accounts that didn't tweet or tweeted only randomly, and that did not connect with other Twitter users.

"These are passive accounts with no listed interests, that no active users would want to friend—these honeypots are irresistible to spammers," Caverlee says. "So if one of these accounts tried to be friended, we would turn down the request but then crawl back to mine their profiles and connections. We could build machine-learned models of them based on artifacts of what they've left in the system."

In this war of persuasion, then, researchers want to know how to change people's minds and views of the world.

Typical spammer behavior is to follow as many other users as possible. Caverlee says an obvious sign of a bot-controlled account is a certain set ratio of friends to followers. Looking at user history showed that as soon as an upper threshold was reached, the bot-controlled account would immediately dump hundreds of followers to maintain the set ratio. Having many links per tweet is another red flag that a user is a bot-controlled account. 

With 60 Twitter honeypots, the team recorded 36,000 spammers in seven months. By modeling the behavior, the team could detect the spam accounts eight to 10 days before Twitter's own spam detectors did. 

Weapons of mass persuasion

The group noticed not only automated spam but also lots of coordinated spam, more like campaigns. The team found multiple accounts with similar talking points, and more subtle, sophisticated spam campaigns instead of brute-force spam that's easier to identify.

So the team's current work, funded through the U.S. Air Force Office of Scientific Research in spring 2012, is looking at propaganda and strategic manipulation in these large networked social systems.

"The social Web and social media have essentially become weapons of mass persuasion," he says. "You have large numbers of people interacting with each other, so you see not only spam but political campaigns involved in this. You see evidence of governments and hate groups engaging in this."

In this war of persuasion, then, researchers want to know how to change people's minds and views of the world. 

"We know that this is happening already, but we as researchers don't have a very good handle on what's going on, only anecdotal evidence," he says. 

So Caverlee is now trying to detect persuasion at the scale of these large systems, finding these talking points and tracking their evolution. Coupled with that is building mathematical models that can explain this behavior. 

"Anecdotally, there's this idea of tipping points, where things go viral," he says. "Once we detect these campaigns, can we rewind the tape and build models? Can we find out if it's the content of the message, or the topology of the social network, or is it key influencers or is it that the message reached a certain kind of inertia, a certain mass of people?"

This all goes back to the idea of these open systems, Caverlee says.

"Anyone can engage, anyone can promote their messages, so it all becomes a war of ideas. Because I'm a computer scientist, I'm trying to build computational methods for detecting and mining these campaigns, for modeling their evolution, for determining how to stop a campaign or determine the factors that make a campaign go global, and understanding the factors that influence all of this." 

Dr. James Caverlee
Dr. James Caverlee
Assistant Professor
Computer Science & Engineering
979.845.0537