Dr. Xia “Ben” Hu, assistant professor in the Department of Computer Science and Engineering at Texas A&M University, received a National Science Foundation (NSF) grant for his novel research, which centers around the development of network embedding algorithms for analyzing large-scale and complex attributed networks.
Attributed networks widely exist in various network information systems, such as social networks, academic networks and health care systems. While traditional network nodes display user-to-user relationships on social networks, paper-to-paper citations for academic networks and doctor-to-doctor relationships for health care networks, attributed network nodes also show an additional set of attributes, such as user demographics, paper contents and doctor expertise.
“While most existing studies focus on plain network embedding, in this project we propose to investigate a novel problem of attributed network embedding by tackling challenges brought by large-scale and complex attributed network data,” Hu said.
There are many challenges with large-scale, complex attributed networks. First, many real-world attributed networks contain thousands of features, which translate to millions of nodes and edges. Edges are links between the vertices, or nodes, of a network. For example, Facebook users update over 600,000 pieces of information each minute; Twitter has 319 million active users and as of 2012, 20 billion edges.
Second, instances in attributed networks are connected with each other through common research interests or shared geographical locations. Existing network embedding methods are built upon the assumption that instances are independently distributed and equally weighted. This is not the case with attributed networks. Because of these challenges, traditional network embedding algorithms cannot be directly applied to large-scale and complex attributed networks.
Hu’s project will complement the White House Big Data Research and Development Initiative to accelerate the emerging field of data science by generating a new class of theoretical and practical network embedding methods to analyze large and complex network data.
This research aims to successfully develop new formulations of algorithms, which will transform existing network embedding algorithms. The developed algorithms can be used in industrial applications in social computing, health informatics and enterprise systems.