Disentangled Representation Learning in Heterogeneous Information Network for Large-scale Android Malware Detection in the COVID-19 Era and Beyond

Abstract: In the fight against the COVID-19 pandemic, many social activities have moved online; society's overwhelming reliance on the complex cyberspace makes its security more important than ever. In this paper, we propose and develop an intelligent system named Dr.HIN to protect users against the evolving Android malware attacks in the COVID-19 era and beyond. In Dr.HIN, besides app content, we propose to consider higher-level semantics and social relations among apps, developers and mobile devices to comprehensively depict Android apps; and then we introduce a structured heterogeneous information network (HIN) to model the complex relations and exploit meta-path guided strategy to learn node (i.e., app) representations from HIN. As the representations of malware could be highly entangled with benign apps in the complex ecosystem of development, it poses a new challenge of learning the latent explanatory factors hidden in the HIN embeddings to detect the evolving malware. To address this challenge, we propose to integrate domain priors generated from different views (i.e., app content, app authorship, app installation) to devise an adversarial disentangler to separate the distinct, informative factors of variations hidden in the HIN embeddings for large-scale Android malware detection. This is the first attempt of disentangled representation learning in HIN data. Promising experimental results based on the large-scale and real sample collections from security industry demonstrate the performance of Dr.HIN in evolving Android malware detection, by comparison with baselines and popular mobile security products.

12/08/2020

Emily Tseng, Rosanna Bellini, Nora McDonald and
Matan Danos, Rachel Greenstadt, Damon McCoy, Nicola Dell, Thomas Ristenpart

Multidisciplinary Topics and Applications, Security and Privacy, Classification, Mining Graphs, Semi Structured Data, Complex Data

13:28

25/04/2020

The DELAY Framework: Designing for Extended LAtencY

Derek Hansen, Amanda Hughes, Sophie Cram and
Austin Harker, Brinnley Ashton, Karli Hirschi, Ben Dorton, Nate Bothwell, Ashley Stevens

Keywords Paper

interpersonal communication, delayed communication, high-latency, delay framework, interplanetary communication, social presence, social connectedness

14:48

12/08/2020

behaviors, cases, classification, classifiers, communities, detection, factors, large_scale, learning, linguistic, linguistic aspects, networks, performance, representations

9:53

25/06/2020

Junqi Zhang, Bing Bai, Ye Lin and
Jian Liang, Kun Bai, Fei Wang

Austine Zong Han Yapp, Hong Soo Nicholas Koh, Yan Ting Lai and
Jiawen Kang, Xuandi Li, Jer Shyuan Ng, Hongchao Jiang, Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato

model personalization, user adaptation, continual learning, domain adaptation, privacy, scalability, unsupervised learning

1:00

06/12/2020

CWE, Python, CVE, vulnerability, JavaScript, software security

5:24

02/02/2021

Qualitative and quantitative studies of social media, Human computer interaction, social media tools, navigation and visualization, Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of soc

7:05

19/08/2021