Heavy hitter estimation over set-valued data with localdifferential privacy

发布时间:2017-07-10 15:01  浏览:199次





讲座题目:Heavy hitter estimation over set-valued data with localdifferential privacy

主讲人:Ting Yu(于挺)教授





In local differential privacy (LDP), each user perturbs her datalocally before sending them to an untrusted data collector, whoanalyzes the data to obtain useful statistics. Unlike the setting oftraditional differential privacy, in LDP data collectors never gainaccess to the exact values of sensitive data, which protects not onlythe privacy of data contributors but also the collectors themselvesagainst liability if data leakage happens. This different settingrequires rethinking of techniques to perform various data analysistasks.

In this talk, we present a systematic study of heavy hitter miningover set-valued data under LDP. We first review existing solutions,extend them to the heavy hitter estimation, and explain why theireffectiveness is limited. We then propose LDPMiner, a two-phasemechanism for obtaining accurate heavy hitters with LDP. The main ideais to first gather a candidate set of heavy hitters using a portion ofthe privacy budget, and focus the remaining budget on refining thecandidate set in a second phase, which is much more efficientbudget-wise than obtaining the heavy hitters directly from the wholedataset. We provide both in-depth theoretical analysis and extensiveexperiments to compare LDPMiner against adaptations of previoussolutions. The results show that LDPMiner significantly improves overexisting methods. More importantly LDPMiner successfully identifiesthe majority true heavy hitters in practical settings.


Ting Yu is a principal scientist in the cyber security groupof Qatar Computer Research Institute (QCRI), and joint Professor inthe College of Science and Engineering, Hamad Bin KhalifaUniversity. Before joining QCRI in 2013, he was an associate professorin the faculty of Computer Science Department, North Carolina StateUniversity. He obtained his BS from Peking University in 1997, MS fromMinnesota University in 1998, and PhD from the University of Illinoisat Urbana-Champaign in 2003, all in computer science. He is arecipient of the NSF CAREER Award in 2007. His research areas focus onprivacy preserving data analysis, data anonymization, and securityanalytics.