Algorithms for Collaborative Machine Learning under Statistical Heterogeneity

Hahn, Seok-Ju

Scholarworks@UNIST

UNIST Library

File Download

200000813156.pdf

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Kim, Gi-Soo	-
dc.contributor.author	Hahn, Seok-Ju	-
dc.date.accessioned	2024-10-14T13:50:13Z	-
dc.date.available	2024-10-14T13:50:13Z	-
dc.date.issued	2024-08	-
dc.description.abstract	Learning from distributed data without accessing them is undoubtedly a challenging and non-trivial task. Nevertheless, the necessity for distributed training of a statistical model has been increasing, due to the privacy concerns of local data owners and the cost in centralizing the massively distributed data. Federated learning (FL) is currently the de facto standard of training a machine learning model across heterogeneous data owners, without leaving the raw data out of local silos. Nevertheless, several challenges must be addressed in order for FL to be more practical in reality. Among these challenges, the statistical heterogeneity problem is the most significant and requires immediate attention. From the main objective of FL, three major factors can be considered as starting points — parameter, mixing coefficient, and local data distributions. In alignment with the components, this dissertation is organized into three parts. In Chapter II, a novel personalization method, SuPerFed, inspired by the mode-connectivity is introduced. This method aims to find better parameters that are suitable for achieving enhanced generalization ability in all local data distributions. In Chapter III, an adaptive decision-making algorithm, AAggFF, is introduced for inducing uniform performance distributions in participating clients, which is realized by online convex optimization framework. This method explicitly learns fairness-inducing mixing coefficients sequentially, and is also specialized for two practical FL settings. Finally, in Chapter IV, a collaborative synthetic data generation method, FedEvg, is introduced, leveraging the flexibility and compositionality of an energy-based modeling approach. The objective of this method is to emulate the joint density of disparate local data distributions without accessing them, which enables to emulate centralized training of a model using the proxy dataset. Taken together, all of these approaches provide practical solutions to mitigate the statistical heterogeneity problem in data-decentralized settings, paving the way for distributed systems and applications using collaborative machine learning methods.	-
dc.description.degree	Doctor	-
dc.description	Department of Industrial Engineering	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/84116	-
dc.identifier.uri	http://unist.dcollection.net/common/orgView/200000813156	-
dc.language	ENG	-
dc.publisher	Ulsan National Institute of Science and Technology	-
dc.subject	Federated Learning\|Collaborative Machine Learning\|Statistical Heterogeneity\|Deep Learning\|Distributed Optimization	-
dc.title	Algorithms for Collaborative Machine Learning under Statistical Heterogeneity	-
dc.type	Thesis	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.