Clustering and Similarity Learning in Financial Markets: A Tutorial for the Practitioners

Mehta, Dhagash; Thompson, John R. J.; Lee, Hoyoung; Lee, Yongjae

doi:10.2139/ssrn.5587353

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

이용재

Lee, Yongjae: Financial Engineering Lab.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Clustering and Similarity Learning in Financial Markets: A Tutorial for the Practitioners

Author(s): Mehta, Dhagash, Thompson, John R. J., Lee, Hoyoung, Lee, Yongjae

Issued Date: 2025-11

DOI: 10.2139/ssrn.5587353

URI: https://scholarworks.unist.ac.kr/handle/201301/89386

Citation: The Journal of Portfolio Management, v.52, no.2, pp.150 - 183

Abstract: Clustering and similarity learning are increasingly indispensable for structuring heterogeneous financial data and supporting real-world decision-making. Traditional heuristics such as industry codes, static style boxes, or return correlations offer only coarse and rigid notions of peer groups. Recent advances in metric learning, graph methods, and large language models now make it possible to build adaptive neighborhoods of securities, funds, companies, and investors that align more closely with actual risk, liquidity, and thematic exposures. This tutorial synthesizes these methodological developments and demonstrates their use across major asset classes. Case studies show how supervised proximities improve bond substitution, how fund similarity systems reconcile category reproducibility with outlier detection, how multimodal pipelines refine company comparables for valuation and strategy, and how investor clustering enhances personalization and “know your client” (KYC) analytics. We emphasize modeling choices that make clustering and similarity auditable and robust under regime shifts. We also outline their evaluation protocols such as neighborhood stability, substitution fidelity, and segment utility, and so on, which align with investment, compliance, and fiduciary objectives. Overall, the central message for practitioners is pragmatic: Similarity systems have moved beyond experimental prototypes and now stand as deployable techniques within real investment workflows.

Publisher: PAGEANT MEDIA LTD

ISSN: 0095-4918

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.