clustering based peer selection with financial...

24
Clustering Based Peer Selection with Financial Ratios Kexing Ding Lucas Hoogduin Xuan Peng Miklos A. Vasarhelyi Yunsen Wang

Upload: others

Post on 06-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Clustering Based Peer Selection with Financial

Ratios

Kexing Ding

Lucas Hoogduin

Xuan Peng

Miklos A. Vasarhelyi

Yunsen Wang

Page 2: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Introduction

• Academic research and practical models use anomaly detection strategies.

• Indirect way to mimic normal pattern: use a group ofbenchmark peer firms

A B CD

Benchmark: A B C

Page 3: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Introduction

• Benchmark firms: A B C

• Firm of interest: D

ABCD2011

Page 4: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Introduction

A BCD2012

• Benchmark firms: A B C

• Firm of interest: D

Page 5: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Introduction

A

BCD2013

• Benchmark firms: A B C

• Firm of interest: D

Page 6: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Introduction

A BC

D

2014

Abnormal

• Benchmark firms: A B C

• Firm of interest: D

Page 7: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Introduction

• Capital market research often calls for firms to be divided into more homogenous groups

• Industry classifications to select homogenous groups• SIC• NAICS• GICS• FamaFrench industry classification

• This study provides a data mining-based classification scheme and compares it with traditional classification schemes

Page 8: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Clustering method

• Clustering analysis is one of the data mining methodologies that groups a set of objects in such a way that objects in a same cluster is more similar to each other than to those in the other clusters

• K-means clustering:Given a set of observations (x1, x2, …, xn), where each observation is a d-dimensional real vector, k-means clustering aims to partition the n observationsinto k (≤ n) sets S = {S1, S2, …, Sk} so as to minimize the within-cluster sum ofsquares.

Formally, the objective is to find:

where μi is the mean of points in Si.

Page 9: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

k-means algorithm example (K=2)

Page 10: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Grouping homogeneous companies

•We apply K-means clustering algorithm on financial ratios to identify companies with similar operating characteristics.

• Prior literature shows that financial ratios are powerful tools in representing firm characteristics.• how they operate business

• how they apply the accounting methods in reporting

•We posit that clustering on financial ratios is appropriate to identify similar firms and separate them from other firms. • return on asset,current ratio,asset turnover,long-term debt over asset

(Krishnan and Press (2003))

Page 11: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Research design

Comparing clustering method to SIC, NAICS, GICs and FF 49:

• Step 1. We start with randomly selecting a target firm in each industry category for each year from 1999 to 2014. This step creates four sets of firm-year targets

• Step 2. for every year from 1994 to 2009, we use selected financial ratios to conduct clustering and partition all firms into groups. The number of groups depends on the number of industry category of the respective classification system.

Page 12: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Research design

• Step 3. We then identify target firm’s peers as the firms that have been consecutively clustered in a same group with the target firm in previous five years. These steps result in a dataset that consists of target firms and their group peers from 1995 to 2014.

• Step 4. Firms that have been assigned with the target firm i by k-means algorithm for previous five consecutive years are regarded as firm i’s peers in the current year t.

t-5

Clustering

t-4

Clustering

t-3

Clustering

t-2

Clustering

t-1

Clustering

t

Target year

Target firm i is selected

Firms in the same clusters with i in the five years

Page 13: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Evaluation

• An ideal method should group homogeneous firms with similar operating characteristics together (Amit and Livnat 1990 JBFA).

• Evaluation 1: Within-group dispersion (Krishnanan Press, 2003;Guenther and Rosman, 1994)

𝐷 =

𝑖=1

𝑁

(𝑛𝑖 − 1)𝑉𝑖/

𝑖=1

𝑁

(𝑛𝑖 − 1)

where N is the number of groups, ni is the number of companies within industry group i, and Vi is group i's variance.

Page 14: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Evaluation

• Evaluation 2: Correlation with peers• Correlation with peers

Ratioi,t = αt + βtAveRatiot,j + ϵi,t

where Ratioi,t is one of the selected ratios (other than the ratios used inclustering) for firm i and AveRatioj,t is the yearly-average of the variable for all firms identified as peers

Page 15: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Data and sample

• Collect data for U.S. firms from COMPUSTAT 1994 to 2014 annual

files.

• Delete firm-year observations that are labeled fraud in Audit

Analytics restatement file

• 207,999 observations.

• Conduct clustering analysis on firms, and group firms using K-

means.

• Final sample includes four sets.

Page 16: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Selected ratios in evaluation

Category Ratio

Profitability Net profit margin (NPM)

Gross Profit/ Total assets (GPROF)

Capitalization Capitalization ratio (CAPITAL_RATIO)

Solvency Total debt / Capital (DEBT_CAPITAL)

Efficiency Sales/equity (SALE_EQUITY)

Sales/invested capital (SALE_INVCAP)

Financial Soundness Cash Flow Margin (CFM)

Long-term Debt/Book Equity (DLTT_BE)

Other R&D expense/Sales (RD_SALE)

Page 17: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Compare within-group dispersion:Ratio of composite variancesSIC V.S. Clustering

Page 18: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Ratio of composite variancesFF49 V.S. Clustering

Page 19: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Ratio of composite variancesNAICS V.S. Clustering

Page 20: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Ratio of composite variancesGICS V.S. Clustering

Page 21: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Adjusted R-squared

Page 22: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Adjusted R-squared

Page 23: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Conclusion

• The clustering method leads to more cohesive groups than traditional classification schemes.

• Implications for academic research as well as practical applications:

a more comparable benchmark in• detection of misstatement in audit

• assessment of firms’ ability to pay debt in credit decisions

• evaluation of firms’ operating performance

• predictive analysis on financial distress, takeover and firm risk.

• Further studies: apply the proposed clustering method in frauddetection

Page 24: Clustering Based Peer Selection with Financial Ratiosraw.rutgers.edu/docs/wcars/40wcars/Presentations/... · 2017. 12. 19. · •Clustering analysis is one of the data mining methodologies

Thank you!