usinganoptimizedlearningvectorquantization-(lvq-)based

Research ArticleUsing an Optimized Learning Vector Quantization- (LVQ-) BasedNeural Network in Accounting Fraud Recognition

Yuan Zheng 1 Xiaolan Ye2 and Ting Wu3

1School of Finance Anhui University of Finance and Economics Bengbu Anhui 233030 China2School of Insurance Southwestern University of Finance and Economics Chengdu Sichuan 611130 China3School of Computer Science ZhongYuan University of Technology Zhengzhou Henan 450007 China

Correspondence should be addressed to Yuan Zheng 120081779aufeeducn

Received 2 June 2021 Accepted 21 June 2021 Published 29 June 2021

Academic Editor Syed Hassan Ahmed

Copyright copy 2021 Yuan Zheng et al )is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

With the continuous development and wide application of artificial intelligence technology artificial neural network technologyhas begun to be used in the field of fraud identification Among them learning vector quantization (LVQ) neural network is themost widely used in the field of fraud identification and the fraud identification rate is relatively high In this context this paperexplores this neural network technology in depth uses the same fraud sample to test the fraud recognition rate of these twomodels and proposes an optimized LVQ-based combined neural network fraud risk recognition model on this basis )is paperselects 550 listed companies that have committed fraud from 2015 to 2019 as the fraud samples determines 550 nonfraudmatching sample companies in accordance with the Beasley principle one-to-one and uses this as the research sample )e fraudrisk identification indicators with better identification effects combed out according to the literature were used as the initialindicator system After the collinearity problem was eliminated through the paired sample T test and principal componentanalysis the five indicators with the best identification effects were finally selected Finally based on the above theoretical analysisand empirical research summarizing the full text it analyzes the shortcomings of this research and puts forward prospects for thefuture development of fraud risk identification models

1 Introduction

Fraud has severely affected the publicrsquos confidence in theaccounting community and the capital market how to ef-fectively identify corporate fraud has become the top priorityof accounting theory practice and regulatory agencies )eempirical research shows that the model fraud identificationeffect is better than the fraud case analysis and the con-struction of an effective fraud risk identification model isinseparable from perfect fraud identification indicators andappropriate identification methods At present the researchon fraud identification indicators has been relatively com-plete but there is less research on fraud identificationmodels )e concept of fraud does not have a unifiedconclusion Governments or regulatory agencies at homeand abroad have defined it For example the NationalAntifraud Accounting Reporting Committee (Treadway

Committee) defines accounting fraud as ldquoa deliberate orreckless act Whether it is false reporting or omissions theresult is a major misleading accounting reportrdquo In additionscholars have also defined accounting fraud [1 2] Hor-mozzadefighalati defined accounting fraud as fraudulent useof accounting fraud and other violations or illegal means toseek self-interest thereby harming the interests of othersrsquointentional behavior [3] Wang et al believed that the ac-counting fraud company executives have deceptively ma-nipulated accounting reports [4] Siamidoudaran et al putforward on the basis of summarizing and analyzing theexisting literature accounting fraud is the intention of theperpetrator to obtain illegitimate benefits deliberately vio-late the principle of authenticity in a planned purposefuland targeted manner and violate the state Acts of lawsregulations policies and rules ultimately lead to distortionof accounting information [5] Based on the above definition

HindawiComputational Intelligence and NeuroscienceVolume 2021 Article ID 4113237 10 pageshttpsdoiorg10115520214113237

of accounting fraud the definition of accounting fraud inthis article is as follows in order to seek illegitimate benefitsthe management of a company deliberately uses forgerytampering cover-up and other deceptive methods to de-liberately cause the distortion of accounting information)e types of accounting fraud mainly include illegal pur-chase of stocks false statements violations of capital con-tributions and violations of regulations Hype and so on

Environment refers to all external things that surround acertain thing and have some influence on it that is envi-ronment refers to the surrounding things that are relativeand related to a certain central thing Any enterprise orsocial organization exists in the environment )e corporateenvironment refers to a collection of interrelated mutuallyrestrictive and constantly changing factors that affect cor-porate management decisions and production and operationactivities)e internal environment of an enterprise refers tothe sum of the material and cultural environment within theenterprise )e external environment of an enterprise refersto the sum of the political environment economic envi-ronment social environment and technological environ-ment outside the enterprise According to the definition ofenvironment and corporate environment this article be-lieves that the environment of accounting fraud refers to acollection of interrelated mutually restrictive and con-stantly changing factors that affect the existence of ac-counting fraud including internal environmentalcharacteristics and external environmental characteristicsBecause accounting fraud is carried out by the company as acarrier the environmental characteristics of accountingfraud refer to the environment that affects accounting fraudinside and outside the company )e following article willintroduce the internal and external environmental charac-teristics of accounting fraud

)e internal environmental characteristics of accountingfraud refer to the collection of various factors that existwithin the enterprise and affect accounting fraud [6] Spe-cifically it includes corporate governance structure insti-tutional settings the implementation of the system and soon)e internal environment of accounting fraud affects thesize of the opportunity for company managers to commitfraud the probability of fraud being discovered and thenature and degree of punishment of fraudsters after fraud isdiscovered A sound internal mechanism will greatly reducethe probability of accounting fraud and there are loopholeswhere defective internal mechanism will increase the pos-sibility of accounting fraud In the research of Brinkrolf et al[7] it contains two internal environmental characteristics ofaccounting fraud the companyrsquos internal governancestructure and the companyrsquos operating performance )einternal environmental characteristics of accounting fraudstudied in this article include board structure directorencouragement leadership structure ownership structureand executive incentives

)e external environmental characteristics of accountingfraud refer to the collection of various factors that existoutside the enterprise and affect accounting fraud It spe-cifically includes external corporate governance marketenvironment political environment and economic

environment )e external environment of accounting fraudaffects the chances of fraud by company managers theprobability of fraud being discovered and the nature anddegree of punishment of fraudsters after fraud is discovered)e supervision and governance of fraudsters by the externalenvironment is a beneficial supplement to the internal en-vironment )e market system favorable political andeconomic environment and effective supervision system canreduce the possibility of accounting fraud On the contraryif the market environment political environment andeconomic environment cannot effectively supervise ac-counting fraud then the occurrence of accounting fraudpossibility will increase )e external environment of ac-counting fraud studied in this article includes the quality ofexternal auditing the degree of media development thedegree of institutional and legal constraints and the degreeof product competition With the continuous developmentand wide application of artificial intelligence technologyartificial neural network technology has begun to be used inthe field of fraud identification Among them the LVQneural network is the most widely used in the field of fraudidentification and the fraud identification rate is relativelyhigh In this context this paper explores this neural networktechnology in depth uses the same fraud sample to test thefraud recognition rate of these two models and proposes anoptimized LVQ-based combined neural network fraud riskrecognition model on this basis )is paper selects listedcompanies that have committed fraud from 2015 to 2019 asthe fraud samples uses the fraud risk identification indi-cators with better identification results combed outaccording to the literature as the initial indicator system andeliminates the commonality through paired sample T testand principal component analysis After the linear problemthe 5 indicators with the best recognition effect were finallyselected Finally based on the above theoretical analysis andempirical research summarizing the full text it analyzes thedeficiencies in this research and puts forward a prospect forthe future development of the fraud risk identificationmodel

2 LVQ Neural Network

21 Concept and Structure of LVQ Neural Network Model)e LVQ neural network is proposed on the basis of thecompetitive network structure and belongs to the forwardneural network )e LVQ neural network combines the ideaof competitive learning with supervised learning algorithms[8] In the process of network learning the assignmentcategory of input samples is specified through the tutorsignal thereby overcoming the lack of classification infor-mation caused by the use of unsupervised learning algo-rithms in self-organizing networks weakness )e biggestadvantage of LVQ neural network is that it cannot onlyclassify linear input data but also process multidimensionaldata containing noise interference LVQ neural network hasa wide range of applications in the field of optimization andpattern recognition and it is also one of the typical clas-sification models Similar to BP neural network LVQ neuralnetwork also has three network layers which are divided

2 Computational Intelligence and Neuroscience

into input layer competition layer and linear layer )ere isa complete connection between the input layer and thecompetition layer [9]

)e competition layer classifies the learning of the inputvector as the same as the competition layer and calls theclassification of the competition layer as subcategories)ereis a partial connection between the competition layer and thelinear layer )e linear layer mainly maps the classificationresults of the competition layer to the target classificationresults according to the needs of users and the classificationof the linear layer is called the target classification )especific LVQ neural network model structure and dataprocessing process are shown in Figures 1 and 2

)e output of each neuron in the competition layer andthe linear layer corresponds to a subclassification or targetclassification result so the competition layer can obtain thesubclassification result through learning and the linear layerclassifies the subclassification result to obtain target classi-fication result [10] )e learning rules of the LVQ neuralnetwork combine the mentor learning rules and the com-petitive learning rules First the LVQ neural network istrained on a set of training samples with mentor signals [11]Usually each neuron in the competition layer is assigned toan output neuron and the corresponding weight is generallyset to 1 and then a weight matrix of the output layer can beobtained )e columns of the weight matrix are representedas categories and the rows are represented as subcategoriesGenerally the weight matrix is set in advance duringtraining to specify the type of output neuron )e weightmatrix data does not change during the training process andthe network learning is carried out by changing the weightmatrix According to the input sample category and thecategory of the winning neuron it can be judged whether thecurrent classification is correct If the classification is correctthe weight vector of the winning neuron is adjusted to theinput vector direction and the classification error is adjustedin the opposite direction [12 13]

Although competing networks can perform adaptiveclassification there are still many problems to be solvedWhen the learning rate is too fast the training speed of thenetwork is very fast but when the weights are updated andthe correct classification is achieved the weights are prone toshock and extremely unstable When the learning rate is tooslow the training speed of the network is slow but once thecorrect classification is achieved it is not easy to oscillate)erefore it is necessary to make a compromise choicebetween learning rate and network stability Second whenthe vectors belonging to each category are very close to eachother it will cause the weight vectors of the prototypes tointerfere with each other resulting in the destruction of theclassification )ird when the input vector of the neuralnetwork is too far away from the corresponding neuron theneuron may never be able to win and learn in the com-petition [14]

22 LVQ Neural Network Algorithm Process LVQ neuralnetwork is a two-layer network the number of neurons inthe first layer is S1 and the number of neurons in the second

layer is S2 In this neural network each neuron in thecompetition layer will be assigned to a neuron in the outputlayer [15 16] )e neurons in the competition layer aresubcategories while those in the output layer are categoriesGenerally each category can include several subcategories)erefore S1 is usually greater than the number of S2

)e difference between the LVQ network and othernetworks is that the net input of the network is not the innerproduct of the input vector and the weight vector but thedirect distance between the input vector and the weightvector [17 18] )e competition layer of the LVQ neuralnetwork calculates which neuron the input vector is closestto sets the output of this neuron to 1 and sets the output ofthe rest of the neurons to 0 to obtain the subcategories of theinput vector )en through the calculation of the secondlayer network it determines to which class this subclassbelongs to finally determine the class of the input vector)e advantage of this calculation method is that the inputvector does not need to be normalized which simplifies thecalculation process

)e net input n of the LVQ network is shown in thefollowing formula

X1i minus x

1i minus p

(1)

)e vector is expressed as

X1i minus

x11 minus p

x12 minus p

middot middot middot

x1i minus p

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

(2)

)e weight W of the first competition layer of the LVQnetwork starts sample training by assigning a set of ran-domly smaller initial values After the samples are trainedone by one the weight vector of the competition layerrepresents the standard pattern vectors of different cate-gories which can be realized to recognize and classify theinput vector And when a new vector is input the networkcan adjust the weight vector that is closest to it in time andstill make it get the correct classification )e category of theinput vector is determined by the weight W of the secondlinear layer of the LVQ network )e rows of 2W representclasses and the columns of W represent subclasses Usuallyseveral subcategories can be combined into one category[19] Each column of the weight matrix W2 has one and onlyone 1 which means that the subcategory of the columnbelongs to the category set to 1 row W2 means as shown inthe following formula

W2ij 1⟶ indicates that subclass j is a part of i (3)

)e basic idea of the LVQ neural network algorithm is tofirst calculate the neuron of the competitive layer closest tothe input vector and find the neuron of the linear outputlayer connected to it and secondly consider whether the typeof the input vector is consistent with the type correspondingto the neuron in the linear output layer If they are con-sistent the weights of the corresponding competing layer

Computational Intelligence and Neuroscience 3

neurons are moved along the direction of the input vector ifthey are inconsistent they move in the opposite direction[20] Specific steps are as follows

Step 1 Initializing the network initialize the weightvector W2

ij and the learning efficiency δ between theneurons in the input layer and the competition layerStep 2 Importing the input layer vector after inputtingthe initial vector calculate the distance di between theneurons in the competition layer and the input vectoraccording to the following formula

di minus X minus Wi

lim

n⟶infin

1113944

n

j1Xj minus Wij1113960 1113961

2

11139741113972

(4)

where X is the input vector and wij is the weight be-tween the input layer j neuron and the competing layeri neuronStep 3 Determining the winning neuron find thecompetitive neuron with the smallest distance from theinput vector determine the winning neuron k set the k-th element of the output vector a1 of the competitivelayer to 1 and set the rest to 0Step 4 Calculating the output vector of the linear layercalculate the value of the output vector of the linearlayer and obtain the classification

Step 5 Adjusting the weight compare the target outputwith the actual output of the network if the classifi-cation is correct adjust the direction of the eye inputvector if not adjust the direction in the reversedirectionStep 6 Updating the learning rate update the learningefficiency according to the following formula

δ(x) δ(0) xj minus x1113872 1113873

xj

(5)

Step 7 Judgment result when xlt xj k k + 1 returnto Step 2 and enter the next sample repeat the aboveprocess and adjust the weight until x xj

During network training it may appear that the inputfeature vector is too far away from the neuron causing theneuron to never win the competition [21ndash23] )e fraudidentification process in accounting based on LVQ neuralnetwork is shown in Figure 3

3 Sample Selection and Fraud RiskIdentification Index Screening

31 Sample Selection Due to the lag in the detection ofcorporate fraud fraud is often discovered after the annualfraudulent accounting statements are released )is article is

Output layer

Competitive layer

Input layer

X1i = ndash||x1

i ndash p||

x1 x2 xn

Figure 1 LVQ neural network structure diagram

Training

LVQ model

LVQ classifier

Shape-basedfeatures

Trainingsamples

Perceptuallinearity

SVM

SVM classifier

ClassificationX and W belong to

the same class

All the samples

Cluster centers areinitialized

Each cluster islabeled Stop criterion met

Randomly select aninput vector andcompute distance

Fuzzy system

Figure 2 Data processing process diagram of LVQ neural network model


based on the China Securities Regulatory Commission thewebsite of the Shanghai Stock Exchange and the ShenzhenStock Exchange)e website publicly disclosed the companyviolation announcements combined with the listed companyviolation processing database to determine the fraud sam-ples )erefore the sample of this article selects 500 com-panies that have committed fraudulent activities during thefive years from 2015 to 2019 as the fraud samples

)e empirical research sample in this paper includesfraudulent companies and matching companies that arematched with fraudulent companies one-to-one )ereforethe sample data include 550 companies )e matchingsample company is determined according to the Beasleyprinciple )ere are 4 specific criteria namely (1) thematching company searched for has no fraud in the fraud

year of its corresponding fraud company (2) the asset scaleof the matching company and the corresponding fraudcompany in the previous year is very close and the differenceis less than 30 (3) the matching company and its corre-sponding fraud company should be in the same industry andhave the same or similar main business and (4) thematching company and its matching fraud company shouldbe in the same stock exchange market When the above 4criteria cannot be met at the same time the priority shall begiven to (2) and (3) provided that (1) must be met

32 Screening of Fraud Risk Identification Index )e mostcritical link in constructing a fraud risk identification modelis the selection of fraud risk identification indicators

Sample modeinput

Start

Initialization

Generate randomevents

Image preprocessing andfeature extraction

Training network weight W

LVQ neural networkrecognition system

Preprocessing and featureextraction

Test mode input


End

e feature vectoris fed into the LVQ

neural network

YesYes

NoNo

Preprocessing andfeature extraction

Figure 3 Fraud identification process in accounting based on LVQ neural network


Indicators with a good identification effect can play a role inaccurately predicting and controlling corporate fraud inadvance However if the indicators are not selected properlyeven if the fraud risk identificationmodel is constructed verywell the identification effect is not ideal )is article is basedon the fraud triangle theory and selects indicators based onthe following three criteria (1) the selected literature reviewin this article in each literature several indicators have thebest results in judging fraud (2) in view of the difficulty ofindex data collection and processing eliminate the indexesthat are too complicated and unavailable in the processingprocess and (3) important fraud risk identification indi-cators that have appeared in classic fraud cases in the lit-erature review part this article has sorted out the indicatorsused by a large number of domestic and foreign scholars inthe identification of management fraud and screened themin accordance with the above three principles A total of 11variables with good discriminating effects were selected and48 indicators were divided into two pieces of indicatorsaccounting and nonaccounting indicators Accounting in-dicators include six subcategories of profitability solvencyoperating capability development capability per share in-dicators and asset quality nonaccounting indicators includefive subcategories of equity structure corporate governancespecial transactions and events audit relationships andoperating pressures Classification basically covers high-frequency indicators for fraud identification

4 Experimental Results of Combined NeuralNetwork Model

41 Descriptive Statistics of LVQ Neural Network and De-termination of Final Indicators In order to verify thecomprehensiveness and significance of the initially selectedindex system and to improve the recognition accuracy andefficiency of the fraud risk identification model this paperwill conduct a paired sample T test for all the initially de-termined indicators and conduct a nonparametric Man-nndashWhitney test )e relevant inspection process is carriedout in SPSS170 Among them the qualitative indicators arerepresented by 1 and 0 mainly including X9 chairmanchange where 1 means change and 0 means no change X10two-time part-time where part-time is 1 if it is 0 X11 auditopinion type where 1 is issued standard audit opinion and 0means a nonstandard audit opinion issuedX12 change of theaccounting firm where 1 means change of the accountingfirm and 0 means no change of the accounting firm X12avoid ST that is whether there are consecutive losses in theprevious two years of fraud where 1 means loss and 0indicates that there is no continuous loss

In order to facilitate data processing and improve thefraud identification effect of the neural network model thispaper sets the fraud company type to 1 and the matchingsample company type to 0 Based on pairs of sample data the30 accounting indicators and 18 nonaccounting indicatorsin the abovementioned initially constructed indicator systemare tested for significance and the variables that pass thesignificance test are selected Determine the final fraud riskidentification indicators )e final identified fraud risk

identification indicators and descriptive statistical results areshown in Table 1 and Figures 4 and 5

)e original data have undergone MannndashWhitney ranktest and T test and the results showX1 indicatorX5 earningsper share before interest and tax X7 two-time concurrentjob X6 management shareholding ratio and X10 auditopinion type )e 3 indicators of STare significant at the 1level )e three indicators of X3 cash flow ratio and X4 totalprofit growth rate are significant at the 5 level )e fourindicators of X5 inventory turnover rate X6 board of su-pervisorsrsquo shareholding ratio X7 state-owned shares ratioand X8 other receivablestotal assets are significant at the10 level Table 2 and Figure 6 show the descriptivestatistics

42 Fraud Recognition Effect Test of Combined Neural Net-work Model )e fraud recognition effect test of the com-bined neural network model is still completed using theneural network toolbox that comes with MATLAB )ecollected 506 pairs of research samples from 2015 to 2019 aredivided into two parts including 326 pairs of trainingsamples and 180 pairs of test sample [18] Since the rec-ognition rate of the training sample represents the learningeffect of the neural network model and cannot explain thefraud recognition effect of the model the recognition ac-curacy rate of the test sample is used for comparativeanalysis )e specific training and test results are shown inTable 3 and Figure 7

Among the 150 fraud samples the combined modelidentified 136 fraudulent companies and misjudged 14 asnonfraud companies )e identification accuracy of fraud-ulent companies was 8854 among 150 matching com-panies the LVQ model identified 132 nonfraud companies18 were misjudged as fraudulent companies and the rec-ognition rate of matching companies was 9154 From theoverall recognition results of fraud companies the overallfraud recognition rate of the combined neural networkmodel based on LVQ is 9048 and the recognition effect issignificantly better than any single neural network model(the overall recognition rate of the neural network modelbased on LVQ is 8652 and the overall recognition rate ofthe neural network model is 93) Because the trainingsamples and test samples used for the three neural networkmodels are the same the fraud recognition rate of the threemodels is comparable

43 Robustness Test of Combined Neural Network ModelIn order to test whether the recognition effect of the LVQ-based combined neural network fraud risk identificationmodel is stable this paper selects 100 companies that havefraudulently occurred in 2019 and 100 matching companiesfound in one-to-one matching with them as researchsamples to test the combined model fraud and identify thestability and the specific robustness test results are shown inFigure 8 and Table 4

Among the 100 fraudulent sample companies thecombination model identified 85 fraudulent companies andmisjudged 15 companies and the fraudulent company


identification rate was 8517 among the 100 matchingsample companies the combination model identified 89matching companies which was incorrect 11 companieswere awarded and the matching company recognition ratewas 9074 )e overall fraud discrimination rate of thecombined model is 8459 which is slightly lower than theprevious overall fraud recognition rate of 9054 but the

fluctuation range is not large and it is still higher than thefraud recognition rate of a single neural network modelindicating the fraud recognition of the combined neuralnetwork model )e effect is indeed higher than that of asingle model and the effect of fraud identification is stablewhich can be used as a discriminant model for corporatefraud

Table 1 Final fraud risk identification index

Index selection significance Index nameMean value MannndashWhitney

rank test T test

Fraud Pairing Z value P value t value P valueProfitability X1 indicator 355 186 minus5121 le 0001 minus3931 00001Operating capacity X2 inventory turnover rate 2973 96523 minus1181 0249 minus1731 00837Solvency X3 cash flow ratio 0098 0201 4531 le 0001 minus2109 00357Development ability X4 total profit 0013 0114 minus4274 le 0001 minus2798 00063Per share index X5 before interest and tax stock income 0340 0442 minus3134 0002 minus2599 00096

CorporateX6 shareholding ratio of the board of supervisors 2121 0011 minus0242 0002 minus2574 0056

X7 two-time part-time 0645 0125 minus2542 0068 minus1541 0005X8 management shareholding ratio 0137 0091 2846 0 004 3026 0002

Governance X9 proportion of state-owned shares 0051 0037 minus2249 0248 1679 0099

Ownership structure X10 audit opinion type 0054 0056 minus5121 0457 1548 0005X11 change of accounting firm 0125 0005 minus2125 0054 1513 0008

Auditor relations X12 other receivablestotal assets 0514 0012 minus2542 0045 1158 0002Behavioral characteristics X13 avoid ST 0154 0005 minus1242 0002 1187 0015

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13

Fraud risk identification

Z value

ndash8

ndash6

ndash4

ndash2

0

2

4

6

Man

nndashW

hitn

ey ra

nk te

st re

sult

Figure 4 Final fraud risk identification MannndashWhitney rank test result

ndash5ndash4ndash3ndash2ndash1

012345

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10X11X12X13


T value

T te

st re

sult

Figure 5 Final fraud risk identification T test result


Table 2 Descriptive statistics

Index name Mean value Standard deviation Analysis NX1 index 0005 0055 550X2 inventory turnover rate 0293 0142 550X3 cash flow ratio 0455 0006 550X4 total cash debt ratio 0365 0275 550X5 total profit growth rate 0155 0133 550X6 earnings per share before interest and tax 0545 0124 550X7 shareholding ratio of the board of supervisors 0159 0152 550X8 management shareholding ratio 0086 0055 550X9 proportion of state-owned shares 0061 0175 550X10 other receivablestotal assets 0078 0165 550

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Fraud risk index

Mean valueStandard deviation

00102030405060708

Des

crip

tive s

tatis

tics r

esul

ts

Figure 6 Descriptive statistics results

Table 3 Discrimination results of neural network model training and test samples based on LVQ

Company typeTraining samples (320 pairs in total) Test samples (150 pairs in total)

Fraud company Matching company Classification accuracy Fraud company Matching company Classificationaccuracy

Fraud company 305 15 9545 136 14 8854Matching company 18 312 3598 18 132 9154Overall correct rate 9354 9048

Fraud company

Matching company

Classification accuracy

Fraud company

Matching company


Test

sam

ples

(150

pai

rs in

tota

l)

Trai

ning

sam

ples

(320

pai

rs in

tota

l)

100 200 300 4000Discrimination results

Fraud companyMatching company

Figure 7 Discrimination results of neural network model training and test samples based on LVQ


5 Conclusion

)e LVQ neural network combines the ideas of competitivelearning with supervised learning algorithms In the processof network learning the assignment category of inputsamples is specified through the tutor signal therebyovercoming the lack of classification information caused bythe use of unsupervised learning algorithms in self-orga-nizing networks weakness )e biggest advantage of LVQneural network is that it cannot only classify linear inputdata but also process multidimensional data containingnoise interference Based on the analysis and comparison ofLVQ neural network model structure and advantages anddisadvantages this paper further proposes a fraud riskidentification model based on LVQ combined neural net-work )e neural network model with better recognitioneffect is used as the main preclassification model and theLVQ neural network as the postclassification model not onlyeffectively processes the data containing noise but alsomakes up for the defects that traditional neural networktechnology cannot subdivide )e above improves the fraudidentification effect of the combined model )e same fraudsample was used to test the fraud recognition effect of thecombined neural network model and the overall fraudrecognition rate was 9051 )e research results show thatthe combined neural network model with complementaryadvantages and disadvantages is better than a single neural

network model in fraud identification Selecting fraudsample data in 2019 to test the robustness of the combinedneural networkmodel the results show that the overall fraudrecognition rate is 9054 which is not much different fromthe previous overall fraud recognition rate of 9074 )ecombined modelrsquos recognition effect is relatively stableWell it can be used as one of the optional models for thecompanyrsquos fraud risk identification in the future )e re-search results of this article broaden the thinking of con-structing fraud risk identification models in the future It isno longer limited to a single fraud identification model It ispossible to combine models with good identification effectsor complementary advantages and disadvantages to create anew fraud risk identification model With the rapid devel-opment and continuous progress of artificial intelligencetechnology it is expected to construct an intelligent fraudrisk identification model in the future According to thedifferent characteristics of each company the appropriatefraud index system is automatically selected and the optimalneural network model is constructed for fraud identificationIt is no longer limited to specific types of neural networktechnology

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

Table 4 Robustness test

Test samples (150 pairs in total)Fraud company Matching company Classification accuracy

Fraud company 85 15 8517Matching company 11 89 9074Overall correct rate 8459

85

15

8517

11

89 9074

Fraud company Matching company Classification accuracy

Test samples (150 pairs in total)

Robu

stnes

s tes

t

Matching companyFraud company

Figure 8 Robustness test results


Conflicts of Interest

)e authors declare that they have no known conflicts ofinterest or personal relationships that could have appearedto influence the work reported in this study

Acknowledgments

)is work was supported by Anhui University of Financeand Economics

References

[1] I G A Widagda N P Sastra D M Wiharta andR S Hartati ldquoInvariant moment and learning vector quan-tization (LVQ NN) for images classificationrdquo Journal ofPhysics Conference Series vol 1722 no 1 Article ID 0120032021

[2] U Orhan and E Arslan ldquoLearning word-vector quantiza-tionrdquo ACM Transactions on Asian and Low-Resource Lan-guage Information Processing vol 19 no 5 pp 1ndash18 2020

[3] H Hormozzadefighalati A Abbasi and A Sadeghi-NiarakildquoOptimal multi-product supplier selection under stochasticdemand with service level and budget constraints usinglearning vector quantization neural networkrdquo RAIRO Oper-ation Research vol 53 no 5 pp 1709ndash1720 2019

[4] Y Wang J K Ashton and A Jaafar ldquoDoes mutual fundinvestment influence accounting fraudrdquo Emerging MarketsReview vol 38 pp 142ndash158 2019

[5] M Siamidoudaran E Iscioglu and M Siamidodaran ldquoTrafficinjury severity prediction along with identification of con-tributory factors using learning vector quantization a casestudy of the city of Londonrdquo SN Applied Sciences vol 1no 10 p 1268 2019

[6] T Mahrina S M Hardi J T Tarigan I Jaya M Ramli andfnm Tulus ldquoComparative analysis of backpropagation withlearning vector quantization (LVQ) to predict rainfall inmedan cityrdquo Journal of Physics Conference Series vol 1235Article ID 012083 2019

[7] J Brinkrolf C Gopfert and B Hammer ldquoDifferential privacyfor learning vector quantizationrdquo Neurocomputing vol 342pp 125ndash136 2019

[8] Y Bao B Ke B Li et al ldquoDetecting accounting fraud inpublicly traded US firms using a machine learning ap-proachrdquo Journal of Accounting Research vol 58 2020

[9] P Arnita M S Sinaga and S Elmanani ldquoClassification anddiagnosis of diabetic with neural network algorithm learningvector quantizatin (LVQ)rdquo Journal of Physics ConferenceSeries vol 1188 Article ID 012091 2019

[10] K C Nguyen C T Nguyen and M Nakagawa ldquoNomdocument digitalization by deep convolution neural net-worksrdquo Pattern Recognition Letters vol 133 pp 8ndash16 2020

[11] P Karmani A A Chandio V Karmani J A SoomroI A Korejo and M S Chandio ldquoTaxonomy on healthcaresystem based on machine learning approaches tuberculosisdisease diagnosisrdquo International Journal of Computing andDigital Systems vol 09 no 6 pp 1199ndash1212 2020

[12] D F Nino and J N Milstein ldquo)ree-dimensional fast op-timized clustering algorithm (FOCAL3D) for single-moleculelocalization microscopyrdquo Biophysical Journal vol 118 no 3p 458a 2020

[13] Y Jing ldquoResearch on fuzzy English automatic recognition andhuman-computer interaction based on machine learningrdquo

Journal of Intelligent amp Fuzzy Systems vol 39 no 4pp 5809ndash5819 2020

[14] O I Al-Sanjary M Roslan R Helmi et al ldquoComparison anddetection analysis of network traffic datasets using K-meansclustering algorithmrdquo Journal of Information amp KnowledgeManagement vol 19 no 3 Article ID 2050026 2020

[15] C-H Yang and K-C Lee ldquoDeveloping a strategy map forforensic accounting with fraud risk management an inte-grated balanced scorecard-based decision modelrdquo Evaluationand Program Planning vol 80 Article ID 101780 2020

[16] J Zhang and J Chen ldquoAn adaptive clustering algorithm fordynamic heterogeneous wireless sensor networksrdquo WirelessNetworks vol 25 no 1 pp 455ndash470 2019

[17] R Agarwal ldquoA multiple perspective view to rampantfraudulent culture in the Indian insurance industryrdquo Inter-national Journal of Indian Culture and Business Managementvol 16 no 4 pp 416ndash437 2018

[18] N Nidheesh K Nazeer and P M Ameer ldquoA hierarchicalclustering algorithm based on silhouette index for cancersubtype discovery from genomic datardquo Neural Computingand Applications vol 32 no 11 pp 1ndash18 2020

[19] I Garnefeld A Eggert M Husemann-Kopetzky andE Bohm ldquoExploring the link between payment schemes andcustomer fraud a mental accounting perspectiverdquo Journal ofthe Academy of Marketing Science vol 47 no 4 pp 595ndash6162019

[20] M Najafzadeh and G Oliveto ldquoExploring 3D wave-inducedscouring patterns around subsea pipelines with artificial in-telligence techniquesrdquo Applied Sciences vol 11 no 9 p 37922021

[21] T Som M Dwivedi C Dubey and A Sharma ldquoParametricstudies on artificial intelligence techniques for battery SOCmanagement and optimization of renewable powerrdquo ProcediaComputer Science vol 167 pp 353ndash362 2020

[22] R Flasher and M A Lamboy-Ruiz ldquoImpact of enforcementon healthcare billing fraud evidence from the USArdquo Journalof Business Ethics vol 157 no 1 pp 217ndash229 2019

[23] M Albashrawi and M Lowell ldquoDetecting financial fraudusing data mining techniques a decade review from 2004 to2015rdquo Journal of Data Science vol 14 no 3 pp 553ndash5692016


of accounting fraud the definition of accounting fraud inthis article is as follows in order to seek illegitimate benefitsthe management of a company deliberately uses forgerytampering cover-up and other deceptive methods to de-liberately cause the distortion of accounting information)e types of accounting fraud mainly include illegal pur-chase of stocks false statements violations of capital con-tributions and violations of regulations Hype and so on

Environment refers to all external things that surround acertain thing and have some influence on it that is envi-ronment refers to the surrounding things that are relativeand related to a certain central thing Any enterprise orsocial organization exists in the environment )e corporateenvironment refers to a collection of interrelated mutuallyrestrictive and constantly changing factors that affect cor-porate management decisions and production and operationactivities)e internal environment of an enterprise refers tothe sum of the material and cultural environment within theenterprise )e external environment of an enterprise refersto the sum of the political environment economic envi-ronment social environment and technological environ-ment outside the enterprise According to the definition ofenvironment and corporate environment this article be-lieves that the environment of accounting fraud refers to acollection of interrelated mutually restrictive and con-stantly changing factors that affect the existence of ac-counting fraud including internal environmentalcharacteristics and external environmental characteristicsBecause accounting fraud is carried out by the company as acarrier the environmental characteristics of accountingfraud refer to the environment that affects accounting fraudinside and outside the company )e following article willintroduce the internal and external environmental charac-teristics of accounting fraud

)e internal environmental characteristics of accountingfraud refer to the collection of various factors that existwithin the enterprise and affect accounting fraud [6] Spe-cifically it includes corporate governance structure insti-tutional settings the implementation of the system and soon)e internal environment of accounting fraud affects thesize of the opportunity for company managers to commitfraud the probability of fraud being discovered and thenature and degree of punishment of fraudsters after fraud isdiscovered A sound internal mechanism will greatly reducethe probability of accounting fraud and there are loopholeswhere defective internal mechanism will increase the pos-sibility of accounting fraud In the research of Brinkrolf et al[7] it contains two internal environmental characteristics ofaccounting fraud the companyrsquos internal governancestructure and the companyrsquos operating performance )einternal environmental characteristics of accounting fraudstudied in this article include board structure directorencouragement leadership structure ownership structureand executive incentives

)e external environmental characteristics of accountingfraud refer to the collection of various factors that existoutside the enterprise and affect accounting fraud It spe-cifically includes external corporate governance marketenvironment political environment and economic

environment )e external environment of accounting fraudaffects the chances of fraud by company managers theprobability of fraud being discovered and the nature anddegree of punishment of fraudsters after fraud is discovered)e supervision and governance of fraudsters by the externalenvironment is a beneficial supplement to the internal en-vironment )e market system favorable political andeconomic environment and effective supervision system canreduce the possibility of accounting fraud On the contraryif the market environment political environment andeconomic environment cannot effectively supervise ac-counting fraud then the occurrence of accounting fraudpossibility will increase )e external environment of ac-counting fraud studied in this article includes the quality ofexternal auditing the degree of media development thedegree of institutional and legal constraints and the degreeof product competition With the continuous developmentand wide application of artificial intelligence technologyartificial neural network technology has begun to be used inthe field of fraud identification Among them the LVQneural network is the most widely used in the field of fraudidentification and the fraud identification rate is relativelyhigh In this context this paper explores this neural networktechnology in depth uses the same fraud sample to test thefraud recognition rate of these two models and proposes anoptimized LVQ-based combined neural network fraud riskrecognition model on this basis )is paper selects listedcompanies that have committed fraud from 2015 to 2019 asthe fraud samples uses the fraud risk identification indi-cators with better identification results combed outaccording to the literature as the initial indicator system andeliminates the commonality through paired sample T testand principal component analysis After the linear problemthe 5 indicators with the best recognition effect were finallyselected Finally based on the above theoretical analysis andempirical research summarizing the full text it analyzes thedeficiencies in this research and puts forward a prospect forthe future development of the fraud risk identificationmodel

2 LVQ Neural Network

21 Concept and Structure of LVQ Neural Network Model)e LVQ neural network is proposed on the basis of thecompetitive network structure and belongs to the forwardneural network )e LVQ neural network combines the ideaof competitive learning with supervised learning algorithms[8] In the process of network learning the assignmentcategory of input samples is specified through the tutorsignal thereby overcoming the lack of classification infor-mation caused by the use of unsupervised learning algo-rithms in self-organizing networks weakness )e biggestadvantage of LVQ neural network is that it cannot onlyclassify linear input data but also process multidimensionaldata containing noise interference LVQ neural network hasa wide range of applications in the field of optimization andpattern recognition and it is also one of the typical clas-sification models Similar to BP neural network LVQ neuralnetwork also has three network layers which are divided










X1i minus x

1i minus p

(1)


X1i minus

x11 minus p

x12 minus p


x1i minus p

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

(2)








di minus X minus Wi

lim

n⟶infin

1113944

n

j1Xj minus Wij1113960 1113961

2

11139741113972

(4)



δ(x) δ(0) xj minus x1113872 1113873

xj

(5)





Output layer

Competitive layer

Input layer

X1i = ndash||x1

i ndash p||

x1 x2 xn


Training

LVQ model

LVQ classifier

Shape-basedfeatures

Trainingsamples

Perceptuallinearity

SVM

SVM classifier


the same class

All the samples




Fuzzy system







Sample modeinput

Start

Initialization






Test mode input


End


neural network

YesYes

NoNo



















rank test T test







X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13


Z value

ndash8

ndash6

ndash4

ndash2

0

2

4

6

Man

nndashW

hitn

ey ra

nk te

st re

sult



012345

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10X11X12X13


T value

T te

st re

sult







00102030405060708

Des

crip

tive s

tatis

tics r

esul

ts






Fraud company

Matching company


Fraud company

Matching company


Test

sam

ples

(150

pai

rs in

tota

l)

Trai

ning

sam

ples

(320

pai

rs in

tota

l)





5 Conclusion



Data Availability





85

15

8517

11

89 9074



Robu

stnes

s tes

t






Acknowledgments


References


































X1i minus x

1i minus p

(1)


X1i minus

x11 minus p

x12 minus p


x1i minus p

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

(2)








di minus X minus Wi

lim

n⟶infin

1113944

n

j1Xj minus Wij1113960 1113961

2

11139741113972

(4)



δ(x) δ(0) xj minus x1113872 1113873

xj

(5)





Output layer

Competitive layer

Input layer

X1i = ndash||x1

i ndash p||

x1 x2 xn


Training

LVQ model

LVQ classifier

Shape-basedfeatures

Trainingsamples

Perceptuallinearity

SVM

SVM classifier


the same class

All the samples




Fuzzy system







Sample modeinput

Start

Initialization






Test mode input


End


neural network

YesYes

NoNo



















rank test T test







X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13


Z value

ndash8

ndash6

ndash4

ndash2

0

2

4

6

Man

nndashW

hitn

ey ra

nk te

st re

sult



012345

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10X11X12X13


T value

T te

st re

sult







00102030405060708

Des

crip

tive s

tatis

tics r

esul

ts






Fraud company

Matching company


Fraud company

Matching company


Test

sam

ples

(150

pai

rs in

tota

l)

Trai

ning

sam

ples

(320

pai

rs in

tota

l)





5 Conclusion



Data Availability





85

15

8517

11

89 9074



Robu

stnes

s tes

t






Acknowledgments


References





























di minus X minus Wi

lim

n⟶infin

1113944

n

j1Xj minus Wij1113960 1113961

2

11139741113972

(4)



δ(x) δ(0) xj minus x1113872 1113873

xj

(5)





Output layer

Competitive layer

Input layer

X1i = ndash||x1

i ndash p||

x1 x2 xn


Training

LVQ model

LVQ classifier

Shape-basedfeatures

Trainingsamples

Perceptuallinearity

SVM

SVM classifier


the same class

All the samples




Fuzzy system







Sample modeinput

Start

Initialization






Test mode input


End


neural network

YesYes

NoNo



















rank test T test







X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13


Z value

ndash8

ndash6

ndash4

ndash2

0

2

4

6

Man

nndashW

hitn

ey ra

nk te

st re

sult



012345

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10X11X12X13


T value

T te

st re

sult







00102030405060708

Des

crip

tive s

tatis

tics r

esul

ts






Fraud company

Matching company


Fraud company

Matching company


Test

sam

ples

(150

pai

rs in

tota

l)

Trai

ning

sam

ples

(320

pai

rs in

tota

l)





5 Conclusion



Data Availability





85

15

8517

11

89 9074



Robu

stnes

s tes

t






Acknowledgments


References






























Sample modeinput

Start

Initialization






Test mode input


End


neural network

YesYes

NoNo



















rank test T test







X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13


Z value

ndash8

ndash6

ndash4

ndash2

0

2

4

6

Man

nndashW

hitn

ey ra

nk te

st re

sult



012345

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10X11X12X13


T value

T te

st re

sult







00102030405060708

Des

crip

tive s

tatis

tics r

esul

ts






Fraud company

Matching company


Fraud company

Matching company


Test

sam

ples

(150

pai

rs in

tota

l)

Trai

ning

sam

ples

(320

pai

rs in

tota

l)





5 Conclusion



Data Availability





85

15

8517

11

89 9074



Robu

stnes

s tes

t






Acknowledgments


References









































rank test T test







X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13


Z value

ndash8

ndash6

ndash4

ndash2

0

2

4

6

Man

nndashW

hitn

ey ra

nk te

st re

sult



012345

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10X11X12X13


T value

T te

st re

sult







00102030405060708

Des

crip

tive s

tatis

tics r

esul

ts






Fraud company

Matching company


Fraud company

Matching company


Test

sam

ples

(150

pai

rs in

tota

l)

Trai

ning

sam

ples

(320

pai

rs in

tota

l)





5 Conclusion



Data Availability





85

15

8517

11

89 9074



Robu

stnes

s tes

t






Acknowledgments


References






























00102030405060708

Des

crip

tive s

tatis

tics r

esul

ts






Fraud company

Matching company


Fraud company

Matching company


Test

sam

ples

(150

pai

rs in

tota

l)

Trai

ning

sam

ples

(320

pai

rs in

tota

l)





5 Conclusion



Data Availability





85

15

8517

11

89 9074



Robu

stnes

s tes

t






Acknowledgments


References


























5 Conclusion



Data Availability





85

15

8517

11

89 9074



Robu

stnes

s tes

t






Acknowledgments


References




























Acknowledgments


References


























usinganoptimizedlearningvectorquantization-(lvq-)based

Documents