k-line patterns’ predictive power analysis using the methods of...

12
Research Article K-Line Patterns’ Predictive Power Analysis Using the Methods of Similarity Match and Clustering Lv Tao, 1 Yongtao Hao, 1 Hao Yijie, 2 and Shen Chunfeng 3 1 College of Electronics and Information Engineering, Tongji University, Shanghai 200092, China 2 Rabun Gap-Nacoochee School, Rabun Gap, GA 30568, USA 3 Shanghai Baosight Soſtware Co., Ltd., Shanghai 200092, China Correspondence should be addressed to Lv Tao; [email protected] Received 23 December 2016; Revised 31 March 2017; Accepted 5 April 2017; Published 22 May 2017 Academic Editor: Anna M. Gil-Lafuente Copyright © 2017 Lv Tao et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Stock price prediction based on K-line patterns is the essence of candlestick technical analysis. However, there are some disputes on whether the K-line patterns have predictive power in academia. To help resolve the debate, this paper uses the data mining methods of pattern recognition, pattern clustering, and pattern knowledge mining to research the predictive power of K-line patterns. e similarity match model and nearest neighbor-clustering algorithm are proposed for solving the problem of similarity match and clustering of K-line series, respectively. e experiment includes testing the predictive power of the ree Inside Up pattern and ree Inside Down pattern with the testing dataset of the K-line series data of Shanghai 180 index component stocks over the latest 10 years. Experimental results show that (1) the predictive power of a pattern varies a great deal for different shapes and (2) each of the existing K-line patterns requires further classification based on the shape feature for improving the prediction performance. 1. Introduction A time series is a series of observations listed in time order. It is the most commonly encountered data type, touching almost every aspect of human life [1], for example, the meteorological time series, the time series of stock prices (stock time series for short) which are composed of stock price observations, and the time series of personal health that are consisted of the observation of blood pressure, temperature, white corpuscle, and so forth. Researches show that the time series have two import features. (a) e historical information will affect the future trend [2]. at is, the historical values of observations will exert an influence on the future values in the time series. e influence can be described by time series’ period, nonstation- arity, varying volatility, and so on. (b) History repeats itself [3]. at is to say, some special time subseries will repeat in the entire time series. Because of the two features, all kinds of time series forecasting have become a present hot research, one of which is the prediction of stock time series, stock prediction for short. As a typical time series, not only have stock time series the features of time series, but also the trend of stock prices is directly related to the people’s vital interests. erefore, stock prediction has aroused the interest of a wide variety of researchers. ere are many technical analysis methods about stock prediction, the best known of which is candlestick technical analysis that is also called K-line technology analysis in Asia. In the stock market, in order to learn and study the fluctuation of stock prices in a more intuitive way, people invent a candlestick chart (also called K-line) to represent stock time series graphically. Taking a daily K-line, for example, a K- line represents the fluctuation of stock prices in one day, it not only shows the close price, open price, high price, and low price for the day but also reflects the difference and size between any two prices (all K-lines given in the paper refer to daily K-line, unless otherwise indicated). If the K-line of a stock lists in time order, then a series used to reflect the fluctuation of the stock price for some time can be formed, which can be called K-line series. As each K-line consists of four prices, the essence of K-line series is stock series with four observations. Hindawi Mathematical Problems in Engineering Volume 2017, Article ID 3096917, 11 pages https://doi.org/10.1155/2017/3096917

Upload: others

Post on 10-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

Research ArticleK-Line Patternsrsquo Predictive Power Analysis Usingthe Methods of Similarity Match and Clustering

Lv Tao1 Yongtao Hao1 Hao Yijie2 and Shen Chunfeng3

1College of Electronics and Information Engineering Tongji University Shanghai 200092 China2Rabun Gap-Nacoochee School Rabun Gap GA 30568 USA3Shanghai Baosight Software Co Ltd Shanghai 200092 China

Correspondence should be addressed to Lv Tao superlvtao163com

Received 23 December 2016 Revised 31 March 2017 Accepted 5 April 2017 Published 22 May 2017

Academic Editor Anna M Gil-Lafuente

Copyright copy 2017 Lv Tao et al This is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Stock price prediction based on K-line patterns is the essence of candlestick technical analysis However there are some disputes onwhether the K-line patterns have predictive power in academia To help resolve the debate this paper uses the dataminingmethodsof pattern recognition pattern clustering and pattern knowledge mining to research the predictive power of K-line patterns Thesimilarity match model and nearest neighbor-clustering algorithm are proposed for solving the problem of similarity match andclustering of K-line series respectively The experiment includes testing the predictive power of the Three Inside Up pattern andThree Inside Down pattern with the testing dataset of the K-line series data of Shanghai 180 index component stocks over the latest10 years Experimental results show that (1) the predictive power of a pattern varies a great deal for different shapes and (2) each ofthe existing K-line patterns requires further classification based on the shape feature for improving the prediction performance

1 Introduction

A time series is a series of observations listed in time orderIt is the most commonly encountered data type touchingalmost every aspect of human life [1] for example themeteorological time series the time series of stock prices(stock time series for short) which are composed of stockprice observations and the time series of personal healththat are consisted of the observation of blood pressuretemperature white corpuscle and so forth

Researches show that the time series have two importfeatures (a) The historical information will affect the futuretrend [2] That is the historical values of observations willexert an influence on the future values in the time series Theinfluence can be described by time seriesrsquo period nonstation-arity varying volatility and so on (b) History repeats itself[3] That is to say some special time subseries will repeat inthe entire time series Because of the two features all kinds oftime series forecasting have become a present hot researchone of which is the prediction of stock time series stockprediction for short As a typical time series not only have

stock time series the features of time series but also the trendof stock prices is directly related to the peoplersquos vital interestsTherefore stock prediction has aroused the interest of a widevariety of researchers

There are many technical analysis methods about stockprediction the best known of which is candlestick technicalanalysis that is also called K-line technology analysis in AsiaIn the stockmarket in order to learn and study the fluctuationof stock prices in a more intuitive way people invent acandlestick chart (also called K-line) to represent stock timeseries graphically Taking a daily K-line for example a K-line represents the fluctuation of stock prices in one day itnot only shows the close price open price high price andlow price for the day but also reflects the difference and sizebetween any two prices (all K-lines given in the paper referto daily K-line unless otherwise indicated) If the K-line ofa stock lists in time order then a series used to reflect thefluctuation of the stock price for some time can be formedwhich can be called K-line series As each K-line consists offour prices the essence of K-line series is stock series withfour observations

HindawiMathematical Problems in EngineeringVolume 2017 Article ID 3096917 11 pageshttpsdoiorg10115520173096917

2 Mathematical Problems in Engineering

(a) Shape A (b) Shape B (c) Shape C

Figure 1 Three kinds of shapes of TIU pattern

In K-line series if a K-line subseries contains someknowledge used to predict stock then this subseries is called aK-line pattern series a K-line pattern for short For instancewhen a subseries appears the stock price will often rise ordescend Then this subseries is a typical pattern series Stockprediction based on K-line patterns is the essence of K-linetechnology analysis How to mine the K-line patterns andhow to make use of these patterns for predicting are mainresearch contents of K-line technology analysis

By the artificial methods of observing the K-line series ofstock market (or Japanese rice market) people (the leadingcharacter is the founder of K-line Munehisa Honma whowas a Japanese rice trader in the 18th century) have foundmany K-line patterns The literatures [4 5] introduce theexisting patterns and their features in detail such as ThreeInside Up (TIU) Three Inside Down (TID) and DojiSome papers [6ndash10] conclude from the experiment that theexisting K-line patterns have a good forecasting capabilityfor forecasting stock trends Some other papers [11ndash15] havestudied the stock prediction based on these patterns andhave achieved some research results However there are alsoa number of papers [5 16ndash18] challenging these patternsrsquopredictive power They argue that K-line technology analysisviolates the efficient market hypothesis so it is not feasiblefor stock investment based on K-line patterns They alsodid some experiments which show that the existing K-linepatterns have no predictive power

Based on the above analysis it is obvious that there aresome disputes on whether the K-line patterns have predictivepower in academia However there are few papers analyzingthe reason why there are two different positions regardingthe patternsrsquo predictive power Paper [19] also pays attentionto the debate while it does not analyze the K-line patternsthemselves but attempts to obtain an answer to the followingquestion are the trend reversals accompanied more often bysome types of candlesticks than by others Finally paper [19]

has found that there exist types of candlesticks that frequentlytend to appear close to the trend-reversal regions and othersthat cannot be found in such regions Although the paperrsquosresearch shows that the K-line patterns exist it does not givethe answer that why there is a debate on the K-line patternsrsquopredictive power

Through reviewing the relevant literatures this paperconsiders that the main reason is that the existing K-linepatterns are lack of rigorous mathematical definition Forexample the shadow length and body size are not definedclearly in the definition of K-line patterns which meansthat a K-line pattern has many different shapes Because thepredictive power of a pattern may vary a lot for differentshapes If we ignore the shape difference and research thepredictive power of a pattern by taking all patterns withvarious shapes as a whole instead of classifying the patternfurther based on its shape feature then the study result ofK-line patternsrsquo predictive power may produce deviationsFor instance a TIU pattern has three shapes shape Ashape B and shape C as shown in Figure 1 where shapeA is the generic form of TIU pattern and shape B and Care infrequent form of which Suppose that shape A haspredictive power and shape B and C do not have predictivepower When studying the predictive power of TIU patternif we ignore the shape difference between the three patternsand research them as a whole then we will come to thewrong conclusion that TIU pattern has no predictive powerHowever if the three patterns are classified further based onshape features and researched separately then we can get thecorrect conclusion that TIUpattern has predictive power onlyat shape A

In addition another reason is that as the existing K-linepatterns are mined by artificial means there may be somespurious pattern in them

In order to resolve the debate and verify the two infer-ences this paper presents the research of K-line patternsrsquo

Mathematical Problems in Engineering 3

High

Close

Body

Open

Low

Upper shadow

Lower shadow

(a) Yang line

High

Open

Body

Close

Low

Upper shadow

Lower shadow

(b) Yin line

High

Low

Upper shadow

Lower shadow

Open = close

(c) Doji line

Figure 2 K-line chart

predictive power using the data mining related method suchas pattern recognition pattern clustering pattern knowledgemining and statistical analysis The rest of this paper isorganized as follows In Section 2 we firstly shortly introduceK-line K-line technology analysis and K-line patterns Thenwe define the similarity match model and nearest neighbor-clustering algorithm of K-line series In Section 3 we definethe mining method of patternsrsquo predictive power Section 4presents the experimental result and discussion Section 5concludes the paper

2 K-Line and K-Line Series Clustering

Firstly we give the mathematic definition of K-line series LetKS119894 represent the 119894-th K-line series of any stock and let 119863119894119905represent t-th K-line in KS119894 then

KS119894 =[[[[[[[

1198621198941 1198621198942 sdot sdot sdot 119862|KS119894|1198741198941 1198741198942 sdot sdot sdot 119874119894|KS119894|1198671198941 1198671198942 sdot sdot sdot 119867119894|KS119894|1198711198941 1198711198942 sdot sdot sdot 119871119894|KS119894|

]]]]]]] (10038161003816100381610038161003816KS11989410038161003816100381610038161003816 isin 119873+)

119863119894119905 = [119862119894119905 119874119894119905 119867119894119905 119871119894119905]119879 (1 le 119905 le 10038161003816100381610038161003816KS11989410038161003816100381610038161003816)

(1)

where |KS119894| is the number of elements in KS119894 which is alsocalled the length of KS119894 119862119894119905 119874119894119905 119867119894119905 and 119871119894119905 are the t-th dayrsquosclose price open price high price and low price in KS119894respectively In this paper ldquo| |rdquo symbol indicates the numberof elements in the set or series

21 K-Line

211 K-Line Introduction As defined in literature [4ndash6] theK-line is drawnby four basic elements close price open pricehigh price and low price where the part between the closeprice and open price is drawn into a rectangle called body ofK-line and the part between the high price and body is drawn

into a line called upper shadow of K-line Moreover the partbetween the lower price and body is drawn into a line calledlower shadow of K-line This kind of very personalized linesconsisting of upper shadow lower shadow and body is calledK-line

In the K-line if open price is lower than close price K-line also called Yang line the body is usually filled with whiteor green color as shown in Figure 2(a) And if open priceis higher than close price K-line also called Yin line thebody is usually filled with black or red color as shown inFigure 2(b) Moreover if open price is equal to close price K-line also called Doji line the body then collapses into a singlehorizontal line as shown in Figure 2(c) It is important tonote that the body color of Yin line and Yang line is differentin Chinese stock market and stock markets of European andAmerican In Chinese stock market the body color of Yangline and Yin line is red and green respectively Howeverthe body color of Yang line and Yin line is green and redrespectively in the stockmarkets of European and American

212 K-Line Technology Analysis Firstly we introduce anddefine some key concepts of K-line technology analysis Let119863119905 represent the t-th dayrsquos K-line of any stock

(1) Moving Average It is the average of stock price for sometime The three-day moving average at time119863119905 is defined by

119872avg (119863119905) = 13 119862119905minus2 + 119862119905minus1 + 119862119905 (2)

where 119862119905 denotes the close price of119863119905(2) K-Line Trend It is used to describe the K-linersquos trendincluding uptrend and downtrend 119863119905 is said to be a down-trend if

119872avg (119863119905minus6) gt 119872avg (119863119905minus5) gt sdot sdot sdot gt 119872avg (119863119905) (3)

with at most one violation of the inequalities Uptrend isdefined analogously

4 Mathematical Problems in Engineering

(a) TIU (b) TID

Figure 3 THE standard K-line series chart of TIU and TID

(3) Stock Price Trend It is used to describe the generaltrend of stock prices for some time including uptrend anddowntrend If the future trend of stock price is rising it iscalled bullish market In contrast if the future trend of stockprice is descending it is called bearish market Moreovera more intense rising or descending trend indicates a moretypical bullish or bearish market The capability of a K-linepatter for predicting the bullish market and bearish market isdefined in formulas (17) and (18) respectively

It is noted that the concepts of ldquomoving averagerdquo and ldquoK-line trendrdquo are defined by the paper [6] while the concept ofstock price trend is firstly defined by the paper

213 K-Line Patterns ManyK-line patterns have beenminedup to now as shown in literatures [4ndash6] Limited by spaceonly the patterns of TIU and TID will be introduced in thenext content Let KS = [119863119905 119863119905+1 119863119905+2] represent a three-dayK-line series

The conditions of KS becoming the TIU pattern are asfollows (1) 119863119905 is a downtrend and 119862119905 lt 119874119905 (2) 119862119905+1 gt 119874119905+1119874119905 ge 119862119905+1 gt 119862119905 and 119874119905 ge 119874119905+1 gt 119862119905 where at most oneof the two equalities holds That is the second day 119905 + 1 isYang line and must be contained with the body of the firstday (3) 119862119905+2 gt 119874119905+2 119862119905+2 gt 119874119905 That is the third day 119905 + 2 isYang line and closes above the open of the first dayA standardTIU pattern is shown in Figure 3(a)

The predictive power of TIU pattern from the existingliterature is that TIU is a trend-reversal pattern which givesthe bullish market signal This means when the TIU patternappears the stock prices will be likely to be transferredfrom downtrend into uptrend or the stock market would bechanged from bearishmarket to bullishmarket and the stockprices would rise gradually

The conditions of KS becoming the TID pattern are asfollows (1) 119863119905 is an uptrend and 119862119905 gt 119874119905 (2) 119862119905+1 lt 119874119905+1119862119905 ge 119862119905+1 gt 119874119905 and 119862119905 ge 119874119905+1 gt 119874119905 where at most one ofthe two equalities holds That is the second day 119905 + 1 is Yin

line and must be contained within the body of the first day(3) 119862119905+2 lt 119874119905+2 119862119905+2 lt 119874119905 That is the third day 119905 + 2 is Yinline and its close is lower than the first dayrsquos open A standardTID pattern is shown in Figure 3(b)

The predictive power of TID pattern from the existingliterature is that TID is a trend-reversal pattern which givesthe bearish market signal That means after the TID patternappears the stock prices will be likely to be transferredfrom uptrend into downtrend or the stock market would bechanged from bullishmarket to bearishmarket and the stockprices would fall gradually

22 Similarity Match of K-Line Series The similarity matchof K-line series is an essential and basis task for K-lineseries clustering In the literature however there are fewpapers focusing on the similarity match of K-line series Onlypaper [20] studies the similarity match method and searchalgorithmofK-line series using image retrieval technology Inaddition paper [19 21] proposes the similarity match modelof K-line series based on the traditional Euclidean distance

From the view of stock prediction the K-line seriesrsquosimilarity refers to the trend similarity of K-line in the K-line series However the K-line trend is determined by theclose price change open price change high price changelow price change and the size relationship between closeprice and open price Therefore if we want to match thesimilarity between two K-line series we should calculate thesimilarity of K-line price changes instead of the similarity ofprice values As the changes of K-line price are not shownin the K-line chart K-line prices distance rather than K-line price changes distance is used in the similarity matchmodel of literature [19ndash21] This means that these matchmodels belong to similarity match methods based on K-lineprice values rather than K-line price changesTherefore theycannot accuratelymeasure the similarity of stock prices trendin the K-line series

Mathematical Problems in Engineering 5

For example assuming that there are twoK-line series KS119894

and KS119895 needed to match their similarity where KS119894 = KS119895

and Sim119894119895 indicate their similarity Let 119862119894119905 = 10 119862119894119905+1 = 105119862119895119905 = 20 119862119895119905+1 = 21 and RC119905+1119905119894 indicate the close pricechange rate of KS119894 at day 119905 + 1 which is calculated by (119862119894119905+1 minus119862119894119905)119862119894119905 SRC119905+1119905119894119895 denotes the similarity between RC119905+1119905119894 andRC119905+1119905119894 then RC119905+1119905119894 = RC119905+1119905119895 = 5 and SRC119905+1119905119894119895 = 1 Wecannot calculate the correct result of SRC119905+1119905119894119895 = 1 by thesimilarity match model in literature [19ndash21] Similarly thesame problems would occur for calculating the similarity ofopen price high price or low price

Therefore this paper proposes a new similarity matchmodel based on K-line price changes to measure the trendsimilarity between two K-line series In this model thesimilarity of K-line series is composed of two parts one isthe shape similarity of K-line which is the similarity of thecorrespondingK-linersquos shape features in the twoK-line seriesthe other is the position similarity of K-line which is thesimilarity of the corresponding K-linersquos position features inthe two K-line series Therefore this paper will define K-lineseriesrsquo shape similarity model and position similarity modelrespectively Then based on these two kinds of similaritymodels the similarity model of the entire K-line series couldbe built

221 The Shape Similarity of K-Line Series According to theshape feature of K-line this paper proposes using the shape

distance to measure the shape similarity between two K-lines Firstly based on the shape structure of K-line threecomponents of K-line shape are extracted the upper shadowshape the lower shadow shape and the body shape Secondlythe similarity match methods of three shapes are definedrespectively Finally the shape similarity of K-line can becalculated by summing the three shapesrsquo similarity Assumingthat 119863119894119905 and 119863119895119905 denote the t-th dayrsquos K-line of KS119894 and KS119895respectively the shape similarity model of K-line series isdefined as follows(1) Let US119894[119905] denote the upper shadow length of 119863119894119905 asdefined in the following formula

US119894 [119905] =

119867119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119867119894119905 minus 119862119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(4)

where 119862119894119905minus1 lowast 01 is used to normalize the upper shadowlength According to the related regulation of Chinese A-share market the range of daily fluctuations of stock pricescannot exceed 10 of the previous dayrsquos close price So119862119894119905minus1 lowast01 can be used to normalize the length of the K-linersquos uppershadow lower shadow and body

Let Sim119894119895US(119905) denote the upper shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895US (119905) =

min (US119894 [119905] US119895 [119905])max (US119894 [119905] US119895 [119905]) US119894 [119905] lowast US119895 [119905] gt 00 US119894 [119905] lowast US119895 [119905] = 0 US119894 [119905] = US119895 [119905] 1 US119894 [119905] = US119895 [119905] = 0

(5)

(2) Let LS119894[119905] denote the lower shadow length of 119863119894119905 asdefined in the following formula

LS119894 [119905] =

119862119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119874119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(6)

Let Sim119894119895LS(119905) denote the lower shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895LS (119905) =

min (LS119894 [119905] LS119895 [119905])max (LS119894 [119905] LS119895 [119905]) LS119894 [119905] lowast LS119895 [119905] gt 00 LS119894 [119905] lowast LS119895 [119905] = 0 LS119894 [119905] = LS119895 [119905] 1 LS119894 [119905] = LS119895 [119905] = 0

(7)

6 Mathematical Problems in Engineering

(3) Let 119861119894[119905] denote the body length of 119863119894119905 as defined inthe following formula

119861119894 [119905] = 119862119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 (8)

Let Sim119894119895Body(119905) denote the body similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895Body (119905) =

min (10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816)max (1003816100381610038161003816119861119894 [119905]1003816100381610038161003816 1003816100381610038161003816119861119895 [119905]1003816100381610038161003816) 119861

119894 [119905] lowast 119861119895 [119905] gt 00 119861119894 [119905] lowast 119861119895 [119905] lt 00 119861119894 [119905] lowast 119861119895 [119905] = 0 10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 = 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816 1 119861119894 [119905] = 119861119895 [119905] = 0

(9)

(4) Let Sim119894119895119878 (119905) denote the shape similarity between 119863119894119905and119863119895119905 as defined by

119908Body + 119908LS + 119908US = 1119908Body ge 0119908US ge 0119908LS ge 0

Sim119894119895119878 (119905) = 119908Body lowast Sim119894119895Body (119905) + 119908US

lowast Sim119894119895US (119905) + 119908LS

lowast Sim119894119895LS (119905)

(10)

where119908Body119908US and119908LS represent the weight of Sim119894119895

Body(119905)Sim119894119895US(119905) and Sim119894119895LS(119905) respectively(5) Let SSim119894119895 denote the shape similarity between KS119894

and KS119895 as defined by

SSim119894119895 = 119899sum119905=1

Sim119894119895119878 (119905) lowast 119908119905119878

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119878 = 1) (11)

where119908119905119878 represent the weight of Sim119894119895119878 (119905) Thanks to the ideathat each K-line can be given different weight the K-lineseries having special shape features could be identified well

222 The Position Similarity of K-Line Series For computingthe similarity between twoK-line series we not only considerthe shape similarity of K-line series but also the positionsimilarity If we only consider the shape similarity then itwill cause the problem that two K-line series having sameshape features but different position features will have thesame similarity

For example supposing that the K-line series chart of KS119894

and KS119895 is shown in Figure 2 we can see that according tothe shape feature definition of K-line all of the corresponding

K-lines of KS119894 and KS119895 have the same shape features Thesemean that KS119894 and KS119895 have identical shape features thatis SSim119894119895 = 1 However as is vividly shown in Figure 4the relative positions of 1198631198942 and 1198631198952 are different though 1198631198941and 1198631198951 have the same relative position in the K-line seriesTherefor the stock pricersquos overall trend of KS119894 and KS119895 are notidentical that is Sim119894119895 lt 1 If we only consider the shapesimilarity we will draw the wrong conclusion that SSim119894119895 =Sim119894119895 = 1

To solve this problem the concept of K-line coordinateis introduced hoping to implement the position match of K-line by definingK-linersquos coordinate in the K-line series In thispaper the sequence of K-line in the K-line series is called119909 coordinate of K-line the increase range of close price iscalled 119910 coordinate of K-line in addition the first K-linersquos119910 coordinate is set to 1 in the K-line series Therefore theposition similarity model of K-line series based on K-linecoordinate is defined as follows(1) Let (119909119894119905 119910119894119905) denote the coordinate of 119863119894119905 which aredefined in the following formula

119909119894119905 = 119905

119910119894119905 =

1 119905 = 1(119862119894119905 minus 119862119894119905minus1)119862119894119905minus1 lowast 01 119905 gt 1

(12)

Let Sim119894119895119875 (119905) denote the position similarity between 119863119894119905and119863119895119905 as defined by

Sim119894119895119875 (119905) =

min (119910119894119905 119910119895119905 )max (119910119894119905 119910119895119905 ) 119910

119894119905 lowast 119910119895119905 gt 0

0 119910119894119905 lowast 119910119895119905 = 0 119910119894119905 = 119910119895119905 1 119910119894119905 = 119910119895119905 = 0

(13)

Mathematical Problems in Engineering 7

110

100

95

90

102

105

110

100

(a) KS119894

110

100

95

9092

95

100

90(b) KS119895

Figure 4 K-line series chart

(2) Let PSim119894119895 denote the position similarity between KS119894

and KS119895 as defined by

PSim119894119895 = 119899sum119905=1

Sim119894119895119875 (119905) lowast 119908119905119875

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119875 = 1) (14)

where 119908119905119875 represents the weight of Sim119894119895119875 (119905) Thanks to theidea that each K-line can be given different weight the K-lineseries having special coordinates could be identified well

223 The Similarity of K-Line Series Finally based on theshape similarity and position similarity the similarity of K-line series could be obtained Therefore the similarity matchmodel between KS119894 and KS119895 is defined by

Sim119894119895 = SSim119894119895 lowast 119908119878 + PSim119894119895 lowast 119908119875 (15)

where 119908119878 and 119908119875 represent the shape similarity weight andposition similarity weight of K-line series respectively

23 Cluster of K-Line Series Themore accurate classificationresult of K-line patterns can be gotten by clustering themusing the nearest neighbor-clustering algorithm based onthe similarity match model of K-line series The K-lineseriesrsquo nearest neighbor-clustering algorithm (KNNCA) isdescribed as shown in Algorithm 1

In addition |119876119898| represents the number of elements in119876119898 As each K-line series will be matched once with all of theK-line series stored in the cluster the time complexity andspace complexity of KNNCA are both 119874(1198992)3 Mining of Patternsrsquo Predictive Power

We can mine and analyze the patternsrsquo predictive poweraccording to the following steps

(1) Pattern Recognition Based on the definition of K-linepatterns we identify all the K-line series belonging to

a pattern (such as TIU or TID) and then they form aset (KSet)(2) Pattern Clustering We use the KNSSC algorithm tocluster KSet then the set of clusters (CSet) can be gotten inwhich different clusters represent the same patternrsquos differentshapes

(3) Knowledge Mining We define some statistical indicatorsabout stock prices which we use to mine stock predictionknowledge from each cluster

The patternrsquos predictive power is gotten primarily byanalyzing the trend of the patternrsquos consequent K-line seriesPaper [22] found that K-line technology is suited for short-term investment prediction and that the most efficient timeperiod for prediction is 10 daysTherefore we mainly analyzethe close price trend of the patternrsquos consequent K-line seriesin 10 days Let KS = [119863119905 119863119905+1 119863119905+2] denote a three-day K-line pattern its consequent K-line series is denoted by CKSThe statistical indicators of CKS are defined as follows

(a) Let 119862119896 denote the k-th close price of CKS let 119875119880(119862119896)denote that the probability of the trend of 119862119896 is uptrend andlet 119875119863(119862119896) denote that the probability of the trend of 119862119896 isdowntrend 119875119880(119862119896) and 119875119863(119862119896) are calculated by

119875119880 (119862119896) =10038161003816100381610038161003816119876119898119896119880 10038161003816100381610038161003816|119876119898|

119875119863 (119862119896) =10038161003816100381610038161003816119876119898119896119863 10038161003816100381610038161003816|119876119898|

(16)

where |119876119898119896119880 | represents the number of patterns meeting thecondition of119862119896 ge 119862119905+2 in119876119898 |119876119898119896119863 | represents the number ofpatterns meeting the condition of 119862119896 le 119862119905+2 in 119876119898 and 119862119905+2represents the close price of119863119905+2 119875119880(119862119896) gt 05 indicates thatthe future trend of 119862119905 is rising 119875119863(119862119896) gt 05 indicates thatthe future trend of 119862119896 is descending

(b) Let 119875119899119880 isin [0 1] denote the probability that the closeprice will rise in the next 119899 days if the pattern appears119875119899119863 isin [0 1] denotes the probability that the close price will

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

2 Mathematical Problems in Engineering

(a) Shape A (b) Shape B (c) Shape C

Figure 1 Three kinds of shapes of TIU pattern

In K-line series if a K-line subseries contains someknowledge used to predict stock then this subseries is called aK-line pattern series a K-line pattern for short For instancewhen a subseries appears the stock price will often rise ordescend Then this subseries is a typical pattern series Stockprediction based on K-line patterns is the essence of K-linetechnology analysis How to mine the K-line patterns andhow to make use of these patterns for predicting are mainresearch contents of K-line technology analysis

By the artificial methods of observing the K-line series ofstock market (or Japanese rice market) people (the leadingcharacter is the founder of K-line Munehisa Honma whowas a Japanese rice trader in the 18th century) have foundmany K-line patterns The literatures [4 5] introduce theexisting patterns and their features in detail such as ThreeInside Up (TIU) Three Inside Down (TID) and DojiSome papers [6ndash10] conclude from the experiment that theexisting K-line patterns have a good forecasting capabilityfor forecasting stock trends Some other papers [11ndash15] havestudied the stock prediction based on these patterns andhave achieved some research results However there are alsoa number of papers [5 16ndash18] challenging these patternsrsquopredictive power They argue that K-line technology analysisviolates the efficient market hypothesis so it is not feasiblefor stock investment based on K-line patterns They alsodid some experiments which show that the existing K-linepatterns have no predictive power

Based on the above analysis it is obvious that there aresome disputes on whether the K-line patterns have predictivepower in academia However there are few papers analyzingthe reason why there are two different positions regardingthe patternsrsquo predictive power Paper [19] also pays attentionto the debate while it does not analyze the K-line patternsthemselves but attempts to obtain an answer to the followingquestion are the trend reversals accompanied more often bysome types of candlesticks than by others Finally paper [19]

has found that there exist types of candlesticks that frequentlytend to appear close to the trend-reversal regions and othersthat cannot be found in such regions Although the paperrsquosresearch shows that the K-line patterns exist it does not givethe answer that why there is a debate on the K-line patternsrsquopredictive power

Through reviewing the relevant literatures this paperconsiders that the main reason is that the existing K-linepatterns are lack of rigorous mathematical definition Forexample the shadow length and body size are not definedclearly in the definition of K-line patterns which meansthat a K-line pattern has many different shapes Because thepredictive power of a pattern may vary a lot for differentshapes If we ignore the shape difference and research thepredictive power of a pattern by taking all patterns withvarious shapes as a whole instead of classifying the patternfurther based on its shape feature then the study result ofK-line patternsrsquo predictive power may produce deviationsFor instance a TIU pattern has three shapes shape Ashape B and shape C as shown in Figure 1 where shapeA is the generic form of TIU pattern and shape B and Care infrequent form of which Suppose that shape A haspredictive power and shape B and C do not have predictivepower When studying the predictive power of TIU patternif we ignore the shape difference between the three patternsand research them as a whole then we will come to thewrong conclusion that TIU pattern has no predictive powerHowever if the three patterns are classified further based onshape features and researched separately then we can get thecorrect conclusion that TIUpattern has predictive power onlyat shape A

In addition another reason is that as the existing K-linepatterns are mined by artificial means there may be somespurious pattern in them

In order to resolve the debate and verify the two infer-ences this paper presents the research of K-line patternsrsquo

Mathematical Problems in Engineering 3

High

Close

Body

Open

Low

Upper shadow

Lower shadow

(a) Yang line

High

Open

Body

Close

Low

Upper shadow

Lower shadow

(b) Yin line

High

Low

Upper shadow

Lower shadow

Open = close

(c) Doji line

Figure 2 K-line chart

predictive power using the data mining related method suchas pattern recognition pattern clustering pattern knowledgemining and statistical analysis The rest of this paper isorganized as follows In Section 2 we firstly shortly introduceK-line K-line technology analysis and K-line patterns Thenwe define the similarity match model and nearest neighbor-clustering algorithm of K-line series In Section 3 we definethe mining method of patternsrsquo predictive power Section 4presents the experimental result and discussion Section 5concludes the paper

2 K-Line and K-Line Series Clustering

Firstly we give the mathematic definition of K-line series LetKS119894 represent the 119894-th K-line series of any stock and let 119863119894119905represent t-th K-line in KS119894 then

KS119894 =[[[[[[[

1198621198941 1198621198942 sdot sdot sdot 119862|KS119894|1198741198941 1198741198942 sdot sdot sdot 119874119894|KS119894|1198671198941 1198671198942 sdot sdot sdot 119867119894|KS119894|1198711198941 1198711198942 sdot sdot sdot 119871119894|KS119894|

]]]]]]] (10038161003816100381610038161003816KS11989410038161003816100381610038161003816 isin 119873+)

119863119894119905 = [119862119894119905 119874119894119905 119867119894119905 119871119894119905]119879 (1 le 119905 le 10038161003816100381610038161003816KS11989410038161003816100381610038161003816)

(1)

where |KS119894| is the number of elements in KS119894 which is alsocalled the length of KS119894 119862119894119905 119874119894119905 119867119894119905 and 119871119894119905 are the t-th dayrsquosclose price open price high price and low price in KS119894respectively In this paper ldquo| |rdquo symbol indicates the numberof elements in the set or series

21 K-Line

211 K-Line Introduction As defined in literature [4ndash6] theK-line is drawnby four basic elements close price open pricehigh price and low price where the part between the closeprice and open price is drawn into a rectangle called body ofK-line and the part between the high price and body is drawn

into a line called upper shadow of K-line Moreover the partbetween the lower price and body is drawn into a line calledlower shadow of K-line This kind of very personalized linesconsisting of upper shadow lower shadow and body is calledK-line

In the K-line if open price is lower than close price K-line also called Yang line the body is usually filled with whiteor green color as shown in Figure 2(a) And if open priceis higher than close price K-line also called Yin line thebody is usually filled with black or red color as shown inFigure 2(b) Moreover if open price is equal to close price K-line also called Doji line the body then collapses into a singlehorizontal line as shown in Figure 2(c) It is important tonote that the body color of Yin line and Yang line is differentin Chinese stock market and stock markets of European andAmerican In Chinese stock market the body color of Yangline and Yin line is red and green respectively Howeverthe body color of Yang line and Yin line is green and redrespectively in the stockmarkets of European and American

212 K-Line Technology Analysis Firstly we introduce anddefine some key concepts of K-line technology analysis Let119863119905 represent the t-th dayrsquos K-line of any stock

(1) Moving Average It is the average of stock price for sometime The three-day moving average at time119863119905 is defined by

119872avg (119863119905) = 13 119862119905minus2 + 119862119905minus1 + 119862119905 (2)

where 119862119905 denotes the close price of119863119905(2) K-Line Trend It is used to describe the K-linersquos trendincluding uptrend and downtrend 119863119905 is said to be a down-trend if

119872avg (119863119905minus6) gt 119872avg (119863119905minus5) gt sdot sdot sdot gt 119872avg (119863119905) (3)

with at most one violation of the inequalities Uptrend isdefined analogously

4 Mathematical Problems in Engineering

(a) TIU (b) TID

Figure 3 THE standard K-line series chart of TIU and TID

(3) Stock Price Trend It is used to describe the generaltrend of stock prices for some time including uptrend anddowntrend If the future trend of stock price is rising it iscalled bullish market In contrast if the future trend of stockprice is descending it is called bearish market Moreovera more intense rising or descending trend indicates a moretypical bullish or bearish market The capability of a K-linepatter for predicting the bullish market and bearish market isdefined in formulas (17) and (18) respectively

It is noted that the concepts of ldquomoving averagerdquo and ldquoK-line trendrdquo are defined by the paper [6] while the concept ofstock price trend is firstly defined by the paper

213 K-Line Patterns ManyK-line patterns have beenminedup to now as shown in literatures [4ndash6] Limited by spaceonly the patterns of TIU and TID will be introduced in thenext content Let KS = [119863119905 119863119905+1 119863119905+2] represent a three-dayK-line series

The conditions of KS becoming the TIU pattern are asfollows (1) 119863119905 is a downtrend and 119862119905 lt 119874119905 (2) 119862119905+1 gt 119874119905+1119874119905 ge 119862119905+1 gt 119862119905 and 119874119905 ge 119874119905+1 gt 119862119905 where at most oneof the two equalities holds That is the second day 119905 + 1 isYang line and must be contained with the body of the firstday (3) 119862119905+2 gt 119874119905+2 119862119905+2 gt 119874119905 That is the third day 119905 + 2 isYang line and closes above the open of the first dayA standardTIU pattern is shown in Figure 3(a)

The predictive power of TIU pattern from the existingliterature is that TIU is a trend-reversal pattern which givesthe bullish market signal This means when the TIU patternappears the stock prices will be likely to be transferredfrom downtrend into uptrend or the stock market would bechanged from bearishmarket to bullishmarket and the stockprices would rise gradually

The conditions of KS becoming the TID pattern are asfollows (1) 119863119905 is an uptrend and 119862119905 gt 119874119905 (2) 119862119905+1 lt 119874119905+1119862119905 ge 119862119905+1 gt 119874119905 and 119862119905 ge 119874119905+1 gt 119874119905 where at most one ofthe two equalities holds That is the second day 119905 + 1 is Yin

line and must be contained within the body of the first day(3) 119862119905+2 lt 119874119905+2 119862119905+2 lt 119874119905 That is the third day 119905 + 2 is Yinline and its close is lower than the first dayrsquos open A standardTID pattern is shown in Figure 3(b)

The predictive power of TID pattern from the existingliterature is that TID is a trend-reversal pattern which givesthe bearish market signal That means after the TID patternappears the stock prices will be likely to be transferredfrom uptrend into downtrend or the stock market would bechanged from bullishmarket to bearishmarket and the stockprices would fall gradually

22 Similarity Match of K-Line Series The similarity matchof K-line series is an essential and basis task for K-lineseries clustering In the literature however there are fewpapers focusing on the similarity match of K-line series Onlypaper [20] studies the similarity match method and searchalgorithmofK-line series using image retrieval technology Inaddition paper [19 21] proposes the similarity match modelof K-line series based on the traditional Euclidean distance

From the view of stock prediction the K-line seriesrsquosimilarity refers to the trend similarity of K-line in the K-line series However the K-line trend is determined by theclose price change open price change high price changelow price change and the size relationship between closeprice and open price Therefore if we want to match thesimilarity between two K-line series we should calculate thesimilarity of K-line price changes instead of the similarity ofprice values As the changes of K-line price are not shownin the K-line chart K-line prices distance rather than K-line price changes distance is used in the similarity matchmodel of literature [19ndash21] This means that these matchmodels belong to similarity match methods based on K-lineprice values rather than K-line price changesTherefore theycannot accuratelymeasure the similarity of stock prices trendin the K-line series

Mathematical Problems in Engineering 5

For example assuming that there are twoK-line series KS119894

and KS119895 needed to match their similarity where KS119894 = KS119895

and Sim119894119895 indicate their similarity Let 119862119894119905 = 10 119862119894119905+1 = 105119862119895119905 = 20 119862119895119905+1 = 21 and RC119905+1119905119894 indicate the close pricechange rate of KS119894 at day 119905 + 1 which is calculated by (119862119894119905+1 minus119862119894119905)119862119894119905 SRC119905+1119905119894119895 denotes the similarity between RC119905+1119905119894 andRC119905+1119905119894 then RC119905+1119905119894 = RC119905+1119905119895 = 5 and SRC119905+1119905119894119895 = 1 Wecannot calculate the correct result of SRC119905+1119905119894119895 = 1 by thesimilarity match model in literature [19ndash21] Similarly thesame problems would occur for calculating the similarity ofopen price high price or low price

Therefore this paper proposes a new similarity matchmodel based on K-line price changes to measure the trendsimilarity between two K-line series In this model thesimilarity of K-line series is composed of two parts one isthe shape similarity of K-line which is the similarity of thecorrespondingK-linersquos shape features in the twoK-line seriesthe other is the position similarity of K-line which is thesimilarity of the corresponding K-linersquos position features inthe two K-line series Therefore this paper will define K-lineseriesrsquo shape similarity model and position similarity modelrespectively Then based on these two kinds of similaritymodels the similarity model of the entire K-line series couldbe built

221 The Shape Similarity of K-Line Series According to theshape feature of K-line this paper proposes using the shape

distance to measure the shape similarity between two K-lines Firstly based on the shape structure of K-line threecomponents of K-line shape are extracted the upper shadowshape the lower shadow shape and the body shape Secondlythe similarity match methods of three shapes are definedrespectively Finally the shape similarity of K-line can becalculated by summing the three shapesrsquo similarity Assumingthat 119863119894119905 and 119863119895119905 denote the t-th dayrsquos K-line of KS119894 and KS119895respectively the shape similarity model of K-line series isdefined as follows(1) Let US119894[119905] denote the upper shadow length of 119863119894119905 asdefined in the following formula

US119894 [119905] =

119867119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119867119894119905 minus 119862119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(4)

where 119862119894119905minus1 lowast 01 is used to normalize the upper shadowlength According to the related regulation of Chinese A-share market the range of daily fluctuations of stock pricescannot exceed 10 of the previous dayrsquos close price So119862119894119905minus1 lowast01 can be used to normalize the length of the K-linersquos uppershadow lower shadow and body

Let Sim119894119895US(119905) denote the upper shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895US (119905) =

min (US119894 [119905] US119895 [119905])max (US119894 [119905] US119895 [119905]) US119894 [119905] lowast US119895 [119905] gt 00 US119894 [119905] lowast US119895 [119905] = 0 US119894 [119905] = US119895 [119905] 1 US119894 [119905] = US119895 [119905] = 0

(5)

(2) Let LS119894[119905] denote the lower shadow length of 119863119894119905 asdefined in the following formula

LS119894 [119905] =

119862119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119874119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(6)

Let Sim119894119895LS(119905) denote the lower shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895LS (119905) =

min (LS119894 [119905] LS119895 [119905])max (LS119894 [119905] LS119895 [119905]) LS119894 [119905] lowast LS119895 [119905] gt 00 LS119894 [119905] lowast LS119895 [119905] = 0 LS119894 [119905] = LS119895 [119905] 1 LS119894 [119905] = LS119895 [119905] = 0

(7)

6 Mathematical Problems in Engineering

(3) Let 119861119894[119905] denote the body length of 119863119894119905 as defined inthe following formula

119861119894 [119905] = 119862119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 (8)

Let Sim119894119895Body(119905) denote the body similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895Body (119905) =

min (10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816)max (1003816100381610038161003816119861119894 [119905]1003816100381610038161003816 1003816100381610038161003816119861119895 [119905]1003816100381610038161003816) 119861

119894 [119905] lowast 119861119895 [119905] gt 00 119861119894 [119905] lowast 119861119895 [119905] lt 00 119861119894 [119905] lowast 119861119895 [119905] = 0 10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 = 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816 1 119861119894 [119905] = 119861119895 [119905] = 0

(9)

(4) Let Sim119894119895119878 (119905) denote the shape similarity between 119863119894119905and119863119895119905 as defined by

119908Body + 119908LS + 119908US = 1119908Body ge 0119908US ge 0119908LS ge 0

Sim119894119895119878 (119905) = 119908Body lowast Sim119894119895Body (119905) + 119908US

lowast Sim119894119895US (119905) + 119908LS

lowast Sim119894119895LS (119905)

(10)

where119908Body119908US and119908LS represent the weight of Sim119894119895

Body(119905)Sim119894119895US(119905) and Sim119894119895LS(119905) respectively(5) Let SSim119894119895 denote the shape similarity between KS119894

and KS119895 as defined by

SSim119894119895 = 119899sum119905=1

Sim119894119895119878 (119905) lowast 119908119905119878

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119878 = 1) (11)

where119908119905119878 represent the weight of Sim119894119895119878 (119905) Thanks to the ideathat each K-line can be given different weight the K-lineseries having special shape features could be identified well

222 The Position Similarity of K-Line Series For computingthe similarity between twoK-line series we not only considerthe shape similarity of K-line series but also the positionsimilarity If we only consider the shape similarity then itwill cause the problem that two K-line series having sameshape features but different position features will have thesame similarity

For example supposing that the K-line series chart of KS119894

and KS119895 is shown in Figure 2 we can see that according tothe shape feature definition of K-line all of the corresponding

K-lines of KS119894 and KS119895 have the same shape features Thesemean that KS119894 and KS119895 have identical shape features thatis SSim119894119895 = 1 However as is vividly shown in Figure 4the relative positions of 1198631198942 and 1198631198952 are different though 1198631198941and 1198631198951 have the same relative position in the K-line seriesTherefor the stock pricersquos overall trend of KS119894 and KS119895 are notidentical that is Sim119894119895 lt 1 If we only consider the shapesimilarity we will draw the wrong conclusion that SSim119894119895 =Sim119894119895 = 1

To solve this problem the concept of K-line coordinateis introduced hoping to implement the position match of K-line by definingK-linersquos coordinate in the K-line series In thispaper the sequence of K-line in the K-line series is called119909 coordinate of K-line the increase range of close price iscalled 119910 coordinate of K-line in addition the first K-linersquos119910 coordinate is set to 1 in the K-line series Therefore theposition similarity model of K-line series based on K-linecoordinate is defined as follows(1) Let (119909119894119905 119910119894119905) denote the coordinate of 119863119894119905 which aredefined in the following formula

119909119894119905 = 119905

119910119894119905 =

1 119905 = 1(119862119894119905 minus 119862119894119905minus1)119862119894119905minus1 lowast 01 119905 gt 1

(12)

Let Sim119894119895119875 (119905) denote the position similarity between 119863119894119905and119863119895119905 as defined by

Sim119894119895119875 (119905) =

min (119910119894119905 119910119895119905 )max (119910119894119905 119910119895119905 ) 119910

119894119905 lowast 119910119895119905 gt 0

0 119910119894119905 lowast 119910119895119905 = 0 119910119894119905 = 119910119895119905 1 119910119894119905 = 119910119895119905 = 0

(13)

Mathematical Problems in Engineering 7

110

100

95

90

102

105

110

100

(a) KS119894

110

100

95

9092

95

100

90(b) KS119895

Figure 4 K-line series chart

(2) Let PSim119894119895 denote the position similarity between KS119894

and KS119895 as defined by

PSim119894119895 = 119899sum119905=1

Sim119894119895119875 (119905) lowast 119908119905119875

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119875 = 1) (14)

where 119908119905119875 represents the weight of Sim119894119895119875 (119905) Thanks to theidea that each K-line can be given different weight the K-lineseries having special coordinates could be identified well

223 The Similarity of K-Line Series Finally based on theshape similarity and position similarity the similarity of K-line series could be obtained Therefore the similarity matchmodel between KS119894 and KS119895 is defined by

Sim119894119895 = SSim119894119895 lowast 119908119878 + PSim119894119895 lowast 119908119875 (15)

where 119908119878 and 119908119875 represent the shape similarity weight andposition similarity weight of K-line series respectively

23 Cluster of K-Line Series Themore accurate classificationresult of K-line patterns can be gotten by clustering themusing the nearest neighbor-clustering algorithm based onthe similarity match model of K-line series The K-lineseriesrsquo nearest neighbor-clustering algorithm (KNNCA) isdescribed as shown in Algorithm 1

In addition |119876119898| represents the number of elements in119876119898 As each K-line series will be matched once with all of theK-line series stored in the cluster the time complexity andspace complexity of KNNCA are both 119874(1198992)3 Mining of Patternsrsquo Predictive Power

We can mine and analyze the patternsrsquo predictive poweraccording to the following steps

(1) Pattern Recognition Based on the definition of K-linepatterns we identify all the K-line series belonging to

a pattern (such as TIU or TID) and then they form aset (KSet)(2) Pattern Clustering We use the KNSSC algorithm tocluster KSet then the set of clusters (CSet) can be gotten inwhich different clusters represent the same patternrsquos differentshapes

(3) Knowledge Mining We define some statistical indicatorsabout stock prices which we use to mine stock predictionknowledge from each cluster

The patternrsquos predictive power is gotten primarily byanalyzing the trend of the patternrsquos consequent K-line seriesPaper [22] found that K-line technology is suited for short-term investment prediction and that the most efficient timeperiod for prediction is 10 daysTherefore we mainly analyzethe close price trend of the patternrsquos consequent K-line seriesin 10 days Let KS = [119863119905 119863119905+1 119863119905+2] denote a three-day K-line pattern its consequent K-line series is denoted by CKSThe statistical indicators of CKS are defined as follows

(a) Let 119862119896 denote the k-th close price of CKS let 119875119880(119862119896)denote that the probability of the trend of 119862119896 is uptrend andlet 119875119863(119862119896) denote that the probability of the trend of 119862119896 isdowntrend 119875119880(119862119896) and 119875119863(119862119896) are calculated by

119875119880 (119862119896) =10038161003816100381610038161003816119876119898119896119880 10038161003816100381610038161003816|119876119898|

119875119863 (119862119896) =10038161003816100381610038161003816119876119898119896119863 10038161003816100381610038161003816|119876119898|

(16)

where |119876119898119896119880 | represents the number of patterns meeting thecondition of119862119896 ge 119862119905+2 in119876119898 |119876119898119896119863 | represents the number ofpatterns meeting the condition of 119862119896 le 119862119905+2 in 119876119898 and 119862119905+2represents the close price of119863119905+2 119875119880(119862119896) gt 05 indicates thatthe future trend of 119862119905 is rising 119875119863(119862119896) gt 05 indicates thatthe future trend of 119862119896 is descending

(b) Let 119875119899119880 isin [0 1] denote the probability that the closeprice will rise in the next 119899 days if the pattern appears119875119899119863 isin [0 1] denotes the probability that the close price will

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

Mathematical Problems in Engineering 3

High

Close

Body

Open

Low

Upper shadow

Lower shadow

(a) Yang line

High

Open

Body

Close

Low

Upper shadow

Lower shadow

(b) Yin line

High

Low

Upper shadow

Lower shadow

Open = close

(c) Doji line

Figure 2 K-line chart

predictive power using the data mining related method suchas pattern recognition pattern clustering pattern knowledgemining and statistical analysis The rest of this paper isorganized as follows In Section 2 we firstly shortly introduceK-line K-line technology analysis and K-line patterns Thenwe define the similarity match model and nearest neighbor-clustering algorithm of K-line series In Section 3 we definethe mining method of patternsrsquo predictive power Section 4presents the experimental result and discussion Section 5concludes the paper

2 K-Line and K-Line Series Clustering

Firstly we give the mathematic definition of K-line series LetKS119894 represent the 119894-th K-line series of any stock and let 119863119894119905represent t-th K-line in KS119894 then

KS119894 =[[[[[[[

1198621198941 1198621198942 sdot sdot sdot 119862|KS119894|1198741198941 1198741198942 sdot sdot sdot 119874119894|KS119894|1198671198941 1198671198942 sdot sdot sdot 119867119894|KS119894|1198711198941 1198711198942 sdot sdot sdot 119871119894|KS119894|

]]]]]]] (10038161003816100381610038161003816KS11989410038161003816100381610038161003816 isin 119873+)

119863119894119905 = [119862119894119905 119874119894119905 119867119894119905 119871119894119905]119879 (1 le 119905 le 10038161003816100381610038161003816KS11989410038161003816100381610038161003816)

(1)

where |KS119894| is the number of elements in KS119894 which is alsocalled the length of KS119894 119862119894119905 119874119894119905 119867119894119905 and 119871119894119905 are the t-th dayrsquosclose price open price high price and low price in KS119894respectively In this paper ldquo| |rdquo symbol indicates the numberof elements in the set or series

21 K-Line

211 K-Line Introduction As defined in literature [4ndash6] theK-line is drawnby four basic elements close price open pricehigh price and low price where the part between the closeprice and open price is drawn into a rectangle called body ofK-line and the part between the high price and body is drawn

into a line called upper shadow of K-line Moreover the partbetween the lower price and body is drawn into a line calledlower shadow of K-line This kind of very personalized linesconsisting of upper shadow lower shadow and body is calledK-line

In the K-line if open price is lower than close price K-line also called Yang line the body is usually filled with whiteor green color as shown in Figure 2(a) And if open priceis higher than close price K-line also called Yin line thebody is usually filled with black or red color as shown inFigure 2(b) Moreover if open price is equal to close price K-line also called Doji line the body then collapses into a singlehorizontal line as shown in Figure 2(c) It is important tonote that the body color of Yin line and Yang line is differentin Chinese stock market and stock markets of European andAmerican In Chinese stock market the body color of Yangline and Yin line is red and green respectively Howeverthe body color of Yang line and Yin line is green and redrespectively in the stockmarkets of European and American

212 K-Line Technology Analysis Firstly we introduce anddefine some key concepts of K-line technology analysis Let119863119905 represent the t-th dayrsquos K-line of any stock

(1) Moving Average It is the average of stock price for sometime The three-day moving average at time119863119905 is defined by

119872avg (119863119905) = 13 119862119905minus2 + 119862119905minus1 + 119862119905 (2)

where 119862119905 denotes the close price of119863119905(2) K-Line Trend It is used to describe the K-linersquos trendincluding uptrend and downtrend 119863119905 is said to be a down-trend if

119872avg (119863119905minus6) gt 119872avg (119863119905minus5) gt sdot sdot sdot gt 119872avg (119863119905) (3)

with at most one violation of the inequalities Uptrend isdefined analogously

4 Mathematical Problems in Engineering

(a) TIU (b) TID

Figure 3 THE standard K-line series chart of TIU and TID

(3) Stock Price Trend It is used to describe the generaltrend of stock prices for some time including uptrend anddowntrend If the future trend of stock price is rising it iscalled bullish market In contrast if the future trend of stockprice is descending it is called bearish market Moreovera more intense rising or descending trend indicates a moretypical bullish or bearish market The capability of a K-linepatter for predicting the bullish market and bearish market isdefined in formulas (17) and (18) respectively

It is noted that the concepts of ldquomoving averagerdquo and ldquoK-line trendrdquo are defined by the paper [6] while the concept ofstock price trend is firstly defined by the paper

213 K-Line Patterns ManyK-line patterns have beenminedup to now as shown in literatures [4ndash6] Limited by spaceonly the patterns of TIU and TID will be introduced in thenext content Let KS = [119863119905 119863119905+1 119863119905+2] represent a three-dayK-line series

The conditions of KS becoming the TIU pattern are asfollows (1) 119863119905 is a downtrend and 119862119905 lt 119874119905 (2) 119862119905+1 gt 119874119905+1119874119905 ge 119862119905+1 gt 119862119905 and 119874119905 ge 119874119905+1 gt 119862119905 where at most oneof the two equalities holds That is the second day 119905 + 1 isYang line and must be contained with the body of the firstday (3) 119862119905+2 gt 119874119905+2 119862119905+2 gt 119874119905 That is the third day 119905 + 2 isYang line and closes above the open of the first dayA standardTIU pattern is shown in Figure 3(a)

The predictive power of TIU pattern from the existingliterature is that TIU is a trend-reversal pattern which givesthe bullish market signal This means when the TIU patternappears the stock prices will be likely to be transferredfrom downtrend into uptrend or the stock market would bechanged from bearishmarket to bullishmarket and the stockprices would rise gradually

The conditions of KS becoming the TID pattern are asfollows (1) 119863119905 is an uptrend and 119862119905 gt 119874119905 (2) 119862119905+1 lt 119874119905+1119862119905 ge 119862119905+1 gt 119874119905 and 119862119905 ge 119874119905+1 gt 119874119905 where at most one ofthe two equalities holds That is the second day 119905 + 1 is Yin

line and must be contained within the body of the first day(3) 119862119905+2 lt 119874119905+2 119862119905+2 lt 119874119905 That is the third day 119905 + 2 is Yinline and its close is lower than the first dayrsquos open A standardTID pattern is shown in Figure 3(b)

The predictive power of TID pattern from the existingliterature is that TID is a trend-reversal pattern which givesthe bearish market signal That means after the TID patternappears the stock prices will be likely to be transferredfrom uptrend into downtrend or the stock market would bechanged from bullishmarket to bearishmarket and the stockprices would fall gradually

22 Similarity Match of K-Line Series The similarity matchof K-line series is an essential and basis task for K-lineseries clustering In the literature however there are fewpapers focusing on the similarity match of K-line series Onlypaper [20] studies the similarity match method and searchalgorithmofK-line series using image retrieval technology Inaddition paper [19 21] proposes the similarity match modelof K-line series based on the traditional Euclidean distance

From the view of stock prediction the K-line seriesrsquosimilarity refers to the trend similarity of K-line in the K-line series However the K-line trend is determined by theclose price change open price change high price changelow price change and the size relationship between closeprice and open price Therefore if we want to match thesimilarity between two K-line series we should calculate thesimilarity of K-line price changes instead of the similarity ofprice values As the changes of K-line price are not shownin the K-line chart K-line prices distance rather than K-line price changes distance is used in the similarity matchmodel of literature [19ndash21] This means that these matchmodels belong to similarity match methods based on K-lineprice values rather than K-line price changesTherefore theycannot accuratelymeasure the similarity of stock prices trendin the K-line series

Mathematical Problems in Engineering 5

For example assuming that there are twoK-line series KS119894

and KS119895 needed to match their similarity where KS119894 = KS119895

and Sim119894119895 indicate their similarity Let 119862119894119905 = 10 119862119894119905+1 = 105119862119895119905 = 20 119862119895119905+1 = 21 and RC119905+1119905119894 indicate the close pricechange rate of KS119894 at day 119905 + 1 which is calculated by (119862119894119905+1 minus119862119894119905)119862119894119905 SRC119905+1119905119894119895 denotes the similarity between RC119905+1119905119894 andRC119905+1119905119894 then RC119905+1119905119894 = RC119905+1119905119895 = 5 and SRC119905+1119905119894119895 = 1 Wecannot calculate the correct result of SRC119905+1119905119894119895 = 1 by thesimilarity match model in literature [19ndash21] Similarly thesame problems would occur for calculating the similarity ofopen price high price or low price

Therefore this paper proposes a new similarity matchmodel based on K-line price changes to measure the trendsimilarity between two K-line series In this model thesimilarity of K-line series is composed of two parts one isthe shape similarity of K-line which is the similarity of thecorrespondingK-linersquos shape features in the twoK-line seriesthe other is the position similarity of K-line which is thesimilarity of the corresponding K-linersquos position features inthe two K-line series Therefore this paper will define K-lineseriesrsquo shape similarity model and position similarity modelrespectively Then based on these two kinds of similaritymodels the similarity model of the entire K-line series couldbe built

221 The Shape Similarity of K-Line Series According to theshape feature of K-line this paper proposes using the shape

distance to measure the shape similarity between two K-lines Firstly based on the shape structure of K-line threecomponents of K-line shape are extracted the upper shadowshape the lower shadow shape and the body shape Secondlythe similarity match methods of three shapes are definedrespectively Finally the shape similarity of K-line can becalculated by summing the three shapesrsquo similarity Assumingthat 119863119894119905 and 119863119895119905 denote the t-th dayrsquos K-line of KS119894 and KS119895respectively the shape similarity model of K-line series isdefined as follows(1) Let US119894[119905] denote the upper shadow length of 119863119894119905 asdefined in the following formula

US119894 [119905] =

119867119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119867119894119905 minus 119862119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(4)

where 119862119894119905minus1 lowast 01 is used to normalize the upper shadowlength According to the related regulation of Chinese A-share market the range of daily fluctuations of stock pricescannot exceed 10 of the previous dayrsquos close price So119862119894119905minus1 lowast01 can be used to normalize the length of the K-linersquos uppershadow lower shadow and body

Let Sim119894119895US(119905) denote the upper shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895US (119905) =

min (US119894 [119905] US119895 [119905])max (US119894 [119905] US119895 [119905]) US119894 [119905] lowast US119895 [119905] gt 00 US119894 [119905] lowast US119895 [119905] = 0 US119894 [119905] = US119895 [119905] 1 US119894 [119905] = US119895 [119905] = 0

(5)

(2) Let LS119894[119905] denote the lower shadow length of 119863119894119905 asdefined in the following formula

LS119894 [119905] =

119862119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119874119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(6)

Let Sim119894119895LS(119905) denote the lower shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895LS (119905) =

min (LS119894 [119905] LS119895 [119905])max (LS119894 [119905] LS119895 [119905]) LS119894 [119905] lowast LS119895 [119905] gt 00 LS119894 [119905] lowast LS119895 [119905] = 0 LS119894 [119905] = LS119895 [119905] 1 LS119894 [119905] = LS119895 [119905] = 0

(7)

6 Mathematical Problems in Engineering

(3) Let 119861119894[119905] denote the body length of 119863119894119905 as defined inthe following formula

119861119894 [119905] = 119862119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 (8)

Let Sim119894119895Body(119905) denote the body similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895Body (119905) =

min (10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816)max (1003816100381610038161003816119861119894 [119905]1003816100381610038161003816 1003816100381610038161003816119861119895 [119905]1003816100381610038161003816) 119861

119894 [119905] lowast 119861119895 [119905] gt 00 119861119894 [119905] lowast 119861119895 [119905] lt 00 119861119894 [119905] lowast 119861119895 [119905] = 0 10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 = 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816 1 119861119894 [119905] = 119861119895 [119905] = 0

(9)

(4) Let Sim119894119895119878 (119905) denote the shape similarity between 119863119894119905and119863119895119905 as defined by

119908Body + 119908LS + 119908US = 1119908Body ge 0119908US ge 0119908LS ge 0

Sim119894119895119878 (119905) = 119908Body lowast Sim119894119895Body (119905) + 119908US

lowast Sim119894119895US (119905) + 119908LS

lowast Sim119894119895LS (119905)

(10)

where119908Body119908US and119908LS represent the weight of Sim119894119895

Body(119905)Sim119894119895US(119905) and Sim119894119895LS(119905) respectively(5) Let SSim119894119895 denote the shape similarity between KS119894

and KS119895 as defined by

SSim119894119895 = 119899sum119905=1

Sim119894119895119878 (119905) lowast 119908119905119878

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119878 = 1) (11)

where119908119905119878 represent the weight of Sim119894119895119878 (119905) Thanks to the ideathat each K-line can be given different weight the K-lineseries having special shape features could be identified well

222 The Position Similarity of K-Line Series For computingthe similarity between twoK-line series we not only considerthe shape similarity of K-line series but also the positionsimilarity If we only consider the shape similarity then itwill cause the problem that two K-line series having sameshape features but different position features will have thesame similarity

For example supposing that the K-line series chart of KS119894

and KS119895 is shown in Figure 2 we can see that according tothe shape feature definition of K-line all of the corresponding

K-lines of KS119894 and KS119895 have the same shape features Thesemean that KS119894 and KS119895 have identical shape features thatis SSim119894119895 = 1 However as is vividly shown in Figure 4the relative positions of 1198631198942 and 1198631198952 are different though 1198631198941and 1198631198951 have the same relative position in the K-line seriesTherefor the stock pricersquos overall trend of KS119894 and KS119895 are notidentical that is Sim119894119895 lt 1 If we only consider the shapesimilarity we will draw the wrong conclusion that SSim119894119895 =Sim119894119895 = 1

To solve this problem the concept of K-line coordinateis introduced hoping to implement the position match of K-line by definingK-linersquos coordinate in the K-line series In thispaper the sequence of K-line in the K-line series is called119909 coordinate of K-line the increase range of close price iscalled 119910 coordinate of K-line in addition the first K-linersquos119910 coordinate is set to 1 in the K-line series Therefore theposition similarity model of K-line series based on K-linecoordinate is defined as follows(1) Let (119909119894119905 119910119894119905) denote the coordinate of 119863119894119905 which aredefined in the following formula

119909119894119905 = 119905

119910119894119905 =

1 119905 = 1(119862119894119905 minus 119862119894119905minus1)119862119894119905minus1 lowast 01 119905 gt 1

(12)

Let Sim119894119895119875 (119905) denote the position similarity between 119863119894119905and119863119895119905 as defined by

Sim119894119895119875 (119905) =

min (119910119894119905 119910119895119905 )max (119910119894119905 119910119895119905 ) 119910

119894119905 lowast 119910119895119905 gt 0

0 119910119894119905 lowast 119910119895119905 = 0 119910119894119905 = 119910119895119905 1 119910119894119905 = 119910119895119905 = 0

(13)

Mathematical Problems in Engineering 7

110

100

95

90

102

105

110

100

(a) KS119894

110

100

95

9092

95

100

90(b) KS119895

Figure 4 K-line series chart

(2) Let PSim119894119895 denote the position similarity between KS119894

and KS119895 as defined by

PSim119894119895 = 119899sum119905=1

Sim119894119895119875 (119905) lowast 119908119905119875

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119875 = 1) (14)

where 119908119905119875 represents the weight of Sim119894119895119875 (119905) Thanks to theidea that each K-line can be given different weight the K-lineseries having special coordinates could be identified well

223 The Similarity of K-Line Series Finally based on theshape similarity and position similarity the similarity of K-line series could be obtained Therefore the similarity matchmodel between KS119894 and KS119895 is defined by

Sim119894119895 = SSim119894119895 lowast 119908119878 + PSim119894119895 lowast 119908119875 (15)

where 119908119878 and 119908119875 represent the shape similarity weight andposition similarity weight of K-line series respectively

23 Cluster of K-Line Series Themore accurate classificationresult of K-line patterns can be gotten by clustering themusing the nearest neighbor-clustering algorithm based onthe similarity match model of K-line series The K-lineseriesrsquo nearest neighbor-clustering algorithm (KNNCA) isdescribed as shown in Algorithm 1

In addition |119876119898| represents the number of elements in119876119898 As each K-line series will be matched once with all of theK-line series stored in the cluster the time complexity andspace complexity of KNNCA are both 119874(1198992)3 Mining of Patternsrsquo Predictive Power

We can mine and analyze the patternsrsquo predictive poweraccording to the following steps

(1) Pattern Recognition Based on the definition of K-linepatterns we identify all the K-line series belonging to

a pattern (such as TIU or TID) and then they form aset (KSet)(2) Pattern Clustering We use the KNSSC algorithm tocluster KSet then the set of clusters (CSet) can be gotten inwhich different clusters represent the same patternrsquos differentshapes

(3) Knowledge Mining We define some statistical indicatorsabout stock prices which we use to mine stock predictionknowledge from each cluster

The patternrsquos predictive power is gotten primarily byanalyzing the trend of the patternrsquos consequent K-line seriesPaper [22] found that K-line technology is suited for short-term investment prediction and that the most efficient timeperiod for prediction is 10 daysTherefore we mainly analyzethe close price trend of the patternrsquos consequent K-line seriesin 10 days Let KS = [119863119905 119863119905+1 119863119905+2] denote a three-day K-line pattern its consequent K-line series is denoted by CKSThe statistical indicators of CKS are defined as follows

(a) Let 119862119896 denote the k-th close price of CKS let 119875119880(119862119896)denote that the probability of the trend of 119862119896 is uptrend andlet 119875119863(119862119896) denote that the probability of the trend of 119862119896 isdowntrend 119875119880(119862119896) and 119875119863(119862119896) are calculated by

119875119880 (119862119896) =10038161003816100381610038161003816119876119898119896119880 10038161003816100381610038161003816|119876119898|

119875119863 (119862119896) =10038161003816100381610038161003816119876119898119896119863 10038161003816100381610038161003816|119876119898|

(16)

where |119876119898119896119880 | represents the number of patterns meeting thecondition of119862119896 ge 119862119905+2 in119876119898 |119876119898119896119863 | represents the number ofpatterns meeting the condition of 119862119896 le 119862119905+2 in 119876119898 and 119862119905+2represents the close price of119863119905+2 119875119880(119862119896) gt 05 indicates thatthe future trend of 119862119905 is rising 119875119863(119862119896) gt 05 indicates thatthe future trend of 119862119896 is descending

(b) Let 119875119899119880 isin [0 1] denote the probability that the closeprice will rise in the next 119899 days if the pattern appears119875119899119863 isin [0 1] denotes the probability that the close price will

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

4 Mathematical Problems in Engineering

(a) TIU (b) TID

Figure 3 THE standard K-line series chart of TIU and TID

(3) Stock Price Trend It is used to describe the generaltrend of stock prices for some time including uptrend anddowntrend If the future trend of stock price is rising it iscalled bullish market In contrast if the future trend of stockprice is descending it is called bearish market Moreovera more intense rising or descending trend indicates a moretypical bullish or bearish market The capability of a K-linepatter for predicting the bullish market and bearish market isdefined in formulas (17) and (18) respectively

It is noted that the concepts of ldquomoving averagerdquo and ldquoK-line trendrdquo are defined by the paper [6] while the concept ofstock price trend is firstly defined by the paper

213 K-Line Patterns ManyK-line patterns have beenminedup to now as shown in literatures [4ndash6] Limited by spaceonly the patterns of TIU and TID will be introduced in thenext content Let KS = [119863119905 119863119905+1 119863119905+2] represent a three-dayK-line series

The conditions of KS becoming the TIU pattern are asfollows (1) 119863119905 is a downtrend and 119862119905 lt 119874119905 (2) 119862119905+1 gt 119874119905+1119874119905 ge 119862119905+1 gt 119862119905 and 119874119905 ge 119874119905+1 gt 119862119905 where at most oneof the two equalities holds That is the second day 119905 + 1 isYang line and must be contained with the body of the firstday (3) 119862119905+2 gt 119874119905+2 119862119905+2 gt 119874119905 That is the third day 119905 + 2 isYang line and closes above the open of the first dayA standardTIU pattern is shown in Figure 3(a)

The predictive power of TIU pattern from the existingliterature is that TIU is a trend-reversal pattern which givesthe bullish market signal This means when the TIU patternappears the stock prices will be likely to be transferredfrom downtrend into uptrend or the stock market would bechanged from bearishmarket to bullishmarket and the stockprices would rise gradually

The conditions of KS becoming the TID pattern are asfollows (1) 119863119905 is an uptrend and 119862119905 gt 119874119905 (2) 119862119905+1 lt 119874119905+1119862119905 ge 119862119905+1 gt 119874119905 and 119862119905 ge 119874119905+1 gt 119874119905 where at most one ofthe two equalities holds That is the second day 119905 + 1 is Yin

line and must be contained within the body of the first day(3) 119862119905+2 lt 119874119905+2 119862119905+2 lt 119874119905 That is the third day 119905 + 2 is Yinline and its close is lower than the first dayrsquos open A standardTID pattern is shown in Figure 3(b)

The predictive power of TID pattern from the existingliterature is that TID is a trend-reversal pattern which givesthe bearish market signal That means after the TID patternappears the stock prices will be likely to be transferredfrom uptrend into downtrend or the stock market would bechanged from bullishmarket to bearishmarket and the stockprices would fall gradually

22 Similarity Match of K-Line Series The similarity matchof K-line series is an essential and basis task for K-lineseries clustering In the literature however there are fewpapers focusing on the similarity match of K-line series Onlypaper [20] studies the similarity match method and searchalgorithmofK-line series using image retrieval technology Inaddition paper [19 21] proposes the similarity match modelof K-line series based on the traditional Euclidean distance

From the view of stock prediction the K-line seriesrsquosimilarity refers to the trend similarity of K-line in the K-line series However the K-line trend is determined by theclose price change open price change high price changelow price change and the size relationship between closeprice and open price Therefore if we want to match thesimilarity between two K-line series we should calculate thesimilarity of K-line price changes instead of the similarity ofprice values As the changes of K-line price are not shownin the K-line chart K-line prices distance rather than K-line price changes distance is used in the similarity matchmodel of literature [19ndash21] This means that these matchmodels belong to similarity match methods based on K-lineprice values rather than K-line price changesTherefore theycannot accuratelymeasure the similarity of stock prices trendin the K-line series

Mathematical Problems in Engineering 5

For example assuming that there are twoK-line series KS119894

and KS119895 needed to match their similarity where KS119894 = KS119895

and Sim119894119895 indicate their similarity Let 119862119894119905 = 10 119862119894119905+1 = 105119862119895119905 = 20 119862119895119905+1 = 21 and RC119905+1119905119894 indicate the close pricechange rate of KS119894 at day 119905 + 1 which is calculated by (119862119894119905+1 minus119862119894119905)119862119894119905 SRC119905+1119905119894119895 denotes the similarity between RC119905+1119905119894 andRC119905+1119905119894 then RC119905+1119905119894 = RC119905+1119905119895 = 5 and SRC119905+1119905119894119895 = 1 Wecannot calculate the correct result of SRC119905+1119905119894119895 = 1 by thesimilarity match model in literature [19ndash21] Similarly thesame problems would occur for calculating the similarity ofopen price high price or low price

Therefore this paper proposes a new similarity matchmodel based on K-line price changes to measure the trendsimilarity between two K-line series In this model thesimilarity of K-line series is composed of two parts one isthe shape similarity of K-line which is the similarity of thecorrespondingK-linersquos shape features in the twoK-line seriesthe other is the position similarity of K-line which is thesimilarity of the corresponding K-linersquos position features inthe two K-line series Therefore this paper will define K-lineseriesrsquo shape similarity model and position similarity modelrespectively Then based on these two kinds of similaritymodels the similarity model of the entire K-line series couldbe built

221 The Shape Similarity of K-Line Series According to theshape feature of K-line this paper proposes using the shape

distance to measure the shape similarity between two K-lines Firstly based on the shape structure of K-line threecomponents of K-line shape are extracted the upper shadowshape the lower shadow shape and the body shape Secondlythe similarity match methods of three shapes are definedrespectively Finally the shape similarity of K-line can becalculated by summing the three shapesrsquo similarity Assumingthat 119863119894119905 and 119863119895119905 denote the t-th dayrsquos K-line of KS119894 and KS119895respectively the shape similarity model of K-line series isdefined as follows(1) Let US119894[119905] denote the upper shadow length of 119863119894119905 asdefined in the following formula

US119894 [119905] =

119867119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119867119894119905 minus 119862119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(4)

where 119862119894119905minus1 lowast 01 is used to normalize the upper shadowlength According to the related regulation of Chinese A-share market the range of daily fluctuations of stock pricescannot exceed 10 of the previous dayrsquos close price So119862119894119905minus1 lowast01 can be used to normalize the length of the K-linersquos uppershadow lower shadow and body

Let Sim119894119895US(119905) denote the upper shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895US (119905) =

min (US119894 [119905] US119895 [119905])max (US119894 [119905] US119895 [119905]) US119894 [119905] lowast US119895 [119905] gt 00 US119894 [119905] lowast US119895 [119905] = 0 US119894 [119905] = US119895 [119905] 1 US119894 [119905] = US119895 [119905] = 0

(5)

(2) Let LS119894[119905] denote the lower shadow length of 119863119894119905 asdefined in the following formula

LS119894 [119905] =

119862119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119874119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(6)

Let Sim119894119895LS(119905) denote the lower shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895LS (119905) =

min (LS119894 [119905] LS119895 [119905])max (LS119894 [119905] LS119895 [119905]) LS119894 [119905] lowast LS119895 [119905] gt 00 LS119894 [119905] lowast LS119895 [119905] = 0 LS119894 [119905] = LS119895 [119905] 1 LS119894 [119905] = LS119895 [119905] = 0

(7)

6 Mathematical Problems in Engineering

(3) Let 119861119894[119905] denote the body length of 119863119894119905 as defined inthe following formula

119861119894 [119905] = 119862119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 (8)

Let Sim119894119895Body(119905) denote the body similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895Body (119905) =

min (10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816)max (1003816100381610038161003816119861119894 [119905]1003816100381610038161003816 1003816100381610038161003816119861119895 [119905]1003816100381610038161003816) 119861

119894 [119905] lowast 119861119895 [119905] gt 00 119861119894 [119905] lowast 119861119895 [119905] lt 00 119861119894 [119905] lowast 119861119895 [119905] = 0 10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 = 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816 1 119861119894 [119905] = 119861119895 [119905] = 0

(9)

(4) Let Sim119894119895119878 (119905) denote the shape similarity between 119863119894119905and119863119895119905 as defined by

119908Body + 119908LS + 119908US = 1119908Body ge 0119908US ge 0119908LS ge 0

Sim119894119895119878 (119905) = 119908Body lowast Sim119894119895Body (119905) + 119908US

lowast Sim119894119895US (119905) + 119908LS

lowast Sim119894119895LS (119905)

(10)

where119908Body119908US and119908LS represent the weight of Sim119894119895

Body(119905)Sim119894119895US(119905) and Sim119894119895LS(119905) respectively(5) Let SSim119894119895 denote the shape similarity between KS119894

and KS119895 as defined by

SSim119894119895 = 119899sum119905=1

Sim119894119895119878 (119905) lowast 119908119905119878

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119878 = 1) (11)

where119908119905119878 represent the weight of Sim119894119895119878 (119905) Thanks to the ideathat each K-line can be given different weight the K-lineseries having special shape features could be identified well

222 The Position Similarity of K-Line Series For computingthe similarity between twoK-line series we not only considerthe shape similarity of K-line series but also the positionsimilarity If we only consider the shape similarity then itwill cause the problem that two K-line series having sameshape features but different position features will have thesame similarity

For example supposing that the K-line series chart of KS119894

and KS119895 is shown in Figure 2 we can see that according tothe shape feature definition of K-line all of the corresponding

K-lines of KS119894 and KS119895 have the same shape features Thesemean that KS119894 and KS119895 have identical shape features thatis SSim119894119895 = 1 However as is vividly shown in Figure 4the relative positions of 1198631198942 and 1198631198952 are different though 1198631198941and 1198631198951 have the same relative position in the K-line seriesTherefor the stock pricersquos overall trend of KS119894 and KS119895 are notidentical that is Sim119894119895 lt 1 If we only consider the shapesimilarity we will draw the wrong conclusion that SSim119894119895 =Sim119894119895 = 1

To solve this problem the concept of K-line coordinateis introduced hoping to implement the position match of K-line by definingK-linersquos coordinate in the K-line series In thispaper the sequence of K-line in the K-line series is called119909 coordinate of K-line the increase range of close price iscalled 119910 coordinate of K-line in addition the first K-linersquos119910 coordinate is set to 1 in the K-line series Therefore theposition similarity model of K-line series based on K-linecoordinate is defined as follows(1) Let (119909119894119905 119910119894119905) denote the coordinate of 119863119894119905 which aredefined in the following formula

119909119894119905 = 119905

119910119894119905 =

1 119905 = 1(119862119894119905 minus 119862119894119905minus1)119862119894119905minus1 lowast 01 119905 gt 1

(12)

Let Sim119894119895119875 (119905) denote the position similarity between 119863119894119905and119863119895119905 as defined by

Sim119894119895119875 (119905) =

min (119910119894119905 119910119895119905 )max (119910119894119905 119910119895119905 ) 119910

119894119905 lowast 119910119895119905 gt 0

0 119910119894119905 lowast 119910119895119905 = 0 119910119894119905 = 119910119895119905 1 119910119894119905 = 119910119895119905 = 0

(13)

Mathematical Problems in Engineering 7

110

100

95

90

102

105

110

100

(a) KS119894

110

100

95

9092

95

100

90(b) KS119895

Figure 4 K-line series chart

(2) Let PSim119894119895 denote the position similarity between KS119894

and KS119895 as defined by

PSim119894119895 = 119899sum119905=1

Sim119894119895119875 (119905) lowast 119908119905119875

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119875 = 1) (14)

where 119908119905119875 represents the weight of Sim119894119895119875 (119905) Thanks to theidea that each K-line can be given different weight the K-lineseries having special coordinates could be identified well

223 The Similarity of K-Line Series Finally based on theshape similarity and position similarity the similarity of K-line series could be obtained Therefore the similarity matchmodel between KS119894 and KS119895 is defined by

Sim119894119895 = SSim119894119895 lowast 119908119878 + PSim119894119895 lowast 119908119875 (15)

where 119908119878 and 119908119875 represent the shape similarity weight andposition similarity weight of K-line series respectively

23 Cluster of K-Line Series Themore accurate classificationresult of K-line patterns can be gotten by clustering themusing the nearest neighbor-clustering algorithm based onthe similarity match model of K-line series The K-lineseriesrsquo nearest neighbor-clustering algorithm (KNNCA) isdescribed as shown in Algorithm 1

In addition |119876119898| represents the number of elements in119876119898 As each K-line series will be matched once with all of theK-line series stored in the cluster the time complexity andspace complexity of KNNCA are both 119874(1198992)3 Mining of Patternsrsquo Predictive Power

We can mine and analyze the patternsrsquo predictive poweraccording to the following steps

(1) Pattern Recognition Based on the definition of K-linepatterns we identify all the K-line series belonging to

a pattern (such as TIU or TID) and then they form aset (KSet)(2) Pattern Clustering We use the KNSSC algorithm tocluster KSet then the set of clusters (CSet) can be gotten inwhich different clusters represent the same patternrsquos differentshapes

(3) Knowledge Mining We define some statistical indicatorsabout stock prices which we use to mine stock predictionknowledge from each cluster

The patternrsquos predictive power is gotten primarily byanalyzing the trend of the patternrsquos consequent K-line seriesPaper [22] found that K-line technology is suited for short-term investment prediction and that the most efficient timeperiod for prediction is 10 daysTherefore we mainly analyzethe close price trend of the patternrsquos consequent K-line seriesin 10 days Let KS = [119863119905 119863119905+1 119863119905+2] denote a three-day K-line pattern its consequent K-line series is denoted by CKSThe statistical indicators of CKS are defined as follows

(a) Let 119862119896 denote the k-th close price of CKS let 119875119880(119862119896)denote that the probability of the trend of 119862119896 is uptrend andlet 119875119863(119862119896) denote that the probability of the trend of 119862119896 isdowntrend 119875119880(119862119896) and 119875119863(119862119896) are calculated by

119875119880 (119862119896) =10038161003816100381610038161003816119876119898119896119880 10038161003816100381610038161003816|119876119898|

119875119863 (119862119896) =10038161003816100381610038161003816119876119898119896119863 10038161003816100381610038161003816|119876119898|

(16)

where |119876119898119896119880 | represents the number of patterns meeting thecondition of119862119896 ge 119862119905+2 in119876119898 |119876119898119896119863 | represents the number ofpatterns meeting the condition of 119862119896 le 119862119905+2 in 119876119898 and 119862119905+2represents the close price of119863119905+2 119875119880(119862119896) gt 05 indicates thatthe future trend of 119862119905 is rising 119875119863(119862119896) gt 05 indicates thatthe future trend of 119862119896 is descending

(b) Let 119875119899119880 isin [0 1] denote the probability that the closeprice will rise in the next 119899 days if the pattern appears119875119899119863 isin [0 1] denotes the probability that the close price will

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

Mathematical Problems in Engineering 5

For example assuming that there are twoK-line series KS119894

and KS119895 needed to match their similarity where KS119894 = KS119895

and Sim119894119895 indicate their similarity Let 119862119894119905 = 10 119862119894119905+1 = 105119862119895119905 = 20 119862119895119905+1 = 21 and RC119905+1119905119894 indicate the close pricechange rate of KS119894 at day 119905 + 1 which is calculated by (119862119894119905+1 minus119862119894119905)119862119894119905 SRC119905+1119905119894119895 denotes the similarity between RC119905+1119905119894 andRC119905+1119905119894 then RC119905+1119905119894 = RC119905+1119905119895 = 5 and SRC119905+1119905119894119895 = 1 Wecannot calculate the correct result of SRC119905+1119905119894119895 = 1 by thesimilarity match model in literature [19ndash21] Similarly thesame problems would occur for calculating the similarity ofopen price high price or low price

Therefore this paper proposes a new similarity matchmodel based on K-line price changes to measure the trendsimilarity between two K-line series In this model thesimilarity of K-line series is composed of two parts one isthe shape similarity of K-line which is the similarity of thecorrespondingK-linersquos shape features in the twoK-line seriesthe other is the position similarity of K-line which is thesimilarity of the corresponding K-linersquos position features inthe two K-line series Therefore this paper will define K-lineseriesrsquo shape similarity model and position similarity modelrespectively Then based on these two kinds of similaritymodels the similarity model of the entire K-line series couldbe built

221 The Shape Similarity of K-Line Series According to theshape feature of K-line this paper proposes using the shape

distance to measure the shape similarity between two K-lines Firstly based on the shape structure of K-line threecomponents of K-line shape are extracted the upper shadowshape the lower shadow shape and the body shape Secondlythe similarity match methods of three shapes are definedrespectively Finally the shape similarity of K-line can becalculated by summing the three shapesrsquo similarity Assumingthat 119863119894119905 and 119863119895119905 denote the t-th dayrsquos K-line of KS119894 and KS119895respectively the shape similarity model of K-line series isdefined as follows(1) Let US119894[119905] denote the upper shadow length of 119863119894119905 asdefined in the following formula

US119894 [119905] =

119867119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119867119894119905 minus 119862119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(4)

where 119862119894119905minus1 lowast 01 is used to normalize the upper shadowlength According to the related regulation of Chinese A-share market the range of daily fluctuations of stock pricescannot exceed 10 of the previous dayrsquos close price So119862119894119905minus1 lowast01 can be used to normalize the length of the K-linersquos uppershadow lower shadow and body

Let Sim119894119895US(119905) denote the upper shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895US (119905) =

min (US119894 [119905] US119895 [119905])max (US119894 [119905] US119895 [119905]) US119894 [119905] lowast US119895 [119905] gt 00 US119894 [119905] lowast US119895 [119905] = 0 US119894 [119905] = US119895 [119905] 1 US119894 [119905] = US119895 [119905] = 0

(5)

(2) Let LS119894[119905] denote the lower shadow length of 119863119894119905 asdefined in the following formula

LS119894 [119905] =

119862119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 ge 119862119894119905

119874119894119905 minus 119871119894119905119862119894119905minus1 lowast 01 119874119894119905 lt 119862119894119905

(6)

Let Sim119894119895LS(119905) denote the lower shadow similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895LS (119905) =

min (LS119894 [119905] LS119895 [119905])max (LS119894 [119905] LS119895 [119905]) LS119894 [119905] lowast LS119895 [119905] gt 00 LS119894 [119905] lowast LS119895 [119905] = 0 LS119894 [119905] = LS119895 [119905] 1 LS119894 [119905] = LS119895 [119905] = 0

(7)

6 Mathematical Problems in Engineering

(3) Let 119861119894[119905] denote the body length of 119863119894119905 as defined inthe following formula

119861119894 [119905] = 119862119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 (8)

Let Sim119894119895Body(119905) denote the body similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895Body (119905) =

min (10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816)max (1003816100381610038161003816119861119894 [119905]1003816100381610038161003816 1003816100381610038161003816119861119895 [119905]1003816100381610038161003816) 119861

119894 [119905] lowast 119861119895 [119905] gt 00 119861119894 [119905] lowast 119861119895 [119905] lt 00 119861119894 [119905] lowast 119861119895 [119905] = 0 10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 = 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816 1 119861119894 [119905] = 119861119895 [119905] = 0

(9)

(4) Let Sim119894119895119878 (119905) denote the shape similarity between 119863119894119905and119863119895119905 as defined by

119908Body + 119908LS + 119908US = 1119908Body ge 0119908US ge 0119908LS ge 0

Sim119894119895119878 (119905) = 119908Body lowast Sim119894119895Body (119905) + 119908US

lowast Sim119894119895US (119905) + 119908LS

lowast Sim119894119895LS (119905)

(10)

where119908Body119908US and119908LS represent the weight of Sim119894119895

Body(119905)Sim119894119895US(119905) and Sim119894119895LS(119905) respectively(5) Let SSim119894119895 denote the shape similarity between KS119894

and KS119895 as defined by

SSim119894119895 = 119899sum119905=1

Sim119894119895119878 (119905) lowast 119908119905119878

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119878 = 1) (11)

where119908119905119878 represent the weight of Sim119894119895119878 (119905) Thanks to the ideathat each K-line can be given different weight the K-lineseries having special shape features could be identified well

222 The Position Similarity of K-Line Series For computingthe similarity between twoK-line series we not only considerthe shape similarity of K-line series but also the positionsimilarity If we only consider the shape similarity then itwill cause the problem that two K-line series having sameshape features but different position features will have thesame similarity

For example supposing that the K-line series chart of KS119894

and KS119895 is shown in Figure 2 we can see that according tothe shape feature definition of K-line all of the corresponding

K-lines of KS119894 and KS119895 have the same shape features Thesemean that KS119894 and KS119895 have identical shape features thatis SSim119894119895 = 1 However as is vividly shown in Figure 4the relative positions of 1198631198942 and 1198631198952 are different though 1198631198941and 1198631198951 have the same relative position in the K-line seriesTherefor the stock pricersquos overall trend of KS119894 and KS119895 are notidentical that is Sim119894119895 lt 1 If we only consider the shapesimilarity we will draw the wrong conclusion that SSim119894119895 =Sim119894119895 = 1

To solve this problem the concept of K-line coordinateis introduced hoping to implement the position match of K-line by definingK-linersquos coordinate in the K-line series In thispaper the sequence of K-line in the K-line series is called119909 coordinate of K-line the increase range of close price iscalled 119910 coordinate of K-line in addition the first K-linersquos119910 coordinate is set to 1 in the K-line series Therefore theposition similarity model of K-line series based on K-linecoordinate is defined as follows(1) Let (119909119894119905 119910119894119905) denote the coordinate of 119863119894119905 which aredefined in the following formula

119909119894119905 = 119905

119910119894119905 =

1 119905 = 1(119862119894119905 minus 119862119894119905minus1)119862119894119905minus1 lowast 01 119905 gt 1

(12)

Let Sim119894119895119875 (119905) denote the position similarity between 119863119894119905and119863119895119905 as defined by

Sim119894119895119875 (119905) =

min (119910119894119905 119910119895119905 )max (119910119894119905 119910119895119905 ) 119910

119894119905 lowast 119910119895119905 gt 0

0 119910119894119905 lowast 119910119895119905 = 0 119910119894119905 = 119910119895119905 1 119910119894119905 = 119910119895119905 = 0

(13)

Mathematical Problems in Engineering 7

110

100

95

90

102

105

110

100

(a) KS119894

110

100

95

9092

95

100

90(b) KS119895

Figure 4 K-line series chart

(2) Let PSim119894119895 denote the position similarity between KS119894

and KS119895 as defined by

PSim119894119895 = 119899sum119905=1

Sim119894119895119875 (119905) lowast 119908119905119875

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119875 = 1) (14)

where 119908119905119875 represents the weight of Sim119894119895119875 (119905) Thanks to theidea that each K-line can be given different weight the K-lineseries having special coordinates could be identified well

223 The Similarity of K-Line Series Finally based on theshape similarity and position similarity the similarity of K-line series could be obtained Therefore the similarity matchmodel between KS119894 and KS119895 is defined by

Sim119894119895 = SSim119894119895 lowast 119908119878 + PSim119894119895 lowast 119908119875 (15)

where 119908119878 and 119908119875 represent the shape similarity weight andposition similarity weight of K-line series respectively

23 Cluster of K-Line Series Themore accurate classificationresult of K-line patterns can be gotten by clustering themusing the nearest neighbor-clustering algorithm based onthe similarity match model of K-line series The K-lineseriesrsquo nearest neighbor-clustering algorithm (KNNCA) isdescribed as shown in Algorithm 1

In addition |119876119898| represents the number of elements in119876119898 As each K-line series will be matched once with all of theK-line series stored in the cluster the time complexity andspace complexity of KNNCA are both 119874(1198992)3 Mining of Patternsrsquo Predictive Power

We can mine and analyze the patternsrsquo predictive poweraccording to the following steps

(1) Pattern Recognition Based on the definition of K-linepatterns we identify all the K-line series belonging to

a pattern (such as TIU or TID) and then they form aset (KSet)(2) Pattern Clustering We use the KNSSC algorithm tocluster KSet then the set of clusters (CSet) can be gotten inwhich different clusters represent the same patternrsquos differentshapes

(3) Knowledge Mining We define some statistical indicatorsabout stock prices which we use to mine stock predictionknowledge from each cluster

The patternrsquos predictive power is gotten primarily byanalyzing the trend of the patternrsquos consequent K-line seriesPaper [22] found that K-line technology is suited for short-term investment prediction and that the most efficient timeperiod for prediction is 10 daysTherefore we mainly analyzethe close price trend of the patternrsquos consequent K-line seriesin 10 days Let KS = [119863119905 119863119905+1 119863119905+2] denote a three-day K-line pattern its consequent K-line series is denoted by CKSThe statistical indicators of CKS are defined as follows

(a) Let 119862119896 denote the k-th close price of CKS let 119875119880(119862119896)denote that the probability of the trend of 119862119896 is uptrend andlet 119875119863(119862119896) denote that the probability of the trend of 119862119896 isdowntrend 119875119880(119862119896) and 119875119863(119862119896) are calculated by

119875119880 (119862119896) =10038161003816100381610038161003816119876119898119896119880 10038161003816100381610038161003816|119876119898|

119875119863 (119862119896) =10038161003816100381610038161003816119876119898119896119863 10038161003816100381610038161003816|119876119898|

(16)

where |119876119898119896119880 | represents the number of patterns meeting thecondition of119862119896 ge 119862119905+2 in119876119898 |119876119898119896119863 | represents the number ofpatterns meeting the condition of 119862119896 le 119862119905+2 in 119876119898 and 119862119905+2represents the close price of119863119905+2 119875119880(119862119896) gt 05 indicates thatthe future trend of 119862119905 is rising 119875119863(119862119896) gt 05 indicates thatthe future trend of 119862119896 is descending

(b) Let 119875119899119880 isin [0 1] denote the probability that the closeprice will rise in the next 119899 days if the pattern appears119875119899119863 isin [0 1] denotes the probability that the close price will

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

6 Mathematical Problems in Engineering

(3) Let 119861119894[119905] denote the body length of 119863119894119905 as defined inthe following formula

119861119894 [119905] = 119862119894119905 minus 119874119894119905119862119894119905minus1 lowast 01 (8)

Let Sim119894119895Body(119905) denote the body similarity between119863119894119905 and119863119895119905 as defined by

Sim119894119895Body (119905) =

min (10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816)max (1003816100381610038161003816119861119894 [119905]1003816100381610038161003816 1003816100381610038161003816119861119895 [119905]1003816100381610038161003816) 119861

119894 [119905] lowast 119861119895 [119905] gt 00 119861119894 [119905] lowast 119861119895 [119905] lt 00 119861119894 [119905] lowast 119861119895 [119905] = 0 10038161003816100381610038161003816119861119894 [119905]10038161003816100381610038161003816 = 10038161003816100381610038161003816119861119895 [119905]10038161003816100381610038161003816 1 119861119894 [119905] = 119861119895 [119905] = 0

(9)

(4) Let Sim119894119895119878 (119905) denote the shape similarity between 119863119894119905and119863119895119905 as defined by

119908Body + 119908LS + 119908US = 1119908Body ge 0119908US ge 0119908LS ge 0

Sim119894119895119878 (119905) = 119908Body lowast Sim119894119895Body (119905) + 119908US

lowast Sim119894119895US (119905) + 119908LS

lowast Sim119894119895LS (119905)

(10)

where119908Body119908US and119908LS represent the weight of Sim119894119895

Body(119905)Sim119894119895US(119905) and Sim119894119895LS(119905) respectively(5) Let SSim119894119895 denote the shape similarity between KS119894

and KS119895 as defined by

SSim119894119895 = 119899sum119905=1

Sim119894119895119878 (119905) lowast 119908119905119878

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119878 = 1) (11)

where119908119905119878 represent the weight of Sim119894119895119878 (119905) Thanks to the ideathat each K-line can be given different weight the K-lineseries having special shape features could be identified well

222 The Position Similarity of K-Line Series For computingthe similarity between twoK-line series we not only considerthe shape similarity of K-line series but also the positionsimilarity If we only consider the shape similarity then itwill cause the problem that two K-line series having sameshape features but different position features will have thesame similarity

For example supposing that the K-line series chart of KS119894

and KS119895 is shown in Figure 2 we can see that according tothe shape feature definition of K-line all of the corresponding

K-lines of KS119894 and KS119895 have the same shape features Thesemean that KS119894 and KS119895 have identical shape features thatis SSim119894119895 = 1 However as is vividly shown in Figure 4the relative positions of 1198631198942 and 1198631198952 are different though 1198631198941and 1198631198951 have the same relative position in the K-line seriesTherefor the stock pricersquos overall trend of KS119894 and KS119895 are notidentical that is Sim119894119895 lt 1 If we only consider the shapesimilarity we will draw the wrong conclusion that SSim119894119895 =Sim119894119895 = 1

To solve this problem the concept of K-line coordinateis introduced hoping to implement the position match of K-line by definingK-linersquos coordinate in the K-line series In thispaper the sequence of K-line in the K-line series is called119909 coordinate of K-line the increase range of close price iscalled 119910 coordinate of K-line in addition the first K-linersquos119910 coordinate is set to 1 in the K-line series Therefore theposition similarity model of K-line series based on K-linecoordinate is defined as follows(1) Let (119909119894119905 119910119894119905) denote the coordinate of 119863119894119905 which aredefined in the following formula

119909119894119905 = 119905

119910119894119905 =

1 119905 = 1(119862119894119905 minus 119862119894119905minus1)119862119894119905minus1 lowast 01 119905 gt 1

(12)

Let Sim119894119895119875 (119905) denote the position similarity between 119863119894119905and119863119895119905 as defined by

Sim119894119895119875 (119905) =

min (119910119894119905 119910119895119905 )max (119910119894119905 119910119895119905 ) 119910

119894119905 lowast 119910119895119905 gt 0

0 119910119894119905 lowast 119910119895119905 = 0 119910119894119905 = 119910119895119905 1 119910119894119905 = 119910119895119905 = 0

(13)

Mathematical Problems in Engineering 7

110

100

95

90

102

105

110

100

(a) KS119894

110

100

95

9092

95

100

90(b) KS119895

Figure 4 K-line series chart

(2) Let PSim119894119895 denote the position similarity between KS119894

and KS119895 as defined by

PSim119894119895 = 119899sum119905=1

Sim119894119895119875 (119905) lowast 119908119905119875

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119875 = 1) (14)

where 119908119905119875 represents the weight of Sim119894119895119875 (119905) Thanks to theidea that each K-line can be given different weight the K-lineseries having special coordinates could be identified well

223 The Similarity of K-Line Series Finally based on theshape similarity and position similarity the similarity of K-line series could be obtained Therefore the similarity matchmodel between KS119894 and KS119895 is defined by

Sim119894119895 = SSim119894119895 lowast 119908119878 + PSim119894119895 lowast 119908119875 (15)

where 119908119878 and 119908119875 represent the shape similarity weight andposition similarity weight of K-line series respectively

23 Cluster of K-Line Series Themore accurate classificationresult of K-line patterns can be gotten by clustering themusing the nearest neighbor-clustering algorithm based onthe similarity match model of K-line series The K-lineseriesrsquo nearest neighbor-clustering algorithm (KNNCA) isdescribed as shown in Algorithm 1

In addition |119876119898| represents the number of elements in119876119898 As each K-line series will be matched once with all of theK-line series stored in the cluster the time complexity andspace complexity of KNNCA are both 119874(1198992)3 Mining of Patternsrsquo Predictive Power

We can mine and analyze the patternsrsquo predictive poweraccording to the following steps

(1) Pattern Recognition Based on the definition of K-linepatterns we identify all the K-line series belonging to

a pattern (such as TIU or TID) and then they form aset (KSet)(2) Pattern Clustering We use the KNSSC algorithm tocluster KSet then the set of clusters (CSet) can be gotten inwhich different clusters represent the same patternrsquos differentshapes

(3) Knowledge Mining We define some statistical indicatorsabout stock prices which we use to mine stock predictionknowledge from each cluster

The patternrsquos predictive power is gotten primarily byanalyzing the trend of the patternrsquos consequent K-line seriesPaper [22] found that K-line technology is suited for short-term investment prediction and that the most efficient timeperiod for prediction is 10 daysTherefore we mainly analyzethe close price trend of the patternrsquos consequent K-line seriesin 10 days Let KS = [119863119905 119863119905+1 119863119905+2] denote a three-day K-line pattern its consequent K-line series is denoted by CKSThe statistical indicators of CKS are defined as follows

(a) Let 119862119896 denote the k-th close price of CKS let 119875119880(119862119896)denote that the probability of the trend of 119862119896 is uptrend andlet 119875119863(119862119896) denote that the probability of the trend of 119862119896 isdowntrend 119875119880(119862119896) and 119875119863(119862119896) are calculated by

119875119880 (119862119896) =10038161003816100381610038161003816119876119898119896119880 10038161003816100381610038161003816|119876119898|

119875119863 (119862119896) =10038161003816100381610038161003816119876119898119896119863 10038161003816100381610038161003816|119876119898|

(16)

where |119876119898119896119880 | represents the number of patterns meeting thecondition of119862119896 ge 119862119905+2 in119876119898 |119876119898119896119863 | represents the number ofpatterns meeting the condition of 119862119896 le 119862119905+2 in 119876119898 and 119862119905+2represents the close price of119863119905+2 119875119880(119862119896) gt 05 indicates thatthe future trend of 119862119905 is rising 119875119863(119862119896) gt 05 indicates thatthe future trend of 119862119896 is descending

(b) Let 119875119899119880 isin [0 1] denote the probability that the closeprice will rise in the next 119899 days if the pattern appears119875119899119863 isin [0 1] denotes the probability that the close price will

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

Mathematical Problems in Engineering 7

110

100

95

90

102

105

110

100

(a) KS119894

110

100

95

9092

95

100

90(b) KS119895

Figure 4 K-line series chart

(2) Let PSim119894119895 denote the position similarity between KS119894

and KS119895 as defined by

PSim119894119895 = 119899sum119905=1

Sim119894119895119875 (119905) lowast 119908119905119875

(119899 = 10038161003816100381610038161003816KS11989410038161003816100381610038161003816 119899sum119905=1

119908119905119875 = 1) (14)

where 119908119905119875 represents the weight of Sim119894119895119875 (119905) Thanks to theidea that each K-line can be given different weight the K-lineseries having special coordinates could be identified well

223 The Similarity of K-Line Series Finally based on theshape similarity and position similarity the similarity of K-line series could be obtained Therefore the similarity matchmodel between KS119894 and KS119895 is defined by

Sim119894119895 = SSim119894119895 lowast 119908119878 + PSim119894119895 lowast 119908119875 (15)

where 119908119878 and 119908119875 represent the shape similarity weight andposition similarity weight of K-line series respectively

23 Cluster of K-Line Series Themore accurate classificationresult of K-line patterns can be gotten by clustering themusing the nearest neighbor-clustering algorithm based onthe similarity match model of K-line series The K-lineseriesrsquo nearest neighbor-clustering algorithm (KNNCA) isdescribed as shown in Algorithm 1

In addition |119876119898| represents the number of elements in119876119898 As each K-line series will be matched once with all of theK-line series stored in the cluster the time complexity andspace complexity of KNNCA are both 119874(1198992)3 Mining of Patternsrsquo Predictive Power

We can mine and analyze the patternsrsquo predictive poweraccording to the following steps

(1) Pattern Recognition Based on the definition of K-linepatterns we identify all the K-line series belonging to

a pattern (such as TIU or TID) and then they form aset (KSet)(2) Pattern Clustering We use the KNSSC algorithm tocluster KSet then the set of clusters (CSet) can be gotten inwhich different clusters represent the same patternrsquos differentshapes

(3) Knowledge Mining We define some statistical indicatorsabout stock prices which we use to mine stock predictionknowledge from each cluster

The patternrsquos predictive power is gotten primarily byanalyzing the trend of the patternrsquos consequent K-line seriesPaper [22] found that K-line technology is suited for short-term investment prediction and that the most efficient timeperiod for prediction is 10 daysTherefore we mainly analyzethe close price trend of the patternrsquos consequent K-line seriesin 10 days Let KS = [119863119905 119863119905+1 119863119905+2] denote a three-day K-line pattern its consequent K-line series is denoted by CKSThe statistical indicators of CKS are defined as follows

(a) Let 119862119896 denote the k-th close price of CKS let 119875119880(119862119896)denote that the probability of the trend of 119862119896 is uptrend andlet 119875119863(119862119896) denote that the probability of the trend of 119862119896 isdowntrend 119875119880(119862119896) and 119875119863(119862119896) are calculated by

119875119880 (119862119896) =10038161003816100381610038161003816119876119898119896119880 10038161003816100381610038161003816|119876119898|

119875119863 (119862119896) =10038161003816100381610038161003816119876119898119896119863 10038161003816100381610038161003816|119876119898|

(16)

where |119876119898119896119880 | represents the number of patterns meeting thecondition of119862119896 ge 119862119905+2 in119876119898 |119876119898119896119863 | represents the number ofpatterns meeting the condition of 119862119896 le 119862119905+2 in 119876119898 and 119862119905+2represents the close price of119863119905+2 119875119880(119862119896) gt 05 indicates thatthe future trend of 119862119905 is rising 119875119863(119862119896) gt 05 indicates thatthe future trend of 119862119896 is descending

(b) Let 119875119899119880 isin [0 1] denote the probability that the closeprice will rise in the next 119899 days if the pattern appears119875119899119863 isin [0 1] denotes the probability that the close price will

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

8 Mathematical Problems in Engineering

InputKSet = KS1KS2 KS119899 the data set of K-line series120579 Similarity threshold

OutputCSet the set of clusters

KNNCA AlgorithmAssign initial value for parameters 119908119878 119908119875 119908Body 119908US 119908LS 119908119905119878 119908119905119875119898 = 1119876119898 = KS1 119876119898 represents them-th clusterCSet = 119876119898FOR 119894 = 2 TO 119899 DO

SimMax = 0FOR EACH 119876item IN CSet

FOR EACH KS119895 IN 119876item

Get Sim119894119895 based on formula (15)If (Sim119894119895 gt SimMax)SimMax = Sim119894119895119891 = item 119891 represents the ID of a cluster whose element is most similar to KS119894

EndEnd

IF (SimMax gt 120579) THEN119876119891 = 119876119891 cup KS119894ELSE 119898 = 119898 + 1119876119898 = KS119894

CSet = 1198761 1198762 119876119898Algorithm 1

fall in the next 119899 days if the pattern appears 119875119899119880 and 119875119899119863 arecalculated by

119875119899119880 =119899sum119896=1

Round [119875119880 (119862119896) 05]119899

Round [119875119880 (119862119896) 05] = 1 119875119880 (119862119896) gt 050 119875119880 (119862119896) le 05

(17)

119875119899119863 =119899sum119896=1

Round [119875119863 (119862119896) 05]119899

Round [119875119863 (119862119896) 05] = 1 119875119863 (119862119896) gt 050 119875119863 (119862119896) le 05

(18)

where a higher value of 119875119899119880 or 119875119899119863 indicates a strongercapability for predicting bullish or bearish market

(4) Analysis Based on the statistical result we analysis thepatternrsquos predictive power

4 Experiment and Result Analysis

41 Experiment Data and Method As Yahoo provides thefinance stock API used to download the transaction data ofChinese stock market the stock transaction data of ChineseA-share market in any time can be acquired based on theAPI To get a representative testing data we select the K-line series data of Shanghai 180 index component stocks overthe latest 10 years (from 2006-01-04 to 2016-08-24) as thetest data Limited by space only the TIU and TID patternrsquospredictive power will be analyzed in the experiment And theparameters of KNSSC algorithm are set as follows 120579 = 075119908119878 = 02 119908119879 = 08 119908Body = 06 119908US = 02 119908LS = 02 and119908119905119878 = 119908119905119875 = 1|KS119894| (119905 = 1 2 |KS119894|)42 Experiment One The aim of the first experiment is toanalyze the TIU patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TIU 1516 TIU patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 554 clusters are obtained We choose the top 20clusters with themost elements to conduct statistical analysisas shown in Table 1

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

Mathematical Problems in Engineering 9

Table 1 The experiment result of TIU

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198801198760 1516 0493 0486 0518 0495 0475 0488 0481 0479 0484 0479 011198761 79 0405 0532 0557 0532 0544 0506 0468 0506 043 0443 061198762 42 0405 0429 05 0452 0405 0452 0429 0476 0524 0476 011198763 32 05 0594 0594 0625 0594 0531 0625 0594 0563 0531 091198764 25 06 06 052 048 048 04 048 048 06 064 051198765 23 0435 0478 0609 0565 0522 0391 0478 0348 0391 0391 031198766 22 0455 0636 0545 05 0545 0545 0591 0591 0636 0636 081198767 21 0333 0571 0524 0476 0619 0857 0762 0714 0429 0048 061198768 21 0524 0476 0524 0429 0476 0476 0429 0476 0429 0429 021198769 21 0524 0476 0524 0429 0381 0286 0381 0429 0381 0429 0211987610 19 0105 0368 0421 0421 0368 0421 0474 0474 0579 0421 0111987611 19 0579 0474 0421 0368 0263 0368 0316 0263 0211 0211 0111987612 17 0588 0294 0471 0353 0412 0471 0471 0412 0471 0412 0111987613 15 06 0533 0467 0467 0467 0467 0533 0533 06 06 0611987614 15 0533 06 0667 06 0533 06 06 0667 06 06 111987615 14 05 0071 05 0429 0429 0357 0286 0286 0286 0429 011987616 14 0643 0571 0714 0714 0714 05 0643 0571 0571 0714 0911987617 14 0714 0857 0857 0857 0786 0786 0571 0643 0643 0643 111987618 14 0357 0571 0429 05 0357 0571 0429 0429 05 05 0211987619 13 0385 0462 0692 0615 0615 0692 0692 0769 0769 0769 0811987620 12 05 0667 0667 0667 075 0583 0583 0667 0583 0583 09

In Table 1 1198760 represents the cluster composed of 1516TIU Patterns Its 11987510119880 is only 05 which means that it may bea spurious pattern to predict bullish market However afterfurther classifying the TIU patterns we can see that (1) 11987631198766 11987616 and so forth have a strong capability for predictingbullish market because their 11987510119880 both are above 08 (2) 119876111987641198767 and so forth have amoderate capability for predictingbullishmarket as their11987510119880 are only in 05sim07 and (3)119876211987651198768 and so forth have a weak capability for predicting bullishmarket as their entire11987510119880 are below 05 Particularly for 1198762its 11987510119880 is only 01

By comparing the predictive power of 1198763 and 1198762 asshown in Figure 5 we can see that the predictive result of 1198763is bullish market while that of 1198762 is bearish market whichmeans that their predictive power is opposite The result ofexperiment one shows that (1) the predictive power of TIUvaries a great deal for different shapes and (2) to be a betterpattern for predicting bullish market the TIU pattern badlyneeds to be further classified which are consistent with theexpected analysis

43 Experiment Two The aim of the second experiment isto analyze the TID patternrsquos predictive power based on themethod defined in Section 3 Firstly based on the definitionof TID 1498 TID patterns are identified from the test dataThen we cluster these patterns using the KNSSC algorithmand finally 572 clusters are obtained We choose the top 20

2 3 4 5 6 7 8 9 101

Q3

Q2

04

045

05

055

06

065

07

Figure 5 The comparison of predictive power between 1198763 and 1198762

clusters with themost elements to conduct statistical analysisas shown in Table 2

Similarly the TID pattern may be also a spurious patternto predict bearish market because 11987510119863 of 1198760 is 0 where1198760 represents the cluster composed of 1498 TID patternsMoreover after further classifying the TID patterns we cansee that (1) except for 11987611 and 11987616 almost all of the clustershave a weak capability for predicting bearish market as theirentire11987510119863 are below 05 and (2) even11987611 and11987616 still not havea higher value of 11987510119863 which are only 05 Therefore we can

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

10 Mathematical Problems in Engineering

Table 2 The experiment result of TID

119876119898 10038161003816100381610038161198761198981003816100381610038161003816 119875119880 (1198621) 119875119880 (1198622) 119875119880 (1198623) 119875119880 (1198624) 119875119880 (1198625) 119875119880 (1198626) 119875119880 (1198627) 119875119880 (1198628) 119875119880 (1198629) 119875119880 (11986210) 119875101198631198760 1498 0421 0451 0461 0449 0437 0449 0459 0444 0427 0435 01198761 78 0449 05 0436 0436 0372 0359 0449 0397 0359 0346 011198762 72 0431 0486 0542 0458 0431 0431 0417 0417 0361 0306 01198763 44 0364 0364 0364 0386 0273 0386 0364 0386 0386 0341 011198764 39 0513 0359 0359 0333 0231 0282 0308 0282 0282 0282 01198765 23 0174 0261 0304 0391 0348 0391 0391 0435 0435 0391 011198766 23 0348 0478 0522 0478 0435 0435 0391 0478 0478 0435 041198767 21 0429 0429 0476 0429 0619 0524 0524 0476 0476 0571 01198768 18 0444 0333 0444 05 05 0278 0222 0333 0333 0333 021198769 18 0333 0667 0611 0389 0444 0444 0389 0389 0389 0389 0111987610 18 0333 0222 0278 0167 0278 0389 0444 0389 05 0556 011987611 18 0444 0444 0278 0333 0389 0222 0278 0333 0444 0389 0511987612 16 0438 0375 05 0563 0688 0688 0625 05 0438 0563 0211987613 15 0267 0267 0467 04 0267 06 0533 0467 04 04 011987614 13 0385 0308 0308 0385 0308 0462 0231 0231 0308 0308 0411987615 12 025 0333 0583 0667 0667 0583 05 05 0417 05 0111987616 12 05 0333 025 025 05 0667 0417 05 05 05 0511987617 12 025 0833 075 0667 0583 05 05 0417 0417 0583 0411987618 11 0364 0273 0273 0364 0545 0455 0545 0545 0455 0545 011987619 11 0364 0455 0273 0364 0364 0364 0455 0455 0455 0364 011987620 11 0273 0364 0273 0364 0364 0364 0364 0364 0182 0273 0

consider that the TID pattern is definitely a spurious patternwhich is also consistent with the expected analysis

44 Experiment Conclusion Through the above experimentwe can draw the following conclusion (1) The predictivepower of a pattern varies a great deal for different shapes TakeTIU for example some shapesrsquo TIU patterns have a strongcapability for predicting bullish market while some othershave the opposite predictive power Therefore to analyze thepredictive power of a pattern we should make a concreteanalysis of concrete shapes (2) There are definitely somespurious patterns in the existing K-line patterns Thereforein order to improve the stock prediction performance basedon K-line patterns we need to further classify the existingpatterns based on the shape feature identify all the spuriouspatterns and choose the patterns having stronger predictivepower to predict the stock price

5 Conclusion

Stock prediction is a popular research field in the timeseries prediction As a primary technology analysis methodof stock prediction there is different option on the stockprice prediction based on K-line patterns in the academicworld though it is widely used in reality To help resolve thedebate this paper uses the data mining method like patternrecognition similarity match cluster and statistical analysisand so forth to study the predictive power of K-line patternsExperimental results show that one reason for the debateis that the definition of K-line patterns is more open and

lack of mathematical rigor The other is that there are somespurious patterns in the existing K-line patterns In additionthe method presented in the paper can be used not onlyto test the predictive power of patterns but also for K-linepatterns mining and stock prediction Therefore the futureworks as follows (1) It will be a necessary and significanttask to identify the entire spurious pattern using the proposedmethod (2) On the basis of the proposed method we canresearch an automatic pattern mining method to discovermore useful patterns for stock prediction

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

The Key Basic Research Foundation of Shanghai Scienceand Technology Committee China (Grant no 14JC1402203)and the Science and Technology Support Program of China(Grant no 2015BAF10B01) financially supported this work

References

[1] J Lin S Williamson K D Borne et al ldquoAdvances in machinelearning amp data mining for astronomyrdquo Pattern Recognition inTime Series pp 617ndash645 2010

[2] L Lihui T Xiang Y Haidong et al ldquoFinancial time series fore-casting based on SVRrdquo Computer Engineering and Applicationsvol 41 no 30 pp 221ndash224 2005

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

Mathematical Problems in Engineering 11

[3] R D Edwards J Magee and W H C Bassetti TechnicalAnalysis of Stock Trends CRC Press Boca Raton Fla USA 10thedition 2012

[4] S Nison Japanese Candlestick Charting Techniques A Contem-porary Guide to the Ancient Investment Technique of the FarEast Institute of Finance New York NY USA 1991

[5] B R Marshall M R Young and L C Rose ldquoCandlesticktechnical trading strategies can they create value for investorsrdquoJournal of Banking and Finance vol 30 no 8 pp 2303ndash23232006

[6] G Caginalp and H Laurent ldquoThe predictive power of pricepatternsrdquoApplied Mathematical Finance vol 5 no 3-4 pp 181ndash205 1998

[7] K H Lee and G S Jo ldquoExpert system for predicting stockmarket timing using a candlestick chartrdquo Expert Systems withApplications vol 16 no 4 pp 357ndash364 1999

[8] M M Goswami C K Bhensdadia and A P Ganatra ldquoCan-dlestick analysis based short term prediction of stock pricefluctuation usingSOM-CBRrdquo in Proceedings of the 2009 IEEEInternational Advance Computing Conference (IACC rsquo09) pp1448ndash1452 March 2009

[9] T H Lu and J Chen Candlestick charting in European stockmarkets no 2 pp 20ndash25 2013

[10] T H Lu and Y M Shiu ldquoTests for two day candlestick patternsin the emerging equity market of Taiwanrdquo Emerging MarketsFinance amp Trade vol 48 no 1 pp 41ndash57 2014

[11] H LiWW Y Ng JW T Lee B Sun andD S Yeung ldquoQuanti-tative study on candlestick pattern for shenzhen stock marketrdquoin Proceedings of the 2008 IEEE International Conference onSystems Man and Cybernetics (SMC rsquo08) pp 54ndash59 October2008

[12] W Xiao WW Y Ng M Firth et al ldquoL-GEM basedMCS aidedcandlestick pattern investment strategy in the shenzhen stockmarketrdquo in Proceedings of the 2009 International Conference onMachine Learning and Cybernetics pp 243ndash248 July 2009

[13] T Kamo and C Dagli ldquoHybrid approach to the Japanesecandlestick method for financial forecastingrdquo Expert Systemswith Applications vol 36 no 3 pp 5023ndash5030 2009

[14] M Jasemi A M Kimiagari and A Memariani ldquoA modernneural network model to do stock market timing on the basisof the ancient investment technique of Japanese CandlestickrdquoExpert Systems with Applications vol 38 no 4 pp 3884ndash38902011

[15] S Barak J H Dahooie and T Tichy ldquoWrapper ANFIS-ICAmethod to do stock market timing and feature selection on thebasis of JapaneseCandlestickrdquoExpert SystemswithApplicationsvol 42 no 23 pp 9221ndash9235 2015

[16] JH Fock C Klein andB Zwergel ldquoPerformance of candlestickanalysis on intraday futures datardquoThe Journal ofDerivatives vol13 no 1 pp 28ndash40 2005

[17] B R Marshall M R Young and R Cahan ldquoAre candlesticktechnical trading strategies profitable in the Japanese equitymarketrdquo Review of Quantitative Finance and Accounting vol31 no 2 pp 191ndash207 2008

[18] M J Horton ldquoStars crows and doji the use of candlesticksin stock selectionrdquo Quarterly Review of Economics and Financevol 49 no 2 pp 283ndash294 2009

[19] L J Chmielewski M Janowicz and A Orłowski ldquoPredictionof trend reversals in stock market by classification of Japanesecandlesticksrdquo in Proceedings of the 9th International ConferenceonComputer Recognition Systems (CORES rsquo15) vol 403 pp 641ndash647

[20] C-F Tsai and Z-Y Quan ldquoStock prediction by searching forsimilarities in candlestick chartsrdquo ACM Transactions on Man-agement Information Systems vol 5 no 2 Article ID 25916722014

[21] L Chmielewski M Janowicz J Kaleta and A Orłowski ldquoPat-tern recognition in the Japanese candlesticksrdquoAdvances in Intel-ligent Systems and Computing vol 342 pp 227ndash234 2015

[22] G L Morris and R Litchfield Candlestick Charting ExplainedTimeless Techniques for Trading Stocks and Futures McGraw-Hill New Yourk NY USA 2nd edition 2006

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: K-Line Patterns’ Predictive Power Analysis Using the Methods of …downloads.hindawi.com/journals/mpe/2017/3096917.pdf · 2019. 7. 30. · K-Line Patterns’ Predictive Power Analysis

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 201

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of