1 +86-551-5331800 +86-551-5331801 tel: fax: website: © 2002 iflytek. all rights reserved. this...

25
1 +86-551-5331800 +86-551-5331801 http:// www.iflytek.com TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational purposes only. iFLYTEK MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN ANHUI USTC iFLYTEK CO., LTD ANHUI USTC iFLYTEK CO., LTD

Upload: lilian-henbest

Post on 30-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

1

+86-551-5331800+86-551-5331801http://www.iflytek.com

TEL: FAX:

WEBSITE:

© 2002 iFLYTEK. All rights reserved.

This presentation is for informational purposes only. iFLYTEK MAKES

NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

ANHUI USTC iFLYTEK CO., LTD ANHUI USTC iFLYTEK CO., LTD

Page 2: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

Network Speech Technology Network Speech Technology

Industry in ChinaIndustry in China

--Qiang Bai--Qiang Bai

ANHUI USTC iFLYTEK Co., LTD.ANHUI USTC iFLYTEK Co., LTD. 安徽中科大讯飞信息科技有限公司安徽中科大讯飞信息科技有限公司

Page 3: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

3

General introduction of the General introduction of the

speech industryspeech industry

Page 4: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

4

General introduction of Chinese speech General introduction of Chinese speech industryindustry

• Chinese speech technology achieved great progress in the past Chinese speech technology achieved great progress in the past four years with the boom of Chinese economy. Chinese speech four years with the boom of Chinese economy. Chinese speech technology gradually occupies the key position in the Pan-Asian technology gradually occupies the key position in the Pan-Asian market by taking part into everyone’s everyday life.market by taking part into everyone’s everyday life.

• In consideration of the weak foundation and brief history, In consideration of the weak foundation and brief history, Chinese speech technology market is still in a primary stage. Chinese speech technology market is still in a primary stage. The total market is limited, and much work is needed to cultivate.The total market is limited, and much work is needed to cultivate.

• Started in the mid 1980s, supported by the national government, Started in the mid 1980s, supported by the national government, now driving by the iFLYTEK, the research of Chinese speech now driving by the iFLYTEK, the research of Chinese speech synthesis technology has advanced rapidly. The synthesis effect synthesis technology has advanced rapidly. The synthesis effect could satisfy most of the practical application.could satisfy most of the practical application.

DZHUANG
46%的数字是我估计的,没有权威的数据来源。
Page 5: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

5

General introduction of Chinese General introduction of Chinese speech industryspeech industry

• The speech technology market of China is special. iFLYTEKThe speech technology market of China is special. iFLYTEK occupied a 82% market share, with the predominant market occupied a 82% market share, with the predominant market position.position. Other suppliers are focusing on the lower-end market, Other suppliers are focusing on the lower-end market, far behind iFLYTEK in both technology and market.far behind iFLYTEK in both technology and market.

• Speech synthesis technology is the mainstream of market Speech synthesis technology is the mainstream of market appliance.appliance. Several industries includes the major customers. Call Several industries includes the major customers. Call centers o f telecommunication (value added services), financial centers o f telecommunication (value added services), financial services almost form the 70% of the present total market. services almost form the 70% of the present total market.

• The potential market of speech technology is huge and The potential market of speech technology is huge and profitable.profitable.

Page 6: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

6

Present Situation of the Present Situation of the

technologytechnology

Page 7: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

7

Challenges• Chinese is special and complex. The text analysis and rhythm analysis of Chinese is special and complex. The text analysis and rhythm analysis of

Chinese are hard to conduct.Chinese are hard to conduct.• There’s no blank between two Chinese characters,There’s no blank between two Chinese characters, the word boundary is hard the word boundary is hard

to define.to define.• One Chinese character might have different pronunciation in different context, One Chinese character might have different pronunciation in different context,

some pronunciation just exist in name.some pronunciation just exist in name. • Special signs should be translate according to the Chinese reading habit.Special signs should be translate according to the Chinese reading habit. As a As a

language with four tones, the rhythm feature of Chinese is very complex, such language with four tones, the rhythm feature of Chinese is very complex, such as tone sandhi and r-colloring. as tone sandhi and r-colloring.

• Mandarin is rhythm based, the mark up system is Pinyin but not international Mandarin is rhythm based, the mark up system is Pinyin but not international phonetic alphabet.phonetic alphabet.

• During the construction of grammar rule and dictionary, a systematical During the construction of grammar rule and dictionary, a systematical transcript framework is needed to mark the exceptive phenomenon. At the transcript framework is needed to mark the exceptive phenomenon. At the same time, some sophisticate employees with the announcer background is same time, some sophisticate employees with the announcer background is needed.needed.

• How to find out the best pair to concatenation in the huge corpus? Based on How to find out the best pair to concatenation in the huge corpus? Based on the intelligent text analysis technology, how to predict the most fluent speech the intelligent text analysis technology, how to predict the most fluent speech parameter?parameter?

Those are all the challenges in Chinese speech synthesis.Those are all the challenges in Chinese speech synthesis.

7

Page 8: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

8

Innovation

• iFLYTEK has the intelligent text analysis technology, which iFLYTEK has the intelligent text analysis technology, which could solve the problems of word segmentation, polyphonic could solve the problems of word segmentation, polyphonic characters, special signs and rhythm level questions. characters, special signs and rhythm level questions.

• By using a huge speech database(3000sentences), together with By using a huge speech database(3000sentences), together with the human markups, the fluent sound could be generate through the human markups, the fluent sound could be generate through the data driven, prosodic prediction and unit selection algorithm.the data driven, prosodic prediction and unit selection algorithm.

• With the latest hmm-based, and absolute data driven method, we With the latest hmm-based, and absolute data driven method, we could generate the fluent voice with a small speech could generate the fluent voice with a small speech database(500-1000)database(500-1000) ..

• We could simulate the target person’s pronunciation by the We could simulate the target person’s pronunciation by the MLLR based voice conversion method.MLLR based voice conversion method.

8

Page 9: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

9

Innovation

• iFLYTEK is the chair party for Chinese national government’s iFLYTEK is the chair party for Chinese national government’s speech technology standard working group. speech technology standard working group.

• iFLYTEK has finished iFLYTEK has finished the Mandarin speech synthesis system the Mandarin speech synthesis system general technology specification. general technology specification. This specification has been This specification has been released as the national standard.released as the national standard.

• This standard laid the ground work for Chinese speech This standard laid the ground work for Chinese speech technology’s rapid growth technology’s rapid growth

• the Mandarin speech synthesis system general technology the Mandarin speech synthesis system general technology specification specification includes the CSSML mark up language, which includes the CSSML mark up language, which could satisfy the need for those exceptional characters’ could satisfy the need for those exceptional characters’ synthesis.synthesis.

9

Page 10: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

Innovation

• iFLYTEK is aspiring to take part in setting the international SSML iFLYTEK is aspiring to take part in setting the international SSML standard. That’s a great support to the Chinese language.standard. That’s a great support to the Chinese language.

• Based on the CSSML’s result, iFLYTEK offers the international Based on the CSSML’s result, iFLYTEK offers the international SSML standard the suggestions of how to improve Chinese.SSML standard the suggestions of how to improve Chinese.

• Following are the latest updated terms for Chinese in the new Following are the latest updated terms for Chinese in the new version of SSML1.1version of SSML1.1 – New sign is defined to support the work of specify the word New sign is defined to support the work of specify the word

boundary.boundary.– Extend the meaning of special phoneme sign to further support Extend the meaning of special phoneme sign to further support

Chinese Pinyin’s mark up.Chinese Pinyin’s mark up.– Offers a system to define the name. especially for the Chinese name Offers a system to define the name. especially for the Chinese name

in which some characters would change their pronunciations.in which some characters would change their pronunciations.– Offers the way to describe dialects.Offers the way to describe dialects.

10

Page 11: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

11

Keep step forward

With 20 years’ persistent efforts, the speech With 20 years’ persistent efforts, the speech technology has achieved great progress.technology has achieved great progress.

YearYear 1995 1998 2001 2004

NaturalnessNaturalness <3.0 3.0 3.8 4.3

Naturalness is the key label of the synthesized speech.Naturalness is the key label of the synthesized speech. The The subjective scoring method is introduced to express the similarity subjective scoring method is introduced to express the similarity between the human voices and the synthesized voices.between the human voices and the synthesized voices.MOSMOS (( Mean Opinion ScoreMean Opinion Score ),), 5 is the best5 is the best ,, 1 is the worst.1 is the worst.

Page 12: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

Obstructive factors in the Obstructive factors in the

application of speech application of speech

technologytechnology

Page 13: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

13

Obstructive factors

• Artificial services are still the mainstream for the costArtificial services are still the mainstream for the cost

• China is rich in labor resources, artificial call centers are simple China is rich in labor resources, artificial call centers are simple and cheap. So a full-scale speech technology solution is far and cheap. So a full-scale speech technology solution is far ahead. ahead.

The average annual salary of the sophisticated call

center workers in Asia, (2006. $)

India 3,334    China 2,558

Malaysia 5,442    Philippines 3,348

Thailand 3,656    Singapore 13,677

Source:callcentres.net

Page 14: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

14

Obstructive factors

• Too many dialects increase the difficulty of the speech Too many dialects increase the difficulty of the speech technology’s popularization.technology’s popularization.

• Chinese is a multi- national country, with 56 nationalities in all. Chinese is a multi- national country, with 56 nationalities in all. Chinese language is complex, statistic shows there’re more than Chinese language is complex, statistic shows there’re more than 3000 dialects in China. That’s a great challenge to the 3000 dialects in China. That’s a great challenge to the development of Chinese speech technology.development of Chinese speech technology.

– Based on the existing effect that speech technology could achieve, Based on the existing effect that speech technology could achieve, the practical application effect could be improved through the practical application effect could be improved through customization. But the premise is a thoroughly understanding of the customization. But the premise is a thoroughly understanding of the customer’s need and purpose. iFLYTEKcustomer’s need and purpose. iFLYTEK is leading in speech is leading in speech technology market for we are experienced and we have the technology market for we are experienced and we have the professional solutions and the splendid team.professional solutions and the splendid team.

– iFLYTEK gained the national support for the project of set up the iFLYTEK gained the national support for the project of set up the Chinese dialects evaluation and recognition database in 2004.Chinese dialects evaluation and recognition database in 2004.

Page 15: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

15

Obstructive factors

• Customer’s acceptance to the IT solution needs improvementCustomer’s acceptance to the IT solution needs improvement

• Comparing to the developed country, the informatization Comparing to the developed country, the informatization construction of China is weak and simple. Mostly because the construction of China is weak and simple. Mostly because the Chinese customers rely more on the hardware rather than the Chinese customers rely more on the hardware rather than the software. Statistic shows that in the oversea countries the ratio software. Statistic shows that in the oversea countries the ratio of hardware, software and service is 1:2:4, while in china the of hardware, software and service is 1:2:4, while in china the ratio is less than 4:2:1.ratio is less than 4:2:1.– The differed acceptance to software and hardware influence the The differed acceptance to software and hardware influence the

living condition of Chinese software and service industry. There’s a living condition of Chinese software and service industry. There’s a long time period to overpass until the living condition turns better long time period to overpass until the living condition turns better and everyone’s concept change. and everyone’s concept change.

• Customers’ habits need to be cultivate.Customers’ habits need to be cultivate. – Due to the development of Chinese informatization, Due to the development of Chinese informatization,

telecommunication comes to Chinese residents just a few years; telecommunication comes to Chinese residents just a few years; artificial services are the mainstream now. Self-service supported by artificial services are the mainstream now. Self-service supported by the speech technology is not so popular as the developed country.the speech technology is not so popular as the developed country.

Page 16: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

16

Obstructive factors

• The actualize capacity need progressThe actualize capacity need progress

• System Integration corporations (SI) are the main body to develop System Integration corporations (SI) are the main body to develop the application of speech technology in China. There’re more than the application of speech technology in China. There’re more than 900 such companies in China without unit focus and steering. Thus 900 such companies in China without unit focus and steering. Thus much energy are spending on the duplicative develop and illicit much energy are spending on the duplicative develop and illicit competition. The whole industry is not canonical and ordered yet.competition. The whole industry is not canonical and ordered yet.– Speech technology are gradually sophisticating, but SI companies still Speech technology are gradually sophisticating, but SI companies still

stay at the preliminary stage. Some work like customer need’s stay at the preliminary stage. Some work like customer need’s generation, product design, testing, customization, customer‘s generation, product design, testing, customization, customer‘s cultivation etc. are in projecting absence.cultivation etc. are in projecting absence.

– As the leader of Chinese speech technology, iFLYTEK focused on the As the leader of Chinese speech technology, iFLYTEK focused on the basic theories’ research, practical application’s development, basic theories’ research, practical application’s development, customer’s cultivate, partner’s training, etc. Now she has been the customer’s cultivate, partner’s training, etc. Now she has been the effective guarantee to the success of speech technology.effective guarantee to the success of speech technology.

Page 17: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

17

Current Situation of commercial Current Situation of commercial

applicationapplication

Page 18: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

18

Driving force of speech Driving force of speech technology’s applicationtechnology’s application• The top priority of northern American’s company is to satisfy the The top priority of northern American’s company is to satisfy the

customer’s need, to help the customer find the information and customer’s need, to help the customer find the information and service that they need as quick as possible.service that they need as quick as possible. At the same time, At the same time, they try to improve the efficiency and reduce the labor cost.they try to improve the efficiency and reduce the labor cost.

44%

36%34%

16%

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

Empowerscallers to

findinformation

Decreasescalls

handled bylive-agents

Reducesoperating

costs

Increasescustomerretention

Source: Benchmark Portal, May 2005

Find the information

Reduce the pressure Cut the cost

Improve loyalty

Page 19: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

19

Driving force of speech Driving force of speech technology’s applicationtechnology’s application

• The top priority of Chinese company is profit. They want to use The top priority of Chinese company is profit. They want to use speech technology to offer more information and value added speech technology to offer more information and value added services. So, in china, the big customers all belong to the services. So, in china, the big customers all belong to the telecommunication industry.telecommunication industry.

Major Goals Customers Deploy Speech

Increase the income

To offer the information and service

Increase competitive edge

Efficiency

Source: iFYTEK Survey

Page 20: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

20

Driving force of speech Driving force of speech technology’s applicationtechnology’s application

• The developing speech technology is driven by some actual The developing speech technology is driven by some actual reasons. At the same time, we’ve found that Chinese customer’s reasons. At the same time, we’ve found that Chinese customer’s comment on the speech technology is different from the comment on the speech technology is different from the sophisticated northern American market.sophisticated northern American market.

4 4.5 5 5.5 6 6.5 7 7.5 8

Improve Customer Service

End User Productivity

Differentiate Products and Services

Lower Overall Costs

Create New Products and Services

4 4.5 5 5.5 6 6.5 7 7.5 8

Improve Customer Service

Generate Revenue

Differentiate Products & Services

End User Productivity

Provide Personalized Services

Source: NUANCE V-World Survey

Page 21: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

21

Present Chinese speech technology market

• In the major application industries like telecommunication, In the major application industries like telecommunication, financing, energy, transportation etc, iFLYTEK occupies a 82% financing, energy, transportation etc, iFLYTEK occupies a 82% market share; representing the trend of the market.market share; representing the trend of the market.

• iFLYTEK has more than 800 partners in China. The total iFLYTEK has more than 800 partners in China. The total application cases are more than 6000, each second, millions of application cases are more than 6000, each second, millions of customers get their information and services through the customers get their information and services through the iFLYTEK speech technology.iFLYTEK speech technology.

• In China, there are many application cases in the value added In China, there are many application cases in the value added services. Speech recognition technology is now being applied in services. Speech recognition technology is now being applied in the major industries like telecommunication, financing, energy, the major industries like telecommunication, financing, energy, transportation etc.transportation etc.

Page 22: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

政府政府

银行银行

电信电信

体育体育

证券证券

交通交通

Applications in all major industries

Page 23: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

23

Chinese speech recognition applications

Page 24: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

24

National project

• Key project—“Multi-language information service for Key project—“Multi-language information service for

2008 Olympic Games”2008 Olympic Games”

• iFLYTEK has been nominated as the only speech iFLYTEK has been nominated as the only speech

technology provider, and been named as the “Best technology provider, and been named as the “Best

participant” by the national 863 committee.participant” by the national 863 committee.

The major function of it is information searching through telephone. The major function of it is information searching through telephone.

Page 25: 1 +86-551-5331800 +86-551-5331801  TEL: FAX: WEBSITE: © 2002 iFLYTEK. All rights reserved. This presentation is for informational

25

+86-551-5331800+86-551-5331801http://www.iflytek.com

TEL: FAX:

WEBSITE:

© 2002 iFLYTEK. All rights reserved.

This presentation is for informational purposes only. iFLYTEK MAKES

NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

Thanks!Thanks!