[ieee 2009 international conference on e-business and information system security (ebiss) - wuhan,...

E-Business Access for Blinds: A Semantic Approach Muhammad Nabeel Talib1, Cai Shuqin*1, Muhammad Abrar1, Muhammad Sheraz Shafiq2

1School of Management, Huazhong University of Science and Technology, Wuhan, 430074, China. 2Bristol Institute of Technology, University of the West of England, UK.

([email protected], [email protected]*, [email protected], [email protected])

Abstract—Efforts have been made on voice enabled navigation, but to use the voice enabled systems more effectively, we need to exploit the marvels of web semantics properly, especially with the aspect of blind. Voice enabled applications have already been introduced by several companies but “Voice Enabled Browsing” is an area yet to be discovered. In this study we focused on the use of voice browsing for disabled; with special emphasis on the blind. We also considered the limitations of the existing web and examined the ways to extract the information from the web pages containing large amount of visual content. Global expansion in e-business volume will open new opportunities for the blind to be a part of e-business; either as a consumer or entrepreneur. The main impetus behind this effort is to enhance the access of existing web by exploring the possibilities to make it more easily accessible through voice; exclusively for the blind.

Keywords: voice browser; e-business; web content; grammer; numeric voice commands; hotlist; dictaion mode

I. INTRODUCTION

Most web applications are developed to target the majority of the users, but on the other hand people with disabilities find it hard to use these applications and contents. Inability of these applications to make the web easily accessible for disabled people is mainly because a). Contents do not meet accessibility standards (Web Accessibility Guidelines for web contents), b). If speech interface is used in accessibility applications, speech recognition engines are inefficient. Waibel and Lee are the pioneers in predicting the practical uses of voice technology, which can achieve marvelous feats for the disabled in making simple inquiries like bank balance, flight schedules and phone call transfers. At more advanced level, the use of voice enabled health systems, for instance in operation theatres, patient diagnosis and prescription can make huge difference for the handicapped [1]. Use of “indexation” can be a real help for blind users as they face browsing problems across the web pages. This paper proposes an idea of using “Numeric Voice Commands”, for navigating a web page through voice, which can help blind users to interact with the system more naturally. The full scope of voice enabled systems is difficult to imagine but predictably speech enabled mobile phones, navigational systems, and automobiles would certainly make a difference in the life of disabled people. Although future would definitely bring a lot of technological advancements in this particular area but the pace of development remains slow. We introduced the problem with an overview of voice browsing in section II. Section III gives a review of related work. Section IV presents analysis with current solutions to the problem and suggestions to enhance the existing models. And section V gives a future outlook for the web semantics with special emphasis on blinds.

II. PROBLEM BACKGRAOUND

A. Problem Difinition

The term ‘Web Accessibility’ refers to the creation and development of the web in such a way that web contents are easily accessible by everyone. According to W3C (World Wide Web Consortium), Web Accessibility means to develop the web in a way that allows disabled and older people to access and contribute to the web as it would for any normal person [2]. Web Accessibility Initiative (WAI), a group managed by W3C, has defined certain standards for Web Accessibility to keep the web evolving in a single direction. According to WAI, Web Accessibility is related with every type of disability where access to the web is affected either visually, auditory, physically, cognitively, neurologically or by speech [3]. One area, which is still unfolded, is the segment of e-business for blinds. Ignored totally by the companies this area can become the mainstream entrepreneurship and promote the business to an extent to which it hasn’t been yet, once alive or introduced; bringing a revolution in e-business [4, 5, 6]. How to make the web more accessible is explained by Web Content Accessibility Guidelines (WCAG), which are considered the universal principles of accessibility design for web [2, 3].

The key points of these guidelines can be summarized as:

Use of proper alternative text for the visual objects e.g. for images, animations, hotspots, hypertext links, graphs and charts Use of proper captioning and description of multimedia objects e.g. audio and video contents Use of consistent structure by using headings and lists Use of Cascading Style Sheets for layout and style Use of alternative content with scripts, applets or plug-ins

Use of meaningful titles for frames, making line by line reading sensible for tables or separate summary text The last step is to validate that the work is according to WCAG by using tools, checklists and guidelines.

These guidelines are specifically defined for the developers by Web Content Accessibility Guidelines but unfortunately, these guidelines are not followed at all [2, 3]. As a result, it is almost impossible for blinds to access the web for their daily use or for business purposes as well.

978-1-4244-4589-9/09/$25.00 ©2009 IEEE

B. Existing Web

Web Content Accessibility Guidelines (WCAG 1.0) provide foundations for developing web applications by offering necessary web accessibility support but unfortunately the practical use of these guidelines is very inadequate and insufficient. Most of the websites do not meet even the basic standards provided by WCAG 1.0, which is an appalling reality making the websites inaccessible for disabled people. The developers should understand the practical problems faced by disabled users, as Asakawa believes that the developers try to comply with the Web Accessibility guidelines without understanding the practical requirements of disabled users [7, 8]. The National Library for the Blind (NLB) supports visually impaired people in the UK by providing them access to libraries and information services. NLB estimated that there are two million people with a visual impairment in the UK alone. NLB admitted in 2006 that technology has made advancements ample enough to cope with Web Accessibility but according to a recent survey conducted by the Disability Rights Commission, unfortunately 81% of the websites sampled failed to meet the basic Web Accessibility initiative level [9]. Usually when the websites are made, most of the developers do not take into account the needs of those users, which are disabled, the elderly or those which are unable to use the web as typically as we do. The structures of the pages are complex, and simple accessibility rules are not followed which results in web contents not being accessible by many users. In our opinion, Web Accessibility is not only important for disabled or elderly people; it is also an important issue for those people who are naïve to the computer or to the web. Discussing on this issue, BBC disclosed that almost 97% of the websites did not provide basic accessibility standards and this statement was based after accessibility agency “Nomensa”, tested the leading websites in five different sectors (travel, retail, banking, government and media), across 20 countries [10]. This news is not only shocking but shows the lack of responsibility taken by the developers who completely ignore the needs of a whole range of people and hence contradict to the idea of Web Accessibility.

C. Voice Browsing

Access to the web is also necessary, as the World Wide Web (WWW) is becoming the ultimate source of information. The most complex development scenario (for accessibility applications) is to address the visual disabilities. A browser is a software application typically used to bring the contents of (an HTML) webpage into display, which allows quick access to the links, buttons, text and pictures. Supported with technology, an ideal voice browser is capable of two way communication with the user through speech or voice as shown in figure 1. Christian et al suggests that a voice browser is at least capable of rendering web pages in audio format or it can interpret speech input for navigation [11]. “My web my way” is a Web Service offered by BBC, which outlines the methods to increase the accessibility depending on the disability.

Figure 1. Conceptual components and their interaction diagram

BBC had also outlined some of the softwares that are used to make the computer talk e.g. Read Please, Browse Aloud and Read Speaker [12]. A voice browser not only allows the user to listen to what’s on the screen, it also allows the user to utter specific commands e.g. next page, skip, links on page. The user can also fill out the forms and is able to submit, all by voice.

III. LITERATURE REVIEW

Asakawa et al emphasize that blind people can easily access the web by using non-visual browsers such as voice browsers [13]. However, the web is becoming more difficult for blind users as more visual contents are being used in Web sites, making the web much more visual. Integrated use of scripts like JavaScript, and visual interfaces like flash, makes it even more difficult to represent the page audibly for those that are blind. Gupta and Kaiser observed that ads, non-description link text e.g. ‘click here’, ‘see more’ and inaccessible forms on the web pages seems to be a big problem which restrict developers to develop a general approach to extract the useful contents from a webpage [14]. Theofanos and Redish blame excessive use of graphics and image-links preventing a solid solution for blinds [15, 16]. Oviatt found that long input (for long sentences) is more complicated and prone to errors [17]. But inconsistency of web structure remains a restriction for developers to come up with a universal solution; as it is impossible to change the whole web.

A. Web Architecture Use of heading tags such as <H1>, <H2> are not used.

Screen readers and voice browsers usually read the document from top left to the bottom right corner which makes it difficult for blind users to access a specific point on the page especially when they are at the bottom [18]. It makes the structure of the document inconsistent and it always varies from site to site as the guidelines are not followed.

Some of the browsing functionalities are as follows:

Use the existing web contents to make them accessible to as much extent as possible Read the data (programmatically) by accessing the DOM (Document Object Model), using a standard browser (existing or new)

Analyze the elements and create more appropriate partitioning (separate and rename elements to make them more meaningful) Converting the (ambiguous) data to understandable (or reasonable) format Give user, the comfort option to get to the elements (from the webpage) of his own choice Unnecessary data can be filtered out (e.g. graphical/structure elements) Representing alerts and errors through reasonable sound voices (or sound alerts) instead of using long voice prompts.

The structure can be built with a semantic approach, interpreting the visual content (by scanning) into an appropriate, sequenced audio (Optical Character Recognition technology can be used to convert the image if the image represents some text, more accurately printed text).

B. Related Work Charles et al offered the idea of flexible vocabulary and

dynamic grammar speech interface by designing a speech user Agent [19]. The user can interact with the system by speaking control words, speak-able hotlist and links, to control the browsing experience like scroll down, back. The user can use speak-able hotlist in which a grammar can be associated with a URL for example ‘Weather in Wuhan’ should open the http://www.bbc.com/weather/wuhan. The idea of smart pages was introduced to implement this in which grammar is defined and for a given link many alternative grammars can be introduced. So an idea which started to make the existing web useful ended in a list dependent approach. This approach is unfeasible when millions of web pages exist and one website is not only a shopping website, it also provides news content. Ramakrishnan et al realized the actual problem with voice browsers (and screen readers), which are now fully capable to read the text on the screen (or alternative text for images), but they (voice browsers & screen readers) are unable to convey the logical structure and semantics of the content in a web document [20]. In our opinion a dynamic conversion approach, is required instead of attempting to change the whole existing web, which is not possible in anyway. So a universal solution is required using existing browsers (or using a new standard browser), which can convert the existing web into a useable format (to an extent as much as possible).

IV. ANALYSIS AND FINDINGS

Practical development shows that speech software development Kit can not accept a long list of voice commands at any one time, which is around 400-500 commands for the numeric list. For “natural text commands” this capacity is almost less than 70 commands. The programs can be set in to Dictation Mode on user request, where he can add and/or alter the entered commands word by word, as described in the table 1. The input voice passes through the voice recognition processes and would invoke an action in the browser only if it matches with one of the voice commands available in voice recognition software. Reasonable speech input option should be offered, where user can comfortably input data through

speech and navigates between contents. Voice output is also one of the biggest setbacks as sometimes the voice output is strange and difficult to understand but the good news is that there are ‘Voices’ available in the market which are very similar to the normal speech output for Voice Browser.

TABLE I. COMMAND LIST IN DICTATION MODE FOR VOICE BROWSER

Command List in Dictation ModeSr Type Command Purpose

1Text Entry

Enable Text Entry To start a Text input

2 Disable Text Entry To finish a Text input

3Custom: Link

Entry

Enable Link Entry To start a Link or URL Entry input where space is not inserted

4 Disable Link Entry To Finish a Link or URL Entry input

5 Advanced Custom:

Command

Enable Command To Enable Command Processing Mode

6 Disable Command To Disable Command Processing Mode

7 Print Print Page Automatically sends the print of the

current webpage to the default printer

8 Refresh Refresh Page Refresh the current webpage

9 Home Home Page Takes to the Home webpage (preset in Internet Explorer)

10 Forward Forward Page Takes to the next page in

memory(if visited and available in memory)

11 Backward Backward Page Takes to the Previous page in

memory(if visited and available in memory

Windows Commands 12 Log off Log off Will force the Windows to log off 13 Restart Restart Will force the Windows to Restart

14 Shutdown Shutdown Will force the Windows to Shutdown

To enhance the capabilities of voice browsers, a lot of work needs to be done, which would help to increase accessibility features through speech.

The optimum focus of future work can be:

Including dynamic voice commands by introducing numeric hierarchies Introducing the whole range of document object model elements to offer comprehensive voice accessible browsing solutions Reading data from pictures (and graphics) using Optical Character Recognition (OCR) techniques and offering it through speech Reading webpage data and associating related data with each other, by introducing semantic relations Reading data from objects and scripts e.g. Flash, JavaScript, VB Script and converting it to understandable format of speech Extracting and converting pop-ups, alerts and banners to useful data Handling windows through commands Making ‘settings’ changeable through voice Searching specific text within Elements.

Navigation between the web content, to get most of the information using help commands at any stage (within browsing mode), can greatly increase the ease of access for blind users. A lot of work is required to increase the voice recognition capabilities, which can enhance the scope of speech enabled systems. The proper and efficient use of numeric commands (hotlist) in Dictation Mode is also a huge task to be achieved. Dynamic grammar and improved recognition in dictation mode are some of the open options to be explored in this field. Web Accessibility through speech can be greatly improved, if all the applications, algorithms and web structure developed in compliance with the WAG guidelines. There is a need of such tools, which can help developers to build web pages and generate contents according to web accessibility guidelines making web access for blinds more convenient.

V. CONCLUSION

Although Voice Browser would support only a limited set of commands but the Speech Technology can highly increase the accessibility for specific users. By enhancing speech recognition capabilities, more generic and automatic solutions can be created in future, consequently increasing formal e-business access. Speech Technology may offer an incredible alternative to formal input and output methods; recent use of speech technology in computer games verifies this trend as well. Microsoft’s Speech Software Development Kit 5.1 provides an excellent opportunity to develop high quality speech applications, offering enormous functions, which are one step ahead of other tools. Only limited set of commands can be offered for recognition at any one time. For a command (or sentence) to be recognized, it should not be longer than 3 words i.e. if each command consists of only one word then nearly 60-85 commands can be offered else we would have only 30-50 commands to offer (see table 1). High quality voice recognition also depends on noise level, microphone being used, operating system and computer hardware. Web access for blinds through voice, in the world of e-business has a great potential as marketing tool, and would not only increase the financial worth of the company but also company’s repute for its social responsibilities. It would also reveal the Niche markets creating new windows with a huge potential in previously an untargeted e-business area. In an attempt to enhance the accessibility features for the web, Voice Browser provides prospects to increase the accessibility for windows, its applications and commands. Opening new horizons to pursue the voice access for operating system and web based technologies is indeed a huge task, which requires more research, corroboration and support from all the scientific community.

ACKNOWLEDGMENT

Authors gratefully acknowledge the support and inspiration of Dr. Qazi Mudassar Ilyas; Assistant Professor, Department of Computer Science, Comsats Institute of Information Technology, Abottabad, Pakistan.

REFERENCES

[1] A. Waibel and K.F. Lee, “Readings in speech recognition,” San Mateo, California: Morgan Kaufmann, pp. 1-3, 1990.

[2] WAI Group, W3C, Introduction to web accessibility [online]. Available from: http://www.w3.org/WAI/intro/accessibility.php [Accessed 11th

November 2008]. [3] WAI Group, W3C, Web Content Accessibility Guidelines (WCAG)

Overview [online]. Available from: http://www.w3.org/WAI/intro/ wcag.php [Accessed 11th November 2008]. [4] B.H. Rudall and C.J.H. Mann, “Advances in the development of

semantic e-business,” Kybernetes, vol. 35, no. 5, pp. 613-615, 2006. [5] H. Matlay, “E-entrepreneurship and small e-business development:

towards a comparative research agenda,” Journal of Small Business and Enterprise Development, vol. 11, no. 3, pp. 408-414, 2004.

[6] N. Bajgoric, “Information systems for e-business continuance: a systems approach,” Kybernetes, vol. 35, no. 5, pp. 628-648, 2006.

[7] C. ASAKAWA, “What's the web like if you can't see it,” ACM International Conference Proceeding Series, vol. 88, pp. 1-8, 2005.

[8] K. Fukuda, S. Saito, H. Takagi and C. Asakawa, “Proposing new metrics to evaluate web usability for the blind,” Conference on Human Factors in Computing Systems, CHI, pp. 1387-1390, 2005.

[9] National Library for the Blind (NLB), Accessibility Advice [online]. Available from: http://www.nlb-online.org/mod.php?mod=userpage& menu=61&page_id=371#content [Accessed 11th September 2008].

[10] BBC, Most websites failing disabled [online]. Available from: http://news.bbc.co.uk/2/hi/technology/6210068.stm [Accessed 28th

September 2008]. [11] K. Christian, B. Kules, B. Shneiderman and A. Youssef, “Comparison of

voice controlled and mouse controlled web browsing,” ACM SIGACCESS Conference on Assistive Technologies, pp. 72-76, 2000.

[12] BBC, My web my way [online]. Available from: http://www.bbc.co.uk/accessibility/win/seeing/talk/access/sub_3.shtml [Accessed 28th September 2008].

[13] C. Asakawa and T. Itoh, “User interface of a home page reader,” Proceedings of ACM ASSETS, ACM SIGACCESS Conference on Assistive Technologies, pp. 149-156, 1998.

[14] S . Gupta and G. Kaiser, “Extracting content from accessible web pages,” ACM International Conference Proceeding Series,vol. 88, pp. 26-30, 2005.

[15] M.F. Theofanos and G.J. Redish, “Guidelines for accessible and usable web sites: Observing users who work with screen readers,” Interactions, vol. 10, no. 6, pp.36-51, 2003.

[16] M.F. Theofanos and G.J. Redish, “Guidelines for accessible and usable web sites: Observing users who work with screen readers,” Interactions, self-published version, Redish & Associates. Available from: http://www.redish.net/content/papers/interactions.html [Accessed 10th

November 2008]. [17] S. Oviatt, “Interface techniques for minimizing disfluent input to spoken

language systems,” Proceedings of ACM CHI, pp. 205- 210, 1994. [18] C. Kouroupetroglou, M. Salampasis and A. Manitsaris, “A semantic-

web based framework for developing applications to improve accessibility in the WWW,” ACM International Conference Proceeding Series, Proceedings of the 2006 international cross-disciplinary workshop on Web accessibility (W4A): Building the mobile web: rediscovering accessibility, vol. 134, pp. 98-108, 2006.

[19] T. Charles, C.T. Hemphill and P.R. Thrift, “Surfing the web by voice,” International Multimedia Conference, Proceedings of the third ACM International conference on Multimedia, pp. 215 – 222, 1995.

[20] I.V. Ramakrishnan, A. Stent and G. Yang, “Usability and accessibility- hearsay-enabling audio browsing on hypertext content,” International World Wide Web Conference, Proceedings of the 13th international conference on World Wide Web, pp. 80-90, 2004.

[ieee 2009 international conference on e-business and information system security (ebiss) - wuhan,...

Documents