1 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
JIRKM | Journal of Information Retrieval and Knowledge
Management
Volume 1, 2011
The Journal Secretariat would like to express their heartfelt appreciation to all the reviewers and
contributors who were involved in publishing this journal.
Editorial Board
Tengku Muhammad Tengku Sembok (Prof., PhD),
Zainab Abu Bakar (Prof, PhD),
Nursuriati Jamil (Associate Professor, Dr )
Advisory Editors
Peter Bruza (QUT, AUSTRALIA)
Shalini R. Urs (ISIM,INDIA)
Executive Staff
Khirulnizam Abd Rahman
Published by
Society of Information Retrieval and Knowledge Management
Copyright © 2011 Persatuan Capaian Maklumat dan Pengurusan Pengetahuan - PECAMP
All rights reserved. No part of this publication may be reproduced in any form or any means
without prior permission from the copyright holder.
2 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Content
1. A CONCEPTUAL CONTENT ANALYSIS ON IT CAREER
OPPORTUNITIES IN MALAYSIA
Dayana Azahar, Norehan Abdul Manaf, Nalini Dharmarajan and Wan Adilah
Wan Adnan
2. KMS COMPONENTS FOR COLLABORATIVE SOFTWARE
MAINTENANCE - VALIDATING THE QUESTIONNAIRE ITEMS’
CONSTRUCT
Mohd Zali Mohd Nor, Rusli Abdullah, Masrah Azrifah Azmi Murad and Mohd
Hasan Selamat
3. AN EXPLORATORY FACTOR ANALYSIS ON INFORMATION
SECURITY RISK IN ICT OUTSOURCING
Nik Zulkarnaen Khidzir, Noor Habibah Arshad and Azlinah Mohamed
4. AUTOMATIC LEXICON GENERATOR: AN ALGORITHM
Kasturi Dewi Varathan, Tengku Mohd. Tengku Sembok and Rabiah Abdul
Kadir
5. KNOWLEDGE TRANSFER APPLYING THE STRUCTURAL
COMPLEXITY MANAGEMENT APPROACH
Maik Maurer
6. TAXONOMY OF K-PORTAL FOR INSTITUTE OF HIGHER LEARNING
IN MALAYSIA: A CASE STUDY OF UNIVERSITI SELANGOR (UNISEL)
Haslinda Sutan Ahmad Nawi, Shireen Muhammad, Nur Syufiza Ahmad Shukor
and Mohd Lezam Lehat
3
15
28
46
56
70
3 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
A CONCEPTUAL CONTENT ANALYSIS ON IT CAREER
OPPORTUNITIES IN MALAYSIA
Dayana Azahar, Norehan Abdul Manaf, Nalini Dharmarajan, Wan Adilah Wan Adnan
Faculty of Computer and Mathematical Sciences
Universiti Teknologi MARA
Selangor, Malaysia
[email protected] , [email protected],
[email protected], [email protected]
Abstract— The knowledge and skills of IT workforce are important in the coordination,
communication and information processing of the organization so that the organization can be
run in a systematic and organized manner. As IT evolves and becomes cheaper and more
powerful, there are abundant opportunities for skilled and qualified IT workers in organizations.
This paper analyses the job opportunities, job skills, qualifications and working experiences that
are sought from IT applicants. Conceptual content analysis was applied in this survey whereby
newspaper advertisements that appeared in the New Straits Times between the years of 2000 and
2008 were analyzed. The results of the analysis for these nine years show several changes. The
findings show that job position for IT management category has the highest percentage of
advertisements while web programming has dominated in job skill requirement. Degree holders
are mainly sought after in the job market whereas their experience is not specified in most of the
advertisements.
Keywords- content analysis; IT workforce; IT career; IT skills
1. INTRODUCTION
Information Technology (IT) or Information Communication Technology (ICT) has proved its
importance in today’s world especially in organizations where the knowledge and skills of IT
professionals are considered necessary to run an organization in a systematic and organized
manner. IT has a great effect in the coordination, communication and information processing of
the organization in which the use of technology reduces the costs and time of the whole process.
As technology becomes cheaper and more powerful, organizations will be more willing to
implement IT in the organization. This implementation will further result in the increase of job
opportunities in IT areas.
However, IT has undergone lots of changes from the day when it was introduced till the
present where technology has become ubiquitous to our daily life. As technology changes, the
economy also changes. To adapt to the economic changes, organization also should redesign the
technical and the process sides in order to stay competitive. Labor demand is one of the aspects
that will respond to these changes. The skill, knowledge and ability of a worker that are needed
by an organization depend on the evolution of technology or in this context, specifically
information technology. The rapid evolvement of technology also promises that new skill and
4 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
knowledge will be demanded by the labor market. This analysis is about the current demand of
IT labor market including the job title, skill and other requirements needed.
2. THE CONCEPT OF IT WORKFORCE AND IT SECTOR
There are a few studies that have been done that tries to give meaning to IT workforce and IT
sector. One of the research is by Kuh (1999) in which it was stated that there are three
approaches that may be taken to define IT sector that are:
a) By Industry-based–What industries actually make out the IT industry as a whole? IT
industry could consists of the businesses that involve directly with the production of IT
products or services that is the core IT-based companies and also could consist of other
businesses that support the use of IT but not directly involved in creating, developing,
maintaining and managing information technology.
b) By Occupation–More workers claim that they are employed in IT occupation rather than in
IT industries. As other industries also need the use of information technology and have the
IT departments of their own, the number of IT workforce could not be looked on the
industry side only. Also according to Ellis and Lowell (1999), the automation of business
process has circulated throughout the economy and IT workers are found everywhere, not
just in the IT industry.
c) By Education–This final approach is by counting people who have been educated in the core
IT fields. This approach is not accurate because not all IT graduates are working in IT
industries and not all workers in IT industries had received degree in IT.
However, Kaarst-Brown and Guzman (2005) have found a few difficulties in defining the
workforce and finally come out with “IT workforce should be treated as a general occupational
work with several specialties that frequently require the acquisition of new IT skills or
knowledge from a variety of formal and less formal sources”.
Technologies have shown rapid changes. The roles of IT workforce should also changes
comparable with the new emerging technology and knowledge. Guzman and Kaarst-Brown
(2004) cited Niederman and Trower (1993) in their research on the role of IT workforce saying
that the role of IT professionals were shaped by different industries. There are three views
provided to define the roles of IT professionals:
a) Agency Theory View – This theory view IT workforce role as can reduce the cost of internal
management that will further affect the outcomes. IT professionals are responsible to
configure and setup technology for better control and monitor of organization. The IS
application should also be integrate to support supervision.
b) Transaction Cost Theory – In this theory, IT professionals are responsible to make sure the
information system provided can facilitate transactions effectively. The roles also include
selecting the right IT that can reduce the cost of transactions, making negotiations under
outsourcing conditions and maintaining and administer the outsourced IS application.
c) Institutional Theory View – From the view of Institutional Theory, IT professionals’ role is
to use IT to enable adaptation to structure and to maintain authenticity in the environment.
Other than that, IT professionals also accountable to implement IT that supports Institutional
authenticity and conformity maintain the standard and formats of the institution and also to
integrate IS applications.
5 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
d) The previous explanation shown that there is no universal definition for who are actually can
be called as IT workforce claimed Wright (2009).
2.1 IT Career in Malaysia
Careers in IT or ICT in Malaysia started as the first usage of computers in the 70’s. In the
beginning, it was known as the computer field, before it was renamed to information technology
field (IT). The job title of IT professionals also changed over time and as technology developed.
The first job title for IT professionals in the government sector was “Juruanalisa Rangka” or
Framework Analyst. Then the title of System Analyst was adopted and now, the job title for IT
professionals in the government sector is “Pegawai Teknologi Maklumat” (Information
Technology Officer) wrote Ali (2008).
Figures obtained from the Department of Statistics, Malaysia, shows the number of ICT
related workers in Malaysia during the years 2000, 2001 and 2002 cited by Kuppusamy, Raman,
& Lee (2009). The majority of workers are employed in software consultancy and supply as
shown in Table 1 below.
TABLE 1: THE NUMBER OF ICT-RELATED WORKERS IN MALAYSIA ACCORDING TO CATEGORIES
ICT Category 2000 2001 2002
Hardware
consultancy services 2,689 1,616 1,471
Software
consultancy and
supply
7,642 10,220 13,799
Data processing
services 809 831 2,645
Database activities 472 253 683
Maintenance and
repair of computers 628 621 1,430
2.2 The Demand for IT Workers
Most of the research that has been done previously show that the demands for IT workers
generally are increasing. Veneri (1998) states that the main contributing factor is as the
computing power increase, the price and cost of technology will fall, and further will result in
many firms and organizations seeking the use of information technology in their organizations.
As the number of organizations using IT increased, the numbers of IT workers needed in order to
design, develop, implement, support and manage the technology have also increased. Another
reason is the inventions of IT, after many years of playing out and involving many firms, has
constituted technical change that also affect the labor demand. Bresnahan, Brynjolfsson, and
Hitt (2000) pointed out that the continuation changes of IT, on the other hand, contribute to the
lasting shift in IT labor demand.
Computer-based job-market also said to be grown much faster than average for all
occupations. This is because organizations had continuously implemented and integrate latest
6 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
technology according to the U.S. Department of Labor: Bureau of Labor Statistics (2004).
Technology had changes in rapid pace and so do organizations that are using IT. Thus, the
demand for IT workforce had keep on increasing. As businesses integrate new technology, IT
and non-IT companies are now competing for the same workers wrote Veneri (1998). The
research done by Zwieg, Kaiser, Beath, Bullen, Gallagher, Goles et al. (2006) has shown that the
overall result of their research indicate that instead of downsizing the IT workforces, most IT
departments are growing up the quality and quantity of those workers.
The trend of employment in IT industry will continue to experience changing as new
technology will emerge which will bring together new skills and new job opportunities,
eliminating the old ones. However, because of the widespread use of IT across all industries, the
demand for IT workforce is growing in all industries, not just in those industries that are
considered high-tech according to Veneri (1998). Autor, Katz, and Krueger (1998) meanwhile
conclude that the spread of computer used not only affect the demand for computer users and
technicians, but also affect the demand for employee with various skills as technological change
altered the work organization as well.
3. METHODOLOGY
This study adopts conceptual content analysis based on proposed model by Busch, De Maret,
Flynn, Kellum, Meyers, et al. (2005) which consists of 8 category coding steps. Busch et al.
(2005) define the content analysis as a tool that is use to identify the presence of certain words
within texts or set of texts. Gallivan, Truex III, and Kvasny (2004) specify content analysis as “a
technique used for making replicable and valid inferences from data to their context”. The focus
of content analysis is to analyze the existence and frequency of concepts under study which is the
form of words. The model from Busch et al. (2005) has proposed eight category coding steps
according to Carley (1993). The eight steps are:
a) Decide the level of analysis
b) Decide how many concepts to code for.
c) Decide whether to code for existence or frequency of a concept.
d) Decide on how the concepts will be distinguished.
e) Develop rules for text coding.
f) Decide what to do with “irrelevant” information.
g) Code the texts.
h) Analyze results.
There are two categories of content analysis method which are conceptual analysis and
relational analysis. Conceptual analysis is the most regular type of content analysis used. It
demands the researcher to analyze the existence and frequency of certain concepts, usually in the
form of words. Relational analysis on the other hand, is an analysis to examine the relationships
among concepts in texts.
This project employs the use of conceptual analysis of job advertisements in newspapers.
According to Litecky, Prabhakar, and Arnett (2006), this method had been used since the early
1990’s when the researcher have sampled job advertisements systematically. Litecky et al.
(2006) state that at least 80% of job seekers use job advertisements to search for available
position. It is the simplest way of scanning for latest job available. On the employer side, job
7 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
advertisement in newspapers ensured that the advertised position can reach wide populations .
Gallivan et al. (2004) had also applied content analysis research method in IT area to find the
changing pattern in IT skill sets from year 1988 till 2003.
In this study, the concepts are IT job position, IT job skills, IT job qualifications and IT
working experience.
3.1 Data Source
After lots of time spent surveying for the suitable newspaper, it has been decided that The New
Straits Times is the appropriate source. The New Straits Times is the oldest English-language
newspaper in Malaysia. It was established on July 15, 1845. Because of the long history, The
New Straits Times have a huge archive that records every event and news in this part of the
world (New Straits Times 2008). The newspapers that are going to be examined are Saturday
publications of New Straits Times for every week of the months during the years of 2000-2008.
The newspapers are retrieved from Malaysia National Library. Some of it is in the form of
hardcopy that is in the form of full newspaper while some of it has been compressed to
microfilm form. The reasons for choosing the above criteria are:
The New Straits Times have a specific column for job advertisements by several
companies including multinational companies. The column is called Appointments.
The Appointments column for Saturday publications are a compiled form of job
advertised throughout the week. This means, all the jobs that have been advertised along
the week will be compiled and advertised once again on Saturday publications. Because
of that, the job advertised for the week can be examined just by checking the Saturday
issues.
The Appointments columns are compiled by NSTP Press alongside with CompuTimes.
CompuTimes is the leading publication of information technology (IT) in Malaysia. It
published IT-related news from local and foreign IT industry every week and was aimed
for groups that follow the IT industry progress especially IT professionals and IT
students.
3.2 Data Collection
The data collection process is done by examining the classified advertisements and code for any
IT job positions found. Other than the job position, the skills, qualifications and experience that
are also mentioned in those advertisements are coded as well. The targeted issues for the coding
are issues that were published every Saturday of the week from the first week of January 2000
until the last week of September 2008. The advertisements for the week after that are not coded
because the format of the advertisement has been changed. Even so, some of the issues were not
in the possession of Malaysia National Library during the process of data collection. Therefore,
the target of 456 data cannot be obtained. The amount of issues that could not be obtained is 4,
making the total of data obtained as 452.
The number of job position advertised per issue varies. Table 2 shows the number of issues
examined and the number of job positions advertised. The number of issues for each year is not
8 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
consistent as some of the issues could not be found during the data collection process and for the
year 2008, the data were collected until the month of September.
TABLE 2: THE NUMBER OF ISSUES AND THE NUMBER OF JOB POSITIONS ADVERTISED ACCORDING
TO YEAR
Year Number
of issues
Number of
job position
advertised
Average
number of
job per
issue
2000 53 248 4.68
2001 52 216 4.15
2002 52 178 3.42
2003 52 108 2.08
2004 52 276 5.31
2005 53 208 3.92
2006 52 166 3.19
2007 48 178 3.71
2008 38 89 2.34
Total 452 1667 3.69
Table 2 shows the total number of job positions that have been coded that are a total of 1667
job positions in 452 issues. The job positions advertised here only included job positions that are
considered in IT areas and not for the whole advertisements. The average number of jobs
advertised along 2000 until September 2008 is as much as 3.69. However, the average number
of job positions advertised per issue according to year varies. Year 2004 indicates the highest
average number of jobs advertise per issue that is 5.31. Meanwhile, 2003 have been the year
with lowest number of jobs position advertise per issue that is 2.08. The other years shows the
average ranging from 2.34 to 4.68. However, the average number of job advertised since 2004
has declined.
4. RESULT
4.1 Changes in Job Position
The job positions are coded according to category of IT job position established by Bureau Labor
of Statistics (BLS), United States. The job categories are; computer programmers, system
analysts, computer engineers, database administrators, computer support specialists, and other
computer scientists. However, due to many IT management jobs that were advertised for IT
graduates, the researcher feels that there is a need to also include IT management as one of the
category (see Table 3).
9 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
TABLE 3: THE NUMBER OF JOB POSITION ADVERTISED BY CATEGORY
Job Position Category 2000 2001 2002 2003 2004 2005 2006 2007 2008 Tota
l
Computer Programmers 34 24 34 12 31 20 12 32 10 209
System Analysts 52 32 26 22 39 26 19 19 5 240
Computer Engineers 22 12 18 4 33 22 31 20 17 179
Database Administrators 2 10 2 2 9 4 9 15 1 54
Computer Support
Specialists
24 18 22 12 17 20 19 26 15 173
Other Computer
Scientists
68 52 22 26 73 41 34 30 19 365
Management 46 68 54 30 74 75 42 36 22 447
Total 248 216 178 108 276 208 166 178 89 1667
FIGURE 1: THE PERCENTAGE OF JOB POSITION ADVERTISED BY CATEGORY
Figure 1 presents the percentage of job opportunity according to categories and years. There
seems to be no definite trends among those job categories. The reason would be because the
time interval of the classified advertisements that were examined is short. Between the year of
2000 and 2008, there are no radical impacts that can be seen in IT work sectors.
0%
5%
10%
15%
20%
25%
30%
35%
40%
Computer programmers System Analysts
Computer Engineers Database Administrators
Computer Support Specialists Other Computer Scientists
Management
10 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
The job that is continuously in demand between 2000 and 2008 is the management categories.
The percentages change radically between the year 2000 and 2001 in which the increase in
percentage is almost 13%. The stable percentages then were maintained until 2008 with the
percentages ranging from 20.22% in 2008 and 36.06% in 2005. As the organizations
implements more IT technology to the firm’s process, more IT works that are related to
management will be created. Unconsciously, the management works also consists of some
technical jobs although not directly.
However, database administrators’ category seems to be lack of demand. Starting with 0.81%
in 2000, the category is experiencing up and down demand throughout 2001 to 2008. The
highest percentage was 8.43% in 2007 while the year with the least percentage is during 2000
with 2 jobs positions. However, in 2008, this category has 1.12% with 1 job position only. The
difference is because the total number of all job categories in 2008 is lower than those in 2000.
If looked at the number of job positions advertised, the number only ranging from 1 job position
to 15 jobs positions for database administrators. Unlike management category that has wide
range of jobs positions included in it, the jobs positions for database administrators are narrow.
Database administrators consist of workforce that set up, test and coordinate changes to
databases. Besides that, databases administrators also responsible in determining ways to
organize and store data.
Another job category that has to be discussed here is system analysts’ category. Figure 1
shows that the job category has experiencing constant decrease in demand. The percentage of
system analysts’ job opportunities in the year of 2000 is quite high, that is 20.97% but during the
year of 2008, the job positions for system analysts have become 5.62%. The demand for system
analysts were first decline during 2000, 2001, and 2002. But the year 2003 bring back the
demand for system analysts. The increase between 2002 and 2003 is almost 6%. However, the
increase does not last long. The demand for system analysts were gradually decline until what
was left in 2008 is only as much as 5.62%.
4.2 Changes in IT Job Skills, Qualifications and Working Experiences
The IT job skills in this project are coded using the queries that have been done by Prabhakar,
Litecky, and Arnett (2005). The queries are done to search which skills or programming
languages belongs to which groups. The coding however were added with one more group that
is networking. The coding of job skill only focus on the IT technical side.
There are 15 groups of IT skills that were coded in Table 4. The total number of 2248 IT
skills were mentioned in 1667 jobs advertised during year 2000-2008. There are some job
advertisements mentioned IT skills that are too general such as programming skills and database
knowledge. Those general skills were not coded by the researcher.
11 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
TABLE 4: THE NUMBER OF IT SKILLS ADVERTISED BY CATEGORY
IT Skill
Category
2000 2001 2002 2003 2004 2005 2006 2007 2008 Total
Web
Programming
46 52 48 60 52 36 36 22 15 367
Java 9 26 16 8 20 18 18 26 6 147
C/C++ 18 36 24 14 28 19 12 15 11 177
Unix 20 22 26 14 30 19 10 21 10 172
Windows 36 2 36 4 26 19 16 22 23 184
Oracle 15 42 12 12 19 13 6 16 7 142
SQL
Programming
15 22 18 26 21 21 6 20 8 157
SQL Server 8 3 18 2 20 7 12 17 5 92
Visual Basic 20 18 24 26 12 21 12 22 6 161
.NET
Development
10 3 3 2 6 3 8 9 1 45
ERP 11 16 14 10 43 14 24 33 14 179
Linux 1 2 6 12 17 8 12 16 3 77
e-Commerce
Server
0 26 0 0 0 2 0 1 0 29
Certification 0 4 2 2 7 11 17 19 7 69
Networking 29 32 40 12 49 27 34 10 17 250
Total 238 306 287 204 350 238 223 269 133 2248
From Figure 2, a few analyses of six IT skills that were most advertised can be done. For seven
consecutive years, Web Programming has dominated the highest percentages of job skills with
the wide range of popular programming languages included in this group such as HTML, XML,
ASP, JavaScript etc. Web programming skills have been mentioned for 367 times.
Nevertheless, for 2007 and 2008, the top percentage is taken by ERP in 2007 and Windows for
2008. The percentage of demand for web programming also has gradually decline. The year
2003 noted the highest percentage of demand for web programming skills that are as much as
29.41%. Since that, web programming has not observed to have more than 20%. Instead, in
2007, the percentage demand for web programming skill only observed to be 8.18%.
12 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
FIGURE 2: SIX IT SKILLS THAT WERE MOST ADVERTISED ACCORDING TO
CATEGORY
Another noticeable change happens with ERP skills. ERP skills consist of skill such as SAP,
ERP, PeopleSoft, JD Edwards or Oracle Financials. The skills started off with 4.62% in 2000
and have been staying around this number for the next three years. However, in 2004, ERP skills
have increase significantly, resulting in 12.29% during that year. The skills then decline to
5.88% in 2005 but later on increase to the percentage number as such that had happened in 2004.
This ERP skill demand can be associated with the implementation of ERP system in lots of firm
and the growth number of ERP consulting firm that require workers that possess this skill.
Besides that, demand for Linux skill has also increased. The slow increase of Linux started
with 0.42% and 0.65% in 2000 and 2001 has become 2.09% in 2003. The demand for Linux
skill has experiencing stable growth since that. Lastly, it was found that certifications have
become increasingly demanded in applicants nowadays. Among the certifications that are most
demanded are such as MCSE, CCNA, CCNP and IBM. The highest demand was on 2006
(7.62%) and has been hovering around that percentages since that.
As expected, for qualifications, the number of bachelor degree holder is the highest
qualification needed followed by job that require diploma/degree holders. However, at the same
time, the number of advertisements that did not specify the qualifications of the applicants are
also in large proportion. The advertisements that did not specify applicants’ qualification can
represent any of the qualifications above. The applicants with higher qualifications than
bachelor degree seems not in large number.
Large proportions of advertisements did not specify the number of working experiences that
the firm needed. It is put under “did not mention” category because it is difficult to determine
whether the firm need applicants without working experience or the firm simply did not bother to
0%
5%
10%
15%
20%
25%
30%
35%
20
00
20
01
20
02
20
03
20
04
20
05
20
06
20
07
20
08
Web programming Windows
ERP Linux
Certification Networking
13 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
specify it. The advertisements for applicants with 1-2 years of experience and 3-5 years
experiences are most sought by employer. While the applicants with more than 8 years
experiences has less demand. Demand for fresh graduates however, is in the middle with 43
advertisements out of a total of 1667 job positions.
5. CONCLUSION
This study adopts conceptual content analysis using job advertisements in the New Straits Times
from year 2000 till 2008. The findings from this study provides an insight on the trends of IT job
markets in Malaysia regarding job positions, job skills, qualifications and experiences. The
findings show that job position for IT management category has the highest percentage of
advertisements while web programming has dominated in IT skill requirement. Degree holders
are mainly sought after in the job market whereas their experience is not specified in most of the
advertisements. The development and promotion of the Multimedia Super Corridor (MSC) may
contribute to the changing trends in the IT job market. However , further research need to be
conducted using data from other mainstream media as well as online job search portals to further
strengthen the analysis in order to give a comprehensive view.
REFERENCES
Ali, M. (2008). Kerjaya dalam Bidang ICT, Kuala Lumpur: PTS Islamika Sdn Bhd. Autor, D. H., Katz, L. F., & Krueger, A. B. (1998). Computing inequality: Have computers
changed the labor market? Quarterly Journal of Economics, p1169-1213. Bresnahan, T. F., Brynjolfsson, E., and Hitt, L. M. (2000). Information technology, workplace
organization, and the demand for skilled labor: Firm level evidence. p1-43.
Busch, C., De Maret, P. S., Flynn, T., Kellum, R., Le, S., Meyers, B., et al. (2005). Content
analysis. Retrieved April 29, 2009, from Writing@CSU:
http://writing.colostate.edu/guides/research/content/printformat.cfm?printformat=yes
Carley, K. (1993). Coding choices for textual analysis: A comparison of content analysis and map
analysis. Social Methodology, Vol.23, 75-126.
Ellis, R., and Lowell, B. L. (1999). Assessing the demand for information technology workers. IT
Workforce Data Project: Report IV , 1999, p1-4. Gallivan, M. J., Truex III, D. P., & Kvasny, L. (2004). Changing pattern in IT skill sets 1988-
2003: A content analysis of classified advertising. The DATA BASE for advances in information systems, p64-87.
Guzman, I. R., and Kaarst-Brown, M. L. (2004). Organizational survival and allignment: Insight
into conflicting perspectives on the role of the IT professional. SIGMIS '04.
Kaarst-Brown, M. L., and Guzman, I. R. (2005). Who is "the IT Workforce"? Challenges facing
policy makers, educators, management and research. SIGMIS-CPR '05, p1-8.
Kuh, C. V. (1999). InformationTechnology Workers in the knowledge-based economy, National
Research Council, Washington DC, p65-76.
Kuppusamy, M., Raman, M., & Lee, G. (2009). Whose ICT investment matters to economic
growth: Private or public? The Malaysian perspective. The Electronic Journal on Information
Systems in Developing Countries, 1-18. Litecky, C., Prabhakar, B., & Arnett, K. (2006). The IT/IS job market: A longitudinal perspective.
SIGMIS-CPR '06, p50-52.
14 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
New Straits Times (2008). About Us. Retrieved April 29, 2009, from The New Straits Times Online: http://www.nst.com.my/Current_News/NST/corpabout.htm
Niederman, F and Trower, J. (1993). Industry influence on IS personnel and roles. ACM Special
Interest Group on Computer Personnel Research. ACM Press, New York, pp. 226-233.
Prabhakar, B., Litecky, C. R., & Arnett, K. (2005). IT skills in a tough job market.
Communications of the ACM, p91-94. U.S. Department of Labor: Bureau of Labor Statistics. (2004, May 18). Occupational outlook
handbook: Computer system analysts, database administrators and computer scientists. Retrieved July 13, 2004, from U.S. Department of Labor: Bureau of Labor Statistics: http://stats.bls.gov/home.htm
Veneri, C. M. (1998). Here today, jobs of tomorrow: Opportunities in information technology.
Occupational Outlook Quarterly, p1-15.
Wright, B. (2009). Employment, trends, and training in information technology, Occupational
Outlook Quarterly , p34-41. Zwieg, P., Kaiser, K. M., Beath, C. M., Bullen, C., Gallagher, K. P., Goles, T., et al. (2006). The
information technology workforce: Trends and implications 2005-2008. MIS Quarterly Executive, p101-108.
15 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
KMS COMPONENTS FOR COLLABORATIVE SOFTWARE
MAINTENANCE - VALIDATING THE QUESTIONNAIRE ITEMS’
CONSTRUCT
Mohd Zali Mohd Nor, Rusli Abdullah, Masrah Azrifah Azmi Murad and
Mohd Hasan Selamat
Universiti Putra Malaysia, Serdang, Selangor, 40150
Abstract. Software maintenance (SM) environment is highly complex, knowledge-driven
and collaborative environment. Therefore, a Knowledge Management System (KMS) is
critical to ensure various parties have the necessary information to perform SM activities. In
an effort to model the requirements for sharing and sustaining knowledge in SM
environment, we review literatures on Knowledge Management (KM), KMS and SM
frameworks to identify the knowledge components, tools and technologies. An initial model
of KMS components in collaborative SM is proposed, to be verified vis-à-vis a questionnaire
survey. Before the questionnaires are sent out, the questionnaire items are validated to
ensure the item constructs are acceptable and could be used as a tool to determine the
important KMS components in collaborative SM. A pilot study was conducted to identify
misfit questionnaire items and respondents. Due to ordinal Likert scale data, small sample
size, and some missing data, Rasch Measurement method is used to analyze the pilot data.
Item reliability is found to be poor and a few respondents and items are identified as misfits
with distorted measurements. As a result, some problematic questions are revised and some
predictably easy questions are excluded from the questionnaire.
Keywords: Knowledge Management System, Software Maintenance, Collaborative
Environment.
1. Introduction
Knowledge Management System (KMS) is critical to software maintenance (SM) due to highly
complex, knowledge-driven and collaborative environment. Software maintenance (SM) has yet
to receive proper attention (Pigoski, 2004). It is a costly process, where previous works estimated
SM costs of between 60% to 90% of total software life cycle costs (Fjeldstad & Hamlen, 1998;
Lientz & Swanson, 1981; Pigoski, 2004; Schach et al., 2003. SM organizations often have
problems identifying resources and use of knowledge (Rodriguez et al., 2004). Some of these
skills and expertise are documented as explicit knowledge, but more are hidden as tacit
knowledge due scarcity of documentation (Santos et al., 2005). Therefore, maintainers have to
collaborate with colleagues and other parties to obtain various information to enable them to
carry out their software maintenance tasks.
In addition, domain knowledge are becoming more important to software maintainers but are
seldom stored in KMS or other electronic means. Often, maintainers have to rely on experts and
also codes to understand the details (Das et al., 2007; Wilson, 2002). While domain knowledge
are important to maintainers, they are lacking, not stored properly or not readily available
(Santos et al., 2005). As such, maintainers, especially newcomers, spend a lot of time searching,
collaborating and understanding these knowledge. Many SM tools are still not integrated to
allow seamless information combination, which hampers information acquisition and sharing.
16 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Managing knowledge in this area is therefore critical to ensure that maintainers can perform SM
activities properly and timely, by sharing and obtaining vital knowledge. To address the above
issues of knowledge within SM environment, a KMS framework shall be proposed. KMS
framework is required to ascertain that KM requirements are fulfilled, includes the necessary
conceptual levels and support integration of the individual, team and organizational perspectives
(Ulrich, 2002).
Software maintenance (SM) is defined as “The totality of activities required to provide cost-
effective support to software system. Activities are performed during the pre-delivery stage as
well as the post-delivery stage” (IEEE, 2006). The knowledge flow, like a river, starts from
several streams or sources (call tickets, bugs reports, testing bugs, request for enhancements,
etc.), the inputs merge into the maintenance request mainstreams which go through many plains
before ending at a delta of changed objects, released applications and changes to domain rules.
To support these flow of maintenance information, several SM governance activities and tools
are required. Tools such as Helpdesk, Software Configuration Management (SCM), Source
Control, Project Management and others allow the team and organization to monitor and control
the processes. In addition, collaborative tools and platform allows users and maintainers to
communicate, cooperate and coordinate the required information to ensure good maintenance
process.
Meanwhile, knowledge is defined as “a fluid mix of framed experience, values, contextual
information and expert insight that provides a framework for evaluating and incorporating new
experiences and information. It originates and is applied in the mind of knowers. In
organizations, it often becomes embedded not only in documents and repositories but also in
organizational routines, processes, practices and norms.” (Davenport & Prusak, 2000). In
technical perspective, KM is defined as the strategies and processes of identifying,
understanding, capturing, sharing, and leveraging knowledge (Abdullah et al., 2006; Alavi &
Leidner, 2000; Davenport & Prusak, 2000; Selamat et al., 2006). One of the major challenges is
to facilitate flow of knowledge not only within individual, but also within teams and
organizations (Alavi & Leidner, 2000). To transform KM framework into a system, KMS framework is conceptualized. KMS is
defined as “I.T-based system developed to support and augment the organizational process of knowledge creation, storage, retrieval, transfer and application” (Alavi & Leidner, 2000). In general, a KMS framework consists of influential factors of KMS initiatives and their interdependent relationships and a model of KMS implementation (Foo et al., 2006). However, systems and technology alone does not create knowledge (Davenport & Prusak, 2000), various other social “incentives” and organizational strategy and culture are often required to stimulate use of technology to share knowledge.
To date, there is no comprehensive KMS framework in the area of collaborative SM. As such,
this research shall focus on building the framework, based on the components from intersection of existing KM, KMS and SM areas. To determine which components are pertinent and important to SM environment, a questionnaire survey is planned, to be distributed to maintainers and users from several SM organizations. However, as a measuring instrument, the face validity and construct validity of questionnaire items have to be ascertained. The aim of this paper is to address issues of construct validity in the questionnaire items to determine the important
17 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
components of KMS in SM environment, from the perspective provided by Rasch measurement. A pilot study is used to analyze the reliability of the questionnaire items and to offer suggestions on how to improve them.
This paper is structured as follows: Section 2 reviews literatures related to KMS in SM and
other methods to formulate and validate the framework constructs. Section 3 discusses the methodology for the study. Then, section 4 discusses the KMS dimensions and components for collaborative SM, and Section 5 elaborates the analysis and findings.
2. Related Works
2.1. KMS Frameworks in SM
KMS for SM has been studied since late 1980s by Jarke and Rose (1988), who introduced a
prototype KMS to control database software development and maintenance, mainly to facilitate
program comprehension. Latter researches by Kitchenham et al. (1999), Dias et al.(2003),
Viscaino (2004), Anquetil & Oliviera (2006), Witte at al.(2007), Kneuper (2001) and Rodriguez
et al. (2004) only discussed issues related to the required knowledge, KMS techniques and KMS
process. Looking at the wider perspective of software engineering (SE), KMS in SE have been studied
by many, including Santos et al. (2005) and Rus & Lindval (2001). Rus and Lindval described the three main tasks of SE (individual, team and organization) and identified the three level of KM support for each task. The 1st level includes the core support for SE activities, document management and competence management. Meanwhile, the 2nd level incorporates methods to store organizational memory using method such as design rationale and tools such as source control and SCM. The 3rd KM support level includes packaged knowledge to support Knowledge definition, acquisition and organization. Although the illustrative model is not given, the above should describe the KMS framework for SE.
However, the above frameworks or models did not consider the social, psychological and
cultural aspects of KM, as identified by the previous other generic KMS frameworks. Therefore, the main motivation for this study is to formulate a more detailed KMS framework for Collaborative SM environment. The long-term goal of this study is to formulate a tool to support and automate KM tasks within collaborative SM environment. As such, the KMS framework shall place more emphasize on the technological perspective.
2.2. Validating Framework Construct There have been lingering questions on how do researchers establish the validity of their KMS
frameworks or models. Whilst some are using case studies and observation (Rodriguez et al., 2004; Vizcaino et al, 2004), a few are using survey questionnaire to gauge which components are important (Alavi & Leidner, 2000; Dingsoyr & Conradi, 2002). To ensure construct validity of the tools, most researchers depends on expert opinions to validate the questionnaire items. However, none of the above studies use quantitative statistics to determine the construct validity of the tools.
18 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
In other domain, however, validations of construct validity in psychological and educational assessment areas are plentiful, using the perspective provided by Rasch measurement model (Bond, 2003). Whilst the usage of Rasch often deals with competency evaluation on people or objects, the usage could also be extended to evaluate another critical element of research instrument - construct validity (Aziz et al., 2008). It is our intention to apply similar method to validate the construct validity of the questionnaire items used to determine the important KMS components in collaborative SM environment.
3. Methodology
To formulate the KMS framework for Collaborative SM, the steps used are depicted in
Fig. 1. Analyses are mainly conducted using Rasch measurement model (Rasch). Rasch was
also selected for its suitability to handle ordinal Likert Scale data, small sample size, and some
missing data. Rasch is a probabilistic model that uses ‘logit’ as the measurement units, by
transforming the ordinal data and raw scores into a linear scale (Bond & Fox, 2007). Being
linear, Rasch enables us to conduct more constructive analyses, including item and respondent
statistics, reliability of respondent and item data, respondent outliers and item outliers. In
addition, Rasch could also determine if the questions are ‘difficult’ (i.e. hard to answer
questions) and if the questions are mundane and trivial (i.e. too easy) that perhaps the questions
could just be left out.
Review the existingKM and KMSframeworks
List KMS aspectsand components
Develop initial setof questionnaires
Send questionnaireto pilot group
Refine questionnaireswith experts
Analyze pilot datausing Rasch
Revise Questionnairebased on Rasch output
Fig. 1 Methodology
4. Discussion
After reviewing the existing various KM and KMS frameworks in CSM area, the following
aspects and the respective components were identified. The aspects and components are
19 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
summarized in Table 1 below:
Table 1 Summary of KMS Aspects and Components for Collaborative SM
Aspects Components Sources
Required
knowledge
Organizational knowledge, managerial
knowledge, technical knowledge,
business domain knowledge and
knowledge on source of knowledge,
Ghali (1993), Rus and Lindval
(2001), Dias et al. (2003) and
Rodriguez et al. (2004)
KM Activities Acquiring knowledge, Selecting
knowledge, using knowledge, Providing/
Creating knowledge and Storing
knowledge.
Nonaka and Takeuchi (1995)
and Holsapple & Joshi (2002)
SM
governance
tool
Helpdesk, SCM, Source Control and
Project Management (PM)
Rus and Lindval (2001), IEEE
14764 and Mohd Nor et
al.(2008)
KMS
Components
and
Infrastructure
Computer-mediated collaboration,
Experience Mgmt System, Document
Management, KM portal, EDMS, OLAP,
and Middleware tools.
Abdullah et al. (2006), Meso
& Smith (2000), Dingsoyr &
Conradi (2002), and Rus and
Lindval (2001)
Automation
and
knowledge
discovery
tools
GDSS, Intelligent Agents, Data
mining/warehouse, Expert system and
Case-Based Reasoning (CBR). Active
tools such as RSS are also useful to get
the right knowledge to the right users at
the right time.
Meso and Smith (2000),
Abdullah et al. (2006),
Rodriguez et al. (2004)
KM
Influences
Managerial influences and strategy, and
psychological and cultural influences.
Holsapple and Joshi (2002)
and Abdullah (2000)
Among the above, which of these components affect the KMS and KM activities in
Collaborative SM? To answer the above questions, questionnaires are drafted to access importance of the following item areas:
General KM And SM knowledge
Required knowledge, such as organizational knowledge, managerial knowledge, technical knowledge, business domain knowledge and knowledge on source of knowledge are important in SM processes.
SM Governance tools, as enablers in knowledge activities
Based on IEEE 14764, SM processes include process implementation, problem and modification analysis, review and acceptance, development/modification, migration and retirement. Whilst KM components deals with converting tacit to explicit and then storing the knowledge, the SM governance tools are used to manage the flow of information and knowledge relating to the SM daily activities and processes. Tools such as Helpdesk application, SCM, source control and project management are used throughout the SM processes and hence provides a good deal of input to the knowledge contents. The questions are – how well and important are these tools in the KM activities (acquiring, selecting, using, providing/creating and storing knowledge).
KMS foundation components and infrastructure
20 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
KMS foundation components include, among a few, computer-mediated collaboration, Experience Mgmt System, Document Management, KM portal, EDMS, OLAP, and Middleware tools and Knowledge Map. While collaboration tools allow users to impart and share both tacit and explicit knowledge synchronously and asynchronously, the other tools are useful to search and extract available explicit information. In many aspects, it involves establishing knowledge ontology, mapping/linking the knowledge and validating the knowledge map.
Automations and automation tools
Automation speeds up and assists maintainers in their daily activities. Technologies such as knowledge-map, CBR, expert system, agent technology and RSS are useful to assist users and maintainers to get the right knowledge at the right time.
Managerial influences and strategies to the KMS activities and processes?
Managerial influences such as leadership, coordination, control and measurement may affect the general SM KM activities and processes. Strategy deals with how KMS is planned for use, whether through codification (storing the explicit information), or personalization (storing the knowledge map).
Psychological and cultural influences in the overall activities of KMS?
Psychological issues include motivation, reward and awareness. Meanwhile cultural influences include the trusts, beliefs, values, norms and unwritten rules.
Based on the above aspects and components, a proposed initial model for Knowledge-based
components for Collaborative SM framework is constructed (see Fig. 2).
21 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
5. Results
13 respondents from an SM organization in Klang Valley, Malaysia participated in this pilot
study. They consist of 2 users, a SM Manager and 10 programmers. The pilot data were
tabulated and analyzed using WinSteps, a Rasch analysis tool. The results of Person and Item
summary statistics and measures are tabulated in Table 2, Table 3, and Table 4. A Person-Item
Differential Map (PIDM) is used to reveal the “easiest” and “hardest” questions answered by
respondents, as depicted in Fig. 3.
Fig. 2 Initial Knowledge-Based Components for Collaborative SM Framework
22 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 2 Summary Of 13 Measured Persons
+-----------------------------------------------------------------------------+
| RAW MODEL INFIT OUTFIT |
| SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD |
|-----------------------------------------------------------------------------|
| MEAN 180.7 58.4 .15 .22 1.06 .2 1.00 -.1 |
| S.D. 17.6 4.7 .66 .02 .37 2.0 .32 1.6 |
| MAX. 209.0 62.0 1.60 .29 1.92 4.3 1.72 3.4 |
| MIN. 132.0 47.0 -.78 .20 .54 -3.2 .65 -2.1 |
|-----------------------------------------------------------------------------|
| REAL RMSE .24 ADJ.SD .62 SEPARATION 2.58 Person RELIABILITY .87 |
|MODEL RMSE .22 ADJ.SD .63 SEPARATION 2.85 Person RELIABILITY .89 |
| S.E. OF Person MEAN = .19 |
+-----------------------------------------------------------------------------+
VALID RESPONSES: 94.2%
Person RAW SCORE-TO-MEASURE CORRELATION = .63 (approximate due to missing data)
CRONBACH ALPHA (KR-20) Person RAW SCORE RELIABILITY = .93
(approximate due to missing data)
Table 3 Summary Of 62 Measured Items
+-----------------------------------------------------------------------------+
| RAW MODEL INFIT OUTFIT |
| SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD |
|-----------------------------------------------------------------------------|
| MEAN 37.9 12.2 .00 .53 1.00 .0 1.01 .0 |
| S.D. 7.0 1.3 .89 .18 .22 .7 .33 .8 |
| MAX. 49.0 13.0 2.80 1.07 1.87 2.2 2.61 3.1 |
| MIN. 19.0 8.0 -2.32 .29 .53 -1.5 .52 -1.5 |
|-----------------------------------------------------------------------------|
| REAL RMSE .57 ADJ.SD .68 SEPARATION 1.19 Item RELIABILITY .59 |
|MODEL RMSE .56 ADJ.SD .69 SEPARATION 1.24 Item RELIABILITY .61 |
| S.E. OF Item MEAN = .11 |
+-----------------------------------------------------------------------------+
UMEAN=.000 USCALE=1.000
Item RAW SCORE-TO-MEASURE CORRELATION = -.31 (approximate due to missing data)
759 DATA POINTS. APPROXIMATE LOG-LIKELIHOOD CHI-SQUARE: 1220.85
Table 4 Person Measure
Entry
Number
Total
Score
Count Measure Model
SE
Infit Outfit PT Mea
Corr
Person
MNSQ ZSTD MNSQ ZSTD
12 186 52 1.6 0.29 1.42 1.8 1.06 0.3 0.33 SU3
9 183 53 1.08 0.25 0.8 -1 0.78 -0.9 0.56 SU2
10 209 62 0.88 0.22 1.15 0.8 0.94 -0.2 0.51 PR4
6 195 60 0.5 0.21 1.92 4.3 1.72 3.4 0.29 SA1
…. … … … … … … … … … …
7 132 47 -0.78 0.24 0.76 -1.4 0.66 -1.9 0.45 SU1
MEAN 180.7 58.4 0.15 0.22 1.06 0.2 1 -0.1
S.D. 17.6 4.7 0.66 0.02 0.37 2 0.32 1.6
23 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 5 Item Measure
Entry
Number
Total
Score
Count Mea
sure
Model
SE
Infit Outfit PT Mea
Corr.
Item
MNSQ ZSTD MNSQ ZSTD
1 29 13 -0.41 0.43 1.16 0.6 1.12 0.5 0.23 A1 Team_K
2 40 13 -0.09 0.48 1.04 0.2 1.04 0.2 0.3 A2 Info
…. … … … … … … … … … …
35 29 12 0.3 0.29 1.87 2.2 2.61 3.1 -0.04 D4 E-Group
…. … … … … … … … … … …
62 42 13 1.46 0.69 1.06 0.3 1.2 0.6 0.15 G5 CoP
MEAN 37.9 12.2 0 0.53 1 0 1.01 0
S.D. 7 1.3 0.89 0.18 0.22 0.7 0.33 0.8
Table 0.6 Analysis
Person Statistics Item Statistics
Spread Spread of (1.60 - (-0.78) = 2.38logit is fair Spread of (2.80 - (-2.80) = 5.60logit is poor.
Item distribution has a much higher spread
compared to Person spread
Reliability Reliability = 0.87 and Cronbach
Alpha=0.93 is good
Reliability, indicated by Real RMSE = 0.59 is
poor
Distribution Person distribution is normal with slightly
positively skewed distribution. The
distribution is also platykurtic (flat).
Item distribution, however, is slightly
negatively skewed with leptokurtic distribution.
Comments Respondent SA1 fails all the 3 measures* Item D4 fails all the 3 measures* and need to
be checked.**
*(Acceptable Point Measure Correlation ( 0.4 < x < 0.8); Outfit Mean Square value (0.5 < y < 1.5) and Outfit
z-standardized value (-2 < z < 2))
24 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Based on the summaries and PIDM, a few observations could be concluded. Person SU1 is at
the leftmost of the person distribution. As the Customer Service Director, it is lonely up there and not many want to share with him information, hence the pattern of answers. However, respondents are not the main issue in our objective. PIDM is utilized to identify misfit respondents so that we could provide plausible causes of such answers to the questionnaire. Questionnaire items are our main agenda, since this will be our instrument to identify the important KMS components for collaborative SM.
Looking at PIDM, item F6 – Personalization, and F5 – Codification are on the rightmost and
leftmost of the Item distribution, respectively. The question for F5 is on Codification Strategy and F6 is on Personalization strategy. We believe that respondents might not understand the terms “codification” and “personalization” in KM context. Layman-terms should be used to better represent the questions. In this case, questions F5 and F6 were rephrased to F5 – “We should store all documented information (e.g. best practices, policies, procedures) in the Knowledge Management System” and F6 – “We should store the location of experts and what they know (like a yellow pages) in the Knowledge Management System”
Question for D4 has been identified as misfit. Looking at the data, the answer distribution is
quite similar in all categories, and hence may be attributed to problematic question. Question D4 is “importance of E-group (e.g. Yahoogroup, Facebook, etc) to Support KM activities”. We propose to change the question to “Importance of Intranet & collaborative sites (e.g. Wikis)”. In
Fig. 3 Person-Item Differential Map (PIDM)
25 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
another analysis, determining the “Easy” questions was not as easy as portrayed in the PIDM. It was envisaged that question F6 (Personalization), D1 (Email) and F1 (Leadership) could be removed from the questionnaire for their obvious responses. However, since F6 is a pair-question with F5, it was decided to keep F6. Subsequently, only D1 and F1 were removed.
6. Conclusion And Future Work
In SM environment, KMS are critical to ensure that KM activities such as knowledge
acquisition, storage and retrieval and processes includes not only the hard-components (tools and
infrastructure), but also the soft-components (managerial, psychological and cultural). To formulate the KMS framework for collaborative SM, the components on KMS, SM
governance, and automation and knowledge discovery are compiled from various literatures. An initial model of modified KMS components for collaborative SM is proposed. The relationships between these components are used to construct the questionnaire, which were tested in a pilot study. Rasch was used to analyze pilot questionnaire to determine the construct validity of the questionnaire items as the measurement tool. Item reliability is found to be poor and a few respondents and items were identified as misfits with distorted measurements. Some problematic questions are revised and some predictably easy questions are excluded from the questionnaire. With the revised questionnaire items, the reliability and construct validity of items are envisaged to be higher and better.
The next task is to send out the revised questionnaire to several SM organizations, to further
verify the components for KMS framework for collaborative SM. The KMS Framework shall be used in our ongoing study to develop a tool to assist SM CoP members to perform their activities better.
7. Acknowledgment
This research is funded by the Malaysia Ministry of Science and Technology (MOSTI) e-
ScienceFund Project No.01-01-04-SF0966.
8. References
Abd Aziz, A., Mohamed, A., Arshad, N., Zakaria, S., & Masodi, S. (2008). Evaluation of
Information Professionals Competency Face Validity Test using Rasch. Proceedings of the
5th WSEAS/IASME international conference on Engineering education.
Abdullah, R. (2008). Knowledge Management System in a Collaborative Environment.
University Putra Malaysia Press.
Abdullah, R., Sahibuddin, S., Alias, R., & Selamat, M. H. (2006). Knowledge Management
System Architecture For Organizational Learning With Collaborative Environment.
International Journal of Computer Science and Network Security, 6(3a).
Alavi, M., & Leidner, D. (2000). Knowledge Management Systems: Issues, Challenges, and
Benefits. Communication of AIS, 1.
Anquetil, N., & de Oliviera, K. (2006). Software Maintenance Ontology. Ontologies for
Software Engineering and Software Technology. Springer Berlin Heidelberg.
26 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Bond, T. (2003). Validity and Assessment: a Rasch Measurement Perspective. Metodología de
las Ciencias del Comportamiento, 5(2), 179-194.
Bond, T., & Fox, C. (2007). Applying the Rasch Model: Fundamental Measurement in Human
Sciences. (2nd ed.). New Jersey: Lawrence Album Associates.
Das, S., Lutters, W., & Seaman, C. (2007). Understanding Documentation Value in Software
Maintenance. Proceedings of the 2007 Symposium on Computer Symposium On Computer
Human Interaction For The Management Of Information Technology.
Davenport, T. H., & Prusak, L. (2000). Working Knowledge: How Organization Manage What
They Know. Harvard Business School Press.
Dias, M. G., Anquetil, N., & Oliveira, K. (2003). Organizing the Knowledge Used in Software
Maintenance. Journal of Universal Computer Science, 9(7).
Dingsoyr, T., & Conradi, R. (2002). A Survey Of Case Studies Of The Use Of Knowledge
Management In Software Engineering. International Journal of Software Engineering and
Knowledge Engineering, 12(4).
Fjeldstad, R., & Hamlen, W. (1978). Application Program Maintenance Study: Report to Our
Respondents. Software Engineering- Theory and Practices. Prentice Hall.
Foo, S., Chua, A., & Sharma, R. (2006). Knowledge management Tools and Techniques.
Singapore: Pearson Prentice Hall.
Ghali, N. (1993). Managing Software Development Knowledge: A Conceptually-Oriented
Software Engineering Environment, PhD. Thesis. University of Ottawa, Canada.
Holsapple, C. W., & Joshi, K. D. (2002). Understanding KM Solutions: The Evolution of
Frameworks in Theory and Practice. Knowledge Management Systems: Theory and Practice.
Thomson Learning.
IEEE. (2006). IEEE 14764-2006, Standard for Software Engineering - Software Life Cycle
Process – Maintenance. The Institute of Electrical and Electronics Engineers, Inc.
Jarke, M., & Rose, T. (1988). Managing Knowledge about Information System Evolution.
Presented at the Proceedings of the 1988 ACM SIGMOD International Conference.
Kitchenham, B., Travassos, G. H., Mayrhauser, A., Niessink, F., Schneidewind, N. F., Singer, J.,
Takada, S., et al. (1999). Towards an Ontology of Software Maintenance. Journal of
Software Maintenance: Research and Practice, 11(6).
Kneuper, R. (2001). Supporting Software Processes Using Knowledge Management. Handbook
of Software Engineering and Knowledge Engineering (Vol. 2). World Scientic Publishing.
Retrieved from ftp://cs.pitt.edu/chang/handbook/37b.pdf
Lientz, B. P., & Swanson, E. B. (1981). Problems in Application Software Maintenance.
Communications of the ACM, 24(11).
Meso, P., & Smith, R. (2000). A Resource-Based View Of Organizational Knowledge
Management Systems. Journal of Knowledge Management, 4(3).
Nonaka, I., & Takeuchi, H. (1995). The Knowledge-Creating Company. New York: Oxford
University Press, Inc.
Pigoski, T. (2004). Software Maintenance. Guide to the Software Engineering Body of
Knowledge (SWEBOK). The Institute of Electrical and Electronics Engineers, Inc.
Rodriquez, O. M., Martinez, A. I., Favela, J., Viscaino, A., & Piattini, M. (2004a).
Understanding and Supporting Knowledge Flows in a Community of Software Developers.
Lecture Notes in Computer Science, 2198.
27 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Rodriquez, O. M., Martinez, A. I., Favela, J., Viscaino, A., & Piattini, M. (2004b). How to
Manage Knowledge in the Software Maintenance Process. Lecture Notes in Computer
Science, 3096.
Rus, I., & Lindvall, M. (2001). Knowledge Management in Software Engineering. IEEE
Software, 19(3).
Santos, G., Vilela, K., Montoni, M., & Rocha, A. (2005). Knowledge Management in a Software
Development Environment to Suport Software Process Deployment. Lecture Notes in
Computer Science, 3782.
Schach, S. R., Jin, B. O., Yu, L., Heller, G. Z., & Offutt, J. (2003). Determining the Distribution
of Maintenance Categories: Survey versus Measurement. Journal of Empirical Software
Engineering, 8.
Selamat, M. H., Abdullah, R., & Paul, C. H. (2006). Knowledge Management System
Architecture For Organizational Learning With Collaborative Environment. International
Journal of Computer Science and Network Security, 6(8a).
Ulrich, F. (2002). A Multi-Layer Architecture for Knowledge Management Systems. Knowledge
Management Systems: Theory and Practice (pp. 97-111). Thomson Learning.
Vizcaíno, A., Soto, J. P., & Piattini, M. (2004). Supporting Knowledge Reuse During the
Software Maintenance Process through Agents. Proceedings of the 6th International
Conference on Enterprise Information Systems (ICEIS).
Wilson, T. D. (2002). The nonsense of knowledge management. Journal of Information
Research, 8(1).
Witte, R., Zhang, Y., & Rilling, J. (2007). Empowering Software Maintainers with Semantic
Web Technologies. Lecture Notes in Computer Science, 4519.
28 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
AN EXPLORATORY FACTOR ANALYSIS ON INFORMATION
SECURITY RISK IN ICT OUTSOURCING
Nik Zulkarnaen Khidzir, Noor Habibah Arshad and Azlinah Mohamed
Department of System Sciences, Faculty of Computer and Mathematical Sciences, Shah Alam,
Selangor, 40450, Universiti Teknologi MARA, Malaysia
Abstract. It has become the norm for organisations to be increasingly dependent on information
technology (IT) and its communication infrastructure to ensure the speedy and effective
execution of their business operations. In so doing, however, it is essential for organisations to
maintain the highest information security posture through integrating an information security risk
management approach. Under such circumstances, organisations have been increasingly
outsourcing functions since outsourcing strategies provide great benefits to organisations aimed
at reducing operational costs, launching new business ventures and improving the efficiency of
business operations whilst enabling them to concentrate on their core business activities.
Information Communication Technology (ICT) outsourcing is another variant of outsourcing that
also provides effective ways to solve shortages of ICT resources and expertise. However, it has
inherent risks and organisations need to be aware of potential threats so that they can be
identified and resolved accordingly. The objectives of the research, therefore, are to conduct an
Exploratory Factor Analysis (EFA) on threat and vulnerability items which would consequently
provide knowledge of the most relevant security risk factors for Malaysian ICT Outsourcing
ventures. 300 questionnaires were distributed to various private companies and government
agencies for the study and 110 were returned. The findings of the research show that threat and
vulnerability factors were the critical information security risk factors in ICT outsourcing.
Moreover, two additional critical risk factors were discovered, being, information security
management defects with a mean factor loadings of 0.674 and the challenges of managing
unexpected change with its mean factor loadings, 0.603. The findings of this study will enable
public and private organisations to identify the inherent security risk factors and to urgently
address them prior to the implementation of ICT outsourcing projects.
Keywords: Information Security Risk, Threats, Vulnerabilities, Information Security
Management Defects, Factor Analysis
1. Introduction
Besides the rapid growth of information and communication technology, outsourcing provides an
alternative channel for organisations to operate more effectively whilst saving money
(McDougal P., 2003). Recent research by Information Week Research (Engardio, P & Kripalani,
M., 2006) shows 65% of the respondents realise that outsourcing provides cost savings while
50% attest to its provision of high-skilled vendor operational expertise. Other benefits include
cost restructuring, quality improvement, knowledge (Engardio, P & Kripalani, M., 2006),
contract, and improved methods of capacity management of services and technology where the
risk of providing the access capacity is borne by supplier or vendors.
Despite providing huge advantages to businesses, ICT Outsourcing approaches entail
some risks (Benoit A. Aubert, Michel Patry & Suzanne Rivard, 2005). The top 10 ICT
29 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
outsourcing risks based on importance are Cost-Reduction Expectations, Data
Security/Protection; Process Discipline such as Capability Maturity Model (CMM), Lost of
Business Knowledge, Vendors’ Failure to deliver, Scope Creep, Culture, Turnover of Key
Personnel and Knowledge Transfer (Dean Davison, 2011). Information Risk Survey conducted
by Deloitte Touché Tohmatsu in 2000 identified information security risk as one of the major
key risk areas in ICT Outsourcing projects (Deloitte Touche Tohmatsu Consulting Firm, 2000).
These evidences show that information security is a growing challenge for every organization
(Mohammad H. Qayoumi & Woody, C., 2004). Therefore, securing information asset is vital for
the survivals of many organizations in their businesses (Harols F. Tipton & Micki Krause, 2004).
Hence, this paper attempts to provide empirical findings on information security risk critical
factors in ICT outsourcing projects. The risk factors are classified into two major categories,
threats and vulnerabilities.
2. Literature Review
2.1. ICT Outsourcing
ICT outsourcing is an act of delegating or transferring some or all of the ICT related decision
making rights, business processes, internal activities and services to external providers, who
develop, manage and administer these activities in accordance with agreed upon deliverables,
performance standard and outputs, as set forth in contractual agreements (S. Dhar & B.
Balakrishnan, 2006). Previous studies identify 11 ICT services most commonly outsourced
namely ISP services, web hosting, ICT application maintenance and support, ICT infrastructure,
programming, e-business solution, application analysis, application services provision, support
end user, staff/user training, ICT security audit and security policy or standards development
(NISER, 2003; Noor Habibah et. al., 2007). These can be further categorized into six major ICT
outsourcing natures based on their scopes of work and ICT services activities, namely, ICT
application system development, ICT infrastructure maintenance, IS/ IT strategic planning, ICT
application maintenance, ICT security maintenance and ICT knowledge transfer & training. The
varied ICT outsourced project categories bring forth different kinds of challenges during their
implementation period and hence, the information security risks that arise are different. A study
on these information security risks will thus provide powerful and accurate information to enable
swift and pre-emptive action.
2.2. Global Information Security Review
Organizations should manage and control the aggressive growth of these information security
incidents to minimize monetary losses. Therefore, clearly defining the information security risk
factors associated with ICT outsourcing is imperative in order to evaluate and manage them
effectively (Dean Davison, 2011). Around the globe, security threats such as phishing, spam,
intrusion, internet worms, sabotage of disgruntled employees and stealing data for monetary
gains are not uncommon. In the US alone in 2005, its Federal Bureau of Investigation (FBI) in a
survey of 2,066 organizations found that viruses, spyware, PC theft and other computer-related
crimes cost U.S. businesses a staggering US67.2 billion (RM255.36 billion) a year
(news.cnet.com, 2011). The trend is similar in other countries such as South Korea and China
where the number of cases reported to the respective Computer Emergency Response Teams
30 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
(CERT) is increasing. In Korea in 2007, the number of cases related to hacking, spam, phishing,
intrusion attempt, webpage defacement and others incidents were 11668 (53.7%), 1095 (5.0%),
4316 (19.9%), 2293 (10.6%) and 2360 (10.9%), respectively. Meanwhile, cyber security
incidents reported to the China CERT (CNCERT/CC) in 2007 were phishing; spam mail and
webpage embedded with malicious code presenting 1326, 1197 and 1151 cases respectively
(APPCERT, 2007). Increasing numbers of information security incidents including harassment,
fraud, hack threat, malicious code, denial-of-service, intrusion and spams are also pervasive in
Malaysia. A total of 21661 incidents, inclusive of spam incidents were reported to MyCERT,
representing a 26.73% increase of incidents compared to Q2 in 2008 (Cyber Security Malaysia,
2008).
2.3. Generic Information Security Risk Factors
Risk issues have been widely researched in various fields, such as insurance, economics,
management, medicine, operations research, and engineering. Each field addresses risk in a
fashion relevant to its object of analysis, and henceforth, adopts a particular perspective.
However in most fields, risks are viewed as undesirable events (Levin, M. & Schneider, M.,
1997). Risk factors refer to any factors that influence undesirable events that may occur in
organizations (Deloitte Touche Tohmatsu Consulting Firm, 2000; Levin, M. & Schneider, M.,
1997). From the business perspective, risk factors refer to measurable characteristics or elements,
a change which can affect the value of an asset, such as exchange rates, interest rates, and market
prices (BusinessDictionary.com, 2011). Meanwhile in the concept of health and disease, risk
factors are variables associated with increased risks of disease or infection (Wikipedia.org,
2011).
There are different contexts in which to explain and define the risk factors in ICT and
information security area of studies. The generic information security risk factors consist of
threats and vulnerabilities. Information security risks are chances of threat actions on
vulnerabilities causing impacts that contribute to information security incidents (Garry Hinson,
2008). Information security risks can also include theft of personal data, information leakage,
extraction or loss and unauthorized exploitation of intellectual properties. These risks are caused
by lack of control on threats and vulnerabilities. Thus, threats and vulnerabilities are two major
risk factors that need to be determined accurately. With this, planning for an effective mitigation
plan will decrease the negative impact to the organization’s business function as well as ICT
services.
According to MAMPU, threats refer to natural or man-made circumstances or events that
could have an adverse or undesirable impact, minor or major, on organizational assets (MAMPU,
2005). This view is also shared by other authors of various published articles (Garry Hinson,
2008; Ray Kaplan, 2005). As shown in Table 1, 18 items of threat factors were identified
through literature review and used for factor analysis. Common threat factors items have been
highlighted by several authors such as Fraudster/ Hackers, organized crime, unauthorized access
and malware authors that are actors or situations that might deliberately or accidently exploit
vulnerabilities causing information security risk (Garry Hinson, 2008; Jean Boltz et. al., 1999;
Vassiliadis et. al., 2005).
31 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 1: Information Security Threat Items
Factors
ID Threat Risk Factor Items References
T1 Imposition of legal and regulatory obligation.
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
T2 Identity theft and personal data/ information (Data
Confidentiality).
Noor Habibah Arshad
et.al.,2007;
Gary Hinson, 2008;
Jean Boltz et. al., 1999
T3 Negligent Service Provider staff such as programmers,
technical architects, testers and project managers who cause
or fail to prevent vulnerabilities.
Gary Hinson, 2008;
MAMPU, 2005;
Tawei Wang et. al.,
2009
T4 Loss, damage or destruction of information assets and ICT
services.
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
T5 Exploiting control weaknesses in the IT-enabled business
processes.
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
T6 Directly exploit control weaknesses within the IT systems. Gary Hinson, 2008;
MAMPU, 2005
T7 Exploit other control weaknesses involving printed or
others information rather than computer data and systems.
MAMPU, 2005
T8 System Error and ICT Failures
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
T9 Unethical competitors (trade secrets, customer lists, etc)
Gary Hinson, 2008;
Tawei Wang et. al.,
2009
T10
Disgruntled/untrained/ignorant employees who make
genuine if naïve human errors (impact to data
confidentiality)
Noor Habibah Arshad
et. al., 2007;
Garry Hinson, 2008;
Gary Hinson, 2003
T11 Persons who misuse/ misconfigure system security
function, or ignore security policies and good practices.
Gary Hinson, 2008;
Jean Boltz et. al., 1999
T12 Persons who destroy or threaten to destroy, information
assets.
Gary Hinson, 2008;
MAMPU, 2005
T13 Unauthorized access to, or modification or disclosure of
information assets.
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
T14 Attacking critical information infrastructure to cause Gary Hinson, 2008;
32 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Factors
ID Threat Risk Factor Items References
disruption of ICT Services. Jean Boltz et. al., 1999
T15 Information leakage (extraction of loss of valuable and/or
private information).
Noor Habibah Arshad
et. al., 2007;
Gary Hinson, 2008;
MAMPU, 2005
T16
Widespread unauthorized and uncontrolled used of portable
devices and transportable computer media.
Gary Hinson, 2008;
Jean Boltz et. al., 1999;
Tawei Wang et. al.,
2009
T17 Severely affect the business survivability of organization.
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
T18
Unauthorized exploitation of intellectual property (IP)
(example : Plagiarism, etc)
Gary Hinson, 2008;
Jean Boltz et. al., 1999;
Vassiliadis et. al., 2005;
Schirripa, F., 2000.
Vulnerabilities constitute another form of information security risk. Vulnerabilities refer
to the absence or weaknesses of a safeguard in an asset that makes a threat potentially more
harmful (can be exploited) or costly, more likely to occur, or likely to occur more frequently
(Garry Hinson, 2008; MAMPU, 2005; Ray Kaplan, 2005). As shown in Table 2, 30 vulnerability
factor items were identified through literature review and used for factor analysis. The most
frequent items classified under vulnerabilities include Software bugs, poor/ missing governance
of information assets are the flaws or weaknesses in system security procedure, design,
implementation or internal control that could be exploited and result in security breach, violation
of system security policy and other impacts.
Table 2: Information Security Vulnerability Items
Factors
ID Vulnerability Risk Factor Items References
V1 Lack of Security training & awareness Gary Hinson, 2008;
MAMPU, 2005
V2 Insufficient law enforcement Gary Hinson, 2008 ;
Jean Boltz et. al., 1999
V3 User system accounts not in use MAMPU, 2005
V4 Lack of suitable management & control over the use of
passwords
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
33 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Factors
ID Vulnerability Risk Factor Items References
V5 Insufficient attention to human factors in systems design
and implementation
Gary Hinson, 2008;
MAMPU, 2005;
Gary Hinson, 2003
V6 Inappropriate, disorganized and/or stressful operating
conditions
Gary Hinson, 2008;
Jean Boltz et. al., 1999
V7 Unorganized access control & privilege on user application
account
Gary Hinson, 2008;
MAMPU, 2005 ;
Jean Boltz et. al., 1999;
Tawei Wang et. al.,
2009
V8 Insufficient backup MAMPU, 2005;
Jean Boltz et. al., 1999
V9 System failures
Gary Hinson, 2008;
MAMPU, 2005 ;
Jean Boltz et. al., 1999
V10 Disgruntled Organization's staff Gary Hinson, 2008;
MAMPU, 2005
V11 Disgruntled Service Provider staff Gary Hinson, 2008;
MAMPU, 2005
V12 Lack of Anti-Virus protection Gary Hinson, 2008;
Gary Hinson, 2003
V13 Complexity in Information Technology and System design
Gary Hinson, 2008;
MAMPU, 2005;
Gary Hinson 2003
V14 System Design Flaws and Weaknesses Gary Hinson, 2008;
MAMPU, 2005
V15 Lack of asset inventory management MAMPU, 2005;
Gary Hinson, 2003
V16 Unreliable level of information assets protection MAMPU, 2005;
Jean Boltz et. al., 1999
V17 Lack of information assets owners responsibility
Gary Hinson, 2008;
MAMPU, 2005;
Tawei Wang et. al.,
2009
V18 Lack of operation maintenance (software updates/
hardware replacement)
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999;
Gary Hinson, 2003
V19 Inadequate investment in appropriate information security
controls
MAMPU, 2005;
Jean Boltz et. al., 1999
V20 Disclosure of information to institution competitors Gary Hinson, 2008;
34 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Factors
ID Vulnerability Risk Factor Items References
MAMPU, 2005;
Jean Boltz et. al., 1999;
Gary Hinson, 2003
V21 Ignorance, carelessness, negligence or idle curiosity by
user
Gary Hinson, 2008;
MAMPU, 2005;
Tawei Wang et. al.,
2009; Gary Hinson,
2003
V22
Unaddressed Service Provider's responsibility for
information security and confidentiality in the contract
Gary Hinson, 2008;
Tawei Wang et. al.,
2009
V23 Loss of confidentiality of classified information
Noor Habibah Arshad
et. al., 2007;
Gary Hinson, 2008;
MAMPU, 2005
V24 Lack of suitable control of information asset accessibility
Gary Hinson, 2008;
MAMPU, 2005;
Tawei Wang et. al.,
2009
V25 Lack of physical access control and protection MAMPU, 2005;
Jean Boltz et. al., 1999
V26 Poor information security studies Gary Hinson, 2008;
MAMPU, 2005
V27
Inadequate controls and practices selection,
implementation, performance measurement, monitoring
and/or auditing
Gary Hinson, 2008;
MAMPU, 2005;
Jean Boltz et. al., 1999
V28 Lack of Business Continuity Management MAMPU, 2005;
Jean Boltz et. al., 1999
V29 Lack of Disaster Recovery Planning MAMPU, 2005;
Jean Boltz et. al., 1999
V30 Frequent change in the business, IT and security arenas Gary Hinson, 2008;
MAMPU, 2005
3. Methodology
3.1. Methodology Overview
The questionnaire survey was executed for this study. The Five-Point-Likert-Scale was used to
measure the criticality of threat and vulnerability factors. Both primary and secondary data were
used in order to accomplish this research objective. The threat and vulnerability items as
35 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
described in the previous section were then used for the Exploratory Factor Analysis (EFA). A
principal component analysis (PCA) was used on 47 items of threats and vulnerabilities with
orthogonal rotation (varimax). For the purpose of this research, > 0.4 factor loadings were
considered as acceptable items relevant to the identified factors. Finally, Cronbach’s Alpha
Coefficient was used to test the item’s reliability for each identified factor.
3.2. Research Model
Fig. 1: Research model on threats and vulnerabilities Exploratory Factor Analysis
The research model as shown in Fig. 1 was built based on a combination of several past
literatures (Nik Zulkarnaen Khidzir et. al., 2010a; 2010b) instead of a single research model.
(Refer to Table 1 for a list of threat risk factors and Table 2 for a list of vulnerability risk
factors). The research model discusses the information security risk factors in ICT outsourcing.
Eighteen (18) threat and thirty (30) vulnerability risk factors were used in the research to
determine the most relevant factors influencing Malaysian ICT outsourcing projects using an
Exploratory Factor Analysis (EFA) technique. Reliability tests were also conducted for items
belonging to the factors.
4. Findings and Results
4.1. Analysis Overview
The survey questionnaire captured the respondents’ profiles as well as the project profiles of 110
respondents from various ICT organizations in Malaysia. Besides, this section also analyses and
discusses the information security risk threat and vulnerability factors as well as the results of
Exploratory Factor Analysis on information security risk in ICT outsourcing.
4.2. Respondent’s Demographic Profiles
36 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
The demographic profiling of respondents included their personal working experience, ICT
outsourcing involvement, organization sectors, nature of business and industries. Analysis was
based on respondents from various industries and businesses that were directly involved in ICT
outsourcing projects. They largely comprised of project managers and senior information system
officers. 18.2% of the respondents had working experience of less than 5 years, 40% had been
working between 6 to 10 years, 17.3% between 11 and 15 years, 15.5% between 16 and 20 years
and only 9.1% possessed more than 20 years working experience. The majority of respondents
(79.9%) were from government agencies, 11.8% from private companies and only 9.1% from
Government-Link-Company (GLC). The response was received from industries and businesses
in Malaysia. Fig. 2 illustrates the range of respondents from various industrial types involved in
the study. The majority of respondents were from government services, the banking &
investment, education and information communication technology (ICT) industries.
Fig. 2: Respondents from various industrial types
4.3. ICT Outsourcing Project Demographic Results
The ICT Outsourcing demographics observed were project type, outsourcing strategy/ approach,
outsourced percentage of ICT function, reason for outsourcing, number of service providers
involved, project/ contract duration and cost, total number of team members involved and
outsourcing project successful criteria.
Upon analysis, it was found that 57.3% of the ICT outsourcing projects were Application
System Development, ICT Infrastructure Maintenance (26.4%), IS/IT Strategic Planning (6.4%),
ICT Security Maintenance (5.5%), ICT Training (2.7%) and ICT Application Maintenance
(1.8%). Most of the projects were joint-venture outsourcing strategies (43.6%). The majority of
the respondents agreed that the main reason for outsourcing ICT projects was to improve quality
of their services (56.4%) and the lack of in-house resources/expertise (71.8%). The highest
response was received from organizations that involved between 1 to 3 Service Providers in one
ICT Outsourcing project (78.2%). Most of the project/ contract duration was Medium-term (1 to
3 years) at 61.8%, while Short-term (less than 1 year) was 18.2% and Long-term (more than 3
years) was 19.1%. The highest response was received from organizations involved with ICT
37 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
outsourcing projects costing between RM 1M to RM 5M (23.6%). The majority of the ICT
outsourcing project team members involved between 10 - 20 people (27.2%). However, this was
also influenced by the type, duration and cost of the projects.
4.4. Exploratory Factor Analysis (EFA)
Exploratory Factor Analysis (EFA), a statistical technique that reduces data to a smaller set of
summary variables and explores the underlining theoretical structure of the phenomena or
construct, is used to identify the structure of the relationship between the variable and the
respondent. For the purpose of this research, the factor analysis produces result of KMO and
Bartlett’s Test, rotation of sums squared loading, and rotated component matrix. Table 3
summarizes the results of the Exploratory Factor Analysis (EFA) and Table 4 describes the
results for KMO and Bartlett’s Test.
For the purpose of this research, a principal component analysis (PCA) was conducted on
48 items (threats and vulnerabilities) with orthogonal rotation (varimax). As shown in Table 4,
the Kaiser-Mayer-Olkin measure verified the sampling adequacy for the factor analysis, KMO =
0.911, which is well above the acceptable limit of 0.5 (Andy Field, 2009). Bartlett’s Test of
Sphericity x² (1128) = 6465.01, p < 0.001, indicated that correlation between items were
sufficiently large for PCA. An initial analysis was run to obtain eigenvalues for each component
in the data. Five components had eigenvalues over Kaiser’s criterion of 1 and combination
explained 75.62% of the variance. As shown in Table 3, Factor loadings after rotation describe
the component represents new factor discovered for the information security risk factor for ICT
Outsourcing. The items that cluster on the same components suggest that risk factor 1 represents
a threat, risk factor 2 vulnerability; risk factor 3 is an information security management defect,
and last but not least, risk factor 4 represents challenges of managing unexpected change.
However, risk factor 5 was not considered as new factor because the factor loadings was quite
low and below the acceptable limit of 0.5.
38 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 3: Exploratory Factor Analysis (EFA) Results Summary
Analyzed
Items
(Variables)
Risk Factor
1
Rotated
Risk Factor
2
Factors
Risk Factor
3
Loadings
Risk Factor 4
Risk Factor 5
T1 0.846
T2 0.839
T3 0.805
T4 0.773
T5 0.748
T6 0.734
T7 0.733
T8 0.731 0.421
T9 0.725
T10 0.714 0.413
T11 0.704
T12 0.695
T13 0.688 0.455
T14 0.682 0.404
T15 0.672
T16 0.668 0.410
T17 0.654 0.409
T18 0.645 0.480
V1 0.636 0.406
V2 0.624
V3 0.624
V4 0.565 0.557
V5 0.559 0.464
V6 0.558 0.499
V7 0.853
V8 0.810
V9 0.808
V10 0.804
V11 0.796
V12 0.425 0.789
V13 0.789
V14 0.789
V15 0.787
V16 0.784
V17 0.780
V18 0.772
V19 0.762
V20 0.759
39 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 4: KMO and Bartlett’s Test
Kaiser-Meyer-Olkin (KMO) 0.911
Approx. Chi-Square 6465.01
Bartlett’s Test of Sphericity Df 1128
Significance 0.000
Results derived from factor analysis show that 24 items clustered on risk factor 1 which
is identified as threats. The factor loadings values describe the degree of the items suitable to the
factor. These factor loading values were classified into three categories which are strong for
value > 0.7; moderate for value > 0.4 and < 0.7; and weak for value < 0.4. Imposition of legal
and regulatory obligation, (T1 = 0.846); identity theft and personal data (T2 = 0.839); negligent
Service Provider staff that cause or fail to prevent vulnerabilities (T3 = 0.805) was among the
highest factor load for the threats factor. Lack of suitable management & control over the use of
passwords (V4 = 0.565); insufficient attention to human factors in systems design and
implementation (V5 = 0.559); and inappropriate, disorganized and/or stressful operating
condition (V6 = 0.558) were among the lowest factor load for factor 1. However these items are
still relevant to the threats factor because of the moderate values of factor loadings. Therefore,
these items describe almost similar items for threat factors involving ICT outsourcing projects.
Table 5 illustrates the selected items according to degree of relevancy.
V21 0.756
V22 0.737
V23 0.714
V24 0.680
V25 0.412 0.774
V26 0.729
V27 0.476 0.603
V28 0.592
V29 0.544 0.607
V30 0.532 0.599
Eigenvalues 14.366 13.932 4.268 2.555 1.177
% of
Variance
29.928 29.026 8.891 5.323 2.452
40 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 5: Relevant items for Threat Factors
Items Threat Items Rotated Factor Loadings Degree of Relevance
Items
T1 0.846 Strong +ve
T2 0.839 Strong +ve
T3 0.805 Strong +ve
T4 0.773 Moderate +ve
T5 0.748 Moderate +ve
T6 0.734 Moderate +ve
T7 0.733 Moderate +ve
T8 0.731 Moderate +ve
T9 0.725 Moderate +ve
T10 0.714 Moderate +ve
T11 0.704 Moderate +ve
T12 0.695 Moderate +ve
T13 0.688 Moderate +ve
T14 0.682 Moderate +ve
T15 0.672 Moderate +ve
T16 0.668 Moderate +ve
T17 0.654 Moderate +ve
T18 0.645 Moderate +ve
V1 0.636 Moderate +ve
V2 0.624 Moderate +ve
V3 0.624 Moderate +ve
V4 0.565 Moderate +ve
V5 0.559 Moderate +ve
V6 0.558 Moderate +ve
Based on the factor analysis results, vulnerabilities remain one of the factors that
contribute to information security risk in ICT outsourcing. About 18 items cluster on the risk
factor 2 representing the vulnerabilities factor. Unorganized access control & privilege on user
application account (0.853); insufficient backup (0.810); System failures (0.808); disgruntled
Organization's staff (0.804) and disgruntled Service Provider staff (0.796) were considered
among the top loading factors for the analysis. This shows a strong degree of relevance for
vulnerability factors. Conversely, factor loadings for the item Lack of suitable control of
information asset accessibility (0.680) was moderate. However, the item is considered as an item
in vulnerability factors since the loadings factor is more than 0.4. Table 6 illustrates the relevant
items for vulnerability factors.
41 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 6: Relevant items for Vulnerability Factors
Items Vulnerability Items Rotated Factor Loadings Degree of Relevance
Items
V7 0.853 Strong +ve
V8 0.810 Strong +ve
V9 0.808 Strong +ve
V10 0.804 Strong +ve
V11 0.796 Strong +ve
V12 0.789 Strong +ve
V13 0.789 Strong +ve
V14 0.789 Strong +ve
V15 0.787 Strong +ve
V16 0.784 Strong +ve
V17 0.780 Strong +ve
V18 0.772 Strong +ve
V19 0.762 Strong +ve
V20 0.759 Strong +ve
V21 0.756 Strong +ve
V22 0.737 Strong +ve
V23 0.714 Strong +ve
V24 0.680 Moderate +ve
The analysis brought about the discovery of two new risk factors; information security
management defects and challenges of managing unexpected change in business, ICT and
security arenas. Results derived from factor analysis show that 4 items cluster on information
security risk defects factor. As shown in Table 6, lack of physical access control and protection
(V25 = 0.774); and poor information security studies (V26 = 0.729) were strongly relevant items
for information security management defects. Meanwhile, inadequate controls and practices
selection, implementation, performance measurement, monitoring and/or auditing (V27 = 0.603);
and lack of Business Continuity Management (V28 = 0.592) were moderately relevant items for
information security management defects.
42 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 6: Relevant items for information security management defects Factor
Items Information Security Management defects
Items Rotated Factor Loadings
Degree of Relevance
Items
V25 0.774 Strong +ve
V26 0.729 Strong +ve
V27 0.603 Moderate +ve
V28 0.592 Moderate +ve
Another new risk factor for information security risk when outsourcing ICT projects was
challenges of managing unexpected change. Even though only two items cluster on this factor,
the rotated factor loadings values were above acceptable value which is > 0.4. Lack of Disaster
Recovery Planning (V29 = 0.607); and Frequent change in the business, IT and security arenas
(V30 = 0.599). Table 7 illustrates the relevant items for the challenges of Managing Unexpected
Change factors. Furthermore, the reliability test for these items shows that Cronbach’s Alpha
value for this factor was high (α = 0.840) which is more than common acceptable value for
reliability, (α > 0.7), as shown in Table 7.
Table 7: Relevant items for the challenges of managing unexpected change
Items Challenges of Managing Unexpected Change
items Rotated Factor Loadings
Degree of Relevance
Items
V29 0.607 Strong +ve
V30 0.599 Moderate +ve
4.5. Reliability Test
Cronbach’s Alpha Coefficient was used to test the survey item’s reliability for each factor
extracted from the factor analysis results. A coefficient value which is closer to “1” is required.
Cronbach Alpha value for threats (0.974); vulnerabilities (0.974); information security
management defects (0.898); and challenges of managing unexpected change (0.840) was high.
Since all items in the Table 8 below had a reliability of more than 0.7, the scale for these
construct was considered to exhibit an acceptable reliability.
43 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 8: Information Security Risk Factors in ICT Outsourcing
ICT Outsourcing Information Security Risk Factors
No. of
Items
extracted
Cronbach’s
Alpha Values N
Threats 24 0.974 110
Vulnerabilities 18 0.974 110
Information Security Management Defects 4 0.898 110
Challenges of Managing Unexpected Change 2 0.840 110
Note: Items - Number of items extracted from factor analysis, N – Total Number of
Respondents
5. Conclusion
ICT outsourcing approaches provide improvements on processes and services in an organization.
However, they involve potential risks that need to be considered and managed effectively.
Serious consideration of the information security risk factors is thus required in order to ensure
the success of ICT outsourcing projects. To this end, the findings of this study provide the much
needed empirical evidence of information security risk factors arising from such ventures.
An Exploratory Factor Analysis (EFA) was used to identify information security risk
factors involved in ICT outsourcing. Results of the analysis show that almost similar factors for
information security risks were extracted (threats and vulnerabilities) in ICT outsourcing
environments. Surprisingly, two new factors were discovered through the analyses which are
information security management defects and challenges of managing unexpected change
factors.
Drawing upon this study, ICT and Information Security professionals as well as
outsourcing practitioners should give urgent attention to these emerging new factors to minimize
information security risks in ICT outsourcing. Managing such impacts is the key to successful
outsourcing ventures.
6. Acknowledgements
The authors wish to thank the committee of CISSP Forum, ISO27000 Forum and Perpustakaan
Negara Malaysia for the articles. Heartfelt thanks are due to UiTM Research Management
Institute (RMI) and Faculty of Computer and Mathematical Sciences, UiTM for their financial
support. Last but not least, to all respondents who participated in the study.
44 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
References
Andy Field. (2009). Discovering Statistics Using SPSS, SAGE Publications Ltd.
APPCERT. (2007). Asia Pacific Computer Emergency Response Team Annual Report.
Benoit A. Aubert, Michel Patry & Suzanne Rivard. (2005). A Framework for Information
Technology Outsourcing Risk Management, The Database for Advances in Information
Systems, Vol. 36, No. 4.
BusinessDictionary.com. (2011). Risk Factors Definition. Retrieved 27 January 2011, from
http://www.businessdictionary.com/search.php?search= Risk+Factors+
Cyber Security Malaysia. (2008). MyCERT 3rd
Quarter 2008 Summary Report, e-Security,
Volume 16 – (Q3/2008). Cyber Security Malaysia.
Dean Davison (2011). Top 10 Risk of Offshore Outsourcing. Retrieved 27 January 2011, from
http://www.zdnet.com/news/top-10-risks-of-offshore-outsourcing/299274
Deloitte Touché Tohmatsu Consulting Firm. (2000). Information Risk Survey 2000 Report.
Engardio, P & Kripalani, M. (2006), Information Week Research Report 2006. Information
Week Research Firm.
Gary Hinson. (2003). Human Factor in Information Security. Whitepaper, IsecT Ltd, England.
Gary Hinson. (2008). Top Information Security Risk for 2008: Information Security Risk., CISSP
and ISO27001 Implementer’s Forum.
Harold F. Tipton & Micki Krause. (2004). Information Security Management Handbook. 5th
edition, Auerbach Publications, CRC Press.
Jean Boltz, Ernest Döring, & Michael Gilmore. (1999). Information Security Risks Assessment:
Practices of Leading Organization. United State General Accounting Office.
Levin, M. and Schneider, M. (1997). Making the Distinction: Risk Management, Risk Exposure.
Risk Management Journal, Vol.44, No.8, pp. 36-42.
MAMPU. (2005). The Malaysian Public Sector Information Security Risk Assessment
Methodology (MyRAM). Perpustakaan Negara Malaysia.
McDougal P. (2003). Outsourcing’s on in a Big Way. Communication Convergence.
Mohammad H. Qayoumi & Woody, C. (2004). Addressing Information Security Risk: A
Journey, not a destination, security work is never done – the challenges just keep coming.
Information Security Risk Management Handbook, 5th
Edition, Auerbach Publications, CRC
Press.
News.cnet.com. (2006). Computer Crime Cost $67 Billion, Says FBI, 2006. Retrieved 27
January 2011, from http://news.cnet.com/Computer-crime-costs-67-billion,-FBI-says/2100-
7349_3-6028946.html
Nik Zulkarnaen Khidzir, Noor Habibah Arshad & Azlinah Mohamed. (2010a). Information
Security Risk Factors: Critical Threats and Vulnerabilities in ICT Outsourcing. Proceedings
of the International Conference on Information Retrieval and Knowledge Management held
on 16 – 18 March 2010 at Shah Alam Convention Centre Malaysia, (pp. 194-199).
Nik Zulkarnaen Khidzir, Azlinah Mohamed & Noor Habibah Arshad. (2010b). Critical
Information Asset Security Requirements in ICT Outsourcing. Proceedings of International
IT and Society Conference 2010 held on 8 – 10 June 2010 at Universiti Malaysia Sabah,
Malaysia. (Vol. 1 No.1, pp 88-95)
45 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
NISER. (2003). Information Security Management System (ISMS) Survey., National ICT
Security and Emergency Response Centre.
Noor Habibah Arshad, Yap May-Lin, Azlinah Mohamed, Sallehuddin Affandi. (2007). Inherent
risks in ICT outsourcing project. Proceeding of the 8th WSEAS Conference on Mathematics
and Computers in Business and Economics, Vol. 8, pp.141 – 146.
Ray Kaplan. (2005). A Matter of Trust - Information Security Risk Management Handbook, 5th
edition, Auerbach Publications, CRC Press.
Schirripa, F. and Tecotzky, N. (2000). An Optimal Frontier, The Journal of Portfolio
Management, Vol.26, No.4, pp. 29-40.
S. Dhar, & B.Balakrishnan. (2006). Risk, Benefits and Challenges in Global IT Outsourcing
Perspective and Practices, Journal of Global Information Management, vol. 14, no. 3, pp. 39-
69.
Tawei Wang, Jakie Rees & Karthik Kannan. (2009). The Association between the Disclosure
and the Realization of Information Security Risk Factors. Centre for Education and Research
Information Assurance and Security.
Vassiliadis B., FotopoulosV., Ilias A. & A.N. Skedras. (2005). Protecting Intellectual Property
Right and the JPEG2000 Coding Standards, Advanced in Informatics, vol. 3746, pp. 705-715.
Wikipedia.org. (2011). Risk Factor Concept in Health and Disease. The free encyclopedia.
Retrieved 27 January 2011, from http://en.wikipedia.org/wiki/Risk_factor
About the Authors
Nik Zulkarnaen Hj Khidzir, currently a PhD candidate at Universiti Teknologi MARA and has
been involve in ICT industry for the past 10 years. He graduated his diploma and bachelor
degree in Computer Science from Universiti Putra Malaysia in 2001. He obtained his master’s
degree (MSc. IT) from Universiti Teknologi MARA in 2006. His areas of interest are Software
Engineering, Information Security and IT Business. He has written several academic articles and
has been regular presenter at national and international conferences.
Datin Dr. Noor Habibah Hj. Arshad, an Associate Professor at Faculty of Computer and
Mathematical Sciences, Universiti Teknologi MARA has been teaching for the past 25 years.
Her areas of interest are IT project management, risk management and IT Governance. She has
co-authored several books, written numerous academic articles, and is a regular presenter at
national and international conferences. She also active in consultation work with the Ministry of
National Resources and Environment and Hicom Communication Sdn Bhd.
Dr. Azlinah Hj. Mohamed, a Professor at Faculty of Computer and Mathematical Sciences,
Universiti Teknologi MARA has been teaching for past 22 years. Currently she is the Dean of
Faculty of Computer and Mathematical Sciences. Her areas of interest are Artificial Intelligence
(AI) and Knowledge Management (KM) and System Engineering. She has also co-authored
several books, written numerous academic articles, and has been regular presenter at national and
international conferences.
46 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
AUTOMATIC LEXICON GENERATOR: AN ALGORITHM
Kasturi Dewi Varathan
Faculty of Information Science & Technology, National University of Malaysia, Bangi,
Selangor, 43600, Malaysia.
Tengku Mohd. Tengku Sembok
Kuliyyah ICT, International Islamic University of Malaysia
Kuala Lumpur, Malaysia
Rabiah Abdul Kadir
Faculty of Computer Science & IT, University Putra Malaysia, Serdang, Selangor, 43400,
Malaysia.
Abstract. Over the past decades, computer revolution has opened up many possibilities for new
field of investigation. With greater accessibility to information and lowering cost of powerful
computers, this has spawned new efforts towards understanding complex tasks. Lexicon in
particular has long been recognized as interesting and challenging because of its complexness. It
is the knowledge of individual words in the language that has been perceived as central
component for all types of natural language processing system. In this paper we present an
algorithm to create an automatic lexicon generator in order to generate lexicon from an input
document by making use of Apple Pie Parser. The lexicon generated managed to reduce
significant amount of time and manpower drastically. Psycholinguists as well as computational
linguists can benefit from this automatic lexicon construction.
Keywords: lexicon, apple pie parser, syntactic tree, natural language processing.
1. Introduction
A lexicon is the backbone of any NLP (natural language processing) system. It is the knowledge
about individual words in the language. Accurate words with grammatical attributes are very
important for many applications such as information extraction, tagging, text mining, ontology,
etc. Perhaps, it is pushing the researchers and developers to accommodate all these applications
with larger and richer lexicons. Thus, the task of designing and developing it has become a
crucial aspect that needs to be taken into serious consideration. It is an undeniable fact that good
quality lexicon construction requires significant amount of time and human labor especially with
the vast amount of growing text from day to day basis. An automatic lexicon generation engine
has become an application which is a necessity than before. Gone are the days when we depend
on the manual construction of lexicon.
Thus, the aim of this paper is to show how lexicon will be generated automatically by
integrating Apple Pie Parser. The remaining of the paper will contain sections on literature
review, automatic lexicon creation, conclusion and limitation of the research.
47 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
2. Literature Review
A lexicon is a collection of representation of words used by a natural language processor as a
source of words specific information (Shalabi & Kanaan 2004). It is being referred as the
component of a NLP system that contains information (semantic, grammartical) about individual
words or words usage (Guthrie et al. 1996). Sagot (2005) has mentioned that lexicon plays a
central role in NLP tasks. The existence of lexicon in NLP has become very critical nowadays
(Ralph & Nicoletta 1996). We can now safely say that there are almost no NLP projects be it in
the form of information extraction/ retrieval, machine translation, dialog systems or even text-
speech recognition that can function without the existence of it. Since the existence of it is so
important, many research efforts toward creating NLP lexicon has emerged since the late 1980s
(Shalabi & Kanaan 2004).
Many lexicons generated in the past have been done manually(. Generally these lexicons
may have been large (Sagot 2005) and expensive to build and would have been customized for
the need of a particular application only (Ralph & Nicoletta 1996). For example, the COMLEX
Syntax (Grishman & Mcyers 1994) and the ANLT dictionaries (Bouguraev & Briscoe 1987) are
manually subcategorized lexicons available for English. Besides that, there are also lexicon that
has been crafted manually to represent an experiment corpus such as Rabiah(2008) who
represented Remedia corpus for question answering system for reading comprehension using
logical inference model. Since it has been catered for a specific application, it will not be
suitable to be shared among other applications which are more general. Besides that, manual
construction is of course slow and very cumbersome. Perhaps, we will not be able to
accommodate the rapid growth and changes of text that occurs every day.
Because of these drawbacks, research efforts in the development of lexicon generation
have emerged. WordNet (Miller 1990) is a very famous lexical database for English. Rich
semantic relations of words including synonym, antonym, and so on in which words are linked
together to form a network has been provided. Unfortunately it does not provide a
comprehensive account of possible syntactic frames and predicate argument structures associated
with individual verb senses. Besides that, Corelex(Buitelaar 1998) which is based on the
generative lexicon has concentrate on nouns rather than verbs.
3. Automatic Lexicon Creation
Verma & Bhattacharya (2004) have presented a method for automatically generating the
dictionary from an input document by making use of the WordNet. Meanwhile, we will present
a method for automatic lexicon creation by using Apple Pie Parser. This parser was developed
by Sekine & Grishman (1995) at New York University in 1995. It is a bottom up probabilistic
English parser which uses best first search algorithm to find the parse tree. The grammar used in
apple pie parser is a semi context grammar which contains 2 non-terminals. These grammars
were automatically extracted from Penn Tree Bank. The main goal of this parser is to generate a
syntactic tree from reasonable sentences that can be obtained from news papers, text book or
websites which does not contain ill-formed sentences.
This automatic lexicon creation will be able to support large scale resources with broad
coverage of information which is not meant for any specific domain. There are lexicons catered
48 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
for a particular domain such as MedPost (Smith et al. 2005) which is specific to biomedical
domain, lexicon for cooking actions (Khirai & Ookawa 2006), etc.
The automated procedures that we have developed in this automatic lexicon creation have
helped in extracting and standardizing the representation of lexical data. It managed to extract
data from various textual based resources. These automated lexicons can be capitalized by many
applications such as NLP systems, question answering systems (Kasturi Dewi et al. 2010) as well
as information retrieval systems in order to be more effective, efficient and robust.
This section will elaborate on how the automatic lexicon has been created. The conceptual
framework for automatic lexicon generation is shown in Fig. 3.1.
Fig 3.1: Conceptual Framework of Automatic Lexicon Generation
For this research, we have used Remedia Corpus as the input text. It is the first corpus
developed to evaluate automated Reading Comprehension systems (Hirschman et al. 1999). This
corpus consists of remedial reading materials for grades 3 to 6 and was annotated by the MITRE
Corporation (Hirschman et al. 1999). Many NLP applications researchers such as Wellner et al.
(2005) and Frakes (1992) have also used this corpus as their experiment corpus. Each document
story is approximately 150-200 words in length. The stories cover a wide time period (19th
and
20th
centuries) and a wide range of topics including science, natural disasters, economy, sports
and the environment.
The document story will then be fed into the Apple Pie Parser. The following shows an
example of input text sentence which has been extracted from one of the document story.
Input Text: Dr Mark brushed Sam's teeth.
The input text will then be processed by Apple Pie Parser. A syntactic tree will be generated as
an output from the parser. The output is shown as below.
Input text
Apple Pie
Parser
Automatic
Lexicon
Generator
Lexical
Database
49 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Apple Pie Parser Output:
(S (NPL (NNPX (NNPX Dr) (NNP Mark))) (VP (VBD brushed) (NP (NPL (NNP Sam) (POS
's)) (NNS teeth))))
The output generated is being represented as a syntactic tree which contains part of speech
(POS). The diagram is shown in Fig. 3.2. Thus, we have made use of this POS tagging results
to dynamically generate lexicon from Apple Pie Parser. This syntactic tree will then be used as
an input to the automatic lexicon generator.
Fig. 3.2: Syntactic Tree Diagram
S
NPL VP
NNPX NNP
NNPX
Dr
VBD NP
brushed
NPL NNS
NNP POS
Sam teeth Mark ‘s
50 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Fig. 3.3: Algorithm for the Automatic Lexicon Generator
Each stages of the algorithm have been explained in detail as follows:
3.1 Tokenizing Text Process
The output obtained from apple pie parser will be used as an input for this process. The
tokenizing process will break the body of the input text into a stream of individual word together
with its POS. The input of tokenizing process and the output is shown as below.
Input from Apple Pie Parser:
(S (NPL (NNPX (NNPX Dr) (NNP Mark))) (VP (VBD brushed) (NP (NPL (NNP Sam) (POS
's)) (NNS teeth))))
Output of tokenizing process:
NNPX Dr
NNP Mark
VBD brushed
NNP Sam
NNS teeth
The syntactic tree from apple pie parser will be tokenized and stemming process will be done to
the tokenized text. Each token will be parsed to stemming process together with its part of the
speech tags.
No
Yes
Yes
No
Automatic
Lexicon
Generator
Tokenizing Text
Stemming
Verb
Verb Categorization
Irregular
Verbs
Lexical
Database
Irregular Verbs
Irregular Verb Argument
Extraction
51 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
3.2 Stemming Process
Stemming is very useful for improving retrieval performance because they reduce variants of the
root word to a common concept. We use affix removal stemming strategy which has been
proven as efficient, simple and intuitive (Frakes 1992). Our focus is on suffix removal because
most variants of words are generated by use of suffixes instead of prefixes. In order to remove
the suffixes, Porter’s algorithm(Porter 1997) has been used due to its reliability, simplicity and
elegance.
3.3 Verb Categorization Process
The next step is to identify whether the token is a verb. Apple pie parser managed to categorize
verb as auxiliary verb, modal verb and active verb through its POS tagger.
Examples of auxiliary verb are: have, be, do.
Examples of modal verb: can, should.
All the other remaining verbs belong to active verb. Transitive and intransitive verbs falls
under the category of active verb but apple pie parser did not state this subcategory explicitly.
If the token has been identified as non-verb, then all the tokens together with its part of
speech tags will be saved in the lexical database. As for verb tokens, we proceed with verb sub
categorization process for active verb. This process will then further subcategorized a verb to
transitive and intransitive verb.
Example of a sentence which contains transitive verb:
- Mary dropped the plate.
Example of sentence which contains intransitive verb:
- At last the wind dropped.
The sub categorization of verb to transitive or intransitive cannot be done by just using the
word and its POS alone. Furthermore, POS tagging results fail to inform whether the verb falls
under transitive or intransitive category (Das et al. 2009; Balducinni et al. 2007; Cowie et al.
2000). It has to be done by using the structure of the sentence and the placement of the verb in
that structure. For example, the verb “dropped” as shown in the example above can be
represented as transitive or intransitive.
The automatic lexicon generator will use some predefined rules which has been catered
for this categorization process (Noor Azlina 1986). Once the verb has been successfully
subcategorized, we proceed to irregular verb checking. If the verb does not belong to irregular
verbs list, then this verb is a regular verb whereby majority of English verbs are. An example of
the regular verb form is as follows:
52 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 1: Regular Verb Arguments
Base Form Third Person
Singular Past Tense
Past
ParticipleTense
Present
Participle
Tense
brush brushes brushed brushed brushing
These 5 forms of arguments will be generated for each of the verb before it could be saved
into the lexical database. The generation of arguments will be based on the general rules given in
(Noor Azlina 1986).
A. 3.4 Irregular Verb Argument Extraction
In this process, each of the verbs will be checked to ensure whether it belongs to common
irregular verbs (UsingEnglish.com). If the verb belongs to irregular verb, then all the verb
argument that belongs to that particular verb will be extracted from the irregular verb database.
An example of an irregular verb is “take”. The arguments extracted for the verb “take” is shown
as in table 2. All the extracted verb arguments with its part of speech, in this case either transitive
verb (tv) or intransitive verb (iv) will be saved into the lexical database.
Table 2: Irregular Verb Arguments
Base Form Third Person
Singular Past Tense
Past
ParticipleTense
Present
Participle
Tense
take takes took taken taking
A special user interface has been created in order to help users to view the lexicons
generated. Examples of transitive lexicons that have been saved are shown as in Fig. 3.4.
We know that English is a rich and complex language and it is hard to accommodate an
application for each rule and exemption that the language contains. However, we have taken
measures to fulfill general rules which will help us in automating the creation of lexicon.
Nevertheless, there are times these rules will not be able to accommodate some of the complex
or irregular lexicon. To accommodate to this situation, the user will be given the opportunity to
make any changes to the lexicon that has been generated. Through this interface, user may edit
and save the changes that they have made to the lexicon.
53 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Fig. 3.4: Screenshot of generated transitive verb
4. Conclusion
In conclusion, this paper has described how an automatic lexicon can be generated by making
use of Apple Pie Parser. Furthermore, the generation of automatic lexicon managed to reduce
significant amount of time and manpower compared to manual approach which is proven to be
tedious, slow and expensive. Besides, the generated lexicon managed to cater for wide range of
topics and not restricted to specific domain. For future research, we would like to extent this
research to be incorporated with question answering system using logical linguistic approach that
we are working on currently.
5. Limitations
Automatic Lexicon generator managed to generate general lexicon for general domain. We did
not cater for any domain restricted corpus such as technical corpus, science corpus or medical
corpus which has not been supported by Apple Pie Parser. Besides that, the lexicon generator
will accept grammatically correct sentences only.
54 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
6. Acknowledgment
The authors wish to thank the Ministry of Higher Education Malaysia and University of Malaya
for giving the opportunity for the authors to conduct this research.
References
Balducinni, M, Baral, C. & Lierler, Y. (2007). Handbook of Knowledge Regresentation.
Bouguraev, B & Briscoe, T. (1987). Large lexicons for natural language processing utilizing the
grammar coding system of the Longman Dictionary of Contemporary English,
Computational Linguistics,vol. 13, pp. 219–240.
Buitelaar, P. (1998). CoreLex: Systematic Polysemy and Underspecification. Ph.D: Brandeis
University.
Cowie, J, Ludovik, E., Salgado,H.M., Nirenburg, S. & Scheremetyeva, S. (2000). “Automatic
Question Answering,” RIAO Conference, Paris.
Das,D., Ekbal, A. & Bandyopadhyay, S. (2009). Acquiring Verb Subcategorization Frames in
Bengali From Corpora, Computer Processing of Oriental Languages. Language Technology
for the Knowledge-Based Economy. Vol.5459/2009, pp.386-393.
Frakes, W. (1992). Stemming algorithms. In W. Frakes ann R.Baeza-Yates, editors, Information
Retrieval: Data Structures & Algorithms, pp. 131-160. Prentice Hall, Englewood Cliffs, NJ,
USA.
Grishman,R. & Mcyers, A. (1994). Comlex syntax: building a computational lexicon.
Proceedings of the International Conference on Computational Linguistics, Kyoto, Japan,
1994, pp.268–272.
Guthrie, L., Pustejovsky, J., Wilks, Y. & Slator, B.M. (1996). The Role of Lexicons in Natural
Language Processing. Communication of the ACM, vol. 39, no. 1, pp. 63–72.
Hirschman, L., Light, M., Breck, E. & Burger, J. (1999). Deep Read: A Reading Comprehension
System. Proceedings of the 37th Annual Meeting of the Association for Computational
Linguistics, pp. 325–332.
Kasturi Dewi, V., Tengku Mohd. T.S & Rabiah, A.K. (2010). Automatic Lexicon Generator for
Logic Based Question Answering System. Proceedings of The 2nd
International Conference
on Computer Engineering and Applications (ICCEA 2010), pp.349-353.
Miller, G. (1990). WordNet: An on-line lexical database. International Jounal of Lexicography,
vol. 3, pp. 235-312.
Noor Azlina, Y. (1986). A Quick English Reference, Penerbit Fajar Bakti Sdn. Bhd,
Porter, M.F. (1997). An algorithm for suffix stripping. In K.Sparck Jones and P.Willet, editors,
Reading in Information Retrieval, pages 313-316. Morgan Kaufmann Publishers, Inc.
Rabiah A.K. (2008). Question Answering for Reading Comprehension Using Logical Inference
Model. PhD Thesis. National University of Malaysia.
Ralph G., Nicoletta C. (1996). “Survey of the State of the Art in Human Language Technology”,
New University, Instituto di Linguistica Computazionale del CNR.
Sagot, B. (2005). Automatic Acquisition of a Slovak Lexicon from a Raw Corpus. Springer-
Verlag Berlin Heidelberg, pp.156-163.
Sekine, S & Grishman, R. (1995). A corpus- based probabilistic grammar with only two non-
terminals. Proceedings of the International Workshop on Parsing Technologies, pages 216–
223.
55 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Shalabi, E.A & Kanaan, G. (2004). Constructing an Automatic Lexicon for Arabic Language,
International Journal of Computing & Information Sciences, vol. 2, no. 2, pp. 114-128.
Shirai, K & Ookawa, H. (2006). Compiling of Lexicon of Cooking Actions for Animation
Generation. Proceeding of the COLING/ACL, Sydney, pp.771-778
Smith L, Rindflesch T & Wilbur WJ. (2005). The Importance of the Lexicon in Tagging
Biological Text. Natural Language Engineering, 12(2), pp1-pp17.
Tiedemann, J. (1997). “Automatical Lexicon Extraction from Aligned Bilingual Corpora,”
Diploma Thesis, Otton-von-Guericke-Universitat Magdeburg.
UsingEnglish.com Ltd. http://www.usingenglish.com/reference/irregular-verbs [30 OCTOBER
2009].
Verma, N & Pushpak Bhattacharyya. (2004). Automatic Lexicon Generation through WordNet,”
Proceedings of the Global WordNet Conference, pp.226-233.
Wellner, B., Ferro, L, Greiff, W & Hirschman, L. (2005). Reading comprehension tests for
computer-based understanding evaluation. Natural Language Engineering.
56 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
KNOWLEDGE TRANSFER APPLYING THE STRUCTURAL COMPLEXITY
MANAGEMENT APPROACH
Maik MAURER
Technische Universität München, Product Development, Garching, 85748, Germany
Abstract. Enterprises in all industries are facing the challenge of transferring knowledge in order
to keep up their competitiveness. This challenge is intensified by the fact that the frequency of
transfers increases while the time slots for transfers decrease. The application of methods and
tools to support knowledge transfers must not be resource-consuming. Numerous existing
approaches turn out to be unsuitable in practice, as they do not fulfill this fundamental
requirement. We developed an approach for efficient knowledge transfer in cooperation with
industrial partners using the method of Structural Complexity Management. The approach stands
out for its uncomplicated application and limited resource consumption. The intuitive
visualization of analysis results enables employees to focus on relevant knowledge components.
The systematic prioritization of these relevant components facilitates a well-directed knowledge
transfer between employees.
Keywords: knowledge management, knowledge transfer, Structural Complexity Management,
MDM, dependency modeling.
1. Initial Situation
Enterprises operating in today’s international and demand-oriented markets are facing the
challenge of satisfying an increasing amount of individual customer demands. At the same time,
development, production and distribution processes have to maintain higher output and quality at
less available time and resources. As a result, adding value is becoming increasingly knowledge-
intensive and knowledge management is attracting notice (Nonaka and Takeuchi 1995). In other
words, companies cannot afford to make identical mistakes twice – and have to transfer valuable
knowledge between employees (Argote and Ingram 2000). Enterprises in all industries have
started to introduce knowledge management systems within the last decade (Alavi and Leidner
2001).
Knowledge to be managed divides up into knowledge on objects and knowledge on the
correlations between objects. While knowledge on objects is documented in most cases,
knowledge on the correlations between objects is not (Bergeron 2003). E.g. detailed information
on products, customers, processes and delivery dates is available in most enterprises. But it is not
documented that adapting a specific product to customer demands at a specific process stage
excludes meeting the requested delivery date. This knowledge does not refer to the objects
themselves, but to the correlations between them. Experienced employees, however, know that
kind of correlations, i.e. experience can be regarded as knowing correlations.
The documentation and visualization of correlations is common practice in many technical
fields. Knowledge correlations, however, are rarely documented – and can hardly be transferred
actively.
57 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
2. Problem
Today, the need for knowledge transfer occurs more often than in prior times. On the one hand,
the flexibility on the labor market is increasing. People change their jobs more often than ever
before. Different jobs involving different demands result in the need for life-long learning
(Machlup 1962). On the other hand, the demands on employees have become highly dynamic.
The steadily changing situation of enterprises operating in globalized markets results in steadily
changing demands on their employees. As a consequence, employees have to keep on learning to
stay up to these demands (Botha 2007).
Knowledge management systems do often not meet the needs of modern enterprises. These
systems require continuous information input and maintenance, which is time-consuming and
thus turns out in a poor cost-benefit ratio. In practice, knowledge management often is not
performed during employment, because the expenses would be too high. In case of an
employee’s resignation the time for detailed knowledge transfer (by use of a knowledge
management system) lacks. The frequency of resignations is permanently increasing
(fluctuation) – and intensifies the problem.
Important knowledge transfers are commonly realized by intense cooperation of employees over
a period of time. Although this procedure is quite effective, it holds a major disadvantage: The
expenditure of time is high, and the efficiency is correspondingly low. As the difference in the
knowledge of the mentor (knowledge owner) and the mentee (knowledge receiver) is not
assessed, the knowledge to be actually transferred remains unclear and the transfer process
cannot be well-directed.
3. Objective
The presented approach is not intended to support classical knowledge management, i.e.
knowledge acquisition during employment. It is aimed at providing pragmatic solutions to the
efficient transfer of knowledge, in case an employment is ending for any reason. Therefore,
knowledge objects and correlations, which actually need to be transferred, are identified.
Proceeding this way enables a well-directed transfer process that holds a major advantage: The
expenditure of time is low, and the efficiency is correspondingly high.
To maintain a reliable and repeatable knowledge transfer process, methods for performing the
following four tasks need to be developed: Firstly, the required knowledge needs to be acquired
with the mentor. This must not be a time-consuming task. Relevant knowledge objects and their
dependencies have to be collected in a systematic process. Due to hard time constraints, only the
existence and not the qualitative specification of dependencies should be captured. Secondly, the
existing knowledge needs to be acquired with the mentee. This represents the part of the required
knowledge the mentee already possesses. Thirdly, the knowledge gaps, i.e. the parts of the
required knowledge the mentee is missing so far, need to be identified. These are the knowledge
objects and dependencies the mentor has to impart to the mentee. Fourthly, the knowledge to be
transferred needs to be prioritized. The required knowledge objects correlate to each other, i.e.
some objects are required as preconditions for other objects. A suitable technique to prioritize
knowledge parts allows determining the best sequence to transfer them from the mentor to the
mentee and increases the efficiency of the transfer process.
58 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
The knowledge transfer process should be kept as easy as possible for the mentor and the
mentee. Time constraints do not allow extensive familiarization with the method. Therefore the
procedure has to be supported by a moderator. Furthermore, intuitive visualization of knowledge
objects, dependencies (networks) and prioritization is important. This allows mentor and mentee
to better interact with the knowledge network – and results in an improved transfer quality.
4. State of the Art
Mind maps are commonly used to capture, (re)structure and visualize information in industrial
practice (Buzan and Buzan 2006). Mind maps are particularly suitable for acquiring knowledge
on objects (Eppler 2003), but they are unsuitable for acquiring knowledge on the correlations
between objects. Considering the objective of a pragmatic solution, mind maps can only provide
a solution to the acquisition of knowledge on objects.
A pragmatic solution to the acquisition of correlations has not been introduced into knowledge
management so far. Structural Complexity Management (StCM) represents a promising
approach, as it is based on collecting, analyzing and optimizing correlations (Lindemann et al.
2009). However, this method has almost only been applied to product design problems so far. At
least, first adaptations to the requirements of knowledge management and application in
industrial practice have been realized already (Maurer et al. 2009). Several software tools allow
the modeling of user-defined (not only hierarchical) dependencies; LOOMEO (Teseon 2009)
supports especially the StCM approach. A comprehensive method and software support, which
meets the requirements of a pragmatic knowledge transfer process, has not been realized so far.
5. Approach
The approach to a pragmatic knowledge transfer is subdivided into four sequential steps. These
steps are based on the framework of the generic StCM approach (Lindemann et al. 2009). They
have been adapted to the specific needs of a knowledge transfer, however, taking into account
practical experiences (Maurer et al. 2009).
During the initial system definition, the mentor specifies the knowledge objects relevant to the
transfer process (supported by the moderator). In the subsequent step of information acquisition,
the mentor determines the dependencies between these knowledge objects. This task represents
the most time-consuming and quality-relevant part of the entire approach. Therefore, a
systematic procedure and adequate software support are mandatory. Once the knowledge objects
and dependencies are on hand, they can be visualized. An intuitive representation allows the
mentor and the mentee to scrutinize specific objects and dependencies or to interact with the
acquired knowledge. That means that mentor and mentee can e.g. focus the visual representation
on single knowledge objects (and their dependencies) with major importance. The consideration
of and interaction with specific system views provides better understanding of the complex
knowledge system. The system analysis represents the final step of the approach and provides a
prioritization of knowledge objects. The highest prioritized objects should be taught first by the
mentor, as minimum efforts (for teaching) meet maximum benefit (of capability for executing
tasks).
59 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
5.1. System definition Within the StCM approach, the step of system definition contains the identification of relevant
system domains and their general dependencies (Lindemann et al. 2009). A domain represents a
class of elements, e.g. “Components” or “Functions”. The general dependency between
“Components” and “Functions” means “realize”. This abstract formulation expresses that every
component that is considered to possess a dependency on a function contributes to the realization
of that function. Most complex problems require an intense system definition, as the
specification of domains and their general dependencies is fundamental to all further activities.
Domains and dependencies that are relevant within the context of knowledge transfers could be
determined in several projects. Figure 1 shows the four domains “Tasks”, “Competences”,
“Networks” and “Methods”. The dependencies all possess the meaning “required for”.
“Competences”, “Networks” and “Methods” only possess outgoing dependencies, i.e. these
domains represent enablers required for achieving “Tasks”. The “Task” domain possesses three
incoming and one self-reflexive dependency. That means that “Tasks” represent objectives
within the system. This can be approved in practice, as knowledge transfer intents to enable the
mentee to fulfill tasks.
Figure 1 System definition with four domains
The basic model of domains and dependencies has been successfully applied in knowledge
transfer processes (Maurer et al. 2009). This meta-model must be adapted to specific needs. For
example, the “Network” domain can be disregarded sometimes, whereas “Competences” could
be split up into separate domains (e.g. management and technical competences). The following
explanations are based on the meta-model in Figure 1, as it is appropriate to many transfer
processes. 5.2. Information Acquisition Based on the system definition, the domains have to be detailed, i.e. relevant knowledge objects
have to be determined. Therefore, the mind mapping method and associated software support
(Mindjet 2009) can be applied. Guided by the moderator, the mentor shall name relevant
knowledge objects within each domain. The moderator has to assure that these objects possess
the same level of detail, as this is important to the subsequent acquisition of dependencies. For
practical reasons, domains should not exceed the quantity of 40 included objects. If, however,
domains tend to exceed that quantity, they should be subdivided.
If knowledge objects have been captured for all domains, next the dependencies between
knowledge objects have to be acquired. This represents the most time-consuming part of the
approach and therefore requires methodical support. According to the StCM approach,
knowledge objects are transferred to a Multiple-Domain Matrix (MDM), as depicted in Figure 2.
The abstract MDM depicts exactly the same information as the graph in Figure 1. Moreover,
Competences
Networks
Methods
Tasks
60 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
every matrix cell in Figure 2 contains a more detailed matrix, which allows the determination of
dependencies between knowledge objects.
Figure 2 Representation of the knowledge system in a Multiple-Domain Matrix
Detailed matrices containing the dependencies between concrete knowledge objects only have to
be filled out, if the corresponding domains are linked (Figure 2). Proceeding this way minimizes
the efforts for information acquisition. Figure 2 shows exemplarily the detailed matrix for the
acquisition of dependencies between knowledge objects from the network and the task domain.
As it can be seen, dependencies are only acquired as binary, i.e. true or false, information.
Whereas the mentor has to specify the knowledge objects and their dependencies, information
acquisition for the mentee is less demanding. Mentees only have to specify their familiarity or
unfamiliarity with the knowledge objects. This task can be executed best by using the mind map
of knowledge objects already created with the mentor.
5.3. System Visualization and Interaction: System Views With the information from the mentor and the mentee on hand, specific system views can be
deduced. The relevant views are shown in the following figures (Figure 3 and Figure 4) in
general. Practical examples can be seen in Chapter 6.
The view described by Figure 3 depicts all knowledge objects that are unknown (or not
sufficiently known) to the mentee (labeled as “Needed”). Furthermore, all dependencies
(specified by the mentor) are indicated, which reach to unknown tasks. In other words, the
system view in Figure 3 indicates the tasks and their enablers, which are unknown to the mentee.
This view allows identifying which knowledge objects need to be transferred before unknown
tasks can be executed by the mentee. In addition, tasks can be identified, which are unknown to
the mentee, but which do not require further enablers. These tasks can directly be transferred,
e.g. by “training by doing”.
Tasks Competences Networks Methods
Tasks X
Competences X
Networks X
Methods XT1 T2 T3 …
N1 X
N2 X X
… X X
61 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Figure 3 Unknown tasks and required enablers
Whereas the first view considers unknown knowledge objects and associated unknown enablers,
the second view focuses on the different handling of tasks and competences between mentor and
mentee. Besides the knowledge objects unknown to the mentee, this represents the second
important block of knowledge transfer: Views based on the logics of Figure 4 show that the
mentee is familiar with a task, but that he does not know one or more associated enablers (based
on the knowledge of the mentor). Obviously, mentor and mentee handle specific tasks
differently, as the mentee does not make use of some of the mentor’s enablers.
Figure 4 Different handling of tasks due to unknown enablers
5.4. System Analysis: Prioritization of Object Transfer The system views on the network of knowledge objects already indicate the objects that have to
be closer considered by mentor and mentee. This provides a first step towards an efficient
transfer process. System analysis now enables users to prioritize knowledge objects to be
transferred by the following criteria: The benefit achievable from the transfer of one knowledge
object can be assessed by the number of tasks that become executable by this enabler. The input
associated with the transfer of one knowledge object can be assessed by the number of additional
TasksRequired for Needed
Competences
Needed
Networks
Needed
Methods
Needed
Tasks
Needed
TasksRequired for
Familiar
Competences
Needed
Networks
Needed
Methods
Needed
Tasks
Needed
62 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
knowledge objects that need to be transferred to enable the execution of the connected tasks.
Both criteria are clarified by the examples A to E in Figure 5.
Elements indicated as T (targets) represent knowledge objects that are unknown to the mentee
and shall be transferred to him, e.g. unknown tasks. Elements indicated as R (requirements)
represent knowledge objects that are also unknown to the mentee and are required for
transferring the targets. In case A, one requirement is linked with one target. That means that the
transfer of one knowledge object enables e.g. performing one task. In case B, the transfer of one
knowledge object would enable performing even two tasks, i.e. the same input (transfer of one
knowledge object) is associated with a higher benefit (the performance of two tasks). The
different cases can be depicted in a portfolio with the input and the benefit on the two axes.
Figure 5 Rating requirements (knowledge objects) by benefit and input
Figure 6 contains the matrix model for computing the achievable benefit, required input and
resulting efficiency of mediating an unfamiliar knowledge object to the mentee. Requirements
(R) correspond to the rows of the matrix, whereas targets (T) correspond to the columns.
Existing correlations are indicated by the cell entries of the matrix. If a requirement is a
precondition of a target, the cell that combines the corresponding row and column contains an
entry (1). The total number of cell entries in a row indicates the number of targets a specific
R1 T1 R1
T1
T2
R1
T1R2
R1 T1
R2 T2
R1 T1
R2 T2
A: B:
C: D:
E:
A
B
C
D,E
Ben
efit
Input1 2
1
2
63 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
requirement contributes to. As explained above, this number determines the (potential) benefit of
transferring a requirement. The formula for computing the benefit of a requirement is denoted
below Figure 6.
The second index for assessing requirements is the so-called input. It indicates the number of
additional requirements that need to be transferred in order to enable (completely) the targets,
which depend on a requirement in focus. Every target corresponds to a specific column and
shares an entry with the row that corresponds to the requirement in focus. If the column includes
additional entries, it depends on additional requirements. The total number of additional entries
in all columns that share an entry with the row in focus is the additional input. The formulas for
computing the (total) input of a specific requirement are also depicted below Figure 6. The
efficiency of mediating an enabling knowledge object to the mentee is determined by the ratio of
benefit to input (formula below Figure 6).
Figure 6 Computation model for benefit, input and efficiency
Benefit:
∑ (1)
Input:
∑ (2)
{ ∑
(3)
Efficiency:
(4)
The computations allow to position knowledge objects in the Benefit-Input Portfolio (Figure 7).
The portfolio divides up into four sectors that categorize knowledge objects: Objects in the lower
left sector represent “quick wins”, as only moderate input is required for deriving benefit (e.g.
only moderate input is required for empowering the mentee to perform certain tasks). The upper
left sector contains the knowledge objects that are the first ones to teach, as moderate input
results in high benefit (e.g. one single enabler represents the basis for executing several
unfamiliar tasks). The lower right sector contains the objects that are the last ones to teach, as a
considerable input can only produce a low benefit. Objects in the upper right sector, finally, are
the so-called “tough wins”, as considerable input is required for gaining significant benefit. The
delimitations of the sectors cannot be fixed to a certain quantification of input and benefit, but
are case-specific.
xij 1
1 1Ri..m
Tj..n
rk
R1
T3Rk
T5
64 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Figure 7 Classification of elements in the Benefit-Input Portfolio
All objects in the portfolio possess a minimum benefit value of one (each object enables at least
one other object) as well as a minimum input value of one (at least the enabling object itself is
required). Knowledge objects that are located on the same diagonal possess an identical
efficiency (although they might possess different input and benefit). The Benefit-Input Portfolio
provides an intuitive visualization of the rating of enabling objects and therefore allows mentor
and mentee to focus their discussion on the most beneficial ones. As the Benefit-Input Portfolio
represents a snapshot of a knowledge system and a transfer process respectively, it can only
serve for assessing single objects (which are of interest for any reason). It is, however, unsuitable
for determining the optimum transfer sequence of several or even all knowledge objects, because
the transfer of each object might change the potential benefit and required input of all other
objects.
Assuming that all knowledge objects have to be transferred in the end, the iterative assessment of
their benefit and input (by applying the Benefit-Input Portfolio) can be used to prioritize them
and plan the transfer process. As the term “first to teach” implies, the transfer process should be
started with an object, which is located in the corresponding sector (the one that possesses the
highest potential benefit and the lowest additional input). If the sector does not include any
(more) objects, the “quick wins”, “tough wins” and finally “last to teach” sector become the
focal point of interest. The selected object and all other objects that are required to actually
derive the related benefit need to be transferred together. In order to proceed with the
prioritization, this group of knowledge objects then has to be removed from the computation
model (see Figure 6) and the Benefit-Input Portfolio has to be recomputed. From the recomputed
portfolio, the next object can be selected. This iterative proceeding allows deriving the transfer
sequence that possesses the optimum benefit progression.
The Benefit-Progression Diagram depicted in Figure 8 includes the derived transfer sequence
(sequence of bars), the groups of knowledge objects that need to be transferred together (groups
of bars with identical height) and clarifies the benefit progression.
Quick wins
Last to teach
First to teach
Tough wins
Efficiency
Dir
ect b
enef
its
Minimal benefitsB
enef
it
Input1
1
65 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Figure 8 Optimum transfer sequence in the Benefit-Progression Diagram
6. Results
Figure 9 to Figure 14 represent (parts of) the outcome of a mentor-mentee interview generated in
a project with an engineering company. For reasons of non-disclosure the system elements have
been made anonymous; the structural dependencies have been retained unchanged. The graph
representations were used for discussions between mentor and mentee. The Benefit-Input
Portfolios were applied for preselecting knowledge objects to facilitate transfer processes. The
time constraints required a highly efficient execution. All in all, the expenditure of time for all
involved people (mentor, mentee and moderator) added up to 8 man days. While the moderator
had to invest 4.5 days of work, mentor (2.5 days) and mentee (1 day) were burdened with less
effort. These expenditures do not include the subsequently required discussions between mentor
and mentee. However, these could be executed very efficiently and the additional input for the
preparative tasks could be overcompensated in the end.
Figure 9 shows the tasks the mentee has not been familiar with. The linked and also unfamiliar
competences and methods have been named by the mentor as being necessary enablers for the
unfamiliar tasks.
Figure 9 Unfamiliar enablers required for needed tasks
Ben
efit
Transfer sequence
1
2
3
4
5
66 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Figure 9 allows determining some important enablers, but the Benefit-Input Portfolio is more
suitable for a detailed and precise rating. Figure 10 shows that Method-1 (M-1) should be taught
first, as it enables the highest amount of tasks and only one additional enabler is required.
Figure 10 Rating unfamiliar enablers
Re-computing the portfolio and selecting objects successively allows deriving the optimum
transfer sequence, which is depicted in the Benefit-Progression Diagram of Figure 11.
Figure 11 Optimum transfer sequences of unfamiliar enablers
Figure 12 depicts familiar tasks of the mentee and competences, methods and other tasks the
mentor specified as required enablers for these tasks. These enablers are unknown to the mentee,
which means that mentor and mentee handle the (known) tasks differently.
M5
Ben
efit
Input1
1
2 3 4 5
2
3
4
5
M4
M1
6
C4
7 8
M2,C3
M3,C2,C1
Ben
efit
Transfer sequence
2
4
6
8
10
M1 C4 M4 M2 M5 C3 M3 C1 C2
67 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Figure 12 Different tasks handling due to unfamiliar enablers
The consideration of the associated Benefit-Input Portfolio (Figure 13) allows the
characterization of knowledge objects for the discussion between mentor and mentee. Method-A
(M-A) should be taught first, owing to the high benefit at low input.
Figure 13 Rating unfamiliar enablers of tasks handled differently
The optimum sequence for discussing and transferring enabling knowledge objects is clarified by
the Benefit-Progression Diagram of Figure 14.
Input1 2 3 4 5 6 7 8 9B
enef
it
1
2
3
4
5
6
7
M-A
C-A
M-D,T-n-2
M-B,M-C
C-E,C-F,C-G
T-n-3
T-n-1
10 11 12 13
C-D
C-B,C-C
68 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Figure 14 Optimum discussion and transfer sequence of unfamiliar enablers
7. Conclusion
Transferring knowledge represents a major challenge for enterprises, as adding value is
becoming increasingly knowledge-intensive. Knowledge management systems require
continuous and time-consuming maintenance. At the same time, the frequency of knowledge
transfers is increasing and time slots for carrying them out are decreasing.
The mind mapping method and the Structural Complexity Management approach have been
successfully applied to knowledge management in the recent past. The presented approach
combines these methods into a pragmatic solution for executing efficient knowledge transfers.
We proofed the usability and efficiency of our approach by carrying out several projects in
industrial practice. The expenditure of time for training involved employees and acquiring
information with them was adequate to all intents and purposes. Suitable software support and
the intuitive visualization of results enabled the intended well-directed transfer of knowledge.
References
M. Alavi and D.E. Leidner, “Review: Knowledge Management and Knowledge Management
Systems: Conceptual Foundations and Research Issues,” MIS Quarterly, vol. 25, no. 1, Mar.
2001, pp. 107-136.
L. Argote and P. Ingram, “Knowledge Transfer: A Basis for Competitive Advantage in Firms,”
Organizational Behavior and Human Decision Process, vol. 82, no. 1, May 2000, pp. 150-
169, doi:10.1006/obhd.2000.2893.
B.P. Bergeron, Essentials of Knowledge Management. Hoboken, NJ: John Wiley & Sons, 2003.
A.P. Botha, Knowledge – Living and Working with It. Cape Town: Juta & Company, 2007.
T. Buzan and B. Buzan, The Mind Map Book. Harlow: Pearson Education, 2006.
M.J. Eppler, “Making knowledge visible through knowledge maps: concepts, elements, cases,”
in Handbook on Knowledge Management 1: Knowledge Matters, C.W. Holsapple Ed.
Berlin: Springer, 2003, pp. 189–205.
U. Lindemann, M. Maurer and T. Braun, Structural Complexity Management – An Approach for
the Field of Product Design. Berlin: Springer, 2009.
Ben
efit
Transfer sequence
2
4
6
8
10
M-A C-B C-C C-A C-D C-E C-F C-G M-B M-C M-D T-n-1T-n-2T-n-3
12
69 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
F. Machlup, The Production and Distribution of Knowledge in the United States. Princeton, NJ:
Princeton University Press, 1962.
M. Maurer, H. Klinger and A. Benz, “Applying the Structural Complexity Management to
Knowledge Transfer in Small and Medium-Sized Companies,” International conference on
innovation through knowledge transfer 2009 (InnovationKT’09), in press.
Mindjet LLC, MindManager, http://www.mindjet.com, accessed: 28 September 2009.
I. Nonaka and H. Takeuchi, The Knowledge-Creating Company: How Japanese Companies
Create the Dynamics of Innovation. Oxford, NY: Oxford University Press, 1995.
Teseon GmbH, LOOMEO – Software for the Structural Complexity Management,
http://www.teseon.com, accessed: 28 September 2009.
70 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
TAXONOMY OF K-PORTAL FOR INSTITUTE OF HIGHER LEARNING
IN MALAYSIA: A CASE STUDY OF UNIVERSITI SELANGOR (UNISEL)
Haslinda Sutan Ahmad Nawi, Shireen Muhammad and Nur Syufiza Ahmad Shukor
Universiti Selangor, Kuala Selangor, Selangor, 45600
and
Mohd Lezam Lehat
Universiti Teknologi MARA, Segamat, Johor, 85000
Abstract. This paper demonstrates results of a common taxonomy for knowledge portal (K-
Portal) adopted by the institutes of higher learning (IHL). Preliminary data was gathered from
the literature review that later produced a draft taxonomy. Based on the review, a set of
questionnaire was developed and, a random survey on a set of academicians was conducted in
order to acquire ingenuous feedback on respondents’ level of awareness and willingness in
contributing to their institutional knowledge sharing endeavors. The data was quantitatively
analyzed and mapped against the results from the observation done at the ranges of active IHL
portals. In the end a common taxonomy was identified, and later adopted in Universiti Selangor
(UNISEL) knowledge portal.
Keywords: institutes of higher learning, K-Portal taxonomy.
1.0 Introduction
Knowledge is critical to the long term sustainability and success of any organization (Nonaka &
Takeuchi, 1995). The importance of knowledge is crucial to both public and private
organizations, particularly educational institutions such as universities (John, 2001). As a
container of knowledge, institutes of higher learning (IHL) are no longer just providing
knowledge to students, but also managing and blending together as well as sharing such
knowledge among the students. Thus, knowledge sharing is certainly challenging and vital
concept in IHL.
Over the last decade, the education era has evolved from a traditional teaching
environment to a highly open and dynamic knowledge-based (K-Based) environment (Arntzen et
al., 2009). This is due to the great adoption of ICT tools, particularly the knowledge portal (K-
Portal) in facilitating communication, coordination and knowledge sharing among academic
community of institutes of IHL (Muhammad et al., 2010). However for most of the
organizations, the tacit knowledge is the main knowledge type as the activity of transforming the
knowledge into documented and digitized form are not easily done (Ahmad Shukor et al., 2009).
According to Arntzen et al., (2009), access to the knowledge is only the beginning.
Generally, most of the academicians are already IT literate, they digitize and deposit monographs
of course note, collections of examination / assignment questions and samples of report in their
private repositories. Consequently, these knowledge objects are not sharable and furthermore this
is harmful for the current education landscape that demands for each knowledge objects to be
constantly accessible via K-Portal (Muhammad et al., 2010). This paper recognizes the
71 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
importance of knowledge taxonomy as a tool for architecting K-Portal with the intentions of
streamlining navigation, improving knowledge sharing and developing a better user experience.
The term knowledge management was first used by management guru Peter Drucker in the
mid-1980s. This concept focused on measuring an organization’s intellectual assets and adding
value and meaning to its information by asking questions such as: What do the organization need
to know? Who in the organization knows it? Who in the organization needs to know it? How
can the people in the organization, who need this information, access it? Subsequently, in the
mid-1990s, the term knowledge management took on a new meaning through the development of
computer applications that provide a means of organizing and accessing the information within
an organization.
A definition of knowledge management by Davenport et al. (1998), support the above
statement:
Knowledge management is concerned with the exploitation and development of the
knowledge assets of an organisation with a view to furthering the organisation's
objectives. The knowledge to be managed includes both explicit, documented
knowledge, and tacit, subjective knowledge. Management entails all of those
processes associated with the identification, sharing and creation of knowledge. This
requires systems for the creation and maintenance of knowledge repositories, and to
cultivate and facilitate the sharing of knowledge and organisational learning.
Organisations that succeed in knowledge management are likely to view knowledge
as an asset and to develop organisational norms and values which support the
creation and sharing of knowledge.
Access to quality education and capacity to acquire, assimilate and productively utilize
knowledge will therefore essential to affirmative action in a K-Based economy. Knowledge
needs to be kept updated and alive in a well structured knowledge repository. Therefore IHLs
have to modernize the quality of education by nurturing their existing culture and intellectual
infrastructure to support life-long learning. IHLs should consider of designing their own
taxonomy to best fit their K-Portal content, context and users. For the purpose of this study,
knowledge sharing is defined as the process of exchanging knowledge (skills, experience, and
understanding) among academician and university stakeholders such as student, head of
department, dean and others. In Malaysian environment, ICT’s capacity that contributes to
greater efficiency of work and resource management has not been fully exploited (Economic
Planning Unit, 2009). Downloading lecture notes from the web is still uncommon and the
Internet is not being adequately accessed for lecture contents. Therefore the development of on-
line system for knowledge sharing will be important to nurture the distribution and management
of knowledge in the economy, accordingly K-Portal is an appropriate bridge between knowledge
and digital divides (Muhammad et al., 2010).
Suitable taxonomies play an important role for this K-Portal in managing teaching
materials because the classification of objects helps academician understand and analyze their
domains. The reduction of complexity and the identification of similarities and differences
among objects are major advantages provided by taxonomies. Debowski, 2006 said that
taxonomies are structured as multilevel hierarchies from general to specific content. Mack et al.,
72 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
2001 defined taxonomy as an adequately labeled set of hierarchically organized clusters for a
document collection. While Chaudhry & Khoo, 2008 define taxonomy as term used in a broad
context referring to a categorization scheme comprising terms arranged hierarchically to
facilitate tagging and browsing of different types of knowledge resources. Taxonomy is an
initiative when building knowledge repositories in realizing knowledge capitalization Arntzen et
al.,(2009). The best part is that this taxanomy will benefit the IHL in the area of search, support,
navigation, data control/mining, schema management, and personalization/information delivery
(Bennya et al., 2004).
This paper discovers common taxonomy for organizing teaching materials and learning
objects in IHL K-Portal by evaluating current K-Portal of nine Accelerated Programme for
Excellence (APEX) candidate universities in Malaysia; and conducting a random survey on a set
of academicians to acquire feedback on respondents’ level of awareness; and willingness in
contributing to their institutional knowledge sharing endeavors. Data from the survey was
analyzed and mapped against the results from the observation of nine active IHL portals of
APEX candidate universities. Ultimately a common taxonomy was identified, and been adopted
in Universiti Selangor (UNISEL) knowledge portal.
2.0 Methods
This research was conducted with two primary objectives; to investigate the common academic
knowledge sharing practice on K-Portal and to examine types of knowledge shared by
academicians. In doing so, a research method that started with a literature review on studies done
by Arntzen et al., 2009 and Chaudhry & Khoo, 2008 to produce draft taxonomy, was employed.
A set of questionnaire was then developed based on this draft to confirm the common knowledge
taxonomy practiced by academician at UNISEL. Simultaneously, a content evaluation on K-
Portal of nine shortlisted APEX candidate universities was examined based on the same draft
taxonomy. Both of the findings were then mapped and common knowledge taxonomy was then
formulated. Fig. 1 illustrated the research method used in the project.
73 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Fig. 1: Methods of the Research
The first step in conducting the research was to produce draft taxonomy from the literature
review. The draft taxonomy that was constructed was based on Arntzen et al., 2009 and
Chaudhry & Khoo, 2008 as depicted in Table 1. Arntzen et al., 2009 identified six types of
knowledge, while reference Chaudhry & Khoo, 2008 listed eleven types of knowledge. Among
the knowledge identified, three of them are mutually identified by both Arntzen et al., 2009 and
Chaudhry & Khoo, 2008, namely discussion database / forum, exercise (e.g. quizzes / lab sheets)
and syllabus / course information. In the end, fourteen types of knowledge that are commonly
shared by the academicians were identified that produced the draft taxonomy of knowledge
shared among the academicians. The draft taxonomy produced needed to be verified and the
verification was done by using the draft taxonomy as parameters in both survey and K-Portal
content evaluation of selected universities which later conducted in this research.
Table 1: Draft Taxonomy
Types of knowledge Sources
Announcement / news Arntzen et al., 2009
Assignment questions
Chaudhry & Khoo,
2008
Books
Chaudhry & Khoo,
2008
Case study questions
Chaudhry & Khoo,
2008
Discussion database / forum Arntzen et al., 2009
Chaudhry & Khoo,
2008
Examination questions
Chaudhry & Khoo,
2008
Exercise (e.g. quizzes / lab sheets) Arntzen et al., 2009
74 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Chaudhry & Khoo,
2008
Extra reading materials
Chaudhry & Khoo,
2008
Lecture artefacts (e.g. slides /
notes)
Chaudhry & Khoo,
2008
Projects questions and samples of
report
Chaudhry & Khoo,
2008
Related references Arntzen et al., 2009
Student lists Arntzen et al., 2009
Syllabus / course information Arntzen et al., 2009
Chaudhry & Khoo,
2008
Tools (software downloads)
Chaudhry & Khoo,
2008
Consequently, questionnaires that utilized the draft taxonomy as one of the testing
parameters were distributed to sixty (60) academicians (lecturers and assistant lecturers) for a
period of 2 weeks (Jul 7 - 21, 2009). Table 2 shows six faculties in Universiti Selangor
(UNISEL) Bestari Jaya were involved in the survey with a total number of forty (40) sets of the
questionnaires were completed, returned and analyzed.
Table 2: Survey Distribution
Faculty No. of
lecturers No. of questionnaires
Distributed Analyzed
Applied Science & Mathematics 36 6 6
Business 60 6 6
Education & Language Studies 44 6 6
Engineering 66 6 2
Computer Science & Information
Technology 60 30 18
Social Science & Industrial
Management 25 6 2
291 60 (21%) 40 (14%)
Source: Muhammad et al, 2010.
With the aim to analyze the respondents’ willingness in sharing their academic knowledge
via K-Portal, the followings were identified through the survey: the source of knowledge; types
of knowledge acquired; types of knowledge repositories; frequency of updating the knowledge
repositories; types of knowledge contained in the repositories; means of sharing; types of
knowledge an academician willing to share; and format of sharing. Based on the elements
identified a more valid taxonomy can be devised.
75 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Series1, Personal
knowledge portal, 3
Series1, Free e-learning
tools, 3
Series1, Personal
email / blog , 8
Series1, Personal
website, 9
Series1, Free e-groups, 25
Series1, Manually, 52
Percentage (%)
Kn
ow
led
ge s
har
ing
me
diu
m
Concurrently, K-Portal content evaluation based on the same parameters analyzed in the
questionnaires was performed. This data collection activity helped to identify the K-Portals
taxonomy for nine shortlisted APEX candidate universities. APEX candidate universities were
chosen for this research as APEX is the rating and ranking exercises practiced in Malaysia to
evaluate each university’s state of readiness, transformation plan and preparedness for change
based on quantitative and qualitative criteria. Therefore it is substantial for this research to refer
to these universities’ K-Portal as the justifiable references.
Finally the findings from the survey conducted at the faculty and results formulated from the
content evaluation of nine APEX candidate university K-Portals were mapped to produce IHL
common knowledge taxonomy which later was utilized by the K- FCSIT Portal.
3.0 Analysis and Discussion
3.1 Analysis
Data for this research was gathered from questionnaires disseminated to 60 academicians of six faculties in UNISEL on July 2009. From 60 sets of questionnaire collected, 40 sets were completed and thus valid for the analysis purpose. Based on the feedback, it is learnt that 60% of the respondents are female; 70% of the respondents are qualified to master’s degree level; and 52% of the respondents are experienced in education profession for over six years. Regrettably as shown in Fig. 2, despite of the high education level and extensive years of teaching experience factors, many of the respondents continue to employ the traditional approach of knowledge sharing by 52% while only a few of them adopt K-Portal in their knowledge sharing endeavour by 3%. According to the analysis on the latter event, it is the found that several factors may be the contributors namely the non-existence of corporate K-Portal as the unified sharing platform and the disorganized information architecture that is ineffective to embody the organizational content. Consequently, manual means are chosen – photocopies of everything are dumped into pigeon holes or simply distributed by hand.
Source: Muhammad et al, 2010.
Fig. 2: Knowledge Sharing Medium
76 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
The questionnaires further discover the sources of knowledge acquired by the academician, as
depicted in Table 3, in which knowledge sources are ranked into three groups based on their
popularities among academicians. The first group (1-3 rank) represents the 53% of major source
of knowledge comprises of physical books, formal learning, journals and materials from the
Web. While the subsequent group (rank 4-6) stands for the 27% of moderately referred sources
namely, hands-on trainings, peers coaching and magazines. And finally, the third group (rank 7-
9) signifies the 20% least referred sources chosen by the academicians for knowledge acquisition
purpose that are seminars and workshop.
Table 3: Sources of Knowledge Acquisition
Types of knowledge f Rank % Knowledge
source rank
Books 20 1 35
53% major
source (1-3
rank)
Formal learning 9 1
Journals 7 2 8
Materials from the
web 8 3 10
Hands-on trainings 8 3.5 10 27% moderate
source (4-6
rank)
Peers coaching 7 4 8
Magazines 7 6 8
Seminars 8 8 10 20% minor
source (7-9
rank) Workshop 8 8 10
Source: Muhammad et al, 2010.
The surveys further realize the types of knowledge captured by the academicians, as shown
in Table 4, wherein types of knowledge are ranked into three groups based on the academicians’
priority set-up. From the results, it is ascertained that the most acquired types of knowledge are
syllabus, lecture artifacts, text books, assignment questions and case study questions. This is due
to the fact that these types of knowledge are the most useful and importance materials in
conducting a class. If is also learnt from the surveys that the academicians moderately seek for
examination question, exercises and project materials such as project question and sample of
report as their knowledge source. And finally, the least or minor types of knowledge acquired by
the academicians are research report, published document and tools such as programming and
multimedia editing software.
77 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 4: Types of Knowledge Acquired
Types of
knowledge f Rank %
Knowledge
source rank
Syllabus 8 1
51
51% major types
of knowledge
acquired (1-3
rank)
Lecture artifacts 10 1
Text books 8 1.5
Assignment Q 6 2.5
Case study Q 6 3
Examination Q 8 4
28
28% moderate
types of
knowledge
acquired (4-6
rank)
Project material 5 4.5
Exercises 8 5
Research report 5 7
21
21% minor types
of knowledge
acquired (7-9
rank)
Published
document 4 8
Tools 7 9
Source: Muhammad et al, 2010.
For the benefit of this research, additional data were also collected from assessments done to
K-Portals of APEX candidate universities in Malaysia, based on the same parameters analyzed in
the questionnaires. The evaluation resulted that every K-Portal enables four clusters of
knowledge processes namely, knowledge capture and store; search and retrieve; structure and
navigate; and integration with existing business application.
While Table 4 demonstrates the types of knowledge required by a population of respondents
from UNISEL, in contrast, Fig. 3 represents frequent types of knowledge provided by the K-
Portals of APEX candidate universities. This register of knowledge types is further exploited in
constructing a common taxonomy of K-Portal as depicted in Table 5.
Source: Muhammad et al, 2010.
78 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Fig. 3: Generic Types of Knowledge Deposited in APEX Candidates Universities’ K-Portal
Table 5 represents common taxonomy, which classifies types of knowledge into three main
categories, according to the three criteria of UNISEL’s academician assessment scheme. The
reason behind the decision of to exercise the assessment-based categorisation is first, to simplify
the assessment process in such a way that the performance of each academician can be easily
tracked by their academic writing contribution as one of the factor; and second, to assist the
academician in better organizing their knowledge objects in line with the university
requirements. Hence, knowledge should be classified accordingly into academic administrative;
teaching and learning material; and research material categories.
As the common taxonomy is established, it is subsequently “revisited” and “re-evaluated” by
the same group of respondents for their sharing willingness consideration. As shown in Table 5,
there are seventeen types of knowledge, wiling to be shared by the respondents whereby based
on the analysis of 40 questionnaires, over 80% of academicians are willing to share these types
of knowledge through the university K-Portal.
In conclusion, this common taxonomy of knowledge for K-Portal has been successfully
established by aligning it to the APEX universities with the conformation and approval from its
future users – the academicians.
79 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
Table 5: Common Taxonomy Knowledge
Knowledge sharing willingness
based on common knowledge
taxonomy
f %
Academic Administrative
Announcement / news 39 98
Discussion / forum 36 90
Students list 37 93
Syllabus 38 95
SOPs 34 85
Teaching & Learning Material
Books 39 98
Assignment question 38 95
Case study question 36 90
Examination question 37 93
Exercises 37 93
Lecture artifacts 38 95
Project material 34 85
Tools 38 95
Research Material
Extra reading material 38 95
Published document 35 88
Related references 36 90
Research report 33 83
Source: Muhammad et al, 2010.
3.2 Taxonomy Application: A Case Study of UNISEL
An effort should be made to put into practice on the agreed common taxonomy depicted in Table 5. Realizing this fact, the Faculty of Computer Science & Information Technology (FCSIT), UNISEL have initiated a project identified as the K- FCSIT Portal, which is to develop a knowledge management portal for FCSIT. This project is to leverage and enhance collaboration and sharing of teaching materials between academicians – the so-called “Community of Practice” of this project.
As current market is now heavily being “pushed” by the open source venture and considering
the financial, technical and operational feasibilities of the technology, FCSIT decided to exploit
Remository 3.52 as the extension tool for establishing and maintaining the teaching material
repository of this portal.
In general, K-FCSIT Portal project took more than a year to complete that is from January
2009 to June 2010, and it is managed through three phases of implementation. In Phase I,
80 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
teaching materials namely books; assignment questions; case study sets; samples of examination
question; exercises; lecture artifacts such as lecture notes and lecture slides; project instructions,
guidelines and samples; and finally tools such as software installer links that are needed in the
lab sessions are uploaded by the course owner, i.e.: the lecturer for the respective course. Apart
from teaching materials, academic administrative materials are as well being uploaded, which are
announcement or news; discussion and forum; student lists; syllabus and first day lecture slides;
and SOPs. While in phase II additional tools for user is integrated to enrich their usage
experience, namely search engine, email integration and FAQ. Finally in Phase III, that is to be
implemented, research materials category namely extra reading materials; published document;
related references; and research report.
As the course owner, a lecturer is allowed to upload academic administrative materials, as
shown in Fig. 4.
Fig. 4: Academic Administrative Material
81 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
A lecturer is also allowed to upload teaching and learning material, as depicted in Fig. 5.
Fig. 5: Teaching and Learning Material
Having materializing the knowledge capture endeavor of teaching materials into the portal, it
is ultimately demonstrated that the taxonomy created has been successfully mapped and adopted
by the UNISEL, specifically by the academicians of FCSIT. While the significance of this
project on the other hand, other than as the central teaching material hub, it is greatly used by the
Malaysian Quality Accreditation (MQA) panels during their visit to FCSIT since third quarter of
2010. They are now referring to the K- FCSIT Portal, as their main reference, rather than the old-
fashion bulky hard-files to refer and evaluate every teaching material.
4.0 Discussion and Conclusion
In conclusion, K-Portal has a big responsibility as an enabler to knowledge management
activities in Malaysia. Therefore before any portal technology initiative is taken into
consideration; the development of the information context as well as user context should be
thoroughly studied (Sulaiman & Selamat, 2007). Based on result discussed, the study can
conclude that there is a need to increase the K-Portal usage among academicians as the enabler
of knowledge management activities and there is a requirement of generic knowledge taxonomy
to fit the K-Portal.
The current study is focusing on the investigation and reporting of the existence knowledge
taxonomy of IHL in Malaysia and knowledge sharing willingness among the academicians. The
report is based on evaluating of nine APEX candidate universities in Malaysia and 40 valid
sampling from one IHL-UNISEL. As a conclusion, the authors revealed that, the knowledge
taxonomy proposed in this paper are suitable for an adoption in the UNISEL K-Portal. It is
hoped that we could further the study by widening our sample. By doing so, we could propose
taxonomy that broadly accepted in order to cultivate the knowledge sharing society among
academician through the usage of K-Portal. Furthermore the functions and features embedded in
the K-Portal also could be an area of the next study.
82 JIRKM – Journal of Information Retrieval and Knowledge Management (Vol. 1, 2011)
4.1 Acknowledgement
The research team would like to express their appreciation to the K-FCSIT Portal team for their
support in developing the K-Portal.
References
[1] Nonaka, I.O. & Takeuchi, H. (1995). The Knowledge-creating Company: How Japanese
Companies Create the Dynamics of Innovation. New York: Oxford University Press.
[2] John, S. (2001). Research and the Knowledge Age. Tertiary Education and Management,
7(3), 227-230.
[3] Arntzen, A.A.B., Worasinchai, L. & Ribie`re, V.M. (2009). An Insight Into Knowledge
Management Practices at Bangkok University. Emerald, 13(2), 127-144.
[4] Muhammad, S., Sutan Ahmad Nawi, H., & Lehat, M.L. (2010). Taxonomy of K-Portal for
Institute of Higher Learning in Malaysia: A Discovery. International Conference of
Retrieval & Knowledge Management (CAMP), (pp.358 – 361).
[5] Ahmad Shukor, N.S., Sutan Ahmad Nawi, H., Basaruddin, S. & Md Rahim, N. (2009).
Investigation of Knowledge Management Processes Among Academicians at Faculty of
Industrial Information Technology, UNISEL: A Case Study. International Conference of
Information Management & Engineering, (pp.732 – 735).
[6] Davenport, T.H., DeLong, D.W. & Beers, M.C. (1998). Successful Knowledge Management
Projects. Sloan Management Review, 39(2), 43-57.
[7] Economic Planning Unit. (2009). Knowledge-Based Economy Master Plan. Retrieved 13
August 2009, from http://www.epu.gov.my/knowledgebased
[8] Debowski, S. (2006). Knowledge Management. Australia: John Wiley & Sons Ltd.
[9] Mack, R., Ravin, Y. & Byrd, R.J. (2001). Knowledge Portals and the Emerging Digital
Knowledge Workplace,. IBM Systems Journal, 40(1), 925 – 955.
[10] Chaudhry, S. & Khoo, C.S.G. (2008). Enhancing the Quality of LIS Education in Asia:
Organizing Teaching Materials for Sharing and Reuse. New Library World, 109(7/8), 354 –
365.
[11] Benbya, H., Passiante, G. & Belbaly, N. A. (2004). Corporate portal: A Tool for Knowledge
Management Synchronization. International Journal of Information Management, 24, 201-
220.
[12] Sulaiman, M. S. & Selamat, M.H. (2007). Portal Project Implementation in Nuclear
Malaysia to Support Knowledge Management Activities. Proceedings of International
Atomic Energy Agency General Conference, IAEA-CN-153/2/P/11.