september 2016 - university of...

76
1 A comparison of Google and Baidu: How do Chinese users assess the quality of the information retrieved for them by the two search engines? A study submitted in partial fulfilment of the requirements for the degree of Postgraduate at THE UNIVERSITY OF SHEFFIELD by SEN WANG September 2016

Upload: others

Post on 30-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

1

A comparison of Google and Baidu: How do Chinese users assess the

quality of the information retrieved for them by the two search engines?

A study submitted in partial fulfilment

of the requirements for the degree of

Postgraduate

at

THE UNIVERSITY OF SHEFFIELD

by

SEN WANG

September 2016

Page 2: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

2

ABSTRACT

Purpose – The purpose of this paper is to determine do Chinese users assess the

quality of the information retrieved from Baidu and Google and make a comparison

for performance of Baidu and Google during the process of users searching.

Design/methodology/approach – The authors interviewed Chinese users to observe

record, and analyse the criteria used by Chinese users when seeking and evaluating

information for a topic, comparing the performance of Baidu and Google.

Findings –Three main steps are adopted when Chinese assess the quality of

information: pre-judgement, look at the display of website and look at the content of

website. Each step has many detailed criteria. In the study, Google had a better

performance than Baidu.

Keywords – Search engine, criteria of evaluating information, Chinese users

Page 3: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

3

1. INTRODUCTION ................................................................................................... 5

2. LITERATURE REVIEW ........................................................................................ 6

2.1PERSPECTIVES OF INFORMATION CREDIBILITY ....................................................... 6

2.2 PRECEIVED WEBSITE QUALITY ............................................................................. 10

2.3 PRECEIVED CONTENT OR MESSAGE CREDIBILITY ................................................. 14

2.4 PRECEIVED CREDIBILITY OF THE AUTHOR OR CONTENT CREATOR ........................ 20

2.5 BAIDU VERSUS GOOGLE AND HOW USERS ASSESS THE TWO SEARCH ENGINES ..... 20

3. METHODOLOGY ................................................................................................ 23

4. RESULT .................................................................................................................. 27

4.1EVALUATION CRITERIA ......................................................................................... 27

4.1.1Pre-judgment ................................................................................................ 27

4.1.2 Display of websites ..................................................................................... 32

4.1.3 Content ........................................................................................................ 37

4.2 PROCESS OF EVALUATING ONLINE INFORMATION SOURCES .................................. 46

4.3 COMPARISON OF BAIDU AND GOOGLE ................................................................. 48

5.DISCUSSION .......................................................................................................... 50

5.1 SUITABILITY OF THE ADOPTED RESEARCH APPROACH .......................................... 50

5.2 EVALUATION CRITERIA ....................................................................................... 51

5.2.1Pre-judgment of hyperlinks .......................................................................... 51

5.2.2 Website assessment ..................................................................................... 54

Page 4: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

4

5.2.3 Content evaluation ...................................................................................... 56

5.3 BAIDU VERSUS GOOGLE ...................................................................................... 60

6.CONCLUSION ....................................................................................................... 61

7. APPENDICES ........................................................................................................ 65

7.1 ETHICAL FORM ..................................................................................................... 65

7.2 PARTICIPANT CONSENT FORM .............................................................................. 66

7.3APPROVAL LETTER ........................................................................................ 67

8. REFERENCES ................................................................................................... 68

Page 5: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

5

1. Introduction

This paper aims to explore two aspects which is relevant to behaviors of information searching:

1. How Chinese users assess the quality of information retrieved from Baidu and Google?

2. Which one has a better performance in searching high quality information between Baidu and Google?

As online information has proliferated during past decade, the website has become the

primary source of information for many people (Jansen & Spink, 2006). Lots of

scholars started to explore the criteria of credible information, many papers

concentrated on specific field, for instance, health-information (Rains & Karmikel,

2009), information in twitter (Castillo, Mendoza, & Poblete, 2011), advice

site(McKnight & Kacmar, 2006).

However, at the beginning, only a few of scholars did useful studies on how to

evaluate general information of website (Fritch & Cromwell, 2001). Some studies

focus more on function of search engines, for example, result lists of search engine

(Davis et al., 2001), and general knowledge for search engine (Ryan, Ryan, Ryan,

Munro, & Robinson, 2002). Metzger (2007) indicated scholars could do more study

about how users evaluate the non-specific information.

Page 6: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

6

In recent years, many scholars tended to focus on exploring the criteria of assessing

credibility of information. For example, Madden et al, 2012, Pickard et al,2010,

Walraven, 2009. Their studies are brilliant and gave a gudiance to this paper.

The main differences between previous studies and this paper is that the research

selected specific subjects, which are Chinese users and two search engines (Baidu and

Google). CNNIC reported that there are about 536 million search engine users in

China by June 2015 (2015). That is a very large user group. So, Chinese users can be

very representative. Search engines as the necessary tools for information searching

behavior, are very important for users. Baidu is the most widely-used search engine in

China and Google is most popular search engine in Europe. Therefore, the two search

engines are typical for research.

2. Literature review

2.1Perspectives of Information Credibility

Credibility is quite complex and multifaceted. Source, receiver, message, medium and

context, all of them are important factors (Wathen, 2002). The researcher gave more

details about these factors (as showed in Table 1).

Factor Issues Source • Expertise/Knowledge

• Likeability/Goodwill/Dynamism • Similarity to receiver beliefs • Attractiveness • Trustworthiness • Credentials

Receiver • Issue relevance

Page 7: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

7

• Motivation (i.e., need for the information) • “Social location” • Prior knowledge of the issue • Stereotypes about source or topic • Issue involvement • Values/beliefs/situation

Message or Content • Topic/content • Internal validity/consistency • Plausibility of arguments • Supported by data • Framing (loss or gain)

Medium • Organization • Usability • Presentation • Vividness

Context • Distraction/“noise” • Time since message encountered

Table 1 Examples of factors influencing credibility (taken from Wathen & Burkell,

2002, p. 136).

Wathen (2002) also posed a model (Figure 1) for how users evaluate the reliability of

internet information.

Page 8: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

8

Figure 1: Proposed model for how users judge the credibility of online information

(taken from Wathen, 2002).

In his 2004 study, Liu designed a questionaire that included 20 simple questions about

how users assess the credibility of scholarly information on the web. The

questionnaire has two questions which are close relevant to the topic of this

dissertation: ‘What are the three most important criteria you use in evaluating the

credibility of scholarly information on the web’ and ‘When you assess the credibility

of scholarly information on the web, what features make the information less

Page 9: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

9

credible?’. The researcher sent questionnaires to participants who are undergraduate

students and graduate students from diverse disciplines. In total,135 questionnaires

were collected including 128 completed questionnaires and seven incomplete

questionnaires. The results indicated features can be used to evaluate the quality of

information are content, authorship, layout and structure. The result showed users are

more likely to think the content of information is good when the information and the

site in general are well-organized, and has a good logic, spelling and grammar.

Besides, not trying to sell something is very important for good content.

Drawing insights from Liu (2004) and Wathen (2002), the main content of this

literature review focused on the following three major credibility criteria: medium or

site quality, content credibility, and source credibility. A medium refers to a platform

where Web-based information or content is created and/or shared, for example, blogs,

corporate websites, portals, personal sites, brand-building sites, click-to-donate sites,

community sites, e-commerce sites, and wikis. Basically, people judge the credibility

of various types of media differently, for example, the overall structure and visual

design. The content perspective considers information credibility based on the actual

message communicated via a medium. For example, how focused, authentic, current,

relevant, or insightful is the content or message presented on a corporate website.

From the source dimension, information credibility judgments are based on the users’

perceptions regarding the creator of content. Content creators can be an individual,

community, or an organization. For example, users may consider the reputation or

background experience of the content creator to make credibility judgments.

Page 10: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

10

2.2 Preceived website quality

According to Wathen (2002), the following are the key website credibility measures:

appearance or presentation (colours, graphics, font size, no obvious errors, and

attention to detail); usability or interface design (navigability and menus, interactivity,

and download speed); and how information is organized). In his study, Liu (2004)

discovered that students evaluate layout and structure are good according to whether

the websites have a clear layout and fewer advertisements and pictures, well

documented, and usability and visual design. In addition, there are some other

features that play an important role in evaluating credibility of websites and

information, for example, URL domain such as .edu and .gov, the price of

information, the publisher of information, and the verification of information.

In their study, Burton and Chadwick (2000) did a survey that involved 543 students

from Western University to investigate how they assess resources and websites of

interest from the Internet. The results showed that students thought resources and

websites that are easy to find, access, and understand are trustworthy. While Burton

and Chadwick (2000) found the ease of finding, accessing, and understanding

websites as critical to perceived trustworthiness; Castillo, Mendoza, and Poblete,

(2011) indicated that users tend to treat resources they find applicable to their

immediate needs as credible because they seem be published out of goodwill or care.

Similar findings were documented by Rubin and Liddy (2006) who indicated that the

relevance of blogs promotes engagement, confidence, and positive beliefs and

attitudes, and credibility as a consequent. However, they add competence as another

Page 11: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

11

evaluation criterion of medium credibility.

McKnight and Kacmar (2006) did a study to research the criteria of information

quality for an internet advice site. They organized 571 students from U.S. University

as participants, producing 504 usable responses. Then they used quantitative method

to test a model designed by themselves (as shown in Figure 2), and the result showed

perceived information credibility includes general dispositions (Faith in Humanity,

Suspicion of Humanity, and risk propensity), technology dispositions (internet anxiety

and trust in technology), and initial impressions. In the disposition of first

impressions, they found the following factors as critical to site quality – ‘trusting

beliefs, perceived reputation, and willingness to explore the site – were important to

build website credibility’.

Figure 2 (Taken from McKnight & Kacmar, 2006)

Mcknight & Kacmar (2007) did an another study which is related to the research they

did in 2006. They used quantitative method to test a model (Figure 3) which was

Page 12: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

12

designed for evaluating medium credibility. The results showed this model was

credible. Besides, the study of Mcknight & Kacmar (2007, p. 430) gave a support that

three first impressions of the website are – ‘trusting, perceived reputation, and

willingness to explore the website – establish initial information credibility’. The

conclusion is similar to the previous research that Mcknight & Kacmar did in 2006,

the subsequent study found belief was constructed by three general dispositions:

suspicion of humanity, trust in general technology, and risk propensity.

Figure 3(Taken from McKnight & Kacmar, 2007)

Metzger (2007) analyzed many existing criteria and methods which are studied on

how to assess credibility of information and websites. He proposed a model (Figure 4)

Page 13: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

13

to evaluate information based on previous studies.

Figure 4 (Taken from Metzger, 2007)

Madden, Ford, Gorrell, Eaglestone, and Holdridge (2012) explored how people assess

the credibility of a site, and used purely qualitative methods in their study. They

discovered that the first impressions is a major assessment criteria (the interviewee

said there are four factors are related to perceive first impressions: name - users will

judge whether websites have authority by observing names of websites; user prejudice

- the positive prejudice for websites influences users assessment of the website; and

website description and URL - users thought URL ended by .edu and .gov are more

trustworthy). Website appearance and accountability, including elements such as

advertisement placement, visual design, layout, presence of references, constitute a

critical credibility evaluation factor. At the end, they draw a basic model to indicate

stages of search at which stage evaluative factors begin to come into effect (Figure 5).

Page 14: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

14

Figure 5 (Taken from Madden et al., 2012)

Abdulla et al (2002) analyzed criteria of evaluating credibility of online newspapers.

They identified three main criteria: trustworthiness, currency, and bias.

2.3 Preceived content or message credibility

Head and Eisenberg (2009, p. 3) explored how students assess quality and reliability

of resources retrieved from the search engines through ‘everyday life research’. They

asked students to make a report of their search activities. Reports included the

difficulties when students want to seek useful information or resources. Students

reported that searching for academic is more difficult and complex than searching for

some general information. Because academic resources need more knowledge to

assess the credibility and authority. Reports also referred to how students evaluated

the resources and information they got from search engines or other approaches. The

core criteria for students are authority and content of resources. Currie, Devlin, Emde,

and Graves (2010) interviewed 10 undergraduate students at the University of Kansas.

They observed the students to find out more about how users searched for credible

sources. Users thought publisher of information and sources is a key factor to assess

credibility of information.

Page 15: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

15

Hung (2004, pp. 5–11) researched how students assess web resources for class papers.

This case found the following five major constructs: coverage, accuracy, authority,

objectivity, and currency. On the other hand, Metzger, Flanagin, and Zwarun (2003)

did a study about credibility and verification of academic and non-academic Web-

based information. This study involved 436 student volunteers from Universities and

307 non-student respondents recruited using ‘snowball sampling’ technique. They

posed a series of research questions guided by the TAM (Technology Acceptance

Model) and diffusion theory to construe what are criteria of volunteers used to assess

the credibility of information. The research questions most relevant to this project are:

‘Do students and non-students vary in their perceptions of the credibility of various

types of information (i.e., news, reference, entertainment, and commercial) across

different media?’; and ‘To what extent do student and non-student users verify online

information and what specific verification strategies do students, in particular, use?’

(Metzger et al., 2003, p. 281). Findings from this study indicated that the Web is a

major source of informational (academic and general) resources for college students.

However, college students were found to be less concerned about the credibility of

Web-based information, thus they rarely verify the accuracy and timeliness of online

resources. Nevertheless, the study indicated that students assess credibility based on

the type and source of information of interest – either academic or general. Like

previous studies (Abdulla et al., 2002; Burton & Chadwick, 2000; Fritch & Cromwell

2001; Hung, 2004), Metzger et al (2003) found that currency, completeness, and

authority, and competence of content as critical credibility assessment factors.

Page 16: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

16

Content such as customer relations, brand reputation, and legal protection represent

key pillars of accountability, and eventually credibility. Moreover, users assess

whether the content is genuine or phony - authorship, quality of writing, quality of

references, corroboration, bias, evidence of maintenance, and evidence of research

(Madden et al., 2012).

In a research involving 21 participants, Eysenbach and Kohler (2002) explored how

users evaluate the credibility of health information from Internet. The study has three

stages, starting with concentrating on groups to determine the principles; users

described whether health information retrieved from Internet is credible. The next

step is researchers observe participants to seek information, when they found

participants look confident about the information or websites, making a record. The

third stage is interviewing participants for how they evaluate online health

information and what criteria they use to decide whether information is trustworthy.

Participants said criteria of credible information are whether the source was published

by a reputable individual/organization, whether the websites cited scientific

references, whether the page has technical design and layout, whether the website is

easy to access, whether the language of the content is easy to understand.

Pickard, Gannon-leary, and Coventry (2010) did a project named ‘Users’ trust in

information resources in the Web’. They divided this project into two phases. First

phase is to analyse existing studies and researches. In the second phase, they made a

questionnaire according to the factors they got from the first phase. Then they sent out

Page 17: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

17

e-mails about their project to the users (students, academic tutors and researchers) and

providers (commercial service providers and HE information service providers) from

the North East of England. After that, they will send another attached the questionaire

to people who responded positively to the first e-mail.When they got responses to the

questionnaire, they started to analyse the responses and draw a Model of User trust in

information resources in the Web environment (Figure 6).

Figure 6: Model of User User trust in information resources in the web environment

(taken from Pickard et al., 2010)

They divided the factors to external factors and internal factors as shown in Table 2.

External Internal

1. Whether users have to pay for the

information e.g. Students are not

willing to pay for internet

information.

2. Seals of approval

3. Credibility rating systems and

1. Need for closure

2. Need for cognition

3. Purpose

4. Prior knowledge

5. Time available

6. Ability

Page 18: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

18

recommendations by others

4. PIC labels

5. Rankings

7. Cognitive limits

8. Propensity to trust

9. Risk propensity

10. Internet anxiety

11. Information searching vs

information assessment and some

caveats

12. How students use the web for

research

Table 2: External factors and internal factors (taken from Pickard et al., 2010)

Tombros et al (2005) asked 24 participants to describe what constituted useful

information on the Web pages they chose to view. In this research, participants needed

to give a valuation for the features (such as text, structure and etc.) of the website by

answering a questionnaire. Participants also were asked to state features which they

used to evaluate whether the websites are useful or not. The descriptions by

participants can help the researcher to construe the criteria how volunteers evaluate

the content. Metzger, Flanagin, and Zwarun (2003) did a study about credibility and

verification of Web-based information. This study involved 436 student volunteers

from Universities and 307 nonstudent volunteers recruited using a ‘snowball

sampling’ technique. They posed a series of research questions guided by the TAM

(Technology Acceptance Model) and diffusion theory to construe what are criteria of

Page 19: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

19

volunteers used to assess the credibility of information. The research questions most

relevant to this project are: ‘Do students and nonstudents vary in their perceptions of

the credibility of various types of information (i.e., news, reference, entertainment,

and commercial) across different media?’; and ‘To what extent do student and

nonstudent users verify online information and what specific verification strategies do

students, in particular, use?’ (Metzger et al., 2003, p. 281). The results indicated the

factors included: whether the information is current, whether the information is

complete, whether the views represented are facts or opinions, seek out other sources

to validate the information online and consider the author’s goals/objectives for

posting information. In this case, about forty-four participants’ final comments are

related to content. All students have mentioned that quality of content is a factor at

least once during interviews. Familiarity was the most frequently mentioned criterion.

Fifteen percent of participants’ comments are related to their prior use or exposure to

a source.

Walraven, Brand-gruwel, and Boshuizen (2009) did a study about how students

evaluate information and sources retrieved from the internet. Twenty-three students

participated in this study. Students had to complete a series of searching tasks given

by researchers. The results showed usability, verifiability, and reliability are the key

assessment criteria. The core factors of usability are connection to task and language.

The core factors of verifiability are author, information agrees with more sites,

information agrees with prior knowledge and organization. The main factor of

reliability is the kind of information.

Page 20: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

20

2.4 Preceived credibility of the author or content

creator

Generally, websites and information resources published by authoritative persons

and/or organizations are worthier to believe in (Burton & Chadwick, 2000). Fritch

and Cromwell (2001) adopted a different model and indicated that the key factors of

how users evaluate websites are the author and institutional affiliation. Therefore, the

authoritativeness of content creators (either individuals or organizations) has emerged

as a key measure of credibility (Burton & Chadwick, 2000; Fritch & Cromwell 2001).

Twait (2005) did a study which utilized a form of non-probability sampling called

‘purposeful sampling’ to explore what are undergraduate students’ source selection

criteria. Researchers sent e-mails about the study to students of Gustavus Adolphus

College. Then the researchers interviewed students who responded. A total of thirteen

students were involved in this study. During the interviews, participants were asked to

describe what are the criteria of evaluating sources. The result showed the key criteria

including content, familiarity and reputation. Some other factors which were not key

were studied include expertise or competence, trustworthiness, and credentials,

external influence, and background/experience regarding the author or content creator

(Wathen, 2002).

2.5 Baidu versus Google and how users assess the two

search engines

While Baidu is a Chinese-based search engine, Google Search (or simply Google) is

Page 21: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

21

based in the U.S. However, the two tools serve a global user community. The two search

engine tools support searches for public online resources such as websites, video and

audio files, and images held in web servers as opposed to privately controlled databases

(Quelch & Jocz, 2010). In addition, common services between these tools include

support for a wide array of functionalities beyond mere searching, for example, maps,

stock trends, news, flight schedules, movie show times, and weather forecasts. The two

search engine tools dominate their domestic market – Baidu (more than 80% in China)

and Google (approximately 65% in the U.S.). However, Google remains the most

popular search engine on the Web in terms of usage. The global dominance of Google

can be attributed to the following major factors: support for approximately 120

languages, thus its penetration across nations apart from China where Baidu is the Web-

based search leader due to government censorship; and support for use of natural

language to search as opposed to mere text-based search, which translates to adoption

of a future outlook to searching experience. Therefore, many users tend to find Google

irresistible, and the search engine tool has managed to control close to 90% of the global

search activity compared to a mere 1% enjoyed by Baidu (Jiang, 2014; O'Rourke IV,

Harris, & Ogilvy, 2007).

There are a number of studies that have been conducted to investigate how Baidu

compare to Google from the point of view of Chinese users (Jiang, 2014; Long, Lv,

Zhao, & Liu, 2007; Liu, Zhang, & Chen, 2010). What factors do Chinese users consider

when assessing the quality of information resources derived from Baidu and Google?

A study by Jiang (2014) indicated that the ‘search’ performance is the major factor that

Page 22: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

22

may be used to assess Baidu and Google among Chinese users. The study showed that

Baidu returns more search results than Google provided that the keyword constitutes a

hot issue, term, or concept related to China. On the other hand, users indicated that

Google is better suited for concepts and terms that are foreign to China. In their study

that involved an experimental approach using Chinese keywords, Long et al (2007)

discovered that Google delivered a significant number of search results than Baidu, but

their accuracy could not match that of the latter. This became evident after a detailed

analysis of the first 5 pages. Chinese users find Baidu to be more powerful than Google

because the former is easy-to-use and it meets the relevance constraint associated with

domestic (Chinese) context and content. The perceived ease of use associated with

Baidu among Chinese users can be attributed to the fact that it returns a precise list of

results because of its relatively better mastery of the complex Chinese dialect and

culture compared to the more complicated and relatively vague Google when it comes

such matters (Liu et al., 2010; Long et al., 2007). Google’s vagueness can be attributed

to overdependence on the traditional and simplified Chinese language that does not

accommodate different contexts of usage. Therefore, Baidu’s dominance in China can

be attributed to the following factors that have also been indicated by other studies

(Head & Eisenberg, 2009; Hung, 2004; Metzger et al., 2004; Pickard et al., 2010;

Walraven et al., 2009): reliability or perceived usefulness, accuracy, completeness, bias

against the more foreign-oriented Google, whether the language of the content is easy

to understand, and relevance.

Page 23: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

23

3. Methodology

This paper concentrates on identifying and describing the behaviors of Chinese

students using two different search engines. It focuses on particular on the criteria

they use for evaluating the websites they find in the course of their searches. The

qualitative was adopted in this study. This paper would take interviews as a main

way, mixed the observation method as a secondary way. During the interviews,

researcher would to observe the participants searching behavior and asked further

questions to identify the motivation of their behavior.

Participants:

In order to recruit participants for this study, researcher sent e-mails about this study

to the Chinese students at The University of Sheffield that he knows. Besides,

researcher also sent the message about his study in Chinese chat group of Sheffield

via social medias (such as WeChat and weibo). Students who are willing to be

involved in this study contacted the researcher. When researcher received the

response of volunteers, the exact place, date and time for the interview were

scheduled. Totally, ten volunteers participated in this study.

Interview:

Interview was applied in this study because it is a basic way to collect data and

sources for qualitative method. It has some elementary steps(Gubrium & Holstein,

2002):

Page 24: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

24

1. Interviewer visits to interviewees to have a talk (face-to-face, phone e-mail)

2. Ask interviewees questions, making records

3. Do systemic analysis based on records to draw a conclusion of the interview.

The three questions form the core pillars of interviews.

This study chose the face-to-face interviews as the main approach with a video or

radio record. Interview also have lots of types. This paper is going to the general

interview guide approach as the core type. The general interview guide approach will

prepare an outline in advance, but it does need to follow the outline strictly (Gubrium

& Holstein, 2002).The reason that research took interviews as the main may is

interview is flexible, in-depth, direct, and effective (Gubrium & Holstein, 2002):

The interviews for this paper involved followed steps (adapted from Madden et al,

2012):

The first step is to request volunteers to select a topic in which they feel they are

knowledgeable or interested and search for resources that would help to educate a

non-expert in the subject. If the researcher allows volunteers to look for anything,

they may choose a subject that interests them but about which they know nothing. If

that is the case, their evaluation is likely to be based on learned heuristics and bias. If,

by contrast, they know something about the subject, there is the possibility that they

can use their own knowledge to help them evaluate. And then ask the volunteers to do

the following:

a) Search for three useful resources using Baidu which is the most widely used search

engine in China

Page 25: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

25

b) Search for three useful resources using Google which is the most widely used

search engine in Europe.

c) Rank the resources according to which they think are the best.

d) Get them to describe the criteria they use in their ranking.

Researcher needs to limit volunteers so that they only look at the first page of

results. Researcher may also need to offer some prompts to help people think of a

search topic (for example, your home city, the football team you support, your

favorite singer and so on). Besides, researcher also asked the volunteer to use the

same keyword during interview. Researcher also adapted the search protocol after

initial sessions. Because volunteers focused on familiar sites, thus researcher asked

volunteers to select sites that were new to them.

After the volunteers finished searching behavior, some questions were proposed:

1. Why you consider these websites are credible?

2. What are the key criteria to assess information?

3. Why you do not believe these websites?

4. Could you describe the basic order or model to evaluate the information from

retrieved from a search engine?

In the process of interview, observation method should be involved.

In the process, interviewer needed to observe the interviewee to ask some further

questions. There are some basic steps for observation method (Taylor, Bogdan, &

DeVault, 2015):

1. Determine the observation plan and targets.

Page 26: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

26

2. Enter the place of the observation.

3. Start observing and recording.

4. Finish the observing.

In the process of interview, when interviewee chooses three useful resources,

researcher would observe how they choose useful resources, asking why they choose

these resources, recording their answers, and their behaviors and thoughts. Some

questions were proposed according to observation:

1. Why you close this site?

2. Why you close the site directly?

3. Why you did not open this link?

After collecting the data and sources, research did systematic analysis according

to data and records, to identify and describe how Chinese users assess the quality of

information retrieved from two search engines.

Risk statement:

The risk degree of this dissertation is low risk.

The main type to collect source and data is interview.

It may include two issues:

1. It may refer to some personal privacy.

In the interview, researcher asked some questions about interviewee’s interests or

familiar field. The interview would be recorded.

Solution: Before the interview, before the interview, participants were given an ethical

Page 27: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

27

information sheet and a participant consent form. Researcher would ask for the

participants’ agreement and signature. Then start the formal interview.

All data of interviews would be stored in a password-protected drive, and uploaded to

the iSchool's data drive.

2. Safety issues

Solution: All interviews were carried out the in the libraries of TuoS, or public cafe

(Starbuck or Costa).

4. Result

4.1Evaluation criteria

4.1.1Pre-judgment

Users are used to doing a pre-judgment for the hyperlinks of websites. When users got

the result list from the search engine, some factors play an important role in this process

included:

• Headline and description

• Word of mouth/popularity

• Familiarity

Page 28: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

28

• Placement of sites

• User preconception

4.1.1.1Headline and description

The results list comprises of the hyperlinks with the headline and description. Users are

used to looking for keywords from the headline and description to judge whether the

sites would provide the information that they want. This influenced whether the site

would be open or not. Many interviewees stated this factor like following:

“I will not open it because the headline and description of this link is not match to

what I am looking for” 1(REC_LWR)

“First of all, I read the headline and description of the site to filter links, I do not want

to open every one of them”2 (REC_TRQ)

“When I got the result list from a search engine, I can see the headline and description

of these sites, I just open the sites which I think they could provide useful

information.”3(REC_FL)

1 “我不会打开这个网站因为它的链接标题和描述看起来不是我要找的。” 2 “首先,我会看链接的名字和描述,因为我不想打开所有链接。” 3 “当我得到搜索结果列表的时候,我会看它的标题和描述,我只打开我认为能够提供有用信息的链

接。”

Page 29: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

29

“I will open the Baidu Encyclopedia and zhihu because both of them are very popular

and famous in China”4 (REC_LWR)

In the study, six in 10 interviewees indicated that the headline and description are very

important when they selected the links to open.

4.1.1.2 Word of mouth/popularity

Participants stated that when they chose a hyperlinks, they preferred sites which are

very popular, or which have been recommended by friends. This is a key factor to select

links.

“I chose it because it is douban, it is very popular, when I decided to open a link or not

based on whether it is famous”5 (REC_ZYY)

“I never will open some sites which have a horrible word of mouth, for example, the

tencent.com”6 (REC_TRU)

4 “我会打开百度百科和知乎因为他们非常流行。” 5 “我选择打开这个链接是因为它的名字是豆瓣,一个非常流行的网站,当我决定要不要打开一个链接的

时候我会看它是不是非常有名。” 6 “我永远不会打开那些口碑很差的网站,比如说腾讯网。”

Page 30: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

30

“When I searched a topic which I am strange, there is no doubt that I will choose some

famous sites to read”7 (REC_XY)

Besides, some websites were selected even these sites are not very famous, because

these sites

“I selected this site because my friends told me this site is good”8 (REC_PHA)

“I am going to open it because lots of my friends use it” 9(REC_CKK)

4.1.1.3 Familiarity

Familiarity is one of the criteria. The results showed that volunteers were willing to

open some sites which they used before. Especially for some Comprehensive websites,

these sites cover lots of field. So they have a number of users, which caused these sites

were likely to be selected. When the researcher asked interviewees why they open these

sites not others, interviewees answered like following:

“I know this sites before, and I am used it before”10 (REC_ZYW)

7 “当我搜索一个很陌生的话题时,我会选择一些名声比较大的网站。” 8 “我选择打开这个网站因为我朋友告诉我这个网站很好。” 9 “我打算打开这个网站是因为我朋友们都用它。” 10 “我之前就知道这个网站,我也常常用它。”

Page 31: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

31

“I selected Baidu Encyclopedia because I always used it to collect information that I

want to know”11 (REC_ZKK)

“The reason that I chose to open the zhihu is I am the long-time user of zhihu.”12(REC-

XY)

“I will open the doban to have a look because I knew and used it before”13(REC-LLJH)

4.1.1.4 Placement of sites

In this study, volunteers were given 10 minutes to search for high quality sources for

each search engine. But most of them just used less than 5 minutes to do that. Volunteers

indicated they just would like to consider the websites which are in the top of result

lists, or the front pages of the result list.

“I did not open these sites because I am used to choosing links from the sites located

top of the result list, I do not want to browse others”14 (REC_AHD)

“First of all, when I chose links, I will prefer the top 3 of result list.”15(REC-LLJH)

11 “我选择百度百科因为我总是用它去获得我想要的信息。”

12 “我选择知乎的原因是我本阿里就是知乎的长期用户。”

13 “我打开豆瓣去看的原因是我我之前用过它。”

14 “我不打开这些网站是因为我习惯于只打开排在列表比较靠上的网站。”

15 “首先,当我选择链接的时候,我会选择排在列表前三的链接。”

Page 32: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

32

“The placement of links in the result list would influence whether I would open it,

because in general, I only want to open the links located in the first two pages.”16

(REC-FL)

4.1.1.5 User preconception

Sometimes, users will filter the links according their personal preference:

“It does not have any reasons; I just do not like Wikipedia.”17 (REC_CKK)

“I close this sites because I do not like it, without any reasons…” 18(REC_JG)

“I close this site just because when I attampted to open it. The picture on the screen

scared me…”19(REC_LLJH)

4.1.2 Display of websites

16 “链接在列表的位置会影响我是否打开它,因为通常通常情况下我只看排在前两页的链接。”

17 “没有什么原因,只是我不喜欢维基百科。”

18 “我关闭这个网站是因为我不喜欢,没什么理由。”

19 “我刚打开网站就关了是因为页面里一张图片吓到了我。”

Page 33: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

33

When users enter a link, they would like to concentrate on looking at the appearance of

this site. Participants proposed criteria included:

• Layout

• Advertisement

• Loading speed

4.1.2.1 Layout

Some volunteers said when they open a website, if the layout of the site looks not good,

they will close this site immediately.

“This site looked very messy, so I close it immediately”20 (REC_ZYY)

“I closed some sites because their layouts are not neat”21 (REC_JG)

Besides, volunteers indicated the layout is a key factor to evaluate a website:

“I think this site is good, because it looks very neat and comfortable.” 22(REC_LWR)

20 “这个网站看起来太乱了,所以我直接关闭了它。” 21 “我关闭这个网站因为他的布局太不整洁了了。” 22 “我喜欢这个网站因为它很整洁而且令人舒服。”

Page 34: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

34

The same volunteer then added that:

“For personality, t think the layout of site is very important, I just willing to read this

site when the layout is not bad.” 23(REC_LWR)

Other volunteers also have the similar views:

“When I open a site, first of all I will focus on the layout, whether the layout is clean. I

will doubt it is not a normal site if the layout is in a mess.”24 (REC_FL)

“The first step to assess a site is to look at its layout and UI.”25 (REC_XY)

“Look at the layout of this site, it is so good, so I like it.”26 (REC_AHD)

“I think this site is good because it layout is very clean.”27 (REC_LJHL)

Almost all volunteers involved in this study mentioned that the layout is an essential

criterion to evaluate the quality of a site.

23 “对我个人而言,我觉得布局非常重要,我只愿意看那些布局不算太糟糕的网站。” 24 “当我打开一个网页,首先我会关注它的布局,看它的布局是否整洁,如果布局很乱的话我会怀疑它是否是一个正规网站。” 25 “评测一个网页第一步就是去看它的布局和 UI。” 26 “看这个网站的布局,非常好,所以我喜欢它。” 27 “我觉得这个网站布局很好是因为它看起来很整洁。”

Page 35: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

35

4.1.2.2 Advertisement

Advertisement as an aspect of quality, brought up by some volunteers. Some comments

prove that likes following:

“I will close the website which is full of advertisements.”28(REC_ZYW)

The same interviewee also mentioned:

“I like this one because it looks very clean…. without advertisements. I feel

comfortable29.” (REC_ZYW)

“I closed this site directly as it has too many advertisements.”30 (REC_JG)

“when I evaluate a site, I will observe the number of advertisements. I will doubt the

safety of site in case that it has lots of advertisements.”31 (REC_TRQ)

“Some sites are full of advertisements… I will consider if they are normal and

professional sites, why they will have so many advertisements…then I think I could not

believe them.”32 (REC_FL)

28 “我会直接关闭那些充满广告的网页。” 29 “我喜欢这个是因为它非常整洁,没广告,我觉得觉得舒服。”

30 “我直接关闭了这个网站是因为它有太多的广告。”

31 “当我评测一个网站的时候,我会先看它的广告数量,如果广告很多的话我会怀疑它是否安全。”

32 “一些网站充满了广告。我会想如果是比较专业的网站怎么会有这么多广告呢,所以我不能相信它

们。”

Page 36: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

36

“Douban read makes me comfortable as it does not have so many advertisements. I like

it.”33 (REC_XY)

“I like wangyi.com because it does not like xinlang.com which have many

advertisements.”34 (REC_AHD)

“I closed this site as it has too many advertisements.”35 (REC_LLJH)

“I like this site considering it is does not have advertisements.”36(REC_PHA)

Advertisement caused participants were not willing to read the content of sites. Besides,

it is the main reason that people close a site immediately. It is a condition precedent to

keep reading a site.

4.1.2.3 Loading speed

The speed of loading a site is an important factor to evaluate a site. Participants

described a lot about this factor:

33 “豆瓣读书基本上没有什么广告,让我觉得很舒服,我喜欢。”

34 “我更喜欢网易是因为它不像想一样有那么多的广告。”

35 “我关闭这个网站是因为它有太多广告了。”

36 “我喜欢这个网站是因为它广告不多。”

Page 37: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

37

“When I open a site, I will close it if the speed is quite slow.” 37(REC_ZYW)

“Some sites have a good loading speed. I think that is quite good…I like them. But the

loading speed of some sites is quite horrible. I do not believe they are high quality

websites.”38(REC_AHD)

“It cost too much time to open this website, so I closed it directly.”39 (REC_LLJH)

“I do not believe this site as its loading speed is too slow than others. It should have a

good server for a good site.”40 (REC_PHA)

4.1.3 Content

The following list proposed the key evaluation criteria applied by volunteers in this study:

• Quality of content

Richness of information

Professional

Update frequency of information

37 “如果一个网站打开的很慢,我会直接关了它。”

38 “一些网站的加载速度很快,我喜欢它们。但是一些网站的加载速度很糟糕,我不能相信它们是很高质

量的网站。” 39 “打开这个网站耗费我太多时间了,所以我直接关闭了它们。”

40 “我不想相信网站是它的加速速度比其他网站慢了太多了。一个好的网站应该会有一个好的服务器

的。”

Page 38: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

38

Authorship

• Credibility of information

Agreement to other sites

Agreement to personal knowledge/experience

Authority

• Degree of relevance with target

• Difficulty level to get information

4.1.3.1 Quality of content

When users assess the content, they would like to pay attention on the quality of content.

“I like zhihu because the information of zhihu is high quality.”41(REC_XY)

The volunteers indicated many factors influenced quality of content:

4.1.3.1.1 Richness of information

Some volunteer proposed richness of content is major factor for the quality of content

41 “我喜欢知乎是因为知乎上的信息质量很高。”

Page 39: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

39

“I think content of this site is very rich, detailed wide coverage, so I like it.” 42

(REC_ZYW)

“I like Baidu Encyclopedia because it gave so detailed and systemic information about

EXO. So, I think it is a good site.”43 (REC_CKK)

“I think this site is best because it has most various information of masks.” 44

(REC_LWR)

“I like zhihu because the information of this site is from different points of view, so I

gave me lots of different information. So, it is a good site to collect

information.”45(REC_FL)

“It is a key factor that whether the information is various and rich.”46 (REC_AHD)

“this site has a very strong logical structure with rich information, I love it.” 47

(REC_LLJH)

42 “我认为这个网站的内容很丰富,覆盖面广,所以我喜欢它。”

43 “我喜欢百度百科因为它很系统的详细的描述了关于 EXO 的信息,所以我认为它很好。”

44 “我认为这个网站是最好的因为它提供了各式各样的面膜信息。”

45 “我喜欢知乎是因为它的信息是来自不同的观点,我能得到各种不同的信息,所以知乎是一个很好的获

取信息的网站。“ 46 “内容的丰富性和多样性是一个关键因素。”

47 “这个网站的逻辑性很强,所以我喜欢它。”

Page 40: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

40

4.1.3.1.2 Professional of information

Some volunteers pointed out that the professional of content is an element to evaluate

content:

“I like douban read because its content looks professional. I believe it.”48 (REC_LLJH)

“For my personality, I prefer the sites which looked pretty professional…”49 (REC_FL)

“The professional of information is a key criterion to evaluate the information and

website.”50 (REC_AHD)

4.1.3.1.3 Update frequency of information

Interviewee bring up that update frequency is a criterion for the quality of information,

especially for information which has timeliness. When an interviewee was searching

something about EPL (English Premier League), he said:

“The speed of updating information is very important, for example, this site gave me

the information about last year, but I am looking for recent information. So I am not

48 “我喜欢豆瓣读书是因为它上面的内容看着很专业。”

49 “就我个人而言,我更喜欢那些看起来专业性很强的网站。”

50 “信息的专业性是评价网站和信息的关键条件。”

Page 41: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

41

going to use it again.”51 (REC_AHD)

Besides, even when users searched for information which does not has strong timeliness,

they also indicated this factor:

“Another criterion is… I will judge the update time of information, if the time is long

ago, or the information is not updating for long time, I will give it up.”52 (REC_FL)

4.1.3.1.4 Authorship

Some participants indicated they pay attention to who is the provider of the information.

For example, some participants did not believe information provided by internet users:

“I will assess the information when I got it. I will look at who provided it. If it is

provided by internet users, not some famous writer. So…it maybe not trustworthy.”53

(REC_TRQ)

In addition, participants also will consider the motivation that the provider of

51 “信息的更新速度非常是非常重要的。比如说,这个网站给了我去年的信息,但是我要找的是最近的信

息,所以我不会再使用它了。” 52 “另一个就是,我会看这个信息的时间,如果是很久以前的信息,或是很久没有更新的信息,我会放弃

它。” 53 “当我看到信息的时候我会先评判它,看它是由谁提供的。如果是一些网友提供的而不是一些著名的写

手,这个信息就不一定是真实的。

Page 42: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

42

information:

“The information is not really reliable. It is provided by the editor of this site. You know,

in general, they will have the business relation with some hotels. So I could believe the

hotel comments in hundred percent.” 54(REC_PHA)

4.1.3.2 Credibility of information

Users reported that credibility of information impact on assessing quality of

information.

They indicated they judge the credibility of information according two main points:

4.1.3.2.1 Agreement to personal knowledge/experience

An interviewee said

“I prefer the sites which provide the trustworthy information, for example, this

website…”55 (REC_LWR)

Then the researcher asked her why she thought the information is trustworthy, she

added:

54 “这上面的信息不一定是完全可靠的,因为这些信息是由这个网站的编辑提供的。你知道的,这个网站

和很多酒店是有利益关系的。所以我不能百分百的相信这个网站上对酒店的评价。” 55 “我更喜欢那些提供了真实可靠信息的网站。比如说这个。“

Page 43: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

43

“Because I knew something about mask before, when judge the reliability of

information, I will refer to my personal knowledge or experience...” 56(REC_LWR)

Other interviewees hold the same views:

“I did not believe this site because the information (provided by this site) is totally

different what I knew before…”57(REC_ZYW)

“I think the content is reliable because it matches to something I has known before”58

(REC_JG)

“I am a senior frequent flyer. I know much about hotels. So, I am sure the content (of

this site) is not true...”59 (REC_PHA)

4.1.3.2.2 Agreement to other sites

“I will concern the credibility of information… by referring to what other sites said, to

find whether the information was agreed by other sites.”60(REC_AHD)

56 “因为我知道一些关于面膜的只是,当我看到这些信息的时候,我会根据我知道的去判断它。”

57 “我不相信这个网站的信息因为它和我知道的完全不一样。”

58 “我认为这个网站上的东西是可靠的,因为它符合那些我已经知道的。”

59 “我是一个资深的常旅客,我知道很多关于酒店的东西。所以我知道这个网站上的内容是不真实的。”

60 “我会侧重在内容的真实性上,通过看这些内容是不是和其他网站所说的都一致。”

Page 44: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

44

“I think the information provided by this site is trustworthy, because it is confirmed by

other sites.”61 (REC_JG)

4.1.3.2.3 Authority

The authority is also a factor to evaluate the credibility of information:

“I think the information of site is trustworthy because the site was built up by Chinese

government.”62 (REC_JG)

“I prefer the itunes’s site… because…you know, it is an official site of Apple. Everything

in this site is legitimate edition.”63 (REC_TRQ)

“I think the information on this site is most reliable. Because it is from the official site

(of this book).”64 (REC_XY)

4.1.3.3 Degree of relevance with target

Although users are used to doing a pre-judgment for links, it is inevitable that they will

61 “我认为这些信息是真实的,因为它和其他网站上的都一致。”

62 “我认为这个网站上的信息肯定真实是因为这个网站是由政府机构提供的。”

63 “我更喜欢 iTunes。你知道的,它是苹果的官方网站。所有东西都是正版的。”

64 “我觉得这个网站的的信息最真实可高,因为这个网站是官方网站。”

Page 45: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

45

open sites which could not provide the content that they were looking for. So, whether

information site can cater to users’ requirement is a key factor.

“The content of this site is detailed…but it is not what I am looking for.”65 (REC_LWR)

“When I evaluate the content, I will consider whether the content is I am looking for.”66

(REC_TRQ)

“Content is a very important factor. I will consider whether the site can provide the

information that I want.”67 (REC_PHA)

In this study, almost all participants reported whether site could provide target

information is a key factor to evaluate content.

4.1.3.4 Difficulty level to get information

Sometimes, websites will set up some thresholds for users. For example, users have to

register a member for sites, or users have to pay for the information. This is a factor

65 “这个网站的内容和详细,但是不是我想要的。”

66 “当我评测一个网站内容的时候,我会先看看的内容是不是我想要的。”

67 “内容是一个非常关键的因素,我会看这个内容能不能提供我想要的的东西。”

Page 46: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

46

when users evaluate the sites.

“Whether I need to pay for the information is a very important factor.”68(REC_XY)

“The difficulty level to get information is an important factor which influenced me, for

example, it is really trouble me if I have to do lots of steps (to get the

information).”69(REC_AND)

“Whether I can get the information easily is an key factor, I do not want to do lots of

steps to register.”70 (REC_LLJH)

The same volunteers also added during interview:

“I will close the site directly if have to pay for that.”71 (REC_LLJH)

4.2 Process of evaluating online information sources

Participants described that when they evaluated information sources. Three processes

were involved: Pre-judgment for the links, look at the appearance of sites, and evaluate

68 “是否需要为这个信息付费是一个很关键的因素。”

69 “获取信息的难易程度是影响我的一个重要因素,如果需要做很多步骤才能看到这个信息会让我觉得很

麻烦。” 70 “我是否能够很容易的获取信息也很重要,我不想做很多步骤去注册。”

71 “我直接关闭那些我需要付费的网站。”

Page 47: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

47

the content of sites. Then, they would make a decision whether they will believe this

site.

“When I looked for useful information or site from Internet, first of all, I will do a filter

for the links in the result list. Because I do not want to spend much time on opening

each link one by one. Then, I will look at the appearance of sites, for example, the layout

and other thing. After that, I will focus on the quality of content. Finally, I will do the

decision.”72 (REC_LLJH)

“First of all, I will choose the links. Then look at the design of the site and content of

sites.”73 (REC_FL)

Research proposed a model about how users evaluate information retrieved from

searching engines according to description of interviewees:

72 “当我从互联网上搜索信息的时候,首先我会对搜索引擎提供的列表做一个简单的筛选,因为我不想耗费很多的时间去一个一个的打开链接。然后,我会看网站给我的感觉,比如说布局之类的。在此之后,我会看网站内容的质量。最后做出决定。” 73 “首先,我会从搜索引擎列表中有选择性的打开一些链接,然后看它的页面和内容。

Page 48: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

48

4.3 Comparison of Baidu and Google

In this study, the researcher asked volunteers to rank the sources from Baidu and Google.

The researcher drew a table (table 3) to show the result. The first means the best one,

and so on, the sixth means the worst one. ‘G’ means the site was retrieved from

Google and ‘B’ means that it was retrieved from Baidu. “*” means the site was

retrieved from both Baidu and Google. Then the researcher gave a score to the two

search engines. A site ranked in first position was given 6; a site in second position

was given 5; third position was given 4, etc.

Page 49: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

49

Name

Order

ZYW CKK JG FL LWR TQR XY AHD LLJH PHA

First G *G/B B B G G *G/B *G/B *G/B G

Second B *G/B B G G G *G/B *G/B *G/B G

Third *G/B *G/B G B/G B B G *G/B *G/B G

Fourth G G G B G B B

Fifth B B B G B B

Sixth G B B B

Table 3: The result of rank

Table 4: result of score

The table 4 shows that 8 of the 10 sites ranked first and second were retrieved from

Google while six of the number 1 and 2 sites were retrieved from Baidu. By contrast,

almost all the sites ranked in sixth and fifth were from Baidu. Google has a higher score

ZY

W

CK

K

J

G

F

L

LW

R

TR

Q

X

Y

AH

D

LLJ

H

PH

A

TOTA

L

Googl

e

13 15 8 12 13 15 15 15 15 15 136

Baidu 11 15 13 12 8 7 14 15 15 6 116

Page 50: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

50

than Baidu which supported Google worked better than Baidu in getting the high

quality information.

Despite the fact that the ranking exercise rates Google more highly than Baidu, only

two volunteers expressed a preference for one of the search engines.

“I prefer Google because its result has a better order.” (REC_ZYW)74

“I prefer Baidu because I am used to using it. ” (REC_CKK)75

5.Discussion

5.1 Suitability of the adopted research approach

Silverman (2016) argues that qualitative research is best suited for research projects

that seek to explore a specific topic of interest or a research problem from a certain

local population perspective. Therefore, qualitative research methods (interviews and

participant observation) were appropriate for this work because it focused on

understanding the factors that guide users (Chinese users) towards determining the

credibility of information retrieved from Baidu and Google.

The proposed model (Figure 1) formed the basis for evaluating how users perceive the

74 “我更喜欢 Google 因为它的搜索引擎排序更好一些。”

75 “我更喜欢百度因为习惯于用它。”

Page 51: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

51

credibility of Web-based information. The model that was used in this study

encompasses the following three major sequential credibility evaluation factors: pre-

judgment of links, site design quality, and content quality. This forms the structure for

presenting the discussion about this research project’s credibility assessment.

5.2 Evaluation Criteria

5.2.1Pre-judgment of hyperlinks

There is no doubt that participants tended to apply learned heuristics and bias when

faced with totally new medium, context, content, or source. The first impression is

closely related to pre-judgment and prejudice. When users retrieve information from

either Baidu or Google, they are likely to focus on the following factors in relation to

the search results: headline and description, popularity, place of sites, prejudice, or

familiarity.

Basically, a search results list comprises of a set of headlines and descriptions as well

as associated hyperlinks. Headlines, descriptions, and places of sites can be used to

assess credibility. Users compare their keywords with headlines and descriptions to

determine the relevance of the underlying information. Users are unwilling to pursue

sites that would not provide relevant information, so they first consider the following

question prior to opening or ignoring a link: can this link provide the information I am

looking for? A ‘yes’ or ‘no’ answer to this question would definitely influence users to

open or ignore a link respectively. This study has indicated that 60% (6 out of ten)

respondents greatly rely on the content of headlines and descriptions to make relevance

Page 52: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

52

and credibility decisions. For example, some users would not have any reason for

ignoring a site out of pre-conception or bias. Pre-judgment may be guided by a positive

or negative impression for a link. What entity (government agency, company, institution,

or individual) is a site or link representing? For example, descriptions and URLs of

governmental agencies and reputable organizations appear to be more trustworthy to

users searching for official information. A perfect example is students looking for

prerequisites for pursuing specific degree programs, whereby they would perceive .edu

domains as links or sites as the most credible. The reputation of the entity owning a site

based on real user experiences also applies to its actual site.

Users tend to prefer recommended pages and links as they appear more popular, useful,

and trustworthy. Users also tend to trust the sites used by most of their friends. This is

especially true for searches involving topics that are strange to users. Therefore,

recommendations play an important role in influencing credibility, and thus selection

of hyperlinks to open with the hope of landing at a web page that fits the purpose. Pre-

judgment based on the popularity may lead to subjective assessment of an online

resource with regard to its credibility (Lewandowski, 2012; Mohammadi, Abrizah, &

Nazari, 2015). For example, this study has shown that Baidu is very popular in China,

so Chinese users trust it as a source of useful information. The popularity of Baidu in

China is echoed by Jiang (2014) and O'Rourke IV et al (2007) – Baidu enjoys a

popularity of more than 80% as a search engine compared to Google’s less than 2%.

Nevertheless, Lazar, Meiselwitz, and Feng (2007) notes that some users do not blindly

follow recommendations, instead they adopt a skeptical approach to assess the

Page 53: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

53

credibility of claims made to avoid being cheated into believing in something that is

not necessarily factual.

Familiarity constitutes another trustworthiness assessment criterion in that users tend

to visit sites they have accessed before. This is particularly true for sites that provide

field-specific information, translating to long-time previous usage and high chances of

being opened in the future (Pattanaphanchai, O'Hara, & Hall, 2013). Users with a

previous experience (familiarity) with a specific site or search engine have high chances

of trusting it. On the same issue, Lucassen, Muilwijk, Noordzij, and Schraagen (2013)

claims that familiarity tends to increase the perceived ease-of-use, confidence, and the

degree of comfort across users with respect to a site or search engine. Unexpected

content or source may turn away users, which builds up the influence of familiarity.

Metzger et al (2003) argues that familiarity is one of the most widely adopted criteria

for assessing credibility. Familiairty facilitates the ease of use and verifiability – clear

link between language and purpose, and information agreement across diverse sources

and with prior knowledge (Walraven et al., 2009).

The placement of sites on the search results list also influences their perceived

credibility. Users tend to focus on the sites or links presented at the top of the first page

of search results because they appear to be more relevant to their topic or subject.

Lewandowski (2012) stated that users are interested in finding the answer as opposed

to web pages in excess of millions, thus most search engine algorithms return query

specific results from the most relevant to the least relevant based on factors such as the

currency of information and user context – the region where the search originates from.

Page 54: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

54

Therefore, users expect the most helpful webpages to be placed at the top of the search

engine’s first query results page. Nevertheless, personal preference may make users to

peruse through results or links regardless of where they are placed. For example, users

searching for academic resources may ignore a Wikipedia page or link because it is

discouraged in the educational arena.

5.2.2 Website assessment

Content and source quality elements become clearer after one visits the actual site or

webpage. The basic appearance of a site influences the perception of users. According

to this study, the following are the proposed criteria for the display of a site: layout,

advertisement, and loading speed. However, there are other quality evaluation criteria,

including site structure, visual design, page rank, traffic rank, implemented

technologies, and branding (Levene, 2011; Westerwick, 2013).

Users consider layout to be a measure of competence – a badly structured site has a

poor user perception. Competence or professionalism remains a key evaluation

criterion of medium credibility because it is a character that shows greatness (Rubin

& Liddy, 2006; Westerwick, 2013). If a site looks messy, then there are high chances

of being closed without necessarily reading the content. Lex et al (2012) asserts that

the user perception of a site is influenced by its consistency and neatness across links

and visual design. A neat as opposed to a crumbled layout also makes it easier to

locate what users are looking for. Users also trust sites which have a clear and logical

structure, and with well-designed user interfaces. Liu (2004) claims that end users

Page 55: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

55

expect sites to uphold high levels of agreement between their constituent web pages.

This is a key enabler of consistency, ease of learning, and user-friendliness and

usability.

The advertisement strategy adopted by a site is also critical to credibility criteria – users

like and trust sites with fewer ads. Sites with massive advertisements are likely to lose

potential users because finding information presented on them would be relatively more

difficult. Very many ads on a site tend to make users ignore the content presented on it.

In fact, some users immediately close a webpage when they discover that there are too

many advertisements for it to be truly helpful. In addition, too many ads would make

users to concern for security and privacy – issues that have been increasingly gaining

realization globally because of growing attack threats. Therefore, minimal

advertisements may make users more comfortable when browsing a site. Users tend to

trust Web-based resources that are ‘easy to find, access, and understand’ (Burton &

Chadwick, 2000).

The page load of a site is a measure of its loading speed, and it constitutes a critical

quality evaluation factor. Users focus on page load performance because it directly

impacts on their everyday experiences regarding a site. No matter how good a brand is

created in terms of visual components, content, white spacing, creativity, legibility of

fonts, and layout design, users want fast page loads. Sites that take too long to load are

often closed by users before even accessing the information resources that come with

them. Therefore, page load has a direct impact on the quality perception of a site.

There are other dimensions that determine the perception of users with respect to the

Page 56: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

56

credibility of a site. Based on the model (see Figure 2) adopted by McKnight and

Kacmar (2006) to assess information quality for an internet advice site, the following

dispositions were documented: general (‘Faith in Humanity’, ‘Suspicion of Humanity’,

and ‘Risk Propensity’), technology (‘Internet Anxiety’, and ‘Trust in Technology’), and

initial impressions (‘Trusting Beliefs’, ‘Perceived Reputation’, and ‘Willingness to

Visit the Site’). In a different research, McKnight and Kacmar (2007, p. 430) discovered

the following factors to be critical to establish site and information credibility: ‘trusting,

perceived reputation, and willingness to explore a site and its content’.

5.2.3 Content evaluation

The following are the major elements of content: quality of information, richness of

information, professionalism, whether information is current or up-to-date, authorship,

and credibility of information (perceived ease-of-accessing information, agreement to

personal experience or knowledge, and relevance) (Twait, 2005). Liu (2004) assumed

that the credibility of Web-based scholarly resources may be assessed based on how

good they are in the organization, quality and logic, spelling and grammar, and

impartiality dimensions.

If a site presents a comprehensive and systematic coverage of information specific to a

subject or topic, then it is perceived to be of high quality. Site content that shows

opinions of different points of view is perceived to be credible – users trust such

information. Users trust information that shows high levels of professionalism. For

example, the credibility of healthcare tips is influenced by the accreditation and

Page 57: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

57

expertise of the individual or organization associated with such content as well as the

professionalism regarding its presentation (Mun, Yoon, Davis, & Lee, 2013). Therefore,

the level of professionalism is a key quality assessment criterion with respect to a site

and its content. What experts, competitors, and real users of a site think about its

professionalism influences its overall reputation. References, reviews, and

recommendations posted by experts and real users of a website guides one in making

credibility decisions. Positive reviews show some level of positive reputation. Other

than reviews, some sites are winners of awards from time to time, proving positive

reputation in areas such as the perceived value addition (Grove, Burns, & Gray, 2014).

The Pulitzer Prize is a perfect example of such an award.

When users are evaluating the credibility of a site, they tend to investigate whether its

content is up-to-date or not. How frequently is information updated? Frequently

updated information appears to be more reliable to users. Upholding timeliness is

especially true for dynamically changing information such as news, weather forecasts,

flight schedules, and geo-location tracking. Currency also translates to credibility for

information such as high-quality financial, professional, and medical advice. Therefore,

users perceive professionally presented information to be of high quality if there is

evidence of regular reviews and updates. Simply put, users want to be assured that the

information they retrieve from search engines and web pages is up-to-date.

Users consider authorship to be of crucial importance to information quality; therefore,

content provider (individuals and organizations) ought to be sufficiently reputable.

Understanding the entity responsible for a specific site forms the basis for assessing the

Page 58: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

58

credibility of its content. Highly credible web pages provide clear and sufficient

information about the overall site, and users tend to trust the content. For example,

information originating from more internet users may be deemed untrustworthy, while

corporate-origination information would be perceived to be highly credible. Users also

assess content based on the perceived motivation of the creator or provider. For example,

information created by a site editor faces the risk of bias or subjectivity in the course of

attempts to create a compelling brand. In addition, financial advice is deemed credible

if it is posted by appropriately accreditated and experienced persons and/or

organizations. Finally, the credibility of such information is negatively affected. On the

other hand, failure to include clear and adequate information about authorship would

obviously damage the credibility of a site and its content (Mai, 2013). Authority entails

the influence or the power (‘the last world’) with respect to the person or organization

involved in the creation of content (Cabrerizo, Martinez, Lopez-Gijon, Chiclana, &

Herrera-Viedma, 2015). Official government, corporate, and individuals’ professional

websites are deemed to have more authority, and thus reliability or relevance and

trustworthiness. Liu (2004) claimed that the following features play an important role

in evaluating credibility of information: URL or domain such as .edu and .gov, the

publisher of information, and the verification of information. The authoritativeness of

content is also regarded as a major credibility measure when users are searching for

academic materials. In fact, it is more difficult to search for relevant academic resources

compared to general information because the former demands high levels of authority

- type and source of information, coverage, accuracy, quality of references , objectivity,

Page 59: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

59

and currency (Head & Eisenberg, 2009; Madden et al., 2012).

Users assess the credibility of information based on how well it agrees to their

experience and/or knowledge. Indeed, users prefer trustworthy content; therefore,

they treat information that is in line with their expectations as reliable. In other words,

users trust information that is similar to what they knew before, otherwise they do not

believe in its credibility. While comprehensiveness of information is a notable

credibility evaluation factor, users would still consider relevance – is this the piece of

information I was looking for? Does this information meet my specific needs and

expectations? Castillo et al (2011) claims that users value resources they find

applicable to their immediate needs and expectations because they seem to be

published out of goodwill and/or care. Consequently, the perceived credibility of such

information resources is high. The relevance of information promotes user

engagement, confidence, satisfaction, and positive beliefs and attitudes, and thus

credibility (Rubin & Liddy, 2006).

Users come across sites with varying ease-of-accessing information based on factors

such as the price of information and the need for a subscription as a member. Whether

one is supposed to pay for a certain piece of information or to register as a member

prior to accessing specific online resources determines the eventual response of users.

The study found that some users would immediately close a site that demands payment.

Similar indications were found for sites characterized by time-intensive registration

processes. More than often, user subscription and registration processes prompt

personally identifiable information (PII) such as credit card number, which may put the

Page 60: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

60

security and/or privacy of confidential information at risk of infringement (Quelch &

Jocz, 2010). Therefore, while information security and privacy dimension were not

explored in this study, it significantly influences how users perceive the ease-of-

accessing Web-based information.

5.3 Baidu versus Google

In order to compare Baidu with Google, researcher carried out a set of searches across

the two search engines. Then gave score to the two search engines based on the result

(as shown in table 3 and 4). Results indicated that Google had the best quality search

results, with 40% (4 out of 10) of the retrieved sites being ranked in the ‘First’ category.

This was in comparison to Baidu, which had 20% (2 out of 10) of the retrieved sites

being ranked in the ‘First’ category. The other 40% of the ‘First’ ranking went to *G/B,

implying that respondents found Google and Baidu to have an equal performance in

terms of the quality of retrieved results. Cumulatively, Google had a higher score

compared to Baidu – 136 against 116 (as shown in Table 4). Therefore, the study found

that Google performed better in terms of retrieving high-quality information.

While this study reveals that Google is better than Baidu in relation to retrieval of

credible or high-quality information, there are mixed findings regarding the strengths

and weaknesses of the two search engines. This is especially true when the performance

of Baidu and Google are compared and contrasted from the perspective of Chinese

users. Basically, the credibility of query results from a search may be used to determine

the performance of a search engine. Baidu was found to perform relatively better than

Page 61: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

61

Google in terms of the number and relevance of retrieved results if the keyword

constitutes a hot issue, terminology, or concept related to China or written in Chinese

language. However, Google appears to return better (many and relevant) results than

Baidu provided the subject is not related to China (Jiang, 2014). When Chinese

keywords are used across the Baidu and Google search engines, the latter returns a later

returns a larger number of results than the former. Nevertheless, the large number of

results retrieved from Google cannot be considered as the absolute measure of

credibility because Baidu delivered better accuracy and relevance (Long et al., 2007).

Therefore, Baidu is better placed to meet the credibility needs and expectations of

Chinese users compared to Google. This can be attributed to the fact that the former

meets the source, content, and context constraints related to the Chinese user

community. Therefore, Google may be the global search engine leader, but Baidu is

deemed to be the most credible source of online information for the Chinese user

community. In addition, the prevalence of Baidu in China can be attributed to the

following general credibility/quality determinants: reliability and/or perceived

usefulness, accuracy, completeness, bias against the more foreign-oriented Google,

familiarity, and relevance.

6.Conclusion

This research has focused on identification and description of the behaviors and user

experiences when using Baidu and Google search engine tools to access the credibility

or quality of websites and other online information resources. The study employed a

Page 62: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

62

qualitative research approach to investigate how Chinese users assess the quality of the

information retrieved from Baidu and Google. More precisely, the methods helped

obtain the values, emotions, beliefs, behaviors, social contexts, motivations, and

perceptions or opinions of Chinese users in relation to Baidu and Google – how this

group of users experiences the two research engines. This study was based on the

following major credibility dimensions: source, receiver, message, medium, and

context. However, the dimensions were condensed into three (medium, source, and

content or message) for easier identification, analysis, description of the behaviors and

user experiences related to the credibility of information retrieved from Baidu and

Google.

There are disparate dispositions associated with the credibility of quality of website,

content, and creator or author of information. Results from this study and past scholarly

works have indicated users have diverse feelings about the determinants of credibility

of Web-based information. Evidently, users trust online resources that are relatively

easy to find, access, and understand. Relevance is also a key credibility determinant

because users tend to treat resources they find applicable to their immediate needs and

expectations to be of high quality. This can be attributed to the fact that such

information appears to be published out of goodwill, care, or concern. Relevance

appears to be a great enabler of engagement, confidence, and positive beliefs and

attitudes. Consequently, credibility is enhanced. Competence or professionalism

dictates the degree of perceived credibility. Web-based information is also presented on

a wide range of platforms, ranging from blogs to corporate websites and news sites.

Page 63: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

63

Nevertheless the fundamental components of these platforms and their content include

the visual design, layout or structure, familiarity, advertisement strategy, reputation,

owner (as specified by the domain name), consistency, ease of learning, page load or

responsiveness, user-friendliness and usability, agreement to personal experience or

knowledge, willingness to explore, quality of information, richness of information,

currency of information, authorship, and the overall degree of professionalism shown.

For example, a messy layout or structure would damage the credibility of a site.

Nevertheless, Web-based resources can be assessed for credibility based on mere

prejudice, for example, popularity, familiarity, pre-conception, recommendations, or

bias.

Results from this study have indicated that respondents ranked Google as the search

engine with high-quality retrieved 'retrieving' information compared to Baidu. However,

insights from past studies and researcher experiences from this work has shown that

elements of pre-judgment such as bias, recommendations, and familiarity play a role in

influencing user behaviors. A perfect example is the huge popularity of Baidu in China

and Google in the U.S., which contributes to their outright prevalence in their respective

countries. In addition, familiarity of Baidu among Chinese users makes it the most

trusted search engine in China. Nevertheless, in the Chinese users’ perspective, Baidu

delivered better accuracy, ease-of-understanding, perceived usefulness, context

matching, and relevance performance compared to Google. Basically, Baidu has an

edge over Google in the Chinese users’ perspective because of better mastery of the

complex Chinese dialect and culture, which is replicated in the preciseness of retrieved

Page 64: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

64

results.

As part of future research, it is important to investigate how Google can improve its

search algorithms to meet the credibility requirements of Chinese language. This is

because the language is highly context-specific, thus a small grammar mistake is likely

to lead to ignite unwanted confusion, and ultimately credibility inefficiencies.

Word count:13028

Page 65: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

65

7. Appendices

7.1 ethical form

TheUniversityofSheffield.InformationSchool

A comparison of Google and Baidu: How do Chinese users assess the quality of the information retrieved for them by the two search engines?

Researchers

Name: Sen Wang Tel:07469 011417 E-mail: [email protected] Purposeoftheresearch

The purpose of this research is to learn more about the views of Chinese users of two major search engines (Baidu and Google) and of the quality of information the search engines retrieve.

Whowillbeparticipating?

Chinese students at the University of Sheffield Whatwillyoubeaskedtodo?

Firstly, you will be asked to select a topic in which you are knowledgeable and to search for resources that you feel would help a non-expert in the subject. Then you will be asked to explain your choices and to rank them. Whatarethepotentialrisksofparticipating?

The risks of participating are the same as those experienced in everyday life. Whatdatawillwecollect?

Interviews will be video recorded for further analysis. Whatwillwedowiththedata?

Interview data will be stored on the Information School's research data drive which can be accessed only by me, my supervisor, the School's Examinations Officer and ICT staff operating the facility. The data will be deleted 6 months after my dissertation has been completed. I will also store a password protected back up copy on my personal laptop, and will delete this data once my dissertation has been completed and marked. Willmyparticipationbeconfidential?

All data will be anonymized. Recordings will be filed using coded identifiers.Whatwillhappentotheresultsoftheresearchproject?

The results of this study will be included in my master’s dissertation which will be publicly available. Some findings may also be reported in academic publications. Please contact the School in six months. Note:Ifyouhaveanydifficultieswith,orwishtovoiceconcernabout,anyaspectofyourparticipationinthisstudy,pleasecontactDr.JoBates,EthicsCoordinator,InformationSchool,TheUniversityofSheffield([email protected]),ortotheUniversityRegistrarandSecretary.

Page 66: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

66

7.2 Participant Consent form

University of Sheffield

Participant Consent FormParticipant Consent Form

A comparison of Google and Baidu: How do Chinese users assess the quality of the information retrieved for them by the two search engines?

Name of Researcher: Sen Wang Contact details: Tel:07469 011417; E-mail: [email protected] Participant Identification Number for this project: Please initial box 1. I confirm that I have read and understand the information sheet dated

18 March 2016 that explains the above research project, and that I have had the opportunity to ask questions about the project.

2. I understand that my participation is voluntary and that I am free to withdraw at any time without giving any reason and without there being any negative consequences. In addition, should I choose not to answer any particular

question or questions, I am free to decline.

3. I understand that my responses will be kept strictly confidential.

I give permission for members of the research team to have access to my anonymised responses. I understand that my name will not be linked with

the research materials, and I will not be identified or identifiable in the report or reports that result from the research.

4. I agree for the data collected from me to be used in future research 5. I agree to take part in the above research project.

________________________ ________________ ____________________ Name of Participant Date Signature (or legal representative) _Sen Wang_______________ ________________ ____________________ Lead Researcher Date Signature To be signed and dated in presence of the participant Copies: Once this has been signed by all parties the participant should receive a copy of the signed and dated participant consent form, the letter/pre-written script/information sheet and any other written information provided to the participants. A copy of the signed and dated consent form should be placed in the project’s main record (e.g. a site file), which must be kept in a secure location.

Page 67: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

67

7.3APPROVAL LETTER

Downloaded: 17/08/2016

Approved: 25/05/2016

Sen Wang

Registration number: 150227885

Information School

Programme: dissertation

Dear Sen

PROJECT TITLE: A comparison of Google and Baidu: How do Chinese users assess the quality of the

information retrieved for them by the two search engines

APPLICATION: Reference Number 008833

On behalf of the University ethics reviewers who reviewed your project, I am pleased to inform you that on

25/05/2016 the above-named project was approved on ethics grounds, on the basis that you will adhere to

the following documentation that you submitted for ethics review:

University research ethics application form 008833 (dated 17/05/2016).

Participant information sheet 1018425 version 1 (17/05/2016).

Participant consent form 1018041 version 2 (17/05/2016).

If during the course of the project you need to deviate significantly from the above-approved documentation

please inform me since written approval will be required.

Yours sincerely

Matt Jones

Ethics Administrator

Information School

Page 68: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

68

8. References

Abdulla, R. A., Garrison, B., Salwen, M., Driscoll, P., & Casey, D. (2002). The

credibility of newspapers, television news, and online news. In Education in

Journalism Annual Convention, Florida USA.

Burton, V. T., & Chadwick, S. A. (2000). Investigating the practices of student

researchers: Patterns of use and criteria for use of Internet and library sources.

Computers and Composition, 17(3), 309–328.

Cabrerizo, F. J., Martinez, M. A., Lopez-Gijon, J., Chiclana, F., & Herrera-Viedma, E.

(2015). A Web Information System to Improve the Digital Library Service

Quality. New Trends on System Science and Engineering: Proceedings of ICSSE

2015, 276, 3.

Castillo, C., Mendoza, M., & Poblete, B. (2011, March). Information credibility on

twitter. In Proceedings of the 20th international conference on World wide web

(pp. 675-684). ACM.

Currie, L., Devlin, F., Emde, J., & Graves, K. (2010). Undergraduate search strategies

and evaluation criteria: Searching for credible sources. New Library World,

111(3/4), 113–124.

Page 69: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

69

Eysenbach, G., Powell, J., Kuss, O., & Sa, E.-R. (2002). Empirical studies assessing

the quality of health information for consumers on the world wide web: a

systematic review. Jama, 287(20), 2691–2700.

Fritch, J. W., & Cromwell, R. L. (2001). Evaluating Internet resources: Identity,

affiliation, and cognitive authority in a networked world. Journal of the

American Society for Information Science and Technology, 52(6), 499–507.

Grove, S. K., Burns, N., & Gray, J. R. (2014). Understanding nursing research:

Building an evidence-based practice. Elsevier Health Sciences.

Gubrium, J. F., & Holstein, J. A. (2002). Handbook of interview research: Context

and method. Sage.

Head, A. J., & Eisenberg, M. B. (2009). Finding Context: What Today’s College

Students Say about Conducting Research in the Digital Age. Project Information

Literacy Progress Report. Project Information Literacy.

Jansen, B. J., & Spink, A. (2006). How are we searching the World Wide Web ? A

comparison of nine search engine transaction logs, 42, 248–263.

http://doi.org/10.1016/j.ipm.2004.10.007

Page 70: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

70

Jiang, M. (2014). The business and politics of search engines: A comparative study of

Baidu and Google’s search results of Internet events in China. New Media &

Society, 16(2), 212-233.

Lazar, J., Meiselwitz, G., & Feng, J. (2007). Understanding Web Credibility: A

Synthesis of the Research Literature. Now Publishers.

Levene, M. (2011). An Introduction to Search Engines and Web Navigation. John Wiley

& Sons.

Lewandowski, D. (2012). Web Search Engine Research. Emerald Group.

Lex, E., Voelske, M., Errecalde, M., Ferretti, E., Cagnina, L., Horn, C., ... & Granitzer,

M. (2012, April). Measuring the quality of web content using factual

information. In Proceedings of the 2nd joint WICOW/AIRWeb workshop on web

quality (pp. 7-10). ACM.

Liu, Z. H., Zhang, F., & Chen, S. (2010). Comparative study on search effectiveness of

Google and Baidu based on user experience. Journal of Zhejiang University

(Science Edition), 37(5), 605-610.

Page 71: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

71

Long, H., Lv, B., Zhao, T., & Liu, Y. (2007, September). Evaluate and compare Chinese

internet search engines based on users' experience. In 2007 International

Conference on Wireless Communications, Networking and Mobile

Computing (pp. 6134-6137). IEEE.

Lucassen, T., Muilwijk, R., Noordzij, M. L., & Schraagen, J. M. (2013). Topic

familiarity and information skills in online credibility evaluation. Journal of the

American Society for Information Science and Technology, 64(2), 254-264.

Madden, A. D., Ford, N., Gorrell, G., Eaglestone, B., & Holdridge, P. (2012).

Metacognition and web credibility. The Electronic Library, 30(5), 671–689.

Mai, J. E. (2013). The quality and qualities of information. Journal of the American

society for information science and technology, 64(4), 675-688.

Mcknight, D. H., & Kacmar, C. J. (2007). Factors and Effects of Information

Credibility, 423–432.

McKnight, H., & Kacmar, C. (2006). Factors of information credibility for an internet

advice site. In System Sciences, 2006. HICSS’06. Proceedings of the 39th Annual

Hawaii International Conference on (Vol. 6, p. 113b–113b). IEEE.

Page 72: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

72

Metzger, M. J. (2007). Making sense of credibility on the Web: Models for evaluating

online information and recommendations for future research. Journal of the

American Society for Information Science and Technology, 58(13), 2078–2091.

Metzger, M. J., Flanagin, A. J., & Zwarun, L. (2003). College student Web use ,

perceptions of information credibility , and verification behavior, 41, 271–290.

http://doi.org/10.1016/S0360-1315(03)00049-6

Mohammadi, F., Abrizah, A., & Nazari, M. (2015). Is the information fit for use?

Exploring teachers perceived information quality indicators for Farsi web-based

learning resources. Malaysian Journal of Library & Information Science, 20(1),

99-122.

Mun, Y. Y., Yoon, J. J., Davis, J. M., & Lee, T. (2013). Untangling the antecedents of

initial trust in Web-based health information: The roles of argument quality,

source expertise, and user perceptions of information quality and risk. Decision

Support Systems, 55(1), 284-295.

O'Rourke IV, J. S., Harris, B., & Ogilvy, A. (2007). Google in China: government

censorship and corporate reputation. Journal of Business Strategy, 28(3), 12-22.

Pattanaphanchai, J., O'Hara, K., & Hall, W. (2013, May). Trustworthiness criteria for

Page 73: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

73

supporting users to assess the credibility of web information. In Proceedings of

the 22nd International Conference on World Wide Web (pp. 1123-1130). ACM.

Pickard, A. J., Gannon-leary, P., & Coventry, L. (2010). Users ’ trust in information

resources in the Web environment : a status report. Retrieved from http://ie-

repository.jisc.ac.uk/470/2/JISC_User_Trust_final_report.pdf

Quelch, J. A., & Jocz, K. E. (2010). Google in China. Harvard Business School.

Rains, S. A., & Karmikel, C. D. (2009). Health information-seeking and perceptions

of website credibility: Examining Web-use orientation, message characteristics,

and structural features of websites. Computers in Human Behavior, 25(2), 544–

553.

Rubin, V. L., & Liddy, E. D. (2006, March). Assessing Credibility of Weblogs. In AAAI

Spring Symposium: Computational Approaches to Analyzing Weblogs (pp. 187-

190).

Ryan, G. J., Ryan, S. W., Ryan, C. M., Munro, W. A., & Robinson, D. (2002, July

16). Search engine. Google Patents.

Silverman, D. (2016). Qualitative research. Sage

Page 74: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

74

Taylor, S. J., Bogdan, R., & DeVault, M. (2015). Introduction to qualitative research

methods: A guidebook and resource. John Wiley & Sons.

Tombros, Anastasios ; Ruthven, Ian ; Jose, J. M. (2005). How users assess Web pages

for information seeking. Journal of the American Society for Information Science

and Technology, 56(4)(12), 327–344. http://doi.org/10.1002/asi.20106

Tsai-Youn, H. (2004). Undergraduate students’ evaluation criteria when using web

resources for class papers. Journal of Educational Media and Library Sciences,

42(1), 1–12.

Twait, M. (2005). Undergraduate Students ’ Source Selection Criteria : A Qualitative

Study, 31(6), 567–573.

Walraven, A., Brand-gruwel, S., & Boshuizen, H. P. A. (2009). How students

evaluate information and sources when searching the World Wide Web for

information. Computers & Education, 52(1), 234–246.

http://doi.org/10.1016/j.compedu.2008.08.003

Wathen, C. N., & Burkell, J. (2002). Believe It or Not : Factors Influencing

Credibility on the Web, 53(2), 134–144. http://doi.org/10.1002/asi.10016

Page 75: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

75

Westerwick, A. (2013). Effects of sponsorship, web site design, and Google ranking

on the credibility of online information. Journal of Computer-Mediated

Communication, 18(2), 80-97.

Page 76: September 2016 - University of Sheffielddagda.shef.ac.uk/dispub/dissertations/2015-16/External/Sen_Wang.pdf · information are content, authorship, layout and structure. The result

76

Information School.

Access to Dissertation A Dissertation submitted to the University may be held by the Department (or School) within which the Dissertation was

undertaken and made available for borrowing or consultation in accordance with University Regulations.

Requests for the loan of dissertations may be received from libraries in the UK and overseas. The Department may also

receive requests from other organisations, as well as individuals. The conservation of the original dissertation is better

assured if the Department and/or Library can fulfill such requests by sending a copy. The Department may also make your

dissertation available via its web pages.

In certain cases where confidentiality of information is concerned, if either the author or the supervisor so requests, the

Department will withhold the dissertation from loan or consultation for the period specified below. Where no such

restriction is in force, the Department may also deposit the Dissertation in the University of Sheffield Library.

To be completed by the Author – Select (a) or (b) by placing a tick in the appropriate box

If you are willing to give permission for the Information School to make your dissertation available in these ways, please

complete the following:

� (a) Subject to the General Regulation on Intellectual Property, I, the author, agree to this dissertation being made

immediately available through the Department and/or University Library for consultation, and for the Department

and/or Library to reproduce this dissertation in whole or part in order to supply single copies for the purpose of

research or private study

(b) Subject to the General Regulation on Intellectual Property, I, the author, request that this dissertation be withheld

from loan, consultation or reproduction for a period of [ ] years from the date of its submission. Subsequent to

this period, I agree to this dissertation being made available through the Department and/or University Library for

consultation, and for the Department and/or Library to reproduce this dissertation in whole or part in order to

supply single copies for the purpose of research or private study

Name Sen Wang

Department Information school

Signed Sen Wang Date 28.08.2016

To be completed by the Supervisor – Select (a) or (b) by placing a tick in the appropriate box

(a) I, the supervisor, agree to this dissertation being made immediately available through the Department and/or

University Library for loan or consultation, subject to any special restrictions (*) agreed with external organisations

as part of a collaborative project.

*Specialrestrictions

(b) I, the supervisor, request that this dissertation be withheld from loan, consultation or reproduction for a period of

[ ] years from the date of its submission. Subsequent to this period, I, agree to this dissertation being made

available through the Department and/or University Library for loan or consultation, subject to any special

restrictions (*) agreed with external organisations as part of a collaborative project

Name

Department

Signed Date

THIS SHEET MUST BE SUBMITTED WITH DISSERTATIONS BY DEPARTMENTAL REQUIREMENTS.