the foundations of scientific thinking notes

of 47

The Foundations of

Scientific Thinking

Notes

of 47

Contents • The Development of Modern Science .................................................................................... 3

➢ Epistemology .................................................................................................................... 3

➢ Influence of Empiricism on Scientific Inquiry ...................................................................... 5

➢ Induction vs Deduction ..................................................................................................... 6

➢ Parsimony/Occam’s Razor ................................................................................................. 7

➢ Falsifiability ...................................................................................................................... 8

➢ Significance of Confirmation Bias....................................................................................... 9

➢ Cultural Contribution Knowledge .................................................................................... 12

➢ Paradigm Shift ................................................................................................................ 15

• Influences on Current Scientific Thinking ............................................................................. 19

➢ Ethics, Morality, & the Law.............................................................................................. 19

➢ Current Influences on Scientific Thinking ......................................................................... 20

➢ Influence of Ethical Frameworks on Scientific Research .................................................... 20

➢ Use of Research Data ...................................................................................................... 25

of 47

• The Development of Modern Science

➢ Epistemology o Epistemology is defined as: ‘a branch of philosophy that investigates the

origin, nature, methods, and limits of human knowledge’.

o Scientific epistemology explores the nature of scientific knowledge. It

consists of three aspects:

1. The Qualities of Scientific

Knowledge

▪ Science attempts to explain natural phenomena ▪ Scientific knowledge is represented as laws and theories. ▪ Laws: describe patterns and relationships in scientific information. ▪ Theories: provide explanations of natural phenomena. ▪ Scientific knowledge is tentative as it requires revision. ▪ Science is part of the social and cultural traditions of many human

societies. ▪ Scientific ideas are affected by social and historical setting.

2. The Limitations of Scientific Knowledge

▪ Science does not make moral judgements (e.g. should euthanasia be permitted?).

▪ Science does not make aesthetic judgments (e.g. is Mozart’s music more beautiful than Bach’s?).

▪ Science does not prescribe how to use scientific knowledge (e.g. should genetic engineering be used to develop disease-resistant crops?).

▪ Science does not explore supernatural or paranormal phenomena (e.g. religious ideas and ghosts).

3. How Scientific Knowledge if

Generated

▪ The development of scientific knowledge relies on observations, experimental evidence, rational arguments and scepticism.

▪ Scientific knowledge advances through slow and incremental steps (evolutionary progression), as well as giant leaps of understanding (revolutionary progression).

▪ Observations are theory dependent, which influences how scientists obtain and interpret evidence.

▪ There is no universal step-by-step scientific method. Scientific knowledge is acquired through a variety of different methods. Two main lines of reasoning that influence modern science as inductive (generalisations) and deductive processes (deriving).

o Science distinguishes itself from other ways of knowing and from other

bodies of knowledge through the use of empirical standards, logical

arguments, and scepticism, as scientists strive for certainty of their

proposed explanation.

of 47

o Alternative Ways of Knowing:

Alternative Ways Explanation Examples

Emotion feeling, as opposed to reasoning Can/should we control our

emotions? Are emotions the enemy of, or necessary for, good reasoning?

Faith/Belied trust or confidence

Can theistic beliefs be considered knowledge because they are

produced by a special cognitive faculty or “divine sense”? Does faith

meet a psychological need?

Imagination forming new ideas, or images or concepts of external objects not

present to the senses.

What is the role of imagination in producing knowledge about a real

world? Can imagination reveal truths that reality hides?

Intuition a form of knowledge that appears in consciousness without obvious

deliberation

Are there certain things that you have to know before being able to learn anything at all? Should you

trust your intuition?

Language a system of communication used by a particular country or community.

How does language shape knowledge? Is the importance of

language cultural?

Memory the faculty by which the brain encodes, stores, and retrieves

information.

Can we know things which are beyond our personal present experience? Can our beliefs contaminate our memory?

Reason a basis or cause, as for some belief,

action, fact, event

What is the difference between reason and logic? How reliable is

inductive reasoning?

Sense Perception understanding gained through the

use of one of the senses such as sight, taste, touch or hearing

How can we know if our senses are reliable? What is the role of

expectation or theory in sense perception?

o Navigation

❖ Early travellers relied on their senses (sense perception) to observe

landforms, wind speed and direction, tides and measures of

distance to navigate (observational knowledge). Celestial navigation

using the positions of stars, constellations and the sun also served as

navigational aids. In those times, travel was restricted to short

distances, or to coastal areas.

❖ With advances in measuring techniques (and geometry), accurate

maps were created. Such calculations indicated that the Earth was a

sphere. The altitude of the North Star provided latitudinal

information. These are examples of knowledge constructed through

memory, language (communication through oral stories, written

accounts and maps) and reasoning.

❖ Later, navigational instruments extended the powers of sense

perception. The compass was an important tool to orientate

of 47

travellers to the magnetic north (works at night as well). Other

instruments, such as the astrolabe, Sextant, chronometer and Chip

Log were designed to identify locations in 3-dimensional space. The

information from these instruments was used to produce highly

refined maps (ways of knowing: reasoning, imagination, intuition,

language).

❖ Modern navigation uses radar, gyroscopic compasses and the GPS to

provide positional and kinematic (e.g. speed and acceleration)

information.

❖ Polynesians used natural navigation aids such as the stars, ocean

currents, and wind patterns. They used non-physical devices such as

songs and stories for memorizing the properties of stars, islands,

and navigational routes.

➢ Influence of Empiricism on Scientific Inquiry o Science is derived from philosophy. The term ‘philosophy’ means the love of

wisdom. One branch of philosophy focuses on developing explanations of

the natural world. This branch was called ‘natural philosophy’.

o Around the 15th century, natural philosophers began to redefine how

knowledge of the natural world should be constructed. Natural philosophy

was the beginning of science. In the 19th century, the British philosopher,

William Whewell, coined the term ‘science’ to describe the type of inquiries

undertaken by the natural philosophers. Eventually, the term science

became distinct from other branches of inquiries (such as philosophy,

religion, etc).

o As an example of the common roots of science and philosophy, the highest

research degree awarded by Universities around the world is the Doctor of

Philosophy, even in science. After a person receives a PhD degree, they are

allowed to use the title “doctor” (medical science is the exception to this

rule).

o Empiricism is a branch of philosophy that emphasises ‘prior experience’.

Empiricists say that we can only construct knowledge after collecting

information through our senses. Sensory information extended to

information collected using instruments.

o Therefore, observations are important for knowledge construction. The

information collected through observation eventually becomes evidence

and explanations for natural phenomena. Over time, the evidence and

explanations become knowledge.

o Empiricism was crucial for

separation of natural philosophy

from the other branches of

philosophy. It came to define

modern science. Most scientific

knowledge is empirical.

Empiricism demands that all

scientific information be based

on evidence and tested through

observations or experimentation.

of 47

➢ Induction vs Deduction o Induction is the process of generalisation. After collecting information about

specific events, generalisations are drawn. They describe the broad

applications of the conclusions. In science, inductive reasoning allows

explanations of related phenomena to be constructed.

o Deduction is the process of deriving specific knowledge from broad ideas.

Therefore, deductive reasoning is often used to make predictions.

▪ The top panel illustrates inductive thinking. When a leaf

is examined under a microscope, it is seen to be

composed of cells. Examining the leaves of many plants

shows the same conclusion. Therefore, through

inductive reasoning, we may conclude that all plants

are composed of cells. In doing so, the definite

conclusion of each observation is used to synthesise a

generalisation – the Cell Theory. Theories are big ideas

in science – broad explanation of natural phenomena.

▪ The lower panel illustrates deductive thinking. Here, we

start with a big idea – that of the Cell Theory, which

states that all plants are composed of cells. Suppose

you have come across a new and unknown type of

plant. Based on the Cell Theory, you predict that the

unknown plant is composed of cells. This prediction is

called a hypothesis. You then conduct an experiment,

where you observe that the plant is indeed composed

of cells. In this case, we have moved from a general

instance (the Cell Theory) to a specific conclusion (that

the new, unknown plant is composed of cells).

▪ Many of the big ideas or theories in science are the

products of inductive thinking. This example shows

Charles Darwin’s inductive thinking on populations of

organisms.

1. Darwin makes several discrete observations about

how individuals in a population are adapted to

their environments (for example the beaks of

different populations of finches show different

shapes).

2. Based on the five observations shown on the slide,

Darwin makes two inferences.

3. After many such inferences, Darwin develops a

new big idea, which generalises how populations

change over time. This is known as the Theory of

Evolution by Natural Selection.

of 47

o Scientific Laws describe the relationships between the variables of a system.

They are usually expressed in the form of mathematical equations. This slide

shows equations in the Chemistry and Physics datasheets.

o Scientific laws are examples of inductive reasoning.

➢ Parsimony/Occam’s Razor o William of Occam was an English friar who live in the 12/13th centuries.

Although he did not invent the phrase, he used the phrase “Plurality must

never be posited without necessity” frequently in his writings. Many

thinkers before Occam, including the Greek philosophers Aristotle and

Ptolemy, have made statement similar to this. A modern-day statement of

Occam’s razor is “Other things being equal, simpler explanations are

generally better than more complex ones”.

o Science works with competing ideas. That means that when scientists are

trying to develop explanations of some phenomenon, the devise alternative

hypothesis. Sometimes, after testing those hypotheses, there may be more

than one plausible hypothesis for a phenomenon. In those situations, using

Occam’s razor may be useful. Occam’s razor says that if the competing

hypotheses are equivalent, then the simpler hypothesis is the best

explanation for the phenomenon.

▪ The discovery of the electron by J.J. Thomson is an

example of deductive thinking. While studying the

nature of cathode rays, Thomson was exploring the

basis of the Atomic Theory.

▪ In the 19th and early 20th centuries, it was thought that

atoms were electrically-neutral and indivisible

constituents of matter. However, through careful

experimentation and data analyses, Thomson

discovered that atoms were composed of subatomic

particles.

▪ One type of subatomic particle was negatively charged

and was called the electron. As a result of his

discoveries, the Atomic theory was modified.

of 47

o There are many historical examples of the use of Occam’s razor. Before the

16th century, the geocentric model of the solar system (Earth at the centre

of the solar system) was dominant. Then, this was replaced with the

heliocentric model (Sun in

the centre). The

geocentric model

required a number of

complicated features (e.g.

epicycles) to explain

some unusual

phenomena (such as the

retrograde motion of

Venus). The heliocentric

model did not require

such features and is thus

a simpler model.

o Scientists do not use Occam’s razor exclusively when accepting ideas in

science.

o The most important factor is evidence.

o Other considerations:

❖ Are some ideas more testable than others?

❖ Are some ideas better at producing broader explanations?

❖ Are some ideas a better fit with existing ideas?

❖ Are some ideas better at generating new areas for investigation?

➢ Falsifiability o Falsifiability is a method of developing scientific knowledge. It is a type of

deductive reasoning. It claims that all scientific ideas should be falsifiable

through testing (for example, through experimentation). If an idea cannot

be falsified, then it cannot be scientific. For example, creation science or

intelligent design are not considered to be scientific because you cannot test

their ideas.

o While not everything agrees with the principle of falsification, falsification

impacted on two aspects of science:

❖ Differentiating scientific ideas

from non-scientific ideas

❖ A method to test and verify

scientific ideas.

o Falsification has given rise to one

method of testing and verifying scientific ideas. This is known as hypothesis

testing.

o Hypotheses are tentative explanations of a narrow set of related

phenomena. For example, consider the hypothesis “particulate pollution in

the atmosphere increases the incidence of asthma”. This hypothesis

of 47

proposes an explanation for the increased incidence of asthma. It is a

tentative explanation that is based on observations, but needs to be

verified. In other words, the hypothesis needs to be tested.

o To test the hypothesis, a controlled experiment must be conducted and the

data generated in that experiment analysed.

o Two important features of hypotheses are that:

❖ Hypotheses cannot be proven to be true – they can only be falsified

(this is because of the falsification principle).

❖ Hypotheses can only be rejected

(if they are NOT supported by

evidence) or not rejected (if the

evidence supports the

hypothesis).

o These are important features of

hypotheses to bear in mind. The goal of

hypothesis testing is to reject what is

false (not supported by evidence).

o Often, hypothesis testing also involves

the statistical analysis of experimental data. So, it is not

simply the data collected in an investigation that is

used to verify hypotheses, but the quality of the data.

➢ Significance of Confirmation Bias o Observations are an important element of scientific inquiry. Inferences can

be influenced by:

❖ Confirmation Bias

❖ Theory-Laden Observation

o Confirmation Bias: the tendency to search for or interpret information in a

way that confirms one’s preconceptions.

o Theory-Dependent Observations: how previous experiences, beliefs and

assumptions affect the inferences drawn from observations.

o No matter whether we use inductive or deductive reasoning, observations

are important. The quality of observations is crucial for initiating inquiries

and investigations. Consider Marshall and Warren’s study on the microbial

cause of gastric ulcers. The observation that Helicobacter pylori are

frequently associated with gastric ulcers was a crucial observation that led

to the discovery that that bacterium caused the disease.

of 47

o Another important aspect of observations is the analysis of data. Identifying

patterns and trends in experimental data is critical for finding evidence that

may support the hypotheses.

o Theory-dependent observations refers to observations that are dependent

on theories. It means that prior knowledge of scientific theories may

influence that inferences that we draw from observations. This extended to

the ways that we analyse and interpret observations. Optical illusions are

often used to illustrate theory-dependent observations. In the picture

shown in the slide, the image on the left shows two lines, one vertical and

the other horizontal.

o On initial inspection, the vertical line appears to be longer than the

horizontal line. Yet, when measured, both lines are of the same length.

Although this example is simplistic, it illustrates how the initial

interpretations we make of our observations may be misleading and require

further inquiry. Therefore, how we interpret observations is dependent on

prior experience.

o Theory-dependent observations play an important role in the way ‘experts’

interpret information. For example, an X-ray image may not be informative

to the untrained person, but to an experienced radiologist, the same image

may be very informative. The radiologist may be able to pick up certain

conditions or pathologies, because of his/her past learning and experiences.

o Theory-laden observations may also be responsible for professional

intuition, where the expert practitioner may be able to arrive at certain

conclusion without conducting much analysis.

of 47

o Theory-laden observations can lead people to derive different conclusion

from the same set of observations. As shown on this table, in many scientific

fields, scientists have come to different conclusions while studying the same

phenomena. In some instances, those conclusions have been wrong (e.g.

Aristotle, Ptolemy). In other instances, the different conclusion describe

different aspects of the same phenomena (Newton and Einstein).

o Confirmation bias is not a good thing in science. As the name suggests, it is a

form of bias. That bias improperly confirms a researcher’s belief about the

outcome of an inquiry. There are many reasons why confirmation bias may

occur in a scientific investigation. Some of the reasons are listed on this

slide.

o Poor experimental design or data is a major cause of confirmation bias.

Sometime, preliminary studies are interpreted as confirmatory studies. For

example, in a recent article, it was claimed that bald men are more likely to

be afflicted with Covid-19 disease. This was only an observation is a couple

of hospitals, and not the result of a well-designed investigation.

Confirmation bias may also occur when correlation is confused with cause-

and-effect.

o Here is an example of confirmation bias in the scientific literature. It is

generally assumed by biologists that ants are more aggressive to ants from

neighbouring nests than to those from their own nest. A research team in

Melbourne decided to examine the scientific papers on the nesting

behaviour of ants. They looked at 79 publications, and noticed that only 29%

of those were designed as blinded-studies. Blinded studies are controlled-

experiments. In other words, 79% of the published studies did not use a

proper experimental design. The researchers also noted that the studies

that were not controlled-experiments,

the assumption of ant behaviour was

identified. However, in the controlled-

experiment studies, the reverse was

observed. This occurred because in the

uncontrolled studies, the researchers

did not check for aggressive behaviour

within each nest. They simply assumed

that ants were less aggressive towards

nest mates, compared to ants from

other nests. Thus, the poor

experimental design of those studies

resulted in a confirmation bias.

of 47

➢ Cultural Contribution Knowledge o Many of those knowledge systems have influenced the development of

scientific knowledge.

o Knowledge construction is closely linked with cultural constructs. This means

that knowledge construction depends on the languages used in a society,

the cultural practices and other factors. In the preceding slides, we looked at

how scientific knowledge is constructed. We examined the central role of

empiricism, reasoning tools (such as induction and deduction), Occam’s

razor, falsification, confirmation bias and paradigm shifts in developing

scientific knowledge. Most of the scientific research and knowledge

construction that happens around the world is largely the product of

European schools of thought.

o However, all cultures in the world have systems for constructing knowledge.

In every culture, knowledge is constructed and communicated in ways that

are specific to those cultures. For example, the knowledge systems in

indigenous societies are called Traditional Knowledge. Many governments

are now tapping into traditional knowledge systems, as those systems have

developed different, but relevant, explanations of natural phenomena.

o As with science, cultural observational knowledge is based on developing

inferences from observations. However, there is little or no experimentation

such as that seen in science. Over the years, cultural observational

knowledge has made significant contributions to scientific advancement.

o One example of cultural observational knowledge that is common to many

societies is astronomy. People around the world realised that many natural

phenomena can be attributed to astronomical events. For example, changes

in the seasons, weather and tides were associated with the positions of the

sun and the moon in the sky. Agriculture was dependent on seasonal

information. The patterns of stars (constellations) in the sky could provide

positional and directional information for travel. Therefore, many cultures

developed systems for measuring and analysing astronomical data. As

shown in this slide, observatories have been identified in ancient Mexican

(Mayan), Egyptian, Indian and Chinese societies. Much of this information

has been used to construct knowledge of the Earth (for travel and trade) and

astronomical phenomena.

of 47

o The indigenous cultures in Australia are ancient and have existed in this

continent for more than 60,000 years. There were more than 400 Aboriginal

nations in Australia. There were different languages and cultural practices in

those societies. They constructed knowledge of natural phenomena and

transmitted that knowledge mainly in the oral tradition. For example,

Aboriginal societies studied the night sky and developed mythical tales of

constellations and other astronomical phenomena.

o The emu in the sky describes the region of the Milky Way that is adjacent to

the Southern Cross, and forms part of the Dreaming narrative about

creation. Other stories were built around the Pleiades system and the Orion

constellations. In addition to mythologies, the night sky also provided

information for seasonal changes, and as guideposts for celestial navigation.

Time, calendars and information about seasons were developed using

astronomical knowledge. Some other uses of astronomical knowledge in

Aboriginal societies are indicated in this slide.

o Aboriginal societies also developed extensive knowledge about local

Australian ecosystems. This knowledge is referred to as Traditional

Ecological Knowledge. That knowledge is currently used in Australian states

and territories for managing ecosystems and landcare. Their understanding

of the role of bushfires in the functioning of local ecosystems is proving to

be critical for modern fire management systems. Another area of traditional

knowledge that has received scrutiny is bush medicine. Traditional

knowledge is being used to identify new substances from native plants that

have medicinal and therapeutic value, including antibiotics, antimicrobials

and antiviral products. Thus, contemporary society benefits from traditional

knowledge as it become integrated with scientific knowledge.

o The use of traditional knowledge for the development of medicinal,

therapeutic or health products has implications for commercialisation

practices and intellectual properties.

o All civilisations developed knowledge of natural phenomena. As shown in

this slide, the cultural observational knowledge of many civilisations

influenced the development of modern science. The Islamic cultures of the

Middle Ages amalgamated and advanced the knowledge systems of those

civilisations and formed the basis of scientific development in Renaissance

Europe.

of 47

o Greece: parallax measurements and geometry; geocentric and heliocentric

models of the solar system

o Egypt: curvature of the Earth (Aristachus), calendar, brewing, agriculture

o India: metallurgy, surgery, medicine, mathematics, astronomy

o China: metallurgy, printing, explosives, paper, irrigation, acupuncture

o Islamic: medicine, physics, chemistry, biology, astronomy

o Here is an example of cultural observations that

enhanced scientific understanding of natural

phenomena. During the Middle Ages, the Islamic

world was a centre of learning. As a result of military

conquests and trading relations, Islamic cultures in

the Middle East and North Africa had access to

knowledge and data from many parts of the world.

Islamic scholars translated the works of the ancient

Greeks, Romans and Egyptians. They assembled

information about scientific discoveries from far-off

places, such as India and China. Universities in the

Middle East were highly-regarded centres of

learning.

o Another example of the contribution of cultural observational knowledge to

science is the discovery of the anti-malarial compound, Artemisinin. A

Chinese literary work, dating back to ~300 A.D., suggested that preparations

of the herb may provide protection against malaria. You-You Tu decided to

extract the active ingredient from this plant so as to develop the substance

as a therapeutic. After many years of sustained effort, Tu isolated the

extract and showed that it was effective

against both types of malarial parasites. Tu

then proceeded to determine the chemical

structure of the active ingredient, which as

called Artemisinin (after the scientific name

of the herb). Artemisinin is now part of the

established anti-malarial therapy used

around the world to treat the condition.

Research is also underway to develop new

therapeutics, based on the molecular

of 47

structure of Artemisinin. For her efforts, Yu received the Nobel Prize in

Medicine or Physiology in 2015.

➢ Paradigm Shift o Thomas Kuhn was a science philosopher who explored scientific

epistemology. For any scientific discipline, the set of concepts, theories,

research methods and postulates used by scientists makes up the paradigm

of that discipline.

o Kuhn said that the scientific paradigm consists of normal science and puzzle-

solving science. Normal science is everyday science, where scientists

conduct inquiries into the paradigm, for example, by verifying hypotheses.

The ‘discoveries’ of normal science are expected findings, based on the

prevailing paradigm. Over time, anomalies will appear in the prevailing

paradigms.

o Those anomalies are referred to as puzzle-solving science. Scientists then

conduct investigations to understand the anomalies. The discoveries in the

puzzle-solving sciences are unexpected and lead to paradigm shifts (also

called scientific revolutions). We will explore these ideas further in the

subsequent slides.

of 47

o According to Kuhn, a paradigm shift occurs in 3 stages:

❖ In Stage 1, normal science dominates. Scientists go about verifying

the prevailing concepts through observation and experimentation.

During this stage, hypotheses that are supported by evidence will be

retained, while those that are not, are rejected. This builds into a

body of scientific knowledge that forms the paradigms for the

various scientific disciplines.

❖ In Stage 2, scientists note that there are anomalies to the prevailing

paradigms. Anomalies are experimental data or observations that

cannot be explained by contemporary concepts.

❖ In Stage 3, the anomalies force scientists to search for new

explanations. When identified, the new concepts, ideas and

explanation result is a paradigm shift.

1. The prevailing view of how populations change over time is the

model proposed by Lamarck. According to him, the events that

an individual experiences in its lifetime will be transmitted to

the offspring. This model is called the Inheritance of Acquired

Characters. For example, the children of a person engaged in

physical labour will develop a stronger physique.

2. There were many anomalies that could not be explained by this

paradigm. One such anomaly is the experiment conducted by

the German biologist, August Weissman. He took a population

of rats and amputated their tails. Those rats were allowed to

breed. All of the offspring had normal tails. Once again, he

amputated their tails and bred them. He did this for 19

generations, but all of the offspring in every generation had

normal tails. Therefore, this observation was an anomaly to the

theory of the Inheritance of Acquired Characters.

3. Charles Darwin then proposed the Theory of Evolution by

Natural Selection, which was a different model to the

Inheritance of Acquired Characters. Eventually the paradigms of

evolutionary biology shifted, and Lamarck’s theory was

discarded. Darwin’s theory of evolution, together with

advances in genetics and cell biology, changed our

understanding of inheritance of biological traits in populations.

of 47

o When anomalies appear in a scientific

discipline, there are 2 ways by which

paradigm shifts may occur. This is

theory replacement. This describes

the situation when a new theory

replaces an old theory. In modelling

the arrangements and movements of

planets, the geocentric model was

replaced with the heliocentric model.

o The second method by which

paradigm shifts occur is called theory

modification. Here, the old theory is

not replaced, but modified. In the

example shown on this slide, the paradigms

of Newtonian mechanics was modified to include Einstein’s theories of

relativity. Both theories are relevant and valid, but are used to describe

different systems.

o Both Galileo and Newton laid the foundations of classical mechanics.

Classical mechanics describes the motion of macroscopic (large) objects,

including those that are at rest. This movement is described in terms of the

masses of objects, and the forces acting on it. Parameters such as distance,

speed, time, space characterise the properties of a moving body. They were

very successful because if we know some initial conditions of a moving body,

then we can predict certain future outcomes, based on the principles of

classical mechanics. For example, if the speed of a moving object is known,

we can calculate the time taken for the object to travel a certain distance.

Classical mechanics applied equally to everything, everyone and

everywhere: for example, Newton’s 2nd Law of motion (F = ma) equally

applicable on the moon as it is on the Earth. Many aspects of modern

society are based on the principles of classical mechanics, for example,

calculating travel times.

o Despite the success of classical mechanics, scientists noted that it could not

explain some specific types of phenomena, for example, very small objects

(e.g. atoms), very large objects (e.g. stars), or objects moving very fast (near

the speed of light). Indeed, Newtonian mechanics was incompatible with

Maxwell’s description of electromagnetism.

o Einstein explored some of these anomalies using a combination of ideas in

physics and mathematics. Special relativity describes the physics of particles

and waves moving at, or close to, the speed of light. On the other hand,

of 47

general relativity deals with the physics of large objects (gravitation).

Relativity did not displace the principles of Newtonian physics, but modified

them. For example, special relativity describes an equation for calculating

the momentum of a particle that is moving close to the speed of light.

However, for a particle that is moving at slower speeds, Einstein’s equation

for special relativity approximates to the equation for momentum in

classical mechanics. Special relativity also introduced new concepts in

physics, such as length contraction, time dilation, relativistic mass, a

universal speed limit, and mass–energy equivalence (E=mc2). General

relativity extends Newton’s Theory of Universal Gravitation. According to

Newton’s model, gravity is an attractive force between two objects.

However, General Relativity extends that idea to describe gravitation as the

warping of space-time around massive objects. Indeed, some aspects of

classical mechanics are seen to be an approximation of special relativity at

low velocities, and special relativity is an approximation of general relativity

in low gravitational fields.

o Modern scientific practice is based on philosophical thinking that developed

over many centuries in Europe, as well as the cultural observational

knowledge of many societies around the world. Modern science is a

powerful method of inquiry. The findings of scientific inquiry eventually

develop in a dynamic body of knowledge, known as science. Empiricism was

a strong force that shapes scientific inquiry – the emphasis on evidence and

experience is central to the development of scientific understanding.

of 47

• Influences on Current Scientific Thinking

➢ Ethics, Morality, & the Law o Consider the following scenario: imagine you are driving a car and are

approaching a traffic intersection. The light has just turned red. What do you

do? You will, I hope, come to a stop at the intersection. Now, think about

your reasons for stopping at the intersection – did you do it because it is

illegal to cross the intersection when the light is red? Or did you think about

safety issues of not following the traffic signals? What if, after having

stopped at the intersection, you notice that no other vehicles are

approaching the intersection – will you be

tempted to cross the intersection even

though the lights are still red?

o Since we live in societies among many

other people, our behaviours will affect

the people around us. The decisions we

make may also affect the other living

things with whom we share the planet, as

well as affect its environments. Therefore,

in all societies, rules of behaviour and

norms are developed in the interest of the

greater good. Essentially, there are three

influences on human behaviour.

o Scientific ethics is concerned with the truth and integrity of scientific

practice. The processes used to conduct investigations, analyse and

communicate the findings should all be based on honesty. That honesty is

one of the most powerful features of science.

o Ethically Questionable Practices

❖ This team of scientists claimed to have performed cold fusion

(generating energy from radioactive fusion reaction at ambient

temperatures).

❖ The case of cold fusion shows another aspect of questionable

scientific ethics. Stanley Pons and Martin Fleischmann at the

University of Utah concluded that they had found evidence of

deuterium fusion occurring at room temperature (this was a ‘holy

grail’ of energy research). Rather than publishing their findings in a

peer-reviewed journal, they announced their findings at a press

conference. However, other scientists could not replicate Pons and

Fleischmann’s experiments. A few weeks later, the U.S. Department

of Energy concluded that Pons and Fleischmann had not achieved

cold fusion. Although their work was not considered to be scientific

fraud, it was unethical as they did not follow the scientific process.

of 47

o Ethical Frameworks

❖ The principle of autonomy: making voluntary and informed

decisions (i.e. capacity to act intentionally, with understanding, and

without controlling influences)

❖ The principle of no maleficence: No subject in a study is

intentionally harmed or injured, either through acts of commission

or omission

❖ The principle of beneficence: Produce beneficial outcomes &

positive steps are taken to prevent and to remove harm from the

patient

❖ The principle of justice: Equal access to care, benefits,

compensation

❖ The principle of confidentiality: maintaining anonymity and privacy.

❖ The principle of non-deception: maintaining open and truthful

communications

➢ Current Influences on Scientific Thinking o Economic

o Political

o Global

➢ Influence of Ethical Frameworks on Scientific Research o Human Research

❖ Human experimentation refers to scientific investigations of humans

(it excludes studies in other areas, such as social science, education,

etc.). Human experimentation may involve manipulation (e.g.

clinical trials), or be purely observational. The history of human

experimentation is a mixed one. Although our knowledge of human

biology advanced in leaps and bounds through experimentation,

of 47

many of those studies will not be conducted in the original manner

today.

❖ For example, vaccination is a powerful medical therapy that has

improved the human condition worldwide. The English doctor,

Edward Jenner, is credited with the first scientific demonstration of

vaccination (strictly speaking, Jenner’s method is called variolation).

However, by today’s standards, Jenner’s studies on variolation

would be considered to be unethical.

❖ When the details of the Tuskegee Study of Untreated Syphilis

experiment was revealed to the public, it raised an uproar

internationally. In this study, 399 syphilitic 201 non-syphilitic

African-American men were part of a study to determine the

physiological effects of syphilis infections on humans. After reading

the synopsis of the research on the website indicated in the slide,

you may notice the following:

▪ The participants were not told that they were involved in a

human experiment.

▪ The participants were given inducements to take part in the

experiment.

▪ The participants were denied treatment for syphilis

infections, even though the treatment became available

during the study.

❖ All human experimentation in Australia is governed by the ethical

frameworks developed by the National Health and Medical Research

Council, also referred to as the NHMRC. These ethical frameworks

are based on the same principles of universal ethics. The four key

frameworks are:

❖ In Australia, at institutions that undertake human research, all

research proposals must be approved by the HREC. Human research

cannot be conducted without an ethics permit is not obtained.

❖ HERCs consist of researchers, non-researchers and community

members.

❖ The HRECs evaluate applications based on:

of 47

1. How is the research question/theme identified or developed?

2. How do the research methods align with the research aims?

3. How will the researchers and the participants engage with

one another?

4. How will the research data or information be collected,

stored, and used?

5. How will the results or outcomes be communicated?

6. What will happen to the data and information upon

completion of the project?

o Experimentation on Animals

❖ Several ethical frameworks also govern the use of animals in

research. Animals are used in many areas of study. The reasons for

using animals in experimentation are two-fold:

1. To learn about the biology and behaviour of the animals

themselves – this is important for veterinary science,

agriculture, management of wildlife (e.g. zoos, aquaria, parks)

and conservation.

2. To use animals as models of human biology – since there are

many biochemical, physiological and genetic similarities

between many animals and humans, animal models can

provide a wealth of information about how human biology

works. This is described further in the next slide.

❖ There are several reasons why animals are used for biomedical

research:

1. Animals are biologically very similar to humans (mice and

humans share more than 98% genetic similarity)

2. Animals are susceptible to many of the same health problems

as humans – cancer, diabetes, heart disease, etc.

3. With a shorter life cycle than humans, animal models can be

studied throughout their whole life span and across several

generations.

of 47

❖ There are both scientific and ethical imperatives for looking after

animals in research. Animals that are not cared for usually

experience stress, which tend to affect other physiological systems.

This may lead to anomalous results, as well as results that are not

reproducible.

❖ The ethical frameworks for animal experimentation apply to

vertebrate animals. These are animals with backbones (fish,

amphibians, reptiles, birds and mammals). This is because

vertebrates can experience pain, while current evidence suggests

that invertebrates do not experience pain.

❖ In Australia, individual states and territories are responsible for

overseeing animal ethics. As for human experimentation,

institutions must have Animal Ethics Committees to review all

research involving vertebrate animals. Researchers must obtain

ethics permits to conduct their research.

❖ AECs focus on the care of animals in experiments, as well as the

disposal of animals after the investigations are completed. There are

strict guidelines on how animals should be housed, fed, cleaned and

maintained during the investigations. If animals are to be

euthanised, then the researchers must use approved methods to kill

the animals. These methods have been approved by expert

committees so as to reduce the stress and pain burden on animals.

❖ The 3R rule was devised to reduce reliance on the use of animals in

experiments:

▪ Replacement of animals with other methods – where

possible, viable alternatives to the use of animals should be

explored.

▪ Reduction in the number of animals used – researchers

should use the minimum number of animals in their

experiments. There are statistical models that they can use

to determine the minimum number of animals that can be

used in an experiment without affecting the reliability and

validity of their findings.

▪ Refinement of techniques used to minimise the adverse

impact on animals – researchers should always use the

latest findings regarding the manipulation of animals so as

to minimise pain and stress on them.

❖ This is a pathway for the discovery of new medical treatments. In

vitro and in silico refer to studies that are conducted with cell or

tissue cultures, or to discoveries made with molecular arrays and

computer simulations. Small animal research is usually the starting

point for the development of a new therapeutic. Once it is proven,

then the therapeutic is tested on large animals. Large animal studies

provide important information about physiological responses to the

therapeutic. Only when its potential is realised in large animals will

the product be used in human experiments (clinical trials). Both

animal and human ethics apply at multiple stages in this discovery

pipeline.

of 47

o Biobanks

❖ Biomedical researchers often need access to biological samples for

their experiments. In some types of research, they may need access

to samples from specific sectors of the population (e.g. for disease

tissues). Biobanks are repositories of biological samples that

researchers can use. To ensure that researchers have access to all

relevant information about the samples in the biobanks, there is a

detailed cataloguing process for all samples.

❖ Information about the type, origin, date of collection and other

information are kept in a database. The tissue samples are stored in

various ways, but usually in cold storage (-70oC or liquid nitrogen).

One example of the use of biobank materials is the research

conducted by the Kathleen Cuningham Foundation National

Consortium for Research on Familial Breast Cancer (KConFaB).

❖ There, the researchers are looking to determine the:

▪ population rates of mutations in breast cancer genes;

▪ kinds of mutations that predispose to breast and ovarian

cancer;

▪ risk of breast and other types of cancer;

▪ age at which cancers occur; and

▪ effect of lifestyle and environmental factors on the risk of

developing cancer and age of onset.

❖ More than 100 research projects worldwide rely on samples from

this biobank.

❖ Many ethical issues surround the collection, maintenance and use of

the materials in biobanks. Some of those issues are listed on this

slide.

▪ Informed consent – all donors must be willing participants

who have been informed about how their donated samples

may be used. Proper communications should also be set up

between the biobanks and the donors. Most importantly, no

one must be compelled to donate samples. Vulnerable

donors, including those who cannot make informed

decisions, need to be respected and protected. Where

relevant, cultural sensitivities must be taken into account.

▪ The information contained in biobanks must be treated

confidentially. The privacy of donors must be respected.

Most biobanks have coded identification systems where

of 47

confidential information is not released, except as required

by the researchers.

▪ Some research activities may result in commercially-

valuable discoveries. Biobanks must ensure that the benefits

of research and development are shared with all people

involved, in accordance with the law.

➢ Use of Research Data o The communication of research findings is at the core of the scientific

enterprise. Research data, once verified, are the raw materials of scientific

knowledge construction. For scientists, the generation and publication of

scientific data are the hallmarks of professionalism. Peer-reviewed

publications are the primary means of such communications, but scientists

also use other forms of communications. Most research funding agencies,

such as the Australian Research Council and the National Health and Medical

Research Council require scientists to publish the data, they generate

through funded research programs. When applying for such funding,

scientists have to indicate how they plan to publicise their findings. This

creates transparency, and the community can see the benefits of supporting

scientific research.

o Sometimes, research may produce data that should not be openly shared.

For example, research that has military or security implications, or data that

may be used to achieve harmful ends (e.g. bioterrorism) should not be

published. Furthermore, discoveries with commercial potential will not be

published (at least, critical data may not be revealed).

o Data sharing is beneficial, as it:

❖ Encourages further scientific enquiry and promotes innovation.

❖ Leads to new collaborations between data users and data creators.

❖ Maximises transparency and accountability.

❖ Reduces the cost of duplicating data collection.

o The ethics of data sharing centres on the following questions:

1. What data or information are required to achieve the objectives of

the project?

2. How and by whom will the data or information be generated,

collected and accessed?

3. How and by whom will the data or information be used and

analysed?

4. Will the data or information be disclosed or shared and, if so, with

whom?

5. How will the data or information be stored and disposed of?

6. What are the risks associated with the collection, use and

management of data or information and how can they be

minimised?

7. What is the likelihood and severity of any harm/s that might result?

of 47

The Scientific

Research Proposal

Notes

of 47

Table of Contents • Developing the Question & Hypothesis ............................................................................... 28

➢ Reliability, Validity & Accuracy ........................................................................................ 28

➢ What makes a source 'reliable'? ...................................................................................... 28

➢ Using citations to locate other relevant journal articles .................................................... 29

➢ How to find more full-text articles ................................................................................... 29

• Scientific Research Proposal ................................................................................................ 30

➢ Plan to Investigate the Scientific Hypothesis .................................................................... 30

➢ Referencing Protocols ..................................................................................................... 30

• Methodology & Data Collection .......................................................................................... 31

➢ Uncertainty in Experimental Evidence ............................................................................. 31

➢ Use of Errors ................................................................................................................... 31

➢ Quantitative & Qualitative Research Methods ................................................................. 32

➢ Methods Used to Obtain Large Data Sets ......................................................................... 32

• Processing Data for Analysis ............................................................................................... 34

➢ Impact of Kepler Telescope Data ..................................................................................... 34

of 47

• Developing the Question & Hypothesis

➢ Reliability, Validity & Accuracy

Reliability

▪ When a scientist repeats an experiment with a different group of people or a different batch of the same chemicals and gets very similar results then those results are said to be reliable. Reliability is measured by a percentage – if you get exactly the same results every time then they are 100% reliable.

▪ Try holding a ruler above a friend’s open hand and dropping it – they have to catch the ruler but may not move until they see the ruler start to move. Note down the measurement where the ruler was caught. Do this ten times and calculate the mean (average) result.

▪ Is the ‘dropping a ruler’ experiment a reliable measure of reaction time?

Validity

▪ Validity describes whether the results of an experiment really do measure the concept being tested. Does seeing how far a ruler can drop through someone’s hand really measure reaction time? What other variables may be influencing the results?

▪ Is the ‘dropping a ruler’ experiment a valid measure of reaction time?

Accuracy

▪ Accuracy describes how well a measuring instrument determines the variable it is measuring. It can be employed in two ways

▪ An accurate measuring instrument, say a thermometer, is one whose readings confirm a known result.

▪ The level of accuracy of a measuring instrument determines the detail to which it can measure. A micrometer measures length to a greater level of accuracy than a ruler which in turn measures length to a greater level of accuracy than a ‘clicker’ wheel.

▪ In order to be accurate in their work scientists need to first select a measuring instrument that allows an appropriate measure of accuracy (e.g. a micrometer for the diameter of a piece of wire and a ruler marked in mm for its length and then to calibrate it. Calibrating an instrument involves measuring already known quantities too assess how accurately it is working).

➢ What makes a source 'reliable'? o I would be looking for journal articles which:

❖ Are published in a high ranking journal (usually ranked according to

how many citations articles in that journal have. See discussion at

https://en.wikipedia.org/wiki/Journal_ranking).

❖ Have a large number of citations (depending on how old the article

is) - meaning that other researchers refer to this paper in their own

research articles. The number of citations is a measure of how highly

regarded the work is by other researchers in the field.

❖ Whose authors are from well-regarded universities/institutions (but

this is not really as important as the first two).

❖ There may also be circumstances in which reliable information is

available on a website is maintained by a reputable institution such

https://en.wikipedia.org/wiki/Journal_ranking

of 47

as NASA (for example, in my sample investigation, the NASA earth

observatory).

➢ Using citations to locate other relevant journal articles o When you find a highly reliable and relevant article, look at both the articles

it cites as well as articles which cite that article to find other highly relevant

articles.

o You can find the articles that cite your article using, for example, google

scholar. When I search for my review article on "the albedo of earth" by

Stehpens et al. on google scholar I can see that it has been cited by 30 other

articles. I click on the "citing literature" link to find those articles. In this way

you 'follow your nose' through the research until you find the information

you want and/or come up against the edge of what is known, with articles

published in the last year or two.

➢ How to find more full-text articles o You can consider using the browser add-on "Unpaywall" which finds legal

fulltext versions of articles (often stored in university repositories of their

researchers work or in preprint archives). If you are having trouble finding a

full-text version of an article you can also sometimes find the researchers

own personal webpage which may list full-text versions (for example I keep

full-text versions of my articles available on my personal website) as it is in

the researcher's interests that their articles be accessible.

o As mentioned on the the "tools" page, joining the state and national

libraries may also assist you find full-text papers.

of 47

• Scientific Research Proposal

➢ Plan to Investigate the Scientific Hypothesis o The Overall Strategy

o Methodology

o Data Analysis

o Representation & Communication of the Scientific Research

o Timelines

o Benchmarks

➢ Referencing Protocols o APA

o Harvard

o MLA

of 47

• Methodology & Data Collection

➢ Uncertainty in Experimental Evidence

Systematic Errors

Two types: ✓ Offset uncertainty - all measurements are larger or smaller than the "true" value by a constant

amount. For example: a thermometer is not in sufficiently good thermal contact with a hot object -

the readings are all lower than the "true" temperature of the object. See this paper for an example of a systematic offset: https://fathomingphysics.nsw.edu.au/wp-content/uploads/2017/07/TEHumphrey_VCalisa_Phys_Teach_vol_52_iss_3_142_2014.pdf

✓ Gain uncertainty - measurements are larger or smaller than the "true" value by a fixed percentage.

For example: A long measuring tape is stretched so that all 1cm markings on the ruler are now actually separated by 1.01cm. Each reading will give a value that is 1% higher than the "true" value.

Random Errors

Random errors correspond to the "scatter" in experimental data, and will result in readings that are scattered around the "true" value, usually with a normal distribution. The data in the paper we looked at earlier (https://fathomingphysics.nsw.edu.au/wp-content/uploads/2017/07/TEHumphrey_VCalisa_Phys_Teach_vol_52_iss_3_142_2014.pdf) also shows significant random error, with data points distributed above and below the line of best fit.

➢ Use of Errors

of 47

➢ Quantitative & Qualitative Research Methods o Qualitative Research: This can be described as research that cannot easily

be communicated or understood in numerical terms. It usually involves

open ended questions and responses (such as interview questions or case

studies).

o Quantitative Research: Is an approach for testing objective theories by

examining the relationship among variables. The variables can be measured

and analysed numerically.

o Mixed Methods Research: Contains elements of both types of research.

o Different Types of Scientific Inquiry

❖ a commitment to deductive testing (i.e. the idea that

experimental/observational evidence determines if a theory can be

accepted)

❖ an experimental design that protects against bias

❖ a consideration of alternative explanations of the results

❖ the interpretation of data (either qualitative or quantitative) to

produce results that are reproducible and generalisable

❖ a discussion of how the research relates to other work done in that

field

➢ Methods Used to Obtain Large Data Sets o Remote Sensing

❖ Remote sensing is obtaining information about an area or

phenomenon through a device that does not touch the area or

phenomenon under study.

❖ Passive remote sensors detect natural energy that is reflected or

emitted from an observed object or scene (most commonly, reflected

sunlight). For example, a camera or a spectrometer (or your eyes!).

❖ Active remote sensors provide their own energy (electromagnetic

radiation) to illuminate the object or scene they are observing, and

then detect the radiation that is reflected or backscattered from that

object. For example, Radar (Radio detection and ranging) or Lidar

(Light detection and ranging) instruments.

❖ Many remote sensing devices are on-board satellites that monitor the

Earth from space.

o Streamed Data

❖ There are many devices now used in research which can operate in an

autonomous or semi autonomous mode in which data is recorded

continuously and then "streamed" out of the sensor for further

processing and analysis.

❖ While this technology has opened new opportunities in research, it has

also brought challenges. With advances in information technology it

has become possible to record very large amounts of data in a short

time. In some cases, e.g. the SKA discussed below, it is necessary to

process the data in real time to reduce the amount of data that is

placed in long term storage. In other systems the constraints may be

on the communication link from the instrument. For example the

Kepler telescope was situated in a sun centred orbit and had a limited

of 47

capacity (once a month only) radio link back to earth (see:

https://www.nasa.gov/mission_pages/kepler/spacecraft/index.html).

Similar considerations apply in sensor networks where the

communication back to base is limited by the amount of power

required.

❖ Other examples, as identified as:

▪ Internet of People (consisting of wearable devices)

▪ Social media

▪ financial transactions

▪ Industrial Internet of Things

▪ Cyberphysical Systems (requiring real-time responses)

▪ Satellite and airborne monitors

▪ National Security

▪ Astronomy

▪ Light Sources

▪ Instruments like the LHC

▪ Sequencers (all involving large volumes of data)

▪ Data Assimilation (where there is sensitivity to latency)

▪ Analysis of Simulation Results

▪ Steering and Control.

of 47

• Processing Data for Analysis

➢ Impact of Kepler Telescope Data o The Kepler and K2 missions have provided an unprecedented data set with a

precision and duration that will not be rivalled for decades. Even though the

data has already contributed to nearly 2,500 scientific publications so far,

the scientific community continues to extract new discoveries from the

archive data every day.

o To help new users understand where there may be important scientific gains

left to be made in analysing Kepler data, and to encourage the continued

use of the archives, we have prepared a white paper which discusses a non-

exhaustive list of 21 important data analysis projects which can be executed

using the public data that are readily available in the archives today. Each

project contains a link to an issue on the GitHub repository of the white

paper where we invite researchers to discuss their ideas or progress towards

resolving the challenge.

o The studies discussed in the paper show that many of Kepler's contributions

still lie ahead of us, owing to the emergence of complementary new data

sets, novel data analysis methods, and advances in computing power.

of 47

The Data, Evidence

and Decisions

Notes

of 47

Table of Contents • Patterns & Trends ............................................................................................................... 38

➢ Data vs. Evidence ............................................................................................................ 38

➢ Qualitative vs. Quantitative Data Sets ............................................................................. 38

o Content & Thematic Analysis ....................................................................................... 38

o Descriptive Statistics ................................................................................................... 38

➢ Tools for Data Representation ......................................................................................... 38

o Spreadsheets .............................................................................................................. 38

o Graphical Representations........................................................................................... 38

o Models ....................................................................................................................... 38

o Digital Technologies .................................................................................................... 38

➢ Limitations of Data Analysis & Interpretation .................................................................. 38

o Quantitative Data ........................................................................................................ 38

o Qualitative Data .......................................................................................................... 39

• Statistics in Scientific Research ............................................................................................ 40

➢ Descriptive Statistics ....................................................................................................... 40

o Mean .......................................................................................................................... 40

o Median ....................................................................................................................... 40

o Standard Deviation ..................................................................................................... 40

➢ Performance Measures ................................................................................................... 40

o Error ........................................................................................................................... 40

o Accuracy ..................................................................................................................... 40

o Precision ..................................................................................................................... 40

o Bias ............................................................................................................................. 40

o Data Cleansing ............................................................................................................ 40

➢ Statistical Tests ............................................................................................................... 40

o Student’s t-test ........................................................................................................... 40

o Chi-square test ............................................................................................................ 41

o F-test .......................................................................................................................... 42

➢ Bivariate Correlation ....................................................................................................... 43

o Correlation Coefficient ................................................................................................ 43

➢ Correlation vs. Causation ................................................................................................ 43

• Decisions from Data & Evidence .......................................................................................... 45

➢ Collective & Individual Decision-Making .......................................................................... 45

o Collective Decision-Making .......................................................................................... 45

of 47

o Individual Decision-Making.......................................................................................... 46

➢ Impact of New Data on Established Scientific Ideas .......................................................... 46

o Gravitational Waves on General Relativity ................................................................... 46

• Data Modelling ................................................................................................................... 47

➢ Data Modelling Techniques ............................................................................................. 47

o Predictive .................................................................................................................... 47

o Statistical .................................................................................................................... 47

o Descriptive .................................................................................................................. 47

o Graphical .................................................................................................................... 47

of 47

• Patterns & Trends

➢ Data vs. Evidence o Data is just data and has no intrinsic meaning on its own.

o Evidence has to be evidence for or of something; an argument, an opinion, a

viewpoint or a hypothesis.

➢ Qualitative vs. Quantitative Data Sets

o Content & Thematic Analysis ❖ Content analysis is the study of documents and communication

artefacts, which might be texts of various formats, pictures, audio or

video. Social scientists use content analysis to examine patterns in

communication in a replicable and systematic manner.

❖ Thematic analysis is one of the most common forms of analysis

within qualitative research. It emphasizes identifying, analysing and

interpreting patterns of meaning within qualitative data.

o Descriptive Statistics ❖ A descriptive statistic is a summary statistic that quantitatively

describes or summarizes features from a collection of information,

while descriptive statistics is the process of using and analysing

those statistics.

➢ Tools for Data Representation

o Spreadsheets

o Graphical Representations

o Models ❖ Physical, computational and/or mathematical

o Digital Technologies

➢ Limitations of Data Analysis & Interpretation o Data analysis and interpretation is that the method of assigning meaning to

the data collected and determining the conclusions, significance, and

implications of the findings. it’s a crucial and exciting step within the process

of research. In most of the research studies, analysis follows data collection.

o There are two main methods in the interpretation of data.

o Quantitative Data ❖ Quantitative data is statistical and is usually structured in nature

meaning it’s more rigid and defined. This kind of data is measured

using values and numbers, which makes it a more suitable candidate

for data analysis.

of 47

❖ E.g.

▪ Experiments

▪ Surveys

▪ Metrics

▪ Tests

o Qualitative Data ❖ Qualitative data is non-statistical and is usually unstructured or

semi-structured in nature. This data isn’t necessarily measured using

hard numbers that are used to develop graphs and charts. Instead,

it’s categorized as supported properties, attributes, labels, and other

identifiers.

❖ E.g.

▪ Symbols and Images

▪ Video and audio recordings

▪ Texts and documents

▪ Observations and notes

o There are many issues that researchers should be aware of with respect to

data analysis. Some of those issues are as follows.

❖ Having the necessary skills to analyse

❖ Simultaneously selecting data collection methods and appropriate

analysis

❖ Drawing unbiased conclusion

❖ Unsuitable subgroup analysis

❖ Lack of clearly defined and objective outcome calculation

❖ Providing honest and exact analysis

❖ Data recording process

❖ Split up ‘text’ when analysing qualitative data

❖ accuracy, authenticity and Validity

of 47

• Statistics in Scientific Research

➢ Descriptive Statistics

o Mean

o Median

o Standard Deviation

➢ Performance Measures

o Error

o Accuracy

o Precision

o Bias ❖ Bias is any trend or deviation from the truth in data collection, data

analysis, interpretation and publication which can cause false

conclusions. Bias can occur either intentionally or unintentionally.

o Data Cleansing ❖ Data cleansing is the process of detecting and correcting corrupt or

inaccurate records from a record set, table, or database and refers

to identifying incomplete, incorrect, inaccurate or irrelevant parts of

the data and then replacing, modifying, or deleting the dirty or

coarse data.

➢ Statistical Tests

o Student’s t-test ❖ Student’s t-test, in statistics, is a method of testing hypotheses

about the mean of a small sample drawn from a normally

distributed population when the population standard deviation is

unknown.

❖ The t distribution is a family of curves in which the number of

degrees of freedom (the number of independent observations in the

sample minus one) specifies a particular curve. As the sample size

(and thus the degrees of freedom) increases, the t distribution

approaches the bell shape of the standard normal distribution. In

practice, for tests involving the mean of a sample of size greater

than 30, the normal distribution is usually applied.

❖ It is usual first to formulate a null hypothesis, which states that

there is no effective difference between the observed sample mean

and the hypothesized or stated population mean—i.e., that any

measured difference is due only to chance.

❖ In an agricultural study, for example, the null hypothesis could be

that an application of fertilizer has had no effect on crop yield, and

an experiment would be performed to test whether it has increased

the harvest. In general, a t-test may be either two-sided (also

of 47

termed two-tailed), stating simply that the means are not

equivalent, or one-sided, specifying whether the observed mean is

larger or smaller than the hypothesized mean. The test statistic t is

then calculated. If the observed t-statistic is more extreme than the

critical value determined by the appropriate reference distribution,

the null hypothesis is rejected. The appropriate reference

distribution for the t-statistic is the t distribution. The critical value

depends on the significance level of the test (the probability of

erroneously rejecting the null hypothesis).

❖ For example, suppose a researcher wishes to test the hypothesis

that a sample of size n = 25 with mean x = 79 and standard deviation

s = 10 was drawn at random from a population with mean μ = 75

and unknown standard deviation. Using the formula for the t-

statistic,

❖ The calculated t equals 2. For a two-sided test at a common level of

significance α = 0.05, the critical values from the t distribution on 24

degrees of freedom are −2.064 and 2.064. The calculated t does not

exceed these values; hence the null hypothesis cannot be rejected

with 95 percent confidence. (The confidence level is 1 − α.)

❖ A second application of the t distribution tests the hypothesis that

two independent random samples have the same mean. The t

distribution can also be used to construct confidence intervals for

the true mean of a population (the first application) or for the

difference between two sample means (the second application).

o Chi-square test ❖ A chi-square (χ2) statistic is a test that measures how expectations

compare to actual observed data (or model results). The data used

in calculating a chi-square statistic must be random, raw, mutually

exclusive, drawn from independent variables, and drawn from a

large enough sample. For example, the results of tossing a coin 100

times meet these criteria.

❖ There are two main kinds of chi-square tests: the test of

independence, which asks a question of relationship, such as, "Is

there a relationship between gender and SAT scores?"; and the

goodness-of-fit test, which asks something like "If a coin is tossed

100 times, will it come up heads 50 times and tails 50 times?"

of 47

❖ For these tests, degrees of freedom are utilized to determine if a

certain null hypothesis can be rejected based on the total number of

variables and samples within the experiment.

❖ For example, when considering students and course choice, a

sample size of 30 or 40 students is likely not large enough to

generate significant data. Getting the same or similar results from a

study using a sample size of 400 or 500 students is more valid.

❖ In another example, consider tossing a coin 100 times. The expected

result of tossing a fair coin 100 times is that heads will come up 50

times and tails will come up 50 times. The actual result might be

that heads will come up 45 times and tails will come up 55 times.

The chi-square statistic shows any discrepancies between the

expected results and the actual results.

o F-test ❖ An F statistic is a value you get when you run an ANOVA test or a

regression analysis to find out if the means between two

populations are significantly different. It’s similar to a T statistic

from a T-Test; A-T test will tell you if a single variable is statistically

significant and an F test will tell you if a group of variables are jointly

significant.

❖ Simply put, if you have significant result, it means that your results

likely did not happen by chance. If you don’t have statistically

significant results, you throw your test data out (as it doesn’t show

anything!); in other words, you can’t reject the null hypothesis.

❖ You can use the F statistic when deciding to support or reject the

null hypothesis. In your F test results, you’ll have both an F value

and an F critical value.

▪ The F critical value is also called the F statistic.

▪ The value you calculate from your data is called the F value

(without the “critical” part).

❖ In general, if your calculated F value in a test is larger than your F

statistic, you can reject the null hypothesis. However, the statistic is

only one measure of significance in an F Test. You should also

consider the p value. The p value is determined by the F statistic and

is the probability your results could have happened by chance.

❖ The F statistic must be used in combination with the p value when

you are deciding if your overall results are significant. Why? If you

have a significant result, it doesn’t mean that all your variables are

significant. The statistic is just comparing the joint effect of all the

variables together.

❖ For example, if you are using the F Statistic in regression analysis

(perhaps for a change in R Squared, the Coefficient of

Determination), you would use the p value to get the “big picture.”

1. If the p value is less than the alpha level, go to Step 2

(otherwise your results are not significant, and you cannot

reject the null hypothesis). A common alpha level for tests is

0.05.

of 47

2. Study the individual p values to find out which of the

individual variables are statistically significant.

➢ Bivariate Correlation

o Correlation Coefficient ❖ The Pearson product-moment correlation coefficient is a measure of

the strength of the linear relationship between two variables. It is

referred to as Pearson's correlation or simply as the correlation

coefficient. If the relationship between the variables is not linear,

then the correlation coefficient does not adequately represent the

strength of the relationship between the variables.

❖ The symbol for Pearson's correlation is "ρ" when it is measured in

the population and "r" when it is measured in a sample. Because we

will be dealing almost exclusively with samples, we will use r to

represent Pearson's correlation unless otherwise noted.

❖ Pearson's r can range from -1 to 1. An r of -1 indicates a perfect

negative linear relationship between variables, an r of 0 indicates no

linear relationship between variables, and an r of 1 indicates a

perfect positive linear relationship between variables.

➢ Correlation vs. Causation ❖ Correlation is a statistical measure (expressed as a number) that

describes the size and direction of a relationship between two or

more variables. A correlation between variables, however, does not

automatically mean that the change in one variable is the cause of

the change in the values of the other variable.

❖ Causation indicates that one event is the result of the occurrence of

the other event; i.e. there is a causal relationship between the two

events. This is also referred to as cause and effect.

❖ Theoretically, the difference between the two types of relationships

are easy to identify — an action or occurrence can cause another

(e.g. smoking causes an increase in the risk of developing lung

cancer), or it can correlate with another (e.g. smoking is correlated

with alcoholism, but it does not cause alcoholism). In practice,

however, it remains difficult to clearly establish cause and effect,

compared with establishing correlation.

❖ The objective of much research or scientific analysis is to identify the

extent to which one variable relates to another variable. For

example:

▪ Is there a relationship between a person's education level

and their health?

▪ Is pet ownership associated with living longer?

▪ Did a company's marketing campaign increase their product

sales?

❖ These and other questions are exploring whether a correlation exists

between the two variables, and if there is a correlation then this

of 47

may guide further research into investigating whether one action

causes the other. By understanding correlation and causality, it

allows for policies and programs that aim to bring about a desired

outcome to be better targeted.

❖ Causality is the area of statistics that is commonly misunderstood

and misused by people in the mistaken belief that because the data

shows a correlation that there is necessarily an underlying causal

relationship .

❖ The use of a controlled study is the most effective way of

establishing causality between variables. In a controlled study, the

sample or population is split in two, with both groups being

comparable in almost every way. The two groups then receive

different treatments, and the outcomes of each group are assessed.

❖ For example, in medical research, one group may receive a placebo

while the other group is given a new type of medication. If the two

groups have noticeably different outcomes, the different

experiences may have caused the different outcomes.

❖ Due to ethical reasons, there are limits to the use of controlled

studies; it would not be appropriate to use two comparable groups

and have one of them undergo a harmful activity while the other

does not. To overcome this situation, observational studies are

often used to investigate correlation and causation for the

population of interest. The studies can look at the groups'

behaviours and outcomes and observe any changes over time.

❖ The objective of these studies is to provide statistical information to

add to the other sources of information that would be required for

the process of establishing whether or not causality exists between

two variables.

of 47

• Decisions from Data & Evidence

➢ Collective & Individual Decision-Making

o Collective Decision-Making ❖ Group decisions may involve assimilating a huge amount of

information, exploring many different ideas, and drawing on many

strands of experience. And the consequences of the right or wrong

decision may be profound for the team and the organization. For

obvious reasons, decisions made in groups can vary considerably

from those undertaken by individuals. It is this potential divergence

in outcomes that make group decision making attractive.

❖ Group decision-making is a situation faced when individuals

collectively make a choice from the alternatives before them. The

decision is then no longer attributable to any single individual who is

a member of the group. This is because all the individuals and social

group processes such as social influence contribute to the outcome.

The decisions made by groups are often different from those made

by individuals. There is much debate as to whether this difference

results in decisions that are better or worse.

❖ According to the idea of synergy, decisions made collectively tend to

be more effective than decisions made by a single individual. Factors

that impact other social group behaviors also affect group decisions.

Moreover, when individuals make decisions as part of a group, there

is a tendency to exhibit a bias towards discussing shared information

(i.e. shared information bias), as opposed to unshared information.

Individual decision refers to the decision-making process where an

individual selects the course of action to be followed in the business

from various alternatives whereas collective decision refers to the

group decision which occurs at mutual agreement from the group.

Advantages Disadvantages

Groups generate more complete information and knowledge.

Group decisions are time-consuming. They typically take more time to reach a solution

than making the decision alone.

By aggregating the resources of several individuals, groups bring more input into the

decision process.

Group decisions have conformity pressures in groups. The desire by group members to be

accepted and considered an asset to the group can result in squashing any overt

disagreement.

In addition to more input, groups can bring heterogeneity to the decision process. They

offer increased diversity of views.

Group decision can be dominated by one or a few members. If this dominated coalition is

composed of low and medium ability members, the group’s overall effectiveness

will suffer.

A group will almost always outperform even the best individual. So, groups generate higher

quality decisions.

Group decisions suffer from ambiguous responsibility. In an individual decision, it’s

clear who is accountable for the final outcome. In a group decision, the

Groups lead to increase acceptance of solutions. Many decisions fail after the final

of 47

choice is made because people don’t accept the solution. Group members who

participated in making a decision are likely to enthusiastically support the decision and

encourage others to accept it.

responsibility of any single member is watered down.

o Individual Decision-Making

Advantages Disadvantages

An individual generally makes prompt decisions. While a group is dominated by

various people, making decision-making very time consuming. Moreover, assembling group

members consumes lot of time.

A group has potential of collecting more and full information compared to an individual

while making decisions.

Individuals do not escape responsibilities. They are accountable for their acts and

performance. While in a group it is not easy to hold any one person accountable for a wrong

decision.

An individual while making any decision uses his own intuition and views. While a group has

many members, so many views and many approaches and hence better decision making.

Individual decision making saves time, money and energy as individuals make prompt and

logical decisions generally. While group decision making involves lot of time, money

and energy.

A group discovers hidden talent and core competency of employees of an organization.

Individual decisions are more focused and rational as compared to group.

An individual will not take into consideration every members interest. While a group will take into account interest of all members of

an organization.

➢ Impact of New Data on Established Scientific Ideas

o Gravitational Waves on General Relativity ❖ Relate to physics principles

of 47

• Data Modelling

➢ Data Modelling Techniques

o Predictive ❖ Predictive modelling is a process that uses data and statistics to

predict outcomes with data models. These models can be used to

predict anything from sports outcomes and TV ratings to

technological advances and corporate earnings.

o Statistical ❖ A statistical model is a mathematical model that embodies a set of

statistical assumptions concerning the generation of sample data. A

statistical model represents, often in considerably idealized form,

the data-generating process.

o Descriptive ❖ A descriptive model describes a system or other entity and its

relationship to its environment. It is generally used to help specify

and/or understand what the system is, what it does, and how it

does it. A geometric model or spatial model is a descriptive model

that represents geometric and/or spatial relationships.

o Graphical ❖ Using the graph data model, designers describe their system as a

connected graph of nodes and relationships, much as they might do

with ER or object data modelling. Graph data models can be used

for text analysis, creating models that uncover relationships among

data points within documents.

the foundations of scientific thinking notes

Documents