the foundations of scientific thinking notes
TRANSCRIPT
Page 1 of 47
The Foundations of
Scientific Thinking
Notes
Page 2 of 47
Contents • The Development of Modern Science .................................................................................... 3
➢ Epistemology .................................................................................................................... 3
➢ Influence of Empiricism on Scientific Inquiry ...................................................................... 5
➢ Induction vs Deduction ..................................................................................................... 6
➢ Parsimony/Occam’s Razor ................................................................................................. 7
➢ Falsifiability ...................................................................................................................... 8
➢ Significance of Confirmation Bias....................................................................................... 9
➢ Cultural Contribution Knowledge .................................................................................... 12
➢ Paradigm Shift ................................................................................................................ 15
• Influences on Current Scientific Thinking ............................................................................. 19
➢ Ethics, Morality, & the Law.............................................................................................. 19
➢ Current Influences on Scientific Thinking ......................................................................... 20
➢ Influence of Ethical Frameworks on Scientific Research .................................................... 20
➢ Use of Research Data ...................................................................................................... 25
Page 3 of 47
• The Development of Modern Science
➢ Epistemology o Epistemology is defined as: ‘a branch of philosophy that investigates the
origin, nature, methods, and limits of human knowledge’.
o Scientific epistemology explores the nature of scientific knowledge. It
consists of three aspects:
1. The Qualities of Scientific
Knowledge
▪ Science attempts to explain natural phenomena ▪ Scientific knowledge is represented as laws and theories. ▪ Laws: describe patterns and relationships in scientific information. ▪ Theories: provide explanations of natural phenomena. ▪ Scientific knowledge is tentative as it requires revision. ▪ Science is part of the social and cultural traditions of many human
societies. ▪ Scientific ideas are affected by social and historical setting.
2. The Limitations of Scientific Knowledge
▪ Science does not make moral judgements (e.g. should euthanasia be permitted?).
▪ Science does not make aesthetic judgments (e.g. is Mozart’s music more beautiful than Bach’s?).
▪ Science does not prescribe how to use scientific knowledge (e.g. should genetic engineering be used to develop disease-resistant crops?).
▪ Science does not explore supernatural or paranormal phenomena (e.g. religious ideas and ghosts).
3. How Scientific Knowledge if
Generated
▪ The development of scientific knowledge relies on observations, experimental evidence, rational arguments and scepticism.
▪ Scientific knowledge advances through slow and incremental steps (evolutionary progression), as well as giant leaps of understanding (revolutionary progression).
▪ Observations are theory dependent, which influences how scientists obtain and interpret evidence.
▪ There is no universal step-by-step scientific method. Scientific knowledge is acquired through a variety of different methods. Two main lines of reasoning that influence modern science as inductive (generalisations) and deductive processes (deriving).
o Science distinguishes itself from other ways of knowing and from other
bodies of knowledge through the use of empirical standards, logical
arguments, and scepticism, as scientists strive for certainty of their
proposed explanation.
Page 4 of 47
o Alternative Ways of Knowing:
Alternative Ways Explanation Examples
Emotion feeling, as opposed to reasoning Can/should we control our
emotions? Are emotions the enemy of, or necessary for, good reasoning?
Faith/Belied trust or confidence
Can theistic beliefs be considered knowledge because they are
produced by a special cognitive faculty or “divine sense”? Does faith
meet a psychological need?
Imagination forming new ideas, or images or concepts of external objects not
present to the senses.
What is the role of imagination in producing knowledge about a real
world? Can imagination reveal truths that reality hides?
Intuition a form of knowledge that appears in consciousness without obvious
deliberation
Are there certain things that you have to know before being able to learn anything at all? Should you
trust your intuition?
Language a system of communication used by a particular country or community.
How does language shape knowledge? Is the importance of
language cultural?
Memory the faculty by which the brain encodes, stores, and retrieves
information.
Can we know things which are beyond our personal present experience? Can our beliefs contaminate our memory?
Reason a basis or cause, as for some belief,
action, fact, event
What is the difference between reason and logic? How reliable is
inductive reasoning?
Sense Perception understanding gained through the
use of one of the senses such as sight, taste, touch or hearing
How can we know if our senses are reliable? What is the role of
expectation or theory in sense perception?
o Navigation
❖ Early travellers relied on their senses (sense perception) to observe
landforms, wind speed and direction, tides and measures of
distance to navigate (observational knowledge). Celestial navigation
using the positions of stars, constellations and the sun also served as
navigational aids. In those times, travel was restricted to short
distances, or to coastal areas.
❖ With advances in measuring techniques (and geometry), accurate
maps were created. Such calculations indicated that the Earth was a
sphere. The altitude of the North Star provided latitudinal
information. These are examples of knowledge constructed through
memory, language (communication through oral stories, written
accounts and maps) and reasoning.
❖ Later, navigational instruments extended the powers of sense
perception. The compass was an important tool to orientate
Page 5 of 47
travellers to the magnetic north (works at night as well). Other
instruments, such as the astrolabe, Sextant, chronometer and Chip
Log were designed to identify locations in 3-dimensional space. The
information from these instruments was used to produce highly
refined maps (ways of knowing: reasoning, imagination, intuition,
language).
❖ Modern navigation uses radar, gyroscopic compasses and the GPS to
provide positional and kinematic (e.g. speed and acceleration)
information.
❖ Polynesians used natural navigation aids such as the stars, ocean
currents, and wind patterns. They used non-physical devices such as
songs and stories for memorizing the properties of stars, islands,
and navigational routes.
➢ Influence of Empiricism on Scientific Inquiry o Science is derived from philosophy. The term ‘philosophy’ means the love of
wisdom. One branch of philosophy focuses on developing explanations of
the natural world. This branch was called ‘natural philosophy’.
o Around the 15th century, natural philosophers began to redefine how
knowledge of the natural world should be constructed. Natural philosophy
was the beginning of science. In the 19th century, the British philosopher,
William Whewell, coined the term ‘science’ to describe the type of inquiries
undertaken by the natural philosophers. Eventually, the term science
became distinct from other branches of inquiries (such as philosophy,
religion, etc).
o As an example of the common roots of science and philosophy, the highest
research degree awarded by Universities around the world is the Doctor of
Philosophy, even in science. After a person receives a PhD degree, they are
allowed to use the title “doctor” (medical science is the exception to this
rule).
o Empiricism is a branch of philosophy that emphasises ‘prior experience’.
Empiricists say that we can only construct knowledge after collecting
information through our senses. Sensory information extended to
information collected using instruments.
o Therefore, observations are important for knowledge construction. The
information collected through observation eventually becomes evidence
and explanations for natural phenomena. Over time, the evidence and
explanations become knowledge.
o Empiricism was crucial for
separation of natural philosophy
from the other branches of
philosophy. It came to define
modern science. Most scientific
knowledge is empirical.
Empiricism demands that all
scientific information be based
on evidence and tested through
observations or experimentation.
Page 6 of 47
➢ Induction vs Deduction o Induction is the process of generalisation. After collecting information about
specific events, generalisations are drawn. They describe the broad
applications of the conclusions. In science, inductive reasoning allows
explanations of related phenomena to be constructed.
o Deduction is the process of deriving specific knowledge from broad ideas.
Therefore, deductive reasoning is often used to make predictions.
▪ The top panel illustrates inductive thinking. When a leaf
is examined under a microscope, it is seen to be
composed of cells. Examining the leaves of many plants
shows the same conclusion. Therefore, through
inductive reasoning, we may conclude that all plants
are composed of cells. In doing so, the definite
conclusion of each observation is used to synthesise a
generalisation – the Cell Theory. Theories are big ideas
in science – broad explanation of natural phenomena.
▪ The lower panel illustrates deductive thinking. Here, we
start with a big idea – that of the Cell Theory, which
states that all plants are composed of cells. Suppose
you have come across a new and unknown type of
plant. Based on the Cell Theory, you predict that the
unknown plant is composed of cells. This prediction is
called a hypothesis. You then conduct an experiment,
where you observe that the plant is indeed composed
of cells. In this case, we have moved from a general
instance (the Cell Theory) to a specific conclusion (that
the new, unknown plant is composed of cells).
▪ Many of the big ideas or theories in science are the
products of inductive thinking. This example shows
Charles Darwin’s inductive thinking on populations of
organisms.
1. Darwin makes several discrete observations about
how individuals in a population are adapted to
their environments (for example the beaks of
different populations of finches show different
shapes).
2. Based on the five observations shown on the slide,
Darwin makes two inferences.
3. After many such inferences, Darwin develops a
new big idea, which generalises how populations
change over time. This is known as the Theory of
Evolution by Natural Selection.
Page 7 of 47
o Scientific Laws describe the relationships between the variables of a system.
They are usually expressed in the form of mathematical equations. This slide
shows equations in the Chemistry and Physics datasheets.
o Scientific laws are examples of inductive reasoning.
➢ Parsimony/Occam’s Razor o William of Occam was an English friar who live in the 12/13th centuries.
Although he did not invent the phrase, he used the phrase “Plurality must
never be posited without necessity” frequently in his writings. Many
thinkers before Occam, including the Greek philosophers Aristotle and
Ptolemy, have made statement similar to this. A modern-day statement of
Occam’s razor is “Other things being equal, simpler explanations are
generally better than more complex ones”.
o Science works with competing ideas. That means that when scientists are
trying to develop explanations of some phenomenon, the devise alternative
hypothesis. Sometimes, after testing those hypotheses, there may be more
than one plausible hypothesis for a phenomenon. In those situations, using
Occam’s razor may be useful. Occam’s razor says that if the competing
hypotheses are equivalent, then the simpler hypothesis is the best
explanation for the phenomenon.
▪ The discovery of the electron by J.J. Thomson is an
example of deductive thinking. While studying the
nature of cathode rays, Thomson was exploring the
basis of the Atomic Theory.
▪ In the 19th and early 20th centuries, it was thought that
atoms were electrically-neutral and indivisible
constituents of matter. However, through careful
experimentation and data analyses, Thomson
discovered that atoms were composed of subatomic
particles.
▪ One type of subatomic particle was negatively charged
and was called the electron. As a result of his
discoveries, the Atomic theory was modified.
Page 8 of 47
o There are many historical examples of the use of Occam’s razor. Before the
16th century, the geocentric model of the solar system (Earth at the centre
of the solar system) was dominant. Then, this was replaced with the
heliocentric model (Sun in
the centre). The
geocentric model
required a number of
complicated features (e.g.
epicycles) to explain
some unusual
phenomena (such as the
retrograde motion of
Venus). The heliocentric
model did not require
such features and is thus
a simpler model.
o Scientists do not use Occam’s razor exclusively when accepting ideas in
science.
o The most important factor is evidence.
o Other considerations:
❖ Are some ideas more testable than others?
❖ Are some ideas better at producing broader explanations?
❖ Are some ideas a better fit with existing ideas?
❖ Are some ideas better at generating new areas for investigation?
➢ Falsifiability o Falsifiability is a method of developing scientific knowledge. It is a type of
deductive reasoning. It claims that all scientific ideas should be falsifiable
through testing (for example, through experimentation). If an idea cannot
be falsified, then it cannot be scientific. For example, creation science or
intelligent design are not considered to be scientific because you cannot test
their ideas.
o While not everything agrees with the principle of falsification, falsification
impacted on two aspects of science:
❖ Differentiating scientific ideas
from non-scientific ideas
❖ A method to test and verify
scientific ideas.
o Falsification has given rise to one
method of testing and verifying scientific ideas. This is known as hypothesis
testing.
o Hypotheses are tentative explanations of a narrow set of related
phenomena. For example, consider the hypothesis “particulate pollution in
the atmosphere increases the incidence of asthma”. This hypothesis
Page 9 of 47
proposes an explanation for the increased incidence of asthma. It is a
tentative explanation that is based on observations, but needs to be
verified. In other words, the hypothesis needs to be tested.
o To test the hypothesis, a controlled experiment must be conducted and the
data generated in that experiment analysed.
o Two important features of hypotheses are that:
❖ Hypotheses cannot be proven to be true – they can only be falsified
(this is because of the falsification principle).
❖ Hypotheses can only be rejected
(if they are NOT supported by
evidence) or not rejected (if the
evidence supports the
hypothesis).
o These are important features of
hypotheses to bear in mind. The goal of
hypothesis testing is to reject what is
false (not supported by evidence).
o Often, hypothesis testing also involves
the statistical analysis of experimental data. So, it is not
simply the data collected in an investigation that is
used to verify hypotheses, but the quality of the data.
➢ Significance of Confirmation Bias o Observations are an important element of scientific inquiry. Inferences can
be influenced by:
❖ Confirmation Bias
❖ Theory-Laden Observation
o Confirmation Bias: the tendency to search for or interpret information in a
way that confirms one’s preconceptions.
o Theory-Dependent Observations: how previous experiences, beliefs and
assumptions affect the inferences drawn from observations.
o No matter whether we use inductive or deductive reasoning, observations
are important. The quality of observations is crucial for initiating inquiries
and investigations. Consider Marshall and Warren’s study on the microbial
cause of gastric ulcers. The observation that Helicobacter pylori are
frequently associated with gastric ulcers was a crucial observation that led
to the discovery that that bacterium caused the disease.
Page 10 of 47
o Another important aspect of observations is the analysis of data. Identifying
patterns and trends in experimental data is critical for finding evidence that
may support the hypotheses.
o Theory-dependent observations refers to observations that are dependent
on theories. It means that prior knowledge of scientific theories may
influence that inferences that we draw from observations. This extended to
the ways that we analyse and interpret observations. Optical illusions are
often used to illustrate theory-dependent observations. In the picture
shown in the slide, the image on the left shows two lines, one vertical and
the other horizontal.
o On initial inspection, the vertical line appears to be longer than the
horizontal line. Yet, when measured, both lines are of the same length.
Although this example is simplistic, it illustrates how the initial
interpretations we make of our observations may be misleading and require
further inquiry. Therefore, how we interpret observations is dependent on
prior experience.
o Theory-dependent observations play an important role in the way ‘experts’
interpret information. For example, an X-ray image may not be informative
to the untrained person, but to an experienced radiologist, the same image
may be very informative. The radiologist may be able to pick up certain
conditions or pathologies, because of his/her past learning and experiences.
o Theory-laden observations may also be responsible for professional
intuition, where the expert practitioner may be able to arrive at certain
conclusion without conducting much analysis.
Page 11 of 47
o Theory-laden observations can lead people to derive different conclusion
from the same set of observations. As shown on this table, in many scientific
fields, scientists have come to different conclusions while studying the same
phenomena. In some instances, those conclusions have been wrong (e.g.
Aristotle, Ptolemy). In other instances, the different conclusion describe
different aspects of the same phenomena (Newton and Einstein).
o Confirmation bias is not a good thing in science. As the name suggests, it is a
form of bias. That bias improperly confirms a researcher’s belief about the
outcome of an inquiry. There are many reasons why confirmation bias may
occur in a scientific investigation. Some of the reasons are listed on this
slide.
o Poor experimental design or data is a major cause of confirmation bias.
Sometime, preliminary studies are interpreted as confirmatory studies. For
example, in a recent article, it was claimed that bald men are more likely to
be afflicted with Covid-19 disease. This was only an observation is a couple
of hospitals, and not the result of a well-designed investigation.
Confirmation bias may also occur when correlation is confused with cause-
and-effect.
o Here is an example of confirmation bias in the scientific literature. It is
generally assumed by biologists that ants are more aggressive to ants from
neighbouring nests than to those from their own nest. A research team in
Melbourne decided to examine the scientific papers on the nesting
behaviour of ants. They looked at 79 publications, and noticed that only 29%
of those were designed as blinded-studies. Blinded studies are controlled-
experiments. In other words, 79% of the published studies did not use a
proper experimental design. The researchers also noted that the studies
that were not controlled-experiments,
the assumption of ant behaviour was
identified. However, in the controlled-
experiment studies, the reverse was
observed. This occurred because in the
uncontrolled studies, the researchers
did not check for aggressive behaviour
within each nest. They simply assumed
that ants were less aggressive towards
nest mates, compared to ants from
other nests. Thus, the poor
experimental design of those studies
resulted in a confirmation bias.
Page 12 of 47
➢ Cultural Contribution Knowledge o Many of those knowledge systems have influenced the development of
scientific knowledge.
o Knowledge construction is closely linked with cultural constructs. This means
that knowledge construction depends on the languages used in a society,
the cultural practices and other factors. In the preceding slides, we looked at
how scientific knowledge is constructed. We examined the central role of
empiricism, reasoning tools (such as induction and deduction), Occam’s
razor, falsification, confirmation bias and paradigm shifts in developing
scientific knowledge. Most of the scientific research and knowledge
construction that happens around the world is largely the product of
European schools of thought.
o However, all cultures in the world have systems for constructing knowledge.
In every culture, knowledge is constructed and communicated in ways that
are specific to those cultures. For example, the knowledge systems in
indigenous societies are called Traditional Knowledge. Many governments
are now tapping into traditional knowledge systems, as those systems have
developed different, but relevant, explanations of natural phenomena.
o As with science, cultural observational knowledge is based on developing
inferences from observations. However, there is little or no experimentation
such as that seen in science. Over the years, cultural observational
knowledge has made significant contributions to scientific advancement.
o One example of cultural observational knowledge that is common to many
societies is astronomy. People around the world realised that many natural
phenomena can be attributed to astronomical events. For example, changes
in the seasons, weather and tides were associated with the positions of the
sun and the moon in the sky. Agriculture was dependent on seasonal
information. The patterns of stars (constellations) in the sky could provide
positional and directional information for travel. Therefore, many cultures
developed systems for measuring and analysing astronomical data. As
shown in this slide, observatories have been identified in ancient Mexican
(Mayan), Egyptian, Indian and Chinese societies. Much of this information
has been used to construct knowledge of the Earth (for travel and trade) and
astronomical phenomena.
Page 13 of 47
o The indigenous cultures in Australia are ancient and have existed in this
continent for more than 60,000 years. There were more than 400 Aboriginal
nations in Australia. There were different languages and cultural practices in
those societies. They constructed knowledge of natural phenomena and
transmitted that knowledge mainly in the oral tradition. For example,
Aboriginal societies studied the night sky and developed mythical tales of
constellations and other astronomical phenomena.
o The emu in the sky describes the region of the Milky Way that is adjacent to
the Southern Cross, and forms part of the Dreaming narrative about
creation. Other stories were built around the Pleiades system and the Orion
constellations. In addition to mythologies, the night sky also provided
information for seasonal changes, and as guideposts for celestial navigation.
Time, calendars and information about seasons were developed using
astronomical knowledge. Some other uses of astronomical knowledge in
Aboriginal societies are indicated in this slide.
o Aboriginal societies also developed extensive knowledge about local
Australian ecosystems. This knowledge is referred to as Traditional
Ecological Knowledge. That knowledge is currently used in Australian states
and territories for managing ecosystems and landcare. Their understanding
of the role of bushfires in the functioning of local ecosystems is proving to
be critical for modern fire management systems. Another area of traditional
knowledge that has received scrutiny is bush medicine. Traditional
knowledge is being used to identify new substances from native plants that
have medicinal and therapeutic value, including antibiotics, antimicrobials
and antiviral products. Thus, contemporary society benefits from traditional
knowledge as it become integrated with scientific knowledge.
o The use of traditional knowledge for the development of medicinal,
therapeutic or health products has implications for commercialisation
practices and intellectual properties.
o All civilisations developed knowledge of natural phenomena. As shown in
this slide, the cultural observational knowledge of many civilisations
influenced the development of modern science. The Islamic cultures of the
Middle Ages amalgamated and advanced the knowledge systems of those
civilisations and formed the basis of scientific development in Renaissance
Europe.
Page 14 of 47
o Greece: parallax measurements and geometry; geocentric and heliocentric
models of the solar system
o Egypt: curvature of the Earth (Aristachus), calendar, brewing, agriculture
o India: metallurgy, surgery, medicine, mathematics, astronomy
o China: metallurgy, printing, explosives, paper, irrigation, acupuncture
o Islamic: medicine, physics, chemistry, biology, astronomy
o Here is an example of cultural observations that
enhanced scientific understanding of natural
phenomena. During the Middle Ages, the Islamic
world was a centre of learning. As a result of military
conquests and trading relations, Islamic cultures in
the Middle East and North Africa had access to
knowledge and data from many parts of the world.
Islamic scholars translated the works of the ancient
Greeks, Romans and Egyptians. They assembled
information about scientific discoveries from far-off
places, such as India and China. Universities in the
Middle East were highly-regarded centres of
learning.
o Another example of the contribution of cultural observational knowledge to
science is the discovery of the anti-malarial compound, Artemisinin. A
Chinese literary work, dating back to ~300 A.D., suggested that preparations
of the herb may provide protection against malaria. You-You Tu decided to
extract the active ingredient from this plant so as to develop the substance
as a therapeutic. After many years of sustained effort, Tu isolated the
extract and showed that it was effective
against both types of malarial parasites. Tu
then proceeded to determine the chemical
structure of the active ingredient, which as
called Artemisinin (after the scientific name
of the herb). Artemisinin is now part of the
established anti-malarial therapy used
around the world to treat the condition.
Research is also underway to develop new
therapeutics, based on the molecular
Page 15 of 47
structure of Artemisinin. For her efforts, Yu received the Nobel Prize in
Medicine or Physiology in 2015.
➢ Paradigm Shift o Thomas Kuhn was a science philosopher who explored scientific
epistemology. For any scientific discipline, the set of concepts, theories,
research methods and postulates used by scientists makes up the paradigm
of that discipline.
o Kuhn said that the scientific paradigm consists of normal science and puzzle-
solving science. Normal science is everyday science, where scientists
conduct inquiries into the paradigm, for example, by verifying hypotheses.
The ‘discoveries’ of normal science are expected findings, based on the
prevailing paradigm. Over time, anomalies will appear in the prevailing
paradigms.
o Those anomalies are referred to as puzzle-solving science. Scientists then
conduct investigations to understand the anomalies. The discoveries in the
puzzle-solving sciences are unexpected and lead to paradigm shifts (also
called scientific revolutions). We will explore these ideas further in the
subsequent slides.
Page 16 of 47
o According to Kuhn, a paradigm shift occurs in 3 stages:
❖ In Stage 1, normal science dominates. Scientists go about verifying
the prevailing concepts through observation and experimentation.
During this stage, hypotheses that are supported by evidence will be
retained, while those that are not, are rejected. This builds into a
body of scientific knowledge that forms the paradigms for the
various scientific disciplines.
❖ In Stage 2, scientists note that there are anomalies to the prevailing
paradigms. Anomalies are experimental data or observations that
cannot be explained by contemporary concepts.
❖ In Stage 3, the anomalies force scientists to search for new
explanations. When identified, the new concepts, ideas and
explanation result is a paradigm shift.
1. The prevailing view of how populations change over time is the
model proposed by Lamarck. According to him, the events that
an individual experiences in its lifetime will be transmitted to
the offspring. This model is called the Inheritance of Acquired
Characters. For example, the children of a person engaged in
physical labour will develop a stronger physique.
2. There were many anomalies that could not be explained by this
paradigm. One such anomaly is the experiment conducted by
the German biologist, August Weissman. He took a population
of rats and amputated their tails. Those rats were allowed to
breed. All of the offspring had normal tails. Once again, he
amputated their tails and bred them. He did this for 19
generations, but all of the offspring in every generation had
normal tails. Therefore, this observation was an anomaly to the
theory of the Inheritance of Acquired Characters.
3. Charles Darwin then proposed the Theory of Evolution by
Natural Selection, which was a different model to the
Inheritance of Acquired Characters. Eventually the paradigms of
evolutionary biology shifted, and Lamarck’s theory was
discarded. Darwin’s theory of evolution, together with
advances in genetics and cell biology, changed our
understanding of inheritance of biological traits in populations.
Page 17 of 47
o When anomalies appear in a scientific
discipline, there are 2 ways by which
paradigm shifts may occur. This is
theory replacement. This describes
the situation when a new theory
replaces an old theory. In modelling
the arrangements and movements of
planets, the geocentric model was
replaced with the heliocentric model.
o The second method by which
paradigm shifts occur is called theory
modification. Here, the old theory is
not replaced, but modified. In the
example shown on this slide, the paradigms
of Newtonian mechanics was modified to include Einstein’s theories of
relativity. Both theories are relevant and valid, but are used to describe
different systems.
o Both Galileo and Newton laid the foundations of classical mechanics.
Classical mechanics describes the motion of macroscopic (large) objects,
including those that are at rest. This movement is described in terms of the
masses of objects, and the forces acting on it. Parameters such as distance,
speed, time, space characterise the properties of a moving body. They were
very successful because if we know some initial conditions of a moving body,
then we can predict certain future outcomes, based on the principles of
classical mechanics. For example, if the speed of a moving object is known,
we can calculate the time taken for the object to travel a certain distance.
Classical mechanics applied equally to everything, everyone and
everywhere: for example, Newton’s 2nd Law of motion (F = ma) equally
applicable on the moon as it is on the Earth. Many aspects of modern
society are based on the principles of classical mechanics, for example,
calculating travel times.
o Despite the success of classical mechanics, scientists noted that it could not
explain some specific types of phenomena, for example, very small objects
(e.g. atoms), very large objects (e.g. stars), or objects moving very fast (near
the speed of light). Indeed, Newtonian mechanics was incompatible with
Maxwell’s description of electromagnetism.
o Einstein explored some of these anomalies using a combination of ideas in
physics and mathematics. Special relativity describes the physics of particles
and waves moving at, or close to, the speed of light. On the other hand,
Page 18 of 47
general relativity deals with the physics of large objects (gravitation).
Relativity did not displace the principles of Newtonian physics, but modified
them. For example, special relativity describes an equation for calculating
the momentum of a particle that is moving close to the speed of light.
However, for a particle that is moving at slower speeds, Einstein’s equation
for special relativity approximates to the equation for momentum in
classical mechanics. Special relativity also introduced new concepts in
physics, such as length contraction, time dilation, relativistic mass, a
universal speed limit, and mass–energy equivalence (E=mc2). General
relativity extends Newton’s Theory of Universal Gravitation. According to
Newton’s model, gravity is an attractive force between two objects.
However, General Relativity extends that idea to describe gravitation as the
warping of space-time around massive objects. Indeed, some aspects of
classical mechanics are seen to be an approximation of special relativity at
low velocities, and special relativity is an approximation of general relativity
in low gravitational fields.
o Modern scientific practice is based on philosophical thinking that developed
over many centuries in Europe, as well as the cultural observational
knowledge of many societies around the world. Modern science is a
powerful method of inquiry. The findings of scientific inquiry eventually
develop in a dynamic body of knowledge, known as science. Empiricism was
a strong force that shapes scientific inquiry – the emphasis on evidence and
experience is central to the development of scientific understanding.
Page 19 of 47
• Influences on Current Scientific Thinking
➢ Ethics, Morality, & the Law o Consider the following scenario: imagine you are driving a car and are
approaching a traffic intersection. The light has just turned red. What do you
do? You will, I hope, come to a stop at the intersection. Now, think about
your reasons for stopping at the intersection – did you do it because it is
illegal to cross the intersection when the light is red? Or did you think about
safety issues of not following the traffic signals? What if, after having
stopped at the intersection, you notice that no other vehicles are
approaching the intersection – will you be
tempted to cross the intersection even
though the lights are still red?
o Since we live in societies among many
other people, our behaviours will affect
the people around us. The decisions we
make may also affect the other living
things with whom we share the planet, as
well as affect its environments. Therefore,
in all societies, rules of behaviour and
norms are developed in the interest of the
greater good. Essentially, there are three
influences on human behaviour.
o Scientific ethics is concerned with the truth and integrity of scientific
practice. The processes used to conduct investigations, analyse and
communicate the findings should all be based on honesty. That honesty is
one of the most powerful features of science.
o Ethically Questionable Practices
❖ This team of scientists claimed to have performed cold fusion
(generating energy from radioactive fusion reaction at ambient
temperatures).
❖ The case of cold fusion shows another aspect of questionable
scientific ethics. Stanley Pons and Martin Fleischmann at the
University of Utah concluded that they had found evidence of
deuterium fusion occurring at room temperature (this was a ‘holy
grail’ of energy research). Rather than publishing their findings in a
peer-reviewed journal, they announced their findings at a press
conference. However, other scientists could not replicate Pons and
Fleischmann’s experiments. A few weeks later, the U.S. Department
of Energy concluded that Pons and Fleischmann had not achieved
cold fusion. Although their work was not considered to be scientific
fraud, it was unethical as they did not follow the scientific process.
Page 20 of 47
o Ethical Frameworks
❖ The principle of autonomy: making voluntary and informed
decisions (i.e. capacity to act intentionally, with understanding, and
without controlling influences)
❖ The principle of no maleficence: No subject in a study is
intentionally harmed or injured, either through acts of commission
or omission
❖ The principle of beneficence: Produce beneficial outcomes &
positive steps are taken to prevent and to remove harm from the
patient
❖ The principle of justice: Equal access to care, benefits,
compensation
❖ The principle of confidentiality: maintaining anonymity and privacy.
❖ The principle of non-deception: maintaining open and truthful
communications
➢ Current Influences on Scientific Thinking o Economic
o Political
o Global
➢ Influence of Ethical Frameworks on Scientific Research o Human Research
❖ Human experimentation refers to scientific investigations of humans
(it excludes studies in other areas, such as social science, education,
etc.). Human experimentation may involve manipulation (e.g.
clinical trials), or be purely observational. The history of human
experimentation is a mixed one. Although our knowledge of human
biology advanced in leaps and bounds through experimentation,
Page 21 of 47
many of those studies will not be conducted in the original manner
today.
❖ For example, vaccination is a powerful medical therapy that has
improved the human condition worldwide. The English doctor,
Edward Jenner, is credited with the first scientific demonstration of
vaccination (strictly speaking, Jenner’s method is called variolation).
However, by today’s standards, Jenner’s studies on variolation
would be considered to be unethical.
❖ When the details of the Tuskegee Study of Untreated Syphilis
experiment was revealed to the public, it raised an uproar
internationally. In this study, 399 syphilitic 201 non-syphilitic
African-American men were part of a study to determine the
physiological effects of syphilis infections on humans. After reading
the synopsis of the research on the website indicated in the slide,
you may notice the following:
▪ The participants were not told that they were involved in a
human experiment.
▪ The participants were given inducements to take part in the
experiment.
▪ The participants were denied treatment for syphilis
infections, even though the treatment became available
during the study.
❖ All human experimentation in Australia is governed by the ethical
frameworks developed by the National Health and Medical Research
Council, also referred to as the NHMRC. These ethical frameworks
are based on the same principles of universal ethics. The four key
frameworks are:
❖ In Australia, at institutions that undertake human research, all
research proposals must be approved by the HREC. Human research
cannot be conducted without an ethics permit is not obtained.
❖ HERCs consist of researchers, non-researchers and community
members.
❖ The HRECs evaluate applications based on:
Page 22 of 47
1. How is the research question/theme identified or developed?
2. How do the research methods align with the research aims?
3. How will the researchers and the participants engage with
one another?
4. How will the research data or information be collected,
stored, and used?
5. How will the results or outcomes be communicated?
6. What will happen to the data and information upon
completion of the project?
o Experimentation on Animals
❖ Several ethical frameworks also govern the use of animals in
research. Animals are used in many areas of study. The reasons for
using animals in experimentation are two-fold:
1. To learn about the biology and behaviour of the animals
themselves – this is important for veterinary science,
agriculture, management of wildlife (e.g. zoos, aquaria, parks)
and conservation.
2. To use animals as models of human biology – since there are
many biochemical, physiological and genetic similarities
between many animals and humans, animal models can
provide a wealth of information about how human biology
works. This is described further in the next slide.
❖ There are several reasons why animals are used for biomedical
research:
1. Animals are biologically very similar to humans (mice and
humans share more than 98% genetic similarity)
2. Animals are susceptible to many of the same health problems
as humans – cancer, diabetes, heart disease, etc.
3. With a shorter life cycle than humans, animal models can be
studied throughout their whole life span and across several
generations.
Page 23 of 47
❖ There are both scientific and ethical imperatives for looking after
animals in research. Animals that are not cared for usually
experience stress, which tend to affect other physiological systems.
This may lead to anomalous results, as well as results that are not
reproducible.
❖ The ethical frameworks for animal experimentation apply to
vertebrate animals. These are animals with backbones (fish,
amphibians, reptiles, birds and mammals). This is because
vertebrates can experience pain, while current evidence suggests
that invertebrates do not experience pain.
❖ In Australia, individual states and territories are responsible for
overseeing animal ethics. As for human experimentation,
institutions must have Animal Ethics Committees to review all
research involving vertebrate animals. Researchers must obtain
ethics permits to conduct their research.
❖ AECs focus on the care of animals in experiments, as well as the
disposal of animals after the investigations are completed. There are
strict guidelines on how animals should be housed, fed, cleaned and
maintained during the investigations. If animals are to be
euthanised, then the researchers must use approved methods to kill
the animals. These methods have been approved by expert
committees so as to reduce the stress and pain burden on animals.
❖ The 3R rule was devised to reduce reliance on the use of animals in
experiments:
▪ Replacement of animals with other methods – where
possible, viable alternatives to the use of animals should be
explored.
▪ Reduction in the number of animals used – researchers
should use the minimum number of animals in their
experiments. There are statistical models that they can use
to determine the minimum number of animals that can be
used in an experiment without affecting the reliability and
validity of their findings.
▪ Refinement of techniques used to minimise the adverse
impact on animals – researchers should always use the
latest findings regarding the manipulation of animals so as
to minimise pain and stress on them.
❖ This is a pathway for the discovery of new medical treatments. In
vitro and in silico refer to studies that are conducted with cell or
tissue cultures, or to discoveries made with molecular arrays and
computer simulations. Small animal research is usually the starting
point for the development of a new therapeutic. Once it is proven,
then the therapeutic is tested on large animals. Large animal studies
provide important information about physiological responses to the
therapeutic. Only when its potential is realised in large animals will
the product be used in human experiments (clinical trials). Both
animal and human ethics apply at multiple stages in this discovery
pipeline.
Page 24 of 47
o Biobanks
❖ Biomedical researchers often need access to biological samples for
their experiments. In some types of research, they may need access
to samples from specific sectors of the population (e.g. for disease
tissues). Biobanks are repositories of biological samples that
researchers can use. To ensure that researchers have access to all
relevant information about the samples in the biobanks, there is a
detailed cataloguing process for all samples.
❖ Information about the type, origin, date of collection and other
information are kept in a database. The tissue samples are stored in
various ways, but usually in cold storage (-70oC or liquid nitrogen).
One example of the use of biobank materials is the research
conducted by the Kathleen Cuningham Foundation National
Consortium for Research on Familial Breast Cancer (KConFaB).
❖ There, the researchers are looking to determine the:
▪ population rates of mutations in breast cancer genes;
▪ kinds of mutations that predispose to breast and ovarian
cancer;
▪ risk of breast and other types of cancer;
▪ age at which cancers occur; and
▪ effect of lifestyle and environmental factors on the risk of
developing cancer and age of onset.
❖ More than 100 research projects worldwide rely on samples from
this biobank.
❖ Many ethical issues surround the collection, maintenance and use of
the materials in biobanks. Some of those issues are listed on this
slide.
▪ Informed consent – all donors must be willing participants
who have been informed about how their donated samples
may be used. Proper communications should also be set up
between the biobanks and the donors. Most importantly, no
one must be compelled to donate samples. Vulnerable
donors, including those who cannot make informed
decisions, need to be respected and protected. Where
relevant, cultural sensitivities must be taken into account.
▪ The information contained in biobanks must be treated
confidentially. The privacy of donors must be respected.
Most biobanks have coded identification systems where
Page 25 of 47
confidential information is not released, except as required
by the researchers.
▪ Some research activities may result in commercially-
valuable discoveries. Biobanks must ensure that the benefits
of research and development are shared with all people
involved, in accordance with the law.
➢ Use of Research Data o The communication of research findings is at the core of the scientific
enterprise. Research data, once verified, are the raw materials of scientific
knowledge construction. For scientists, the generation and publication of
scientific data are the hallmarks of professionalism. Peer-reviewed
publications are the primary means of such communications, but scientists
also use other forms of communications. Most research funding agencies,
such as the Australian Research Council and the National Health and Medical
Research Council require scientists to publish the data, they generate
through funded research programs. When applying for such funding,
scientists have to indicate how they plan to publicise their findings. This
creates transparency, and the community can see the benefits of supporting
scientific research.
o Sometimes, research may produce data that should not be openly shared.
For example, research that has military or security implications, or data that
may be used to achieve harmful ends (e.g. bioterrorism) should not be
published. Furthermore, discoveries with commercial potential will not be
published (at least, critical data may not be revealed).
o Data sharing is beneficial, as it:
❖ Encourages further scientific enquiry and promotes innovation.
❖ Leads to new collaborations between data users and data creators.
❖ Maximises transparency and accountability.
❖ Reduces the cost of duplicating data collection.
o The ethics of data sharing centres on the following questions:
1. What data or information are required to achieve the objectives of
the project?
2. How and by whom will the data or information be generated,
collected and accessed?
3. How and by whom will the data or information be used and
analysed?
4. Will the data or information be disclosed or shared and, if so, with
whom?
5. How will the data or information be stored and disposed of?
6. What are the risks associated with the collection, use and
management of data or information and how can they be
minimised?
7. What is the likelihood and severity of any harm/s that might result?
Page 26 of 47
The Scientific
Research Proposal
Notes
Page 27 of 47
Table of Contents • Developing the Question & Hypothesis ............................................................................... 28
➢ Reliability, Validity & Accuracy ........................................................................................ 28
➢ What makes a source 'reliable'? ...................................................................................... 28
➢ Using citations to locate other relevant journal articles .................................................... 29
➢ How to find more full-text articles ................................................................................... 29
• Scientific Research Proposal ................................................................................................ 30
➢ Plan to Investigate the Scientific Hypothesis .................................................................... 30
➢ Referencing Protocols ..................................................................................................... 30
• Methodology & Data Collection .......................................................................................... 31
➢ Uncertainty in Experimental Evidence ............................................................................. 31
➢ Use of Errors ................................................................................................................... 31
➢ Quantitative & Qualitative Research Methods ................................................................. 32
➢ Methods Used to Obtain Large Data Sets ......................................................................... 32
• Processing Data for Analysis ............................................................................................... 34
➢ Impact of Kepler Telescope Data ..................................................................................... 34
Page 28 of 47
• Developing the Question & Hypothesis
➢ Reliability, Validity & Accuracy
Reliability
▪ When a scientist repeats an experiment with a different group of people or a different batch of the same chemicals and gets very similar results then those results are said to be reliable. Reliability is measured by a percentage – if you get exactly the same results every time then they are 100% reliable.
▪ Try holding a ruler above a friend’s open hand and dropping it – they have to catch the ruler but may not move until they see the ruler start to move. Note down the measurement where the ruler was caught. Do this ten times and calculate the mean (average) result.
▪ Is the ‘dropping a ruler’ experiment a reliable measure of reaction time?
Validity
▪ Validity describes whether the results of an experiment really do measure the concept being tested. Does seeing how far a ruler can drop through someone’s hand really measure reaction time? What other variables may be influencing the results?
▪ Is the ‘dropping a ruler’ experiment a valid measure of reaction time?
Accuracy
▪ Accuracy describes how well a measuring instrument determines the variable it is measuring. It can be employed in two ways
▪ An accurate measuring instrument, say a thermometer, is one whose readings confirm a known result.
▪ The level of accuracy of a measuring instrument determines the detail to which it can measure. A micrometer measures length to a greater level of accuracy than a ruler which in turn measures length to a greater level of accuracy than a ‘clicker’ wheel.
▪ In order to be accurate in their work scientists need to first select a measuring instrument that allows an appropriate measure of accuracy (e.g. a micrometer for the diameter of a piece of wire and a ruler marked in mm for its length and then to calibrate it. Calibrating an instrument involves measuring already known quantities too assess how accurately it is working).
➢ What makes a source 'reliable'? o I would be looking for journal articles which:
❖ Are published in a high ranking journal (usually ranked according to
how many citations articles in that journal have. See discussion at
https://en.wikipedia.org/wiki/Journal_ranking).
❖ Have a large number of citations (depending on how old the article
is) - meaning that other researchers refer to this paper in their own
research articles. The number of citations is a measure of how highly
regarded the work is by other researchers in the field.
❖ Whose authors are from well-regarded universities/institutions (but
this is not really as important as the first two).
❖ There may also be circumstances in which reliable information is
available on a website is maintained by a reputable institution such
Page 29 of 47
as NASA (for example, in my sample investigation, the NASA earth
observatory).
➢ Using citations to locate other relevant journal articles o When you find a highly reliable and relevant article, look at both the articles
it cites as well as articles which cite that article to find other highly relevant
articles.
o You can find the articles that cite your article using, for example, google
scholar. When I search for my review article on "the albedo of earth" by
Stehpens et al. on google scholar I can see that it has been cited by 30 other
articles. I click on the "citing literature" link to find those articles. In this way
you 'follow your nose' through the research until you find the information
you want and/or come up against the edge of what is known, with articles
published in the last year or two.
➢ How to find more full-text articles o You can consider using the browser add-on "Unpaywall" which finds legal
fulltext versions of articles (often stored in university repositories of their
researchers work or in preprint archives). If you are having trouble finding a
full-text version of an article you can also sometimes find the researchers
own personal webpage which may list full-text versions (for example I keep
full-text versions of my articles available on my personal website) as it is in
the researcher's interests that their articles be accessible.
o As mentioned on the the "tools" page, joining the state and national
libraries may also assist you find full-text papers.
Page 30 of 47
• Scientific Research Proposal
➢ Plan to Investigate the Scientific Hypothesis o The Overall Strategy
o Methodology
o Data Analysis
o Representation & Communication of the Scientific Research
o Timelines
o Benchmarks
➢ Referencing Protocols o APA
o Harvard
o MLA
Page 31 of 47
• Methodology & Data Collection
➢ Uncertainty in Experimental Evidence
Systematic Errors
Two types: ✓ Offset uncertainty - all measurements are larger or smaller than the "true" value by a constant
amount. For example: a thermometer is not in sufficiently good thermal contact with a hot object -
the readings are all lower than the "true" temperature of the object. See this paper for an example of a systematic offset: https://fathomingphysics.nsw.edu.au/wp-content/uploads/2017/07/TEHumphrey_VCalisa_Phys_Teach_vol_52_iss_3_142_2014.pdf
✓ Gain uncertainty - measurements are larger or smaller than the "true" value by a fixed percentage.
For example: A long measuring tape is stretched so that all 1cm markings on the ruler are now actually separated by 1.01cm. Each reading will give a value that is 1% higher than the "true" value.
Random Errors
Random errors correspond to the "scatter" in experimental data, and will result in readings that are scattered around the "true" value, usually with a normal distribution. The data in the paper we looked at earlier (https://fathomingphysics.nsw.edu.au/wp-content/uploads/2017/07/TEHumphrey_VCalisa_Phys_Teach_vol_52_iss_3_142_2014.pdf) also shows significant random error, with data points distributed above and below the line of best fit.
➢ Use of Errors
Page 32 of 47
➢ Quantitative & Qualitative Research Methods o Qualitative Research: This can be described as research that cannot easily
be communicated or understood in numerical terms. It usually involves
open ended questions and responses (such as interview questions or case
studies).
o Quantitative Research: Is an approach for testing objective theories by
examining the relationship among variables. The variables can be measured
and analysed numerically.
o Mixed Methods Research: Contains elements of both types of research.
o Different Types of Scientific Inquiry
❖ a commitment to deductive testing (i.e. the idea that
experimental/observational evidence determines if a theory can be
accepted)
❖ an experimental design that protects against bias
❖ a consideration of alternative explanations of the results
❖ the interpretation of data (either qualitative or quantitative) to
produce results that are reproducible and generalisable
❖ a discussion of how the research relates to other work done in that
field
➢ Methods Used to Obtain Large Data Sets o Remote Sensing
❖ Remote sensing is obtaining information about an area or
phenomenon through a device that does not touch the area or
phenomenon under study.
❖ Passive remote sensors detect natural energy that is reflected or
emitted from an observed object or scene (most commonly, reflected
sunlight). For example, a camera or a spectrometer (or your eyes!).
❖ Active remote sensors provide their own energy (electromagnetic
radiation) to illuminate the object or scene they are observing, and
then detect the radiation that is reflected or backscattered from that
object. For example, Radar (Radio detection and ranging) or Lidar
(Light detection and ranging) instruments.
❖ Many remote sensing devices are on-board satellites that monitor the
Earth from space.
o Streamed Data
❖ There are many devices now used in research which can operate in an
autonomous or semi autonomous mode in which data is recorded
continuously and then "streamed" out of the sensor for further
processing and analysis.
❖ While this technology has opened new opportunities in research, it has
also brought challenges. With advances in information technology it
has become possible to record very large amounts of data in a short
time. In some cases, e.g. the SKA discussed below, it is necessary to
process the data in real time to reduce the amount of data that is
placed in long term storage. In other systems the constraints may be
on the communication link from the instrument. For example the
Kepler telescope was situated in a sun centred orbit and had a limited
Page 33 of 47
capacity (once a month only) radio link back to earth (see:
https://www.nasa.gov/mission_pages/kepler/spacecraft/index.html).
Similar considerations apply in sensor networks where the
communication back to base is limited by the amount of power
required.
❖ Other examples, as identified as:
▪ Internet of People (consisting of wearable devices)
▪ Social media
▪ financial transactions
▪ Industrial Internet of Things
▪ Cyberphysical Systems (requiring real-time responses)
▪ Satellite and airborne monitors
▪ National Security
▪ Astronomy
▪ Light Sources
▪ Instruments like the LHC
▪ Sequencers (all involving large volumes of data)
▪ Data Assimilation (where there is sensitivity to latency)
▪ Analysis of Simulation Results
▪ Steering and Control.
Page 34 of 47
• Processing Data for Analysis
➢ Impact of Kepler Telescope Data o The Kepler and K2 missions have provided an unprecedented data set with a
precision and duration that will not be rivalled for decades. Even though the
data has already contributed to nearly 2,500 scientific publications so far,
the scientific community continues to extract new discoveries from the
archive data every day.
o To help new users understand where there may be important scientific gains
left to be made in analysing Kepler data, and to encourage the continued
use of the archives, we have prepared a white paper which discusses a non-
exhaustive list of 21 important data analysis projects which can be executed
using the public data that are readily available in the archives today. Each
project contains a link to an issue on the GitHub repository of the white
paper where we invite researchers to discuss their ideas or progress towards
resolving the challenge.
o The studies discussed in the paper show that many of Kepler's contributions
still lie ahead of us, owing to the emergence of complementary new data
sets, novel data analysis methods, and advances in computing power.
Page 35 of 47
The Data, Evidence
and Decisions
Notes
Page 36 of 47
Table of Contents • Patterns & Trends ............................................................................................................... 38
➢ Data vs. Evidence ............................................................................................................ 38
➢ Qualitative vs. Quantitative Data Sets ............................................................................. 38
o Content & Thematic Analysis ....................................................................................... 38
o Descriptive Statistics ................................................................................................... 38
➢ Tools for Data Representation ......................................................................................... 38
o Spreadsheets .............................................................................................................. 38
o Graphical Representations........................................................................................... 38
o Models ....................................................................................................................... 38
o Digital Technologies .................................................................................................... 38
➢ Limitations of Data Analysis & Interpretation .................................................................. 38
o Quantitative Data ........................................................................................................ 38
o Qualitative Data .......................................................................................................... 39
• Statistics in Scientific Research ............................................................................................ 40
➢ Descriptive Statistics ....................................................................................................... 40
o Mean .......................................................................................................................... 40
o Median ....................................................................................................................... 40
o Standard Deviation ..................................................................................................... 40
➢ Performance Measures ................................................................................................... 40
o Error ........................................................................................................................... 40
o Accuracy ..................................................................................................................... 40
o Precision ..................................................................................................................... 40
o Bias ............................................................................................................................. 40
o Data Cleansing ............................................................................................................ 40
➢ Statistical Tests ............................................................................................................... 40
o Student’s t-test ........................................................................................................... 40
o Chi-square test ............................................................................................................ 41
o F-test .......................................................................................................................... 42
➢ Bivariate Correlation ....................................................................................................... 43
o Correlation Coefficient ................................................................................................ 43
➢ Correlation vs. Causation ................................................................................................ 43
• Decisions from Data & Evidence .......................................................................................... 45
➢ Collective & Individual Decision-Making .......................................................................... 45
o Collective Decision-Making .......................................................................................... 45
Page 37 of 47
o Individual Decision-Making.......................................................................................... 46
➢ Impact of New Data on Established Scientific Ideas .......................................................... 46
o Gravitational Waves on General Relativity ................................................................... 46
• Data Modelling ................................................................................................................... 47
➢ Data Modelling Techniques ............................................................................................. 47
o Predictive .................................................................................................................... 47
o Statistical .................................................................................................................... 47
o Descriptive .................................................................................................................. 47
o Graphical .................................................................................................................... 47
Page 38 of 47
• Patterns & Trends
➢ Data vs. Evidence o Data is just data and has no intrinsic meaning on its own.
o Evidence has to be evidence for or of something; an argument, an opinion, a
viewpoint or a hypothesis.
➢ Qualitative vs. Quantitative Data Sets
o Content & Thematic Analysis ❖ Content analysis is the study of documents and communication
artefacts, which might be texts of various formats, pictures, audio or
video. Social scientists use content analysis to examine patterns in
communication in a replicable and systematic manner.
❖ Thematic analysis is one of the most common forms of analysis
within qualitative research. It emphasizes identifying, analysing and
interpreting patterns of meaning within qualitative data.
o Descriptive Statistics ❖ A descriptive statistic is a summary statistic that quantitatively
describes or summarizes features from a collection of information,
while descriptive statistics is the process of using and analysing
those statistics.
➢ Tools for Data Representation
o Spreadsheets
o Graphical Representations
o Models ❖ Physical, computational and/or mathematical
o Digital Technologies
➢ Limitations of Data Analysis & Interpretation o Data analysis and interpretation is that the method of assigning meaning to
the data collected and determining the conclusions, significance, and
implications of the findings. it’s a crucial and exciting step within the process
of research. In most of the research studies, analysis follows data collection.
o There are two main methods in the interpretation of data.
o Quantitative Data ❖ Quantitative data is statistical and is usually structured in nature
meaning it’s more rigid and defined. This kind of data is measured
using values and numbers, which makes it a more suitable candidate
for data analysis.
Page 39 of 47
❖ E.g.
▪ Experiments
▪ Surveys
▪ Metrics
▪ Tests
o Qualitative Data ❖ Qualitative data is non-statistical and is usually unstructured or
semi-structured in nature. This data isn’t necessarily measured using
hard numbers that are used to develop graphs and charts. Instead,
it’s categorized as supported properties, attributes, labels, and other
identifiers.
❖ E.g.
▪ Symbols and Images
▪ Video and audio recordings
▪ Texts and documents
▪ Observations and notes
o There are many issues that researchers should be aware of with respect to
data analysis. Some of those issues are as follows.
❖ Having the necessary skills to analyse
❖ Simultaneously selecting data collection methods and appropriate
analysis
❖ Drawing unbiased conclusion
❖ Unsuitable subgroup analysis
❖ Lack of clearly defined and objective outcome calculation
❖ Providing honest and exact analysis
❖ Data recording process
❖ Split up ‘text’ when analysing qualitative data
❖ accuracy, authenticity and Validity
Page 40 of 47
• Statistics in Scientific Research
➢ Descriptive Statistics
o Mean
o Median
o Standard Deviation
➢ Performance Measures
o Error
o Accuracy
o Precision
o Bias ❖ Bias is any trend or deviation from the truth in data collection, data
analysis, interpretation and publication which can cause false
conclusions. Bias can occur either intentionally or unintentionally.
o Data Cleansing ❖ Data cleansing is the process of detecting and correcting corrupt or
inaccurate records from a record set, table, or database and refers
to identifying incomplete, incorrect, inaccurate or irrelevant parts of
the data and then replacing, modifying, or deleting the dirty or
coarse data.
➢ Statistical Tests
o Student’s t-test ❖ Student’s t-test, in statistics, is a method of testing hypotheses
about the mean of a small sample drawn from a normally
distributed population when the population standard deviation is
unknown.
❖ The t distribution is a family of curves in which the number of
degrees of freedom (the number of independent observations in the
sample minus one) specifies a particular curve. As the sample size
(and thus the degrees of freedom) increases, the t distribution
approaches the bell shape of the standard normal distribution. In
practice, for tests involving the mean of a sample of size greater
than 30, the normal distribution is usually applied.
❖ It is usual first to formulate a null hypothesis, which states that
there is no effective difference between the observed sample mean
and the hypothesized or stated population mean—i.e., that any
measured difference is due only to chance.
❖ In an agricultural study, for example, the null hypothesis could be
that an application of fertilizer has had no effect on crop yield, and
an experiment would be performed to test whether it has increased
the harvest. In general, a t-test may be either two-sided (also
Page 41 of 47
termed two-tailed), stating simply that the means are not
equivalent, or one-sided, specifying whether the observed mean is
larger or smaller than the hypothesized mean. The test statistic t is
then calculated. If the observed t-statistic is more extreme than the
critical value determined by the appropriate reference distribution,
the null hypothesis is rejected. The appropriate reference
distribution for the t-statistic is the t distribution. The critical value
depends on the significance level of the test (the probability of
erroneously rejecting the null hypothesis).
❖ For example, suppose a researcher wishes to test the hypothesis
that a sample of size n = 25 with mean x = 79 and standard deviation
s = 10 was drawn at random from a population with mean μ = 75
and unknown standard deviation. Using the formula for the t-
statistic,
❖ The calculated t equals 2. For a two-sided test at a common level of
significance α = 0.05, the critical values from the t distribution on 24
degrees of freedom are −2.064 and 2.064. The calculated t does not
exceed these values; hence the null hypothesis cannot be rejected
with 95 percent confidence. (The confidence level is 1 − α.)
❖ A second application of the t distribution tests the hypothesis that
two independent random samples have the same mean. The t
distribution can also be used to construct confidence intervals for
the true mean of a population (the first application) or for the
difference between two sample means (the second application).
o Chi-square test ❖ A chi-square (χ2) statistic is a test that measures how expectations
compare to actual observed data (or model results). The data used
in calculating a chi-square statistic must be random, raw, mutually
exclusive, drawn from independent variables, and drawn from a
large enough sample. For example, the results of tossing a coin 100
times meet these criteria.
❖ There are two main kinds of chi-square tests: the test of
independence, which asks a question of relationship, such as, "Is
there a relationship between gender and SAT scores?"; and the
goodness-of-fit test, which asks something like "If a coin is tossed
100 times, will it come up heads 50 times and tails 50 times?"
Page 42 of 47
❖ For these tests, degrees of freedom are utilized to determine if a
certain null hypothesis can be rejected based on the total number of
variables and samples within the experiment.
❖ For example, when considering students and course choice, a
sample size of 30 or 40 students is likely not large enough to
generate significant data. Getting the same or similar results from a
study using a sample size of 400 or 500 students is more valid.
❖ In another example, consider tossing a coin 100 times. The expected
result of tossing a fair coin 100 times is that heads will come up 50
times and tails will come up 50 times. The actual result might be
that heads will come up 45 times and tails will come up 55 times.
The chi-square statistic shows any discrepancies between the
expected results and the actual results.
o F-test ❖ An F statistic is a value you get when you run an ANOVA test or a
regression analysis to find out if the means between two
populations are significantly different. It’s similar to a T statistic
from a T-Test; A-T test will tell you if a single variable is statistically
significant and an F test will tell you if a group of variables are jointly
significant.
❖ Simply put, if you have significant result, it means that your results
likely did not happen by chance. If you don’t have statistically
significant results, you throw your test data out (as it doesn’t show
anything!); in other words, you can’t reject the null hypothesis.
❖ You can use the F statistic when deciding to support or reject the
null hypothesis. In your F test results, you’ll have both an F value
and an F critical value.
▪ The F critical value is also called the F statistic.
▪ The value you calculate from your data is called the F value
(without the “critical” part).
❖ In general, if your calculated F value in a test is larger than your F
statistic, you can reject the null hypothesis. However, the statistic is
only one measure of significance in an F Test. You should also
consider the p value. The p value is determined by the F statistic and
is the probability your results could have happened by chance.
❖ The F statistic must be used in combination with the p value when
you are deciding if your overall results are significant. Why? If you
have a significant result, it doesn’t mean that all your variables are
significant. The statistic is just comparing the joint effect of all the
variables together.
❖ For example, if you are using the F Statistic in regression analysis
(perhaps for a change in R Squared, the Coefficient of
Determination), you would use the p value to get the “big picture.”
1. If the p value is less than the alpha level, go to Step 2
(otherwise your results are not significant, and you cannot
reject the null hypothesis). A common alpha level for tests is
0.05.
Page 43 of 47
2. Study the individual p values to find out which of the
individual variables are statistically significant.
➢ Bivariate Correlation
o Correlation Coefficient ❖ The Pearson product-moment correlation coefficient is a measure of
the strength of the linear relationship between two variables. It is
referred to as Pearson's correlation or simply as the correlation
coefficient. If the relationship between the variables is not linear,
then the correlation coefficient does not adequately represent the
strength of the relationship between the variables.
❖ The symbol for Pearson's correlation is "ρ" when it is measured in
the population and "r" when it is measured in a sample. Because we
will be dealing almost exclusively with samples, we will use r to
represent Pearson's correlation unless otherwise noted.
❖ Pearson's r can range from -1 to 1. An r of -1 indicates a perfect
negative linear relationship between variables, an r of 0 indicates no
linear relationship between variables, and an r of 1 indicates a
perfect positive linear relationship between variables.
➢ Correlation vs. Causation ❖ Correlation is a statistical measure (expressed as a number) that
describes the size and direction of a relationship between two or
more variables. A correlation between variables, however, does not
automatically mean that the change in one variable is the cause of
the change in the values of the other variable.
❖ Causation indicates that one event is the result of the occurrence of
the other event; i.e. there is a causal relationship between the two
events. This is also referred to as cause and effect.
❖ Theoretically, the difference between the two types of relationships
are easy to identify — an action or occurrence can cause another
(e.g. smoking causes an increase in the risk of developing lung
cancer), or it can correlate with another (e.g. smoking is correlated
with alcoholism, but it does not cause alcoholism). In practice,
however, it remains difficult to clearly establish cause and effect,
compared with establishing correlation.
❖ The objective of much research or scientific analysis is to identify the
extent to which one variable relates to another variable. For
example:
▪ Is there a relationship between a person's education level
and their health?
▪ Is pet ownership associated with living longer?
▪ Did a company's marketing campaign increase their product
sales?
❖ These and other questions are exploring whether a correlation exists
between the two variables, and if there is a correlation then this
Page 44 of 47
may guide further research into investigating whether one action
causes the other. By understanding correlation and causality, it
allows for policies and programs that aim to bring about a desired
outcome to be better targeted.
❖ Causality is the area of statistics that is commonly misunderstood
and misused by people in the mistaken belief that because the data
shows a correlation that there is necessarily an underlying causal
relationship .
❖ The use of a controlled study is the most effective way of
establishing causality between variables. In a controlled study, the
sample or population is split in two, with both groups being
comparable in almost every way. The two groups then receive
different treatments, and the outcomes of each group are assessed.
❖ For example, in medical research, one group may receive a placebo
while the other group is given a new type of medication. If the two
groups have noticeably different outcomes, the different
experiences may have caused the different outcomes.
❖ Due to ethical reasons, there are limits to the use of controlled
studies; it would not be appropriate to use two comparable groups
and have one of them undergo a harmful activity while the other
does not. To overcome this situation, observational studies are
often used to investigate correlation and causation for the
population of interest. The studies can look at the groups'
behaviours and outcomes and observe any changes over time.
❖ The objective of these studies is to provide statistical information to
add to the other sources of information that would be required for
the process of establishing whether or not causality exists between
two variables.
Page 45 of 47
• Decisions from Data & Evidence
➢ Collective & Individual Decision-Making
o Collective Decision-Making ❖ Group decisions may involve assimilating a huge amount of
information, exploring many different ideas, and drawing on many
strands of experience. And the consequences of the right or wrong
decision may be profound for the team and the organization. For
obvious reasons, decisions made in groups can vary considerably
from those undertaken by individuals. It is this potential divergence
in outcomes that make group decision making attractive.
❖ Group decision-making is a situation faced when individuals
collectively make a choice from the alternatives before them. The
decision is then no longer attributable to any single individual who is
a member of the group. This is because all the individuals and social
group processes such as social influence contribute to the outcome.
The decisions made by groups are often different from those made
by individuals. There is much debate as to whether this difference
results in decisions that are better or worse.
❖ According to the idea of synergy, decisions made collectively tend to
be more effective than decisions made by a single individual. Factors
that impact other social group behaviors also affect group decisions.
Moreover, when individuals make decisions as part of a group, there
is a tendency to exhibit a bias towards discussing shared information
(i.e. shared information bias), as opposed to unshared information.
Individual decision refers to the decision-making process where an
individual selects the course of action to be followed in the business
from various alternatives whereas collective decision refers to the
group decision which occurs at mutual agreement from the group.
Advantages Disadvantages
Groups generate more complete information and knowledge.
Group decisions are time-consuming. They typically take more time to reach a solution
than making the decision alone.
By aggregating the resources of several individuals, groups bring more input into the
decision process.
Group decisions have conformity pressures in groups. The desire by group members to be
accepted and considered an asset to the group can result in squashing any overt
disagreement.
In addition to more input, groups can bring heterogeneity to the decision process. They
offer increased diversity of views.
Group decision can be dominated by one or a few members. If this dominated coalition is
composed of low and medium ability members, the group’s overall effectiveness
will suffer.
A group will almost always outperform even the best individual. So, groups generate higher
quality decisions.
Group decisions suffer from ambiguous responsibility. In an individual decision, it’s
clear who is accountable for the final outcome. In a group decision, the
Groups lead to increase acceptance of solutions. Many decisions fail after the final
Page 46 of 47
choice is made because people don’t accept the solution. Group members who
participated in making a decision are likely to enthusiastically support the decision and
encourage others to accept it.
responsibility of any single member is watered down.
o Individual Decision-Making
Advantages Disadvantages
An individual generally makes prompt decisions. While a group is dominated by
various people, making decision-making very time consuming. Moreover, assembling group
members consumes lot of time.
A group has potential of collecting more and full information compared to an individual
while making decisions.
Individuals do not escape responsibilities. They are accountable for their acts and
performance. While in a group it is not easy to hold any one person accountable for a wrong
decision.
An individual while making any decision uses his own intuition and views. While a group has
many members, so many views and many approaches and hence better decision making.
Individual decision making saves time, money and energy as individuals make prompt and
logical decisions generally. While group decision making involves lot of time, money
and energy.
A group discovers hidden talent and core competency of employees of an organization.
Individual decisions are more focused and rational as compared to group.
An individual will not take into consideration every members interest. While a group will take into account interest of all members of
an organization.
➢ Impact of New Data on Established Scientific Ideas
o Gravitational Waves on General Relativity ❖ Relate to physics principles
Page 47 of 47
• Data Modelling
➢ Data Modelling Techniques
o Predictive ❖ Predictive modelling is a process that uses data and statistics to
predict outcomes with data models. These models can be used to
predict anything from sports outcomes and TV ratings to
technological advances and corporate earnings.
o Statistical ❖ A statistical model is a mathematical model that embodies a set of
statistical assumptions concerning the generation of sample data. A
statistical model represents, often in considerably idealized form,
the data-generating process.
o Descriptive ❖ A descriptive model describes a system or other entity and its
relationship to its environment. It is generally used to help specify
and/or understand what the system is, what it does, and how it
does it. A geometric model or spatial model is a descriptive model
that represents geometric and/or spatial relationships.
o Graphical ❖ Using the graph data model, designers describe their system as a
connected graph of nodes and relationships, much as they might do
with ER or object data modelling. Graph data models can be used
for text analysis, creating models that uncover relationships among
data points within documents.