bioinformatics

BIOINFORMATICS

CHIRAG THAKKAR (MCA-37)

IIND SEM

• Introduction

• History

• Need for bioinformatics

• Computational evolutionary biology

• Success

• Software and tools

CONTENTS

• Bioinformatics is the application of Information

technology to store, organize and analyse the

vast amount of biological data.

• The stored data is available in the form of

sequences and structures of proteins and

nucleic acids (the information carrier).

• The biological information of nucleic acids is

available as sequences while the data of

proteins is available as sequences and

structures

INTRODUCTION

• Sequences are represented in single dimension

where as the structure contains the three

dimensional data of sequences.

Biologists

collect molecular data:

DNA & Protein sequences,

gene expression, etc.

Computer scientists

(+Mathematicians, Statisticians, etc.)

Develop tools, soft wares, algorithms

to store and analyze the data.

Bioinformaticians

Study biological questions by

analyzing molecular data

The field of science in which biology, computer science

and information technology merge into a single

discipline .

• By course of 10 years starting from 1981,

following events occurred…

• 579 human genes had been mapped.

• Invented a method for automated DNA

sequencing.

• The Human Genome organization (HUGO) was

founded. This is an international organization of

scientists involved in Human Genome Project.

• The first complete genome map was published

for the bacteria Haemophilus influenza .

HISTORY

• After 10 years…

• By 1991, a total of 1879 human genes had been

mapped.

• In 1993, Genethon , a human genome research

center in France Produced a physical map of the

human genome.

• After 3 years…

• Genethon published the final version of the

Human Genetic Map. This concluded the end of

the first phase of the Human Genome Project.

• Bioinformatics was fuelled by the need to createhuge databases.

• GenBank and EMBL and DNA Database ofJapan.

• They store and compare the DNA sequencedata coming from the human genome and othergenome sequencing projects.

• Today, bioinformatics enhances protein structureanalysis, gene and protein functionalinformation, data from patients, pre-clinical andclinical trials, and the metabolic pathways ofnumerous species.

• The first bioinformatics databases were constructed

a few years after the first protein sequences began

to become available.

• Now, A huge variety of divergent data resources of

different types and sizes are now available either in

the public domain information through

Internet(www.ncbi.nlm.nih.gov).

• All of the original databases were organized in a

very simple way with data entries being stored in flat

files, as a single large text file. Re-write - Later on

lookup indexes were added to allow convenient

keyword searching of header information.

• Bioinformatics uses many areas of computer

science, statistics, mathematics and engineering to process

biological data.

• Complex machines are used to read in biological data at a much

faster rate than before.

• Analyzing biological data may involve algorithms in artificial

intelligence, soft computing, data mining, image processing,

and simulation.

• The algorithms in turn depend on theoretical foundations such

as discrete mathematics, control theory, system theory, information

theory, and statistics.

• Commonly used software tools and technologies in the field

include Java, C#, XML, Perl, C, C++, Python, R, SQL, CUDA, MATL

AB, and spreadsheet

• the development of new algorithms (mathematical formulas) and

statistics with which to assess relationships among members of

NEED FOR BIOINFORMATICS

• Evolutionary biology is the study of the origin

and species, as well as their change over

time. Informatics has assisted evolutionary

biologists by enabling researchers.

COMPUTATIONAL EVOLUTIONARY

BIOLOGY

• 1) Analysis of gene expression

SUCCESS

• 2) Analysis of regulation

• One can then apply clustering algorithms to that

expression data to determine which genes are

co-expressed

• 3) Analysis of protein expression

• Bioinformatics is very much involved in making

sense of protein microarray and HT MS data.

• involves the problem of matching large amounts

of mass data against predicted masses from

protein sequence databases.

• 4) Analysis of mutations in cancer

• Bioinformaticians continue to produce

specialized automated systems to manage the

sheer volume of sequence data produced, and

they create new algorithms and software to

compare the sequencing results to the growing

collection of human genome sequences

and germline polymorphisms

• 5) Comparative genomics

• 6) High-throughput image analysis

• Computational technologies are used to

accelerate or fully automate the processing,

quantification and analysis of large amounts of

high-information-content biomedical imagery.

• accuracy, simple objective and high speed

• Open-source bioinformatics software

• Many free and open-source software tools have

existed and continued to grow up till now.

• The range of open-source software

packages includes titles such

as Bioconductor, BioPerl, Biopython, BioJava, BioR

uby, Bioclipse, EMBOSS, .NET Bio, Taverna

workbench, and UGENE.

• In order to maintain this tradition and create further

opportunities, the non-profit Open Bioinformatics

Foundation have supported the annual

Bioinformatics Open Source Conference (BOSC)

since 2000.

SOFTWARE AND TOOLS

• Web services in bioinformatics

• The main advantages is that end users do not

have to deal with software and database

maintenance overheads.

http://www.ncbi.nlm.nih.gov/


• Bioinformatics workflow management

systems

• A Bioinformatics workflow management

system is a specialized form of a workflow

management system designed specifically to

compose and execute a series of computational

or data manipulation steps, or a workflow, in a

Bioinformatics application.

https://www.google.co.in/search?q=primer3&rlz=1C1PRFB_enIN517IN517&oq=primir3&aqs=chrome.1.69i57j0l5.10426j0j8&sourceid=chrome&espv=210&es_sm=93&ie=UTF-8

https://www.google.co.in/search?q=primer3&rlz=1C1PRFB_enIN517IN517&oq=primir3&aqs=chrome.1.69i57j0l5.10426j0j8&sourceid=chrome&espv=210&es_sm=93&ie=UTF-8

• Rosalind

• Rosalind is an educational resource and web

project for learning bioinformatics

through problem solving and computer

programming.

• bioinfo.mbb.yale.edu

• www.ncbi.nlm.nih.gov

• bioinformaticsweb.net

• www.oxfordjournals.org

• www.umass.edu

REFERENCES


http://www.oxfordjournals.org/

bioinformatics

Science

data of proteins

molecular data

stored data

data mining

data entries

dimensional data of

dna sequence data

human genome project