The Human Genome Project
#1

The Human Genome Project
Vimal Jayaprakash & Sandhya.K
S1S2 Biotechnology and Biochemical Engineering
Mohandas College of Engineering & Technology
Nedumangad

[attachment=10258]

Abstract
The Human Genome Project (HGP) is an international scientific research project with a primary goal of
determining the sequence of chemical base pairs which make up DNA and to identify and map the approximately
20,000–25,000 genes of the human genome from both a physical and functional standpoint. The project began in
1990 and was initially headed by the Office of Biological and Environmental Research in the U.S. Department of
Energy's Office of Science. In summary: the best estimates of total genome size indicate that about 92.3% of the
genome has been completed and it is likely that the centromeres and telomeres will remain un-sequenced until new
technology is developed that facilitates their sequencing. Most of the remaining DNA is highly repetitive and
unlikely to contain genes, but it cannot be truly known until it is entirely sequenced.. The roles of junk DNA, the
evolution of the genome, the differences between individuals, and many other questions are still the subject of
intense interest by laboratories all over the world. This Mega Project is co-ordinated by the U.S. Department of
Energy and the National Institute of Health. During the early years of the project, the Wellcome Trust (U.K.)
became a major partner, other countries like Japan, Germany, China and France contributed significantly.. It is
anticipated that detailed knowledge of the human genome will provide new avenues for advances in medicine and
biotechnology The project's goals included not only identifying all of the approximately 24,000 genes in the
human genome, but also to address the ethical, legal, and social issues (ELSI) that might arise from the availability
of genetic information. Five percent of the annual budget was allocated to address the ELSI arising from the project
Keywords: HGP-Human Genome Project, ELSI-Ethical Legal & Social Issues

Introduction
The Human Genome Project (HGP) is an
international scientific research project with a
primary goal of determining the sequence of
chemical base pairs which make up DNA and to
identify and map the approximately 20,000–25,000
genes of the human genome from both a physical and
functional standpoint.
The project began in 1990 and was headed by the
Office of Biological and Environmental Research in
the U.S. Department of Energy's Office of Science.
Francis Collins directed the National Institutes of
Health National Human Genome Research Institute
efforts. A working draft of the genome was
announced in 2000 and a complete one in 2003, with
further, more detailed analysis still being published.
A parallel project was conducted outside of
government by the Celera Corporation, which was
formally launched in 1998. Most of the government-
sponsored sequencing was performed in universities
and research centers from the United States, the
United Kingdom, Japan, France, Germany, and
China. The mapping of human genes is an important
step in the development of medicines and other
aspects of health care.
While the objective of the Human Genome Project is
to understand the genetic makeup of the human
species, the project has also focused on several other
nonhuman organisms such as E. coli, the fruit fly,
and the laboratory mouse. It remains one of the
largest single investigative projects in modern
science. Human genome project is called a mega
project mainly because of the aims of the project
which are as follows: Human genome is said to have
approximately 3x109
bp and if the cost of sequencing
is US $ 3 per bp (estimated cost in the beginning), the
total estimated cost of the project would be 9 billion
US dollars. Further if the obtained sequence were to
be stored in typed form in books, and if each page of
the book contained 1000 letters and each bookcontained 1000 pages, then 3300 such books would
be required to store the information of DNA
sequence from a single human cell. The enormous
amount of data expected to be generated also
necessitated the use of high speed computational
devices for data storage and retrieval and analysis.
The Human Genome Project originally aimed to map
the nucleotides contained in a human haploid
reference genome (more than three billion). Several
groups have announced efforts to extend this to
diploid human genomes including the International
HapMap Project, Applied Biosystems, Perlegen,
Illumina, JCVI, Personal Genome Project, and
Roche-454.
The "genome" of any given individual (except for
identical twins and cloned organisms) is unique;
mapping "the human genome" involves sequencing
multiple variations of each gene. The project did not
study the entire DNA found in human cells; some
heterochromatic areas (about 8% of the total genome)
remain un-sequenced.
Background
The project began with the culmination of several
years of work supported by the United States
Department of Energy .This 1987 report stated
boldly, "The ultimate goal of this initiative is to
understand the human genome" and "knowledge of
the human as necessary to the continuing progress of
medicine and other health sciences as knowledge of
human anatomy has been for the present state of
medicine." Candidate technologies were already
being considered for the proposed undertaking at
least as early as 1985.
The $3-billion project was formally founded in 1990
by the United States Department of Energy and the
U.S. National Institutes of Health, and was expected
to take 15 years. In addition to the United States, the
international consortium comprised geneticists in the
United Kingdom, France, Germany, Japan, China,
and India.
Due to widespread international cooperation and
advances in the field of genomics (especially in
sequence analysis), as well as major advances in
computing technology, a 'rough draft' of the genome
was finished in 2000 (announced jointly by then US
president Bill Clinton and the British Prime Minister
Tony Blair on June 26, 2000).. Ongoing sequencing
led to the announcement of the essentially complete
genome in April 2003, 2 years earlier than planned.
In May 2006, another milestone was passed on the
way to completion of the project, when the sequence
of the last chromosome was published.
History
In 1976, the genome of the RNA virus Bacteriophage
MS2 was the first complete genome to be
determined, by Walter Fiers and his team at the
University of Ghent (Ghent, Belgium). The idea for
the shotgun technique came from the use of an
algorithm that combined sequence information from
many small fragments of DNA to reconstruct a
genome. This technique was pioneered by Frederick
Sanger to sequence the genome of the Phage Φ-
X174, a virus (bacteriophage) that primarily infects
bacteria that was the first fully sequenced genome
(DNA-sequence) in 1977. The technique was called
shotgun sequencing because the genome was broken
into millions of pieces as if it had been blasted with a
shotgun. In order to scale up the method, both the
sequencing and genome assembly had to be
automated, as they were in the 1980s.Those techniques were shown applicable to
sequencing of the first free-living bacterial genome
(1.8 million base pairs) of Haemophilus influenzae in
1995 and the first animal genome (~100 Mbp) It
involved the use of automated sequencers, longer
individual sequences using approximately 500 base
pairs at that time. Paired sequences separated by a
fixed distance of around 2000 base pairs which were
critical elements enabling the development of the first
genome assembly programs for reconstruction of
large regions of genomes (aka 'contigs').
Three years later, in 1998, the announcement by the
newly-formed Celera Genomics that it would scale
up the pairwise end sequencing method to the human
genome was greeted with skepticism in some circles.
The shotgun technique breaks the DNA into
fragments of various sizes, ranging from 2,000 to
300,000 base pairs in length, forming what is called a
DNA "library". Using an automated DNA sequencer
the DNA is read in 800bp lengths from both ends of
each fragment. Using a complex genome assembly
algorithm and a supercomputer, the pieces are
combined and the genome can be reconstructed from
the millions of short, 800 base pair fragments. The
success of both the public and privately funded effort
hinged upon a new, more highly automated capillary
DNA sequencing machine, called the Applied
Biosystems 3700, that ran the DNA sequences
through an extremely fine capillary tube rather than a
flat gel. Even more critical was the development of a
new, larger-scale genome assembly program, which
could handle the 30–50 million sequences that would
be required to sequence the entire human genome
with this method. At the time, such a program did not
exist. One of the first major projects at Celera
Genomics was the development of this assembler,
which was written in parallel with the construction of
a large, highly automated genome sequencing
factory. Development of the assembler was led by
Brian Ramos. The first version of this assembler was
demonstrated in 2000, when the Celera team joined
forces with Professor Gerald Rubin to sequence the
fruit fly Drosophila melanogaster using the whole-
genome shotgun method.
[20]
At 130 million base
pairs, it was at least 10 times larger than any genome
previously shotgun assembled. One year later, the
Celera team published their assembly of the three
billion base pair human genome.
The Human Genome Project was a 13 year old mega
project, that was launched in the year 1990 and
completed in 2003. This project is closely associated
to the branch of biology called Bio-informatics. The
human genome project international consortium
announced the publication of a draft sequence and
analysis of the human genome—the genetic blueprint
for the human being. An American company—
Celera, led by Craig Venter and the other huge
international collaboration of distinguished scientists
led by Francis Collins, director, National Human
Genome Research Institute, U.S., both published
their findings.
This Mega Project is co-ordinated by the U.S.
Department of Energy and the National Institute of
Health. During the early years of the project, the
Wellcome Trust (U.K.) became a major partner, other
countries like Japan, Germany, China and France
contributed significantly. Already the atlas has
revealed some starting facts. The two factors that
made this project a success are:
1. Genetic Engineering Techniques, with
which it is possible to isolate and clone any
segment of DNA.
2. Availability of simple and fast technologies,
to determining the DNA sequences.
Being the most complex organisms, human beings
were expected to have more than 100,000 genes or
combination of DNA that provides commands for
every characteristics of the body. Instead their studies
show that humans have only 30,000 genes – around
the same as mice, three times as many as flies, and
only five times more than bacteria. Scientist told that
not only are the numbers similar, the genes
themselves, baring a few, are alike in mice and men.
In a companion volume to the Book of Life, scientists
have created a catalogue of 1.4 million single-letter
differences, or single-nucleotide polymorphisms
(SNPs) – and specified their exact locations in the
human genome. This SNP map, the world's largest
publicly available catalogue of SNP's, promises to
revolutionize both mapping diseases and tracing
human history. The sequence information from the
consortium has been immediately and freely released
to the world, with no restrictions on its use or
redistribution. The information is scanned daily by
scientists in academia and industry, as well as
commercial database companies, providing key
information services to bio-technologists. Already,
many genes have been identified from the genome
sequence, including more than 30 that play a direct
role in human diseases. By dating the three millions
repeat elements and examining the pattern of
interspersed repeats on the Y-chromosome, scientists
estimated the relative mutation rates in the X and the
Y chromosomes and in the male and the female germ
lines. They found that the ratio of mutations in male
Vs female is 2:1. Scientists point to several possible
reasons for the higher mutation rate in the male germ
line, including the fact that there are a greater number
of cell divisions involved in the formation of sperm
than in the formation of eggs.
State Of Completion
There are multiple definitions of the "complete
sequence of the human genome". According to some
of these definitions, the genome has already been
completely sequenced, and according to other
definitions, the genome has yet to be completely
sequenced. The genome has been completely
sequenced using the definition employed by the
International Human Genome Project. A graphical
history of the human genome project shows that most
of the human genome was complete by the end of
2003. However, there are a number of regions of the
human genome that can be considered unfinished:
• First, the central regions of each
chromosome, known as centromeres, are
highly repetitive DNA sequences that are
difficult to sequence using current
technology. The centromeres are millions
(possibly tens of millions) of base pairs long
and for the most part these are entirely un-
sequenced.
• Second, the ends of the chromosomes, called
telomeres, are also highly repetitive, and for
most of the 46 chromosome ends these too
are incomplete. It is not known precisely
how much sequence remains before the
telomeres of each chromosome are reached,
but as with the centromeres, current
technological restraints are prohibitive.
• Third, there are several loci in each
individual's genome that contain members of
multigene families that are difficult to
disentangle with shotgun sequencing
methods – these multigene families often
encode proteins important for immune
functions.
• Other than these regions, there remain a few
dozen gaps scattered around the genome,
some of them rather large, but there is hope
that all these will be closed in the next
couple of years.
In summary: the best estimates of total genome size
indicate that about 92.3% of the genome has been
completed and it is likely that the centromeres and
telomeres will remain un-sequenced until new
technology is developed that facilitates their
sequencing. Most of the remaining DNA is highly
repetitive and unlikely to contain genes, but it cannot
be truly known until it is entirely sequenced.
Understanding the functions of all the genes and their
regulation is far from complete. The roles of junk
DNA, the evolution of the genome, the differences
between individuals, and many other questions are
still the subject of intense interest by laboratories all
over the world
Goals
The sequence of the human DNA is stored in
databases available to anyone on the Internet. The
U.S. National Center for Biotechnology Information
(and sister organizations in Europe and Japan) house
the gene sequence in a database known as GenBank,
along with sequences of known and hypothetical
genes and proteins. Other organizations such as the
University of California, Santa Cruz, and
Ensemblpresent additional data and annotation and
powerful tools for visualizing and searching it.
Computer programs have been developed to analyze
the data, because the data itself is difficult to interpret
without such programs.
The process of identifying the boundaries between
genes and other features in a raw DNA sequence is
called genome annotation and is the domain of
bioinformatics. While expert biologists make the best
annotators, their work proceeds slowly, and computer
programs are increasingly used to meet the high-
throughput demands of genome sequencing projects.
The best current technologies for annotation make
use of statistical models that take advantage of
parallels between DNA sequences and human
language, using concepts from computer science such
as formal grammars.
Another, often overlooked, goal of the HGP is the
study of its ethical, legal, and social implications. It is
important to research these issues and find the most
appropriate solutions before they become large
dilemmas whose effect will manifest in the form of
major political concerns.
All humans have unique gene sequences. Therefore
the data published by the HGP does not represent the
exact sequence of each and every individual's
genome. It is the combined "reference genome" of a
small number of anonymous donors. The HGP genome is a scaffold for future work in identifying
differences among individuals. Most of the current
effort in identifying differences among individuals
involves single-nucleotide polymorphisms and the
HapMap.
Findings
Key findings of the draft (2001) and complete (2004)
genome sequences include
1. There are approximately 20,500 genes in human
beings, the same range as in mice and twice that of
roundworms. Understanding how these genes express
themselves will provide clues to how diseases are
caused.
2. Between 1.1% to 1.4% of the genome's sequence
codes for proteins
3. The human genome has significantly more
segmental duplications (nearly identical, repeated
sections of DNA) than other mammalian genomes.
These sections may underlie the creation of new
primate-specific genes
4. At the time when the draft sequence was published
less than 7% of protein families appeared to be
vertebrate specifi
How It Was Accomplished
The first printout of the human genome to be
presented as a series of books, displayed at the
Wellcome Collection, London
The Human Genome Project was started in 1989 with
the goal of sequencing and identifying all three
billion chemical units in the human genetic
instruction set, finding the genetic roots of disease
and then developing treatments. With the sequence in
hand, the next step was to identify the genetic
variants that increase the risk for common diseases
like cancer and diabetes.
It was far too expensive at that time to think of
sequencing patients’ whole genomes. So the National
Institutes of Health embraced the idea for a
"shortcut", which was to look just at sites on the
genome where many people have a variant DNA unit.
The theory behind the shortcut was that since the
major diseases are common, so too would be the
genetic variants that caused them. Natural selection
keeps the human genome free of variants that damage
health before children are grown, the theory held, but
fails against variants that strike later in life, allowing
them to become quite common. (In 2002 the National
Institutes of Health started a $138 million project
called the HapMap to catalog the common variants in
European, East Asian and African genomes.)
The genome was broken into smaller pieces;
approximately 150,000 base pairs in length. These
pieces were then ligated into a type of vector known
as "bacterial artificial chromosomes", or BACs,
which are derived from bacterial chromosomes which
have been genetically engineered. The vectors
containing the genes can be inserted into bacteria
where they are copied by the bacterial DNA
replication machinery. Each of these pieces was then
sequenced separately as a small "shotgun" project
and then assembled. The larger, 150,000 base pairs
go together to create chromosomes. This is known as
the "hierarchical shotgun" approach, because the
genome is first broken into relatively large chunks,
which are then mapped to chromosomes before being
selected for sequencing.
Funding came from the US government through the
National Institutes of Health in the United States, and
a UK charity organization, the Wellcome Trust, as
well as numerous other groups from around the
world. The funding supported a number of large
sequencing centers including those at Whitehead
Institute, the Sanger Centre, Washington University
in St. Louis, and Baylor College of Medicine.
The Human Genome Project is considered a Mega
Project because the human genome has
approximately 3.3 billion base-pairs; if the cost of
sequencing is US $3 per base-pair, then the
approximate cost will be US $10 billion.
If the sequence obtained was to be stored in book
form, and if each page contained 1000 base-pairs
recorded and each book contained 1000 pages, then
3300 such books would be needed in order to store
the complete genome. However, if expressed in units
of computer data storage, 3.3 billion base-pairs
recorded at 2 bits per pair would equal 786
megabytes of raw data. This is comparable to a fully
data loaded CD.
Public Versus Private Approaches
In 1998, a similar, privately funded quest was
launched by the American researcher Craig Venter,
and his firm Celera Genomics. Venter was a scientist
at the NIH during the early 1990s when the project
was initiated. The $300,000,000 Celera effort was
intended to proceed at a faster pace and at a fraction
of the cost of the roughly $3 billion publicly funded
project.
Celera used a technique called whole genome
shotgun sequencing, employing pairwise end
sequencing, which had been used to sequence
bacterial genomes of up to six million base pairs in
length, but not for anything nearly as large as the
three billion base pair human genome.
Celera initially announced that it would seek patent
protection on "only 200–300" genes, but later
amended this to seeking "intellectual property
protection" on "fully-characterized important
structures" amounting to 100–300 targets. The firm
eventually filed preliminary ("place-holder") patent
applications on 6,500 whole or partial genes. Celera
also promised to publish their findings in accordance
with the terms of the 1996 "Bermuda Statement," by
releasing new data annually (the HGP released its
new data daily), although, unlike the publicly funded
project, they would not permit free redistribution or
scientific use of the data. The publicly funded
competitor UC Santa Cruz was compelled to publish
the first draft of the human genome before Celera for
this reason. On July 7, 2000, the UCSC Genome
Bioinformatics Group released a first working draft
on the web. The scientific community downloaded
one-half trillion bytes of information from the UCSC
genome server in the first 24 hours of free and
unrestricted access to the first ever assembled
blueprint of our human species.
In March 2000, President Clinton announced that the
genome sequence could not be patented, and should
be made freely available to all researchers. The
statement sent Celera's stock plummeting and
dragged down the biotechnology-heavy Nasdaq. The
biotechnology sector lost about $50 billion in market
capitalization in two days.
Although the working draft was announced in June
2000, it was not until February 2001 that Celera and
the HGP scientists published details of their drafts.
Special issues of Nature (which published the
publicly funded project's scientific paper) and
Science (which published Celera's paper) described
the methods used to produce the draft sequence and
offered analysis of the sequence. These drafts
covered about 83% of the genome (90% of the
euchromatic regions with 150,000 gaps and the order
and orientation of many segments not yet
established). In February 2001, at the time of the joint
publications, press releases announced that the
project had been completed by both groups.
Improved drafts were announced in 2003 and 2005,
filling in to ≈92% of the sequence currently.
The competition proved to be very good for the
project, spurring the public groups to modify their
strategy in order to accelerate progress. The rivals at
UC Santa Cruz initially agreed to pool their data, but
the agreement fell apart when Celera refused to
deposit its data in the unrestricted public database
GenBank. Celera had incorporated the public data
into their genome, but forbade the public effort to use
Celera data.
HGP is the most well known of many international
genome projects aimed at sequencing the DNA of a
specific organism. While the human DNA sequence
offers the most tangible benefits, important
developments in biology and medicine are predicted
as a result of the sequencing of model organisms,
including mice, fruit flies, zebrafish, yeast,
nematodes, plants, and many microbial organisms
and parasites.
In 2004, researchers from the International Human
Genome Sequencing Consortium (IHGSC) of the
HGP announced a new estimate of 20,000 to 25,000
genes in the human genome. Previously 30,000 to
40,000 had been predicted, while estimates at the
start of the project reached up to as high as
2,000,000. The number continues to fluctuate and it
is now expected that it will take many years to agree
on a precise value for the number of genes in the
human genome

Reference
• Robert Krulwich. (2001-04-17). Cracking
the Code of Life. [Television Show]. PBS.
ISBN 1-5375-16-9..
• ^ "It's personal: Individualised genomics has
yet to take off". The Economist. 2010-06-17.
• ^ Barnhart, Benjamin J. (1989). "DOE
Human Genome Program". Human Genome
Quarterly
• ^ DeLisi, Charles (2001). "Genomes: 15
Years Later A Perspective by Charles
DeLisi, HGP Pioneer". Human Genome
News 11: 3–4.
• ^ Noble, Ivan (2003-04-14). "Human
genome finally complete". BBC News.
• ^ "Guardian Unlimited". The Guardian
(London). Archived from the original on
October 12, 2007.
• ^ [Human Genome Project
Race]http://cbse.ucsc.edu/research/hg
p_race
• ^ Adams, MD. et al. (2000). "The genome
sequence of Drosophila melanogaster.".
Science 287 (5461): 2185–2195.
• ^ IHGSC (2004). "Finishing the
euchromatic sequence of the
human genome.". Nature 431
(7011): 931–945
• ^ Waterston RH, Lander ES,
Sulston JE (2003). "More on the
sequencing of the human genome".
Proc Natl Acad Sci U S A. 100
• ^ Kennedy D (2002). "Not wicked,
perhaps, but tacky". Science 297
(5585):
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Tagged Pages: human genome project for seminar,
Popular Searches: wikipediareporter genes, the who is who bermuda, human security report project, who is collins tuohy boyfriend, the hapmap, human genome sequence seminar tv, seminario mayor de santa fe,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  Human Artificial Chromosome seminar project explorer 0 2,495 15-03-2011, 09:28 PM
Last Post: seminar project explorer
  Genetic Engineering of Human Stem Cells for Enhanced Angiogenesis Using Biodegradabl seminar project explorer 0 2,096 15-03-2011, 09:17 PM
Last Post: seminar project explorer
  Human-Structure Interaction in Cantilever Grandstands seminar class 0 1,965 16-02-2011, 01:46 PM
Last Post: seminar class

Forum Jump: