Bioinformatics: The Machine Learning Approach, second Edition

Keterangan Bibliografi
Pengarang : Baldi, Pierre
Pengarang 2 :
Kontributor : Brunak, Soren
Penerbit : Massachusetts Institute of Technology
Kota terbit : Massachusetts
Tahun terbit : 2001
ISBN : 0-262-02506-X
Subyek : Bioinformatics - Molecular biologi - computer simulation
Klasifikasi : 572.801 13 Bal B
Bahasa : English
Edisi : Ed. 2
Halaman : 477 hlm.: ilus.
Jenis Koleksi Pustaka

E-Book

Kategori Pustaka

Tidak ada kategori

Abstraksi
Bioinformatics, however, continues to evolve very rapidly, hence the need for a new edition. In the past three years, fullgenome sequencing has blossomed with the completion of the sequence of the fly and the first draft of the Human Genome Project. In addition, several other high-throughput/ combinatorial technologies, such as DNA microarrays and mass spectrometry, have considerably progressed. Altogether, these highthroughput technologies are capable of rapidly producing terabytes of data that are too overwhelming for conventional biological approaches. As a result, the need for computer/statistical/machine learning techniques is today stronger rather than weaker. In all areas of biological and medical research, the role of the computer has been dramatically enhanced in the last five to ten year period. While the first wave of computational analysis did focus on sequence analysis, where many highly important unsolved problems still remain, the current and future needs will in particular concern sophisticated integration of extremely diverse sets of data. Genome-wide gene expression measurements using DNA microrarrays is, in essence, a realization of tens of thousands of Northern blots. As a result, computational support in experiment design, processing of results and interpretation of results has become essential.These developments have greatly widened the scope of bioinformatics. The main focus of the book is on methods, not on the history of a rapidly evolving field. While we have tried to quote the relevant literature in detail, we have concentrated our main effort on presenting a number of techniques, and perhaps a general way of thinking that we hope will prove useful. We have tried to illustrate each method with a number of results, often but not always drawn from our own practice. Chapter 1 provides an introduction to sequence data in the context of molecular biology, and to sequence analysis. It contains in particular an overview of genomes and proteomes, the DNA and protein “universes” created by evolution that are becoming available in the public databases. It presents an overview of genomes and their sizes, and other comparative material that, if not original, is hard to find in other textbooks. Chapter 2 is the most important theoretical chapter, since it lays the foundations for all machine-learning techniques, and shows explicitly how one must reason in the presence of uncertainty. It describes a general way of thinking about sequence problems: the Bayesian statistical framework for inference and induction. Chapter 3 is a warm-up chapter, to illustrate the general Bayesian probabilistic framework. It develops a few classical examples in some detail which are used in the following chapters. Chapter 4 contains a brief treatment of many of the basic algorithms required for Bayesian inference, machine learning, and sequence applications, in order to compute expectations and optimize cost functions. These include various forms of dynamic programming, gradient-descent and EM algorithms, as well as a number of stochastic algorithms, such as Markov chain Monte Carlo (MCMC) algorithms. Chapter 5 provides an introduction to the theory of neural networks. It contains definitions of the basic concepts, a short derivation of the “backpropagation” learning algorithm, as well as a simple proof of the fact that neural networks are universal approximators. Chapter 6 contains a selected list of applications of neural network techniques to sequence analysis problems. Chapter 7 contains a fairly detailed introduction to hidden Markov models (HMMs), and the corresponding dynamic programming algorithms (forward, backward, and Viterbi algorithms) as well as learning algorithms (EM, gradientdescent, etc.). Chapter 8 contains a selected list of applications of hidden Markov models to both protein and DNA/RNA problems. It demonstrates, first, how HMMs can be used, among other things, to model protein families, derive large multiple alignments, classify sequences, and search large databases of complete or fragment sequences. Chapter 10 presents phylogenetic trees and, consistent with the framework of Chapter 2, the inevitable underlying probabilistic models of evolution. Chapter 11 covers formal grammars and the Chomsky hierarchy. Stochastic grammars provide a new class of models for biological sequences, which generalize both HMMs and the simple dice model. Chapter 12 focuses primarily on the analysis of DNA microarray gene expression data, once again by generalizing the die model. Chapter 13 contains an overview of current database resources and other information that is publicly available over the Internet, together with a list of useful directions to interesting WWW sites and pointers.
Inventaris
# Inventaris Dapat dipinjam Status Ada
1 8905/P1/2020.c1 Ya
2 8906/P1/2020.c2 Ya
3 8907/P1/2020.c3 Ya
4 8908/P1/2020.c4 Ya
5 8909/P1/2020.c5 Ya
6 8910/P1/2020.c6 Ya