These notes are about the non-trivial sequence similarities that can not be easily found with standard database search methods. This is an introductory course for students and for biologists who want to know more about a particular sequence. They deal primarily with protein sequences but the concepts, and most of the methods are applicable to nucleic acid sequences as well.
The experience of previous courses has shown that the main problem for a biologist is to understand the seemingly complicated and vague terminology of this young field. This is the subject of the first part of this course.
In the following chapter II we describe the most commonly used concepts of distant similarity searching. In the subsequent chapter we describe the principle of the main strategies, with pointers to the servers available on the Internet. In the last section we give a practical exercise that you can follow through a "built-in" example or you can do an exercise on your own sequence. It consists in finding homologs, building and verifying a new pattern. WWW links are provided at each problem that lead you to a server dedicated to solving the problem.
So, how to read this tutorial? First of all, with patience. some of the parts may be too elementary for you, some of them may be too jargon-ridden. We try to improve upon this and you can help by sending suggestions.
Note: The boxes are links to detailed explanations on a subject. The boxes are literature references and short notes. Click on the figures to get an enlarged and readable form. At the bottom of each document you see a navigation bar (e.g. here below). Clicking on it will take you directly top the main chapter indicated.