Bioinformatics - Lab 1tm

Text Mining

Tamara Polajnar



The purpose of this lab is to demonstrate the need for text mining (TM) in Systems Biology. TM is a method of extracting knowledge from a large text collection. This data depends on the needs of the user. In Systems Biology one of the goals is to represent an organism by a schematic diagram as a system of interactions between its basic components such as genes and proteins. Many interactions are observed independently of each other by different research teams around the world. Some interactions may be a byproduct of research conducted for a specific purpose not directly related to the mission of mapping all the interactions in the organism. Let us for the moment concentrate on interactions between proteins. Proteins are products of genes. Each protein has specific purposes in the cell. They are like intelligent agents performing tasks in the cell. Proteins can influence each other by binding, inhibiting, activating, etc. Interacting proteins can form networks (pathways). Pathways are classified according to their purpose into: metabolic, signalling, transcription, etc.

In the first part of this lab we will examine one signalling pathway. This pathway is a small part of a larger network involved in regulating cell division. If the pathway is damaged it can lead to uncontrolled cell division which is the cause of cancer. Thus this pathway is a highly important to cancer researchers. We will use available medical research tools to find out different aspects of this pathway.


You are expected to spend roughly half the practical on each of the parts so please move on to the second part half at half past the hour.



The tools which we will use in this lab are:




The RKIP pathway, shown on the image below, is fully described in the paper by Cho et al. It is essential in understanding of how cancer works. Researchers perform experiments on cells containing this pathway by removing or altering parts of the pathway, stimulating the receptor proteins which initialise cascade reactions, or by adding drugs. The structure and function of this pathway is slowly revealed through many such experiments. RKIP pathway is formed using a part of the MAPK pathway which contains Raf-1 by interacting it with Raf kinase inhibitory protein (RKIP).


RKIP Pathway
The RKIP pathway.


Raf-1, RKIP, ERK-PP, etc. are all proteins whose interactions are represented by this pathway. Raf-1/RKIP denotes a product of their interaction, where as the PP after MEK-PP or ERK-PP denotes the activated form of the protein. This means that ERK-PP is not in the strictest sense synonymous with ERK, however there is no standard way of writing this name so many researchers may write it as pERK or ERKPP.



Part 1


As discussed before, in this part we will use some of the on-line tools to familiarise ourselves with different aspects of this pathway and to gain full appreciation for the difficulties biologists may go through trying to find specific information using the standard information access tools.

Some of these resources may take a long time to load, in particular the Gene Ontology AmiGO tool, so please try to do some of these searches in parallel in different browser tabs/windows.

Spend roughly 10 minutes per question.
  1. Getting familiar with the pathway. In this exercise locate different components of the pathway and write down their synonyms. Try to understand the function of the pathway by looking at different species where it occurs and by following various links.
    • Open the KEGG site in one browser window. Looking at the whole MAPK pathway, can you see where the RKIP pathway would fit on this network? Which of the components from this network are also in the MAPK pathway? Click on the components to find out more about them. Which pathways share these same components?
    • Open Entrez Gene in another window search for RKIP. What did you notice about this particular protein? Write down its synonyms.
    • For further information look on The Gene Ontology site, but beware that it takes a while to load.
  2. Reading about the pathway.
    • Use PubMed and Web of Knowledge to do various searches about the pathway, using different kinds of keywords which you think would get you more information about the components or the pathway from literature.
  3. Using specialist websites.
    • Go to EBIMed and try searching for the components. What kinds of results are you getting?
    • Pick five components and search for them using ChiliBot. How does the resulting map compare with the original pathway? On the left, click on Edit Synonyms. Add your own synonyms. Try searching again.