The trouble with proteins…
Unlike DNA with just four basic molecules, in proteomics things are not so simple, one gene does not just produce one protein. The genes (pages) are made up of a series of sub-structures called exons (paragraphs), which can be combined in different ways to give rise to whole series of very similar but different proteins. Just to complicate things further, once proteins are made, they are decorated with various other chemicals such as phosphate, sugars or fats. These decorations drastically affect the function of the protein, for example phosphate usually acts as an on-off switch and sugars can tell the proteins where to go in the cell. Thus the determination of the sequence of the human genome was simple since there are only 46 molecules, albeit huge ones, made up of 4 building blocks or letters.
The amino acids
Proteins have 20 building blocks, each of which can be modified or decorated after the protein is built. Hence the study of the protein version of the genome, the 'proteome' must deal with 20,000 genes that can be arranged to give some 1,000,000 proteins that can be decorated with over 300 different chemicals. Not only that, proteomics, the study of the proteome, must also define which proteins are being produced in a certain type of cell at a specific time, how they are modified, where they are in the cell and with whom they are in contact and finally and most difficult, what is the function of the protein.
Schematic of ost-translatioal modification of Histone H4