Overview of Rational Drug Design
© Copyright 2017 Herbert J. Bernstein
Drug Discovery versus Drug Design
For most of history, new medications have been introduced to medical practice by hit or miss experimentation with new uses for
existing drugs or by experimentation with minerals, plants, animal products and other substances found in nature, or
one had to synthesize new substances. One had to extract or synthesize thousands of potential leads to slowly and patiently
try in vitro (in the lab) or in vivo (in animals and human subjects) to discover useful activity, and hopefully
not injure or kill the subjects. This is the process of drug discovery. See https://en.wikipedia.org/wiki/Drug_discovery.
Ligands and Targets
The action of most drugs involves at least two molecules, the drug itself, and the "target", a molecule in a biological pathway the
normal action of which the drug either inhibits or promotes. In many cases the drug is a ligand that binds in an active site on the
target. Usually the drug is a
small molecule and the target is a macromolecule,
most commonly a G-coupled-protein receptor (GCPR) or a kinase. This is a rendering of a portion of the surface of Protein Data Bank entry
5F19, "The Crystal Structure of Aspirin Acetylated Human Cyclooxygenase-2" by Lucido, M.J., Orlando, B.J., Vecchio, A.J., Malkowski, M.G.
(2016) Biochemistry 55: 1226-1238.
As the
three-dimensional structures of an increasing number
of small molecules and macromolecules became know, it became increasingly feasible to move from screening in vitro
or in vivo to a least preliminary screening in silico to at least reject the least promising leads. The
combination of highly automated laboratory techniques, large databases of annotated chemical and biological data, and computational
chemistry has resulted in high throughput screening and a move from the ability to design drugs that may exist only as a computer model.
Representing Molecules in Computers
In order to be able to work with molecules in computers, we need some representation of those molecules. Internally computers work only
with numbers. We can put letters and words and other innformation into computers by assigning a distinct number to each letter, word or
item of information, so a good starting point is to represent molecules as strings of letters and digits, i.e. to describe each molecule
in words. We may lose some detail, especially in terms of fine-grained dynamics and steric constraints, in doing so, but is is a good start.
For example, consider aspirin , as shown by pubchem.ncbi.nlm.nih.gov. This simple, but important, molecule
consists of 9 carbon atoms, 8 hydrogen atoms and 4 oxygen atoms, so we can describe it by chemical formula as
C9H8O4, but that tells us almost nothing about how where the various atoms are placed relative to one
another. Perhaps we should use a formula that puts related atoms near one another in the formula, as in this NIOSH formula for aspirin,
CH3COOC6H4COOH. That helps, put will a little more effort, we can use the SMILES representation,
CC(=O)OC1=CC=CC=C1C(=O)O, or the InChi representation, InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12), to give us enough
information for an accurate representation as a 2-dimensional chemical diagram. .
If we need more detail we can record the actual 3-dimensional coordinates to the atoms, using CIF,
as in COD entry 1515581, which is obtained from the Crystallographic Open Database
http://www.crystallography.net/cod/1515581.html.
There are many more representations of chemical information. A good starting point is to consider the Crystallographic
Information Framework (CIF), the Chemical Markup Language (CML), SMILES and InChi.
For CIF,
look
at the documentation and software at
http://www.iucr.org/resources/cif/
and the original paper on CIF at
http://scripts.iucr.org/cgi-bin/paper?es0164
For CML look at
http://www.xml-cml.org/ and the paper at
http://pubs.rsc.org/en/content/articlehtml/2001/nj/b008780g
For SMILES look at the wikipedia article at
https://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system
and the paper at
http://pubs.acs.org/doi/pdf/10.1021/ci00057a005.
There is a good tutorial on the rules of smiles from
http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
For InCHI see the wikipedia article at
https://en.wikipedia.org/wiki/International_Chemical_Identifier
and the paper at
http://jcheminf.springeropen.com/articles/10.1186/s13321-015-0068-4
and the YouTube video at https://www.youtube.com/watch?v=rAnJ5toz26c
Differences in Representing Small Molecule Ligands and Macromolecule Targets
Much of rational drug design software is based on knowledge of three-dimensional atomic
models of molecules. If you have accurate observations of element types and relative positions of atoms, you
can make good estimates of bonding patterns and charges. With some techniques, you can even observe
charge densities in addition to atomic positions. However, it become increasingly difficult to
estimate the positions of individual atoms, the larger a molecule gets. Therefore you may not have
accurate observations of individual atoms positions for the larger macromolecules. Instead what you
are more likely to be able to observe are aggregations of atoms into groups or residues.
Therefore the primary representations of ligands are likely to be in terms of individual atoms (see
the periodic table [Mendeleev, Dmitrii. "The relation between the properties and atomic weights of the elements"
Journal of the Russian Chemical Society 1 (1869): 60-77]
in https://en.wikipedia.org/wiki/Periodic_table),
while the primary representations of macromolecules are likely to be in terms of sequences of residues,
either amino acids (see https://en.wikipedia.org/wiki/Amino_acid) or
nuclei acids (see https://en.wikipedia.org/wiki/Nucleic_acid).
This difference makes the representation of ligands simpler than the representation of macromolecules.
The Periodic Table
Interactive dynamic period table from http://www.ptable.com
The periodic table helps us to understand which elements are likely to bond to which other elements, forming compounds that
will interact in predictable ways, an essential part of rational drug design.
Table of Amino Acids
This table of amino acids (from the RasMol Manual http://www.openrasmol.org/doc/ helps us to understand how the most commin residues in
proteins will interact.
Residues: | ala | arg | asn | asp | cys | glu | gln | gly | his | ile | leu | lys | met | phe | pro | ser | thr | trp | tyr | val
|
---|
| A | R | N | D | C | E | Q | G | H | I | L | K | M | F | P | S | T | W | Y | V
|
---|
Predefined Set |
|
---|
| A | R | N | D | C | E | Q | G | H | I | L | K | M | F | P | S | T | W | Y | V
|
---|
acidic | | |
| * | | * |
| | | |
| | | |
| | | |
|
|
---|
acyclic | * | * | *
| * | * | * | *
| * | | * | *
| * | * | |
| * | * | |
| *
|
---|
aliphatic | * | |
| | | |
| * | | * | *
| | | |
| | | |
| *
|
---|
aromatic | | |
| | | |
| | * | |
| | | * |
| | | * | *
|
|
---|
basic | | * |
| | | |
| | * | |
| * | | |
| | | |
|
|
---|
buried | * | |
| | * | |
| | | * | *
| | * | * |
| | | * |
| *
|
---|
charged | | * |
| * | | * |
| | * | |
| * | | |
| | | |
|
|
---|
cyclic | | |
| | | |
| | * | |
| | | * | *
| | | * | *
|
|
---|
hydrophobic | * | |
| | | |
| * | | * | *
| | * | * | *
| | | * | *
| *
|
---|
large | | * |
| | | * | *
| | * | * | *
| * | * | * |
| | | * | *
|
|
---|
medium | | | *
| * | * | |
| | | |
| | | | *
| | * | |
| *
|
---|
negative | | |
| * | | * |
| | | |
| | | |
| | | |
|
|
---|
neutral | * | | *
| | * | | *
| * | * | * | *
| | * | * | *
| * | * | * | *
| *
|
---|
polar | | * | *
| * | * | * | *
| | * | |
| * | | |
| * | * | |
|
|
---|
positive | | * |
| | | |
| | * | |
| * | | |
| | | |
|
|
---|
small | * | |
| | | |
| * | | |
| | | |
| * | | |
|
|
---|
surface | | * | *
| * | | * | *
| * | * | |
| * | | | *
| * | * | | *
|
|
---|
Rational Drug Design Processes
- Designation of Targets
- Molecules in a disease-related pathway for which a changes in activity may impact
the course of the disease.
- The targets must be druggable
- May be identified by sequence or structual homology to known druggable targets
- Need to identify potential active sites
- Designation of Drug Leads
- Usually small molecule ligands
- Molecules complementary to identified active sites
- May have the structures of the targets to match against
- May have pharmacophore defined by other known ligands (https://en.wikipedia.org/wiki/Pharmacophore)
- May be based on Quantitative structure-activity relationship (QSAR) models (https://en.wikipedia.org/wiki/Quantitative_structure%E2%80%93activity_relationship
- If an accurate 3D target binding site structure is available, may try virtual screening by docking models of ligands to the model of the binding site
- May screen to avoid binding to the wrong targets
- Optimize lead characteristics to achieve druggability with acceptable toxicity
For some desirable characteristics of a drug, see admet.html