• Aucun résultat trouvé

Challenges in Structural Biology and Protein Kinases

Dans le document The DART-Europe E-theses Portal (Page 33-37)

Although Imatinib (Gleevec) is highly selective for the specific inactive form of the Bcr-Abl kinase it also binds three other non-target proteins. Inhibitor profiling of Dasatinib, another drug prescribed for chronic myelogenous leukemia (CML), targeting the same protein, shows a lower selectivity and much broader range of 35 other kinases (Hantschel, et al., 2008). The difference between the two drugs is in the mode of inhibition. Imatinib is a Type II inhibitor targeting the inactive conformation of Abl kinase and Dasatinib is Type I, binding to its active conformation. Interestingly, Dasatinib is a 300 times higher affinity binder for Bcr-Abl than Imatinib and therefore exhibits higher resistance to common mutants of the protein. This study demonstrated how two drugs designed against the same target could have completely different inhibition profiles depending solely on the form of the protein. While both compounds are proven successful clinical inhibitors, they demonstrate how a good structural understanding of the protein kinase active and inactive conformation is a prerequisite for successful drug design. Furthermore, establishing in vivo roles for a specific kinase commonly exploits inhibitors as research tools to neutralize the kinase and study the effect on the pathways involved.

Challenges in Structural Biology and Protein Kinases

Structural biology is the field concerned with the molecular structure of biological macromolecules, especially proteins. The objective commonly is to understand the

31

relationship between their structure and how alterations affect their function. As demonstrated in the previous sections, structural knowledge on protein kinases is important for structure-oriented drug design towards treating diseases and studying various cellular signalling pathways.



One of the early products of the various genome sequencing projects was a large catalogue of open reading frames encoding both known and unknown proteins. A branch of structural biology arose to attempt to process this data into 3D protein structures. Protein Structure Initiatives (PSI) were created with the ambitious aim to express and determine structures covering a large sequence space so that computational tools can aid a more thorough coverage at a next step (Burley, 2000). These efforts resulted in “structure pipelines” capable of processing a multitude of targets in high-throughput fashion. At the front end of such pipelines, stands target selection aiming to establish a list of proteins conforming to the aim of the project e.g. novel structures/folds, or targets with significance to human health. The next steps are automated high-throughput cloning, expression, purification, crystallization, data collection, and processing. Finally, methods of semi-automatic publication are intended to annotate the new structures and deposit them in the PDB (Service, 2008).

Structural genomics has not been without criticism, generally related to the choice of targets and, in the case of the PSI, its focus on fold space coverage. Some commentators suggest it has overlooked difficult proteins and has not deposited enough novel or valuable protein structures. However, an undebatable fact is that it has driven methodological progress in most areas of structure determination. Beyond this, protein bioinformatic tools have developed significantly with benefits in tools ranging from structure solution to prediction of function (Watson, et al., 2007). Moreover, the critical assessment of structure prediction (CASP)

32

project has shown significant improvements in structure prediction accuracy since its start in 1994 (Cozzetto, et al., 2009). It has even been shown that structure prediction models can sometimes be used for structure determination by molecular replacement (Qian, et al., 2007).

Finally and importantly, along with the numerous new structures a consequence of the protein structure initiatives is the generation of accurate metrics describing the difficulties in expressing, purifying and crystallising proteins.



Large amounts of soluble purified protein are required for structural characterisation and inhibitor screening (Blundell, et al., 2002). For all the recent advancements in structural biology, a major bottleneck remains - the obtention of soluble protein. Protein expression in Escherichia coli remains the standard for heterologous protein production due to well-established recombinant DNA manipulation techniques, low cost, and eases of isotope and heavy atom labelling. However, these advantages are accompanied by problems of aggregation and degradation (Dobson, 2004) which are relatively common in E. coli.

Sometimes targets can be expressed more successfully in eukaryotic expression hosts such as mammalian, yeast or insect cells, but these are far from universal solutions and insufficient expression still accounts for the lack of structural characterization of a large number of proteins.



It is not always clear why viral and mammalian targets are more difficult to express than those from bacteria, but one common observation there is that they have a higher complexity of domain organization. In contrast to prokaryotic proteins, they are often larger and comprise multiple domains connected by longer linkers or low-complexity regions (Ward, et al., 2004). Frequently they may be subunits of multi-component complexes, or at least

33

require interaction of partners for stability, either via binding or post-translational modification. Expression of such proteins in full-length form in E. coli frequently results in aggregation or degradation. Even when expressible in alternative hosts, the protein may not be in a crystallisable form due to high flexibility.

When an entire protein cannot be expressed, it is common practice to isolate separate domains. Classically, the identification of domains within proteins relies on conservation of regions between homologues present in the sequence databases. Therefore, the successful outcome of this approach is totally dependent on the presence of sequenced homologues and when none are present in the databases, studies can be blocked. In the case of many viral genes that mutate rapidly and have poorly understood evolutionary histories, this is often the case. Furthermore, sequence analysis-based approaches sometimes fail to yield large amounts of soluble material even when homologous sequences with useful sequence conservation are available, as in the case of human protein kinases. The conclusion is that expression of stable domains is an intricate process dependent on many unpredictable underlying factors beyond those identifiable from sequence analyses including folding efficiency, requirement for chaperones, ligands and protein partners, toxicity, mRNA secondary structure, codon usage and intramolecular stabilisation from other parts of the protein.



Since the first protein kinase structure of PKA-C in 1991 our knowledge of this class of proteins has increased significantly canonizing the various structural elements of the catalytic domain and elucidating some spectacular mechanisms of activity regulation. While structural genomics initiatives have contributed significantly to the improved structural coverage of the human kinome (Marsden and Knapp, 2008) alternative approaches are still required for kinases resisting overexpression.

34

It is perhaps surprising that many protein kinases have resisted protein expression since the structure of their catalytic domains appears well-defined and highly conserved. Some difficulties may be linked to the loose definition of the catalytic domain C-terminal limits (as discussed earlier, it sometimes folds back to the domain and/or stabilizing). Other likely problems are poor folding efficiency and necessity/heterogeneity of post-translational modifications. Being an enzyme, overexpression may result in an unspecific, toxic activity in the heterologous expression system.

Some of the factors that complicate expression of kinases can start to be addressed using synthetic genes that reduce mRNA secondary structure or codon limitations that reduce translation efficiency in E. coli. However, this technology is still in development and the real factors that enhance expression are still being defined (Welch, et al., 2009). Inactive mutants are often used to reduce the toxicity effects, as is coexpression of phosphatases. Even with these tricks, there is still a need for precise domain definition since a good construct is a basic prerequisite for further expression optimisation. Protein expression-compatible boundaries do not always coincide with those predicted from sequence conservation in alignments. One approach that can address this is the generation of random construct diversity coupled to a screen or selection for soluble protein fragments, borrowing strategies from directed evolution methods of protein engineering.

Dans le document The DART-Europe E-theses Portal (Page 33-37)