• Aucun résultat trouvé

Carbohydrate and bacterial binding specificity of human intelectin-1

N/A
N/A
Protected

Academic year: 2021

Partager "Carbohydrate and bacterial binding specificity of human intelectin-1"

Copied!
163
0
0

Texte intégral

(1)

Carbohydrate and Bacterial Binding Specificity of Human Intelectin-1

By

Christine R. Isabella M.S. Biochemistry

University of Wisconsin – Madison, 2017 B.S. Molecular and Cellular Biology

University of Puget Sound, 2012

SUBMITTED TO THE DEPARTMENT OF CHEMISTRY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY IN CHEMISTRY AT THE

MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2021

Ó 2021 Massachusetts Institute of Technology. All Rights Reserved.

Signature of Author: _____________________________________________________________ Department of Chemistry January 11, 2021

Certified By: ___________________________________________________________________ Laura L. Kiessling Novartis Professor of Chemistry Thesis Supervisor

Accepted By: ___________________________________________________________________ Adam P. Willard Associate Professor of Chemistry Chair, Department Committee on Graduate Students

(2)

This doctoral thesis has been examined by a committee of professors from the Department of Chemistry as follows:

Barbara Imperiali _______________________________________________________________ Thesis Committee Chair Class of 1922 Professor of Biology and Chemistry

Laura L. Kiessling _______________________________________________________________ Thesis Supervisor Novartis Professor of Chemistry

Ronald T. Raines ________________________________________________________________ Thesis Committee Member Roger and Georges Firmenich Professor of Natural Products Chemistry

Eric J. Alm ____________________________________________________________________ Thesis Committee Member

(3)

Carbohydrate and Bacterial Binding Specificity of Human Intelectin-1

By

Christine R. Isabella

Submitted to the Department of Chemistry on January 15, 2021 in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in

Chemistry

ABSTRACT

The mucosal surfaces of the human body exist in close contact with complex communities of resident microorganisms termed the microbiome. The microbiome is crucial for host health, and therefore the host must discern between which microbes colonize and which must be cleared. Human soluble lectins are secreted carbohydrate-binding proteins that bind microbes by specific recognition of cell surface glycans. Many soluble lectins are important mucosal innate immune factors, as lectins binding to microbes can result in their clearance from the host. However, the glycan and microbial binding specificities of lectins are poorly defined. In this thesis, I aim to address this gap with a focus on human intelectin-1 (hItln-1). In Chapter 1, I review the recently identified class of lectins, the X-type lectins. The X-type lectins, or intelectins, are found

throughout chordates and share highly conserved sequences but their biological roles are not well understood. However, their expression patterns and microbial binding specificity suggest a role in regulation of the microbiome. In Chapter 2, I build on previous work to further define hItln-1 carbohydrate specificity. These studies reveal that carbohydrate conformation is stabilized by stereoelectronic effects, and that carbohydrates are bound by hItln-1 in their stabilized

conformation. In Chapter 3, I turn to bacterial cell recognition by hItln-1 and determine that hItln-1 displays competitive binding to bacterial strains in a mixture. These studies reveal the need to assay lectin-bacteria recognition against diverse microbial communities to understand their binding specificity in a biological context. In Chapter 4, I develop lectin-sequencing (lectin-SEQ) as a method for identifying bacterial targets of lectins in native communities. Using the human stool microbiome, I assess binding to stool bacteria by hItln-1, and surfactant protein-D (SP-D). Lectin-SEQ reveals that hItln-1 recognizes health-promoting commensal bacteria, while SP-D recognizes pathogenic bacteria. These results indicate a novel role for hItln-1 in promoting colonization of commensal bacteria.

Thesis Supervisor: Laura L. Kiessling Title: Novartis Professor of Chemistry

(4)

Acknowledgements

It turns out that a Ph.D. is true type II fun. There were countless times when I thought I wouldn’t do it, and for all of those times there were people there who supported and encouraged me, and those who ensured that, in the end, it all looked like fun. To all of you, I owe a debt of gratitude.

First, I must thank Laura. You have always encouraged me and believed in my abilities more than I believed in myself. I will always be grateful for your push to dive deep into lectin-SEQ and your interest and excitement in seeing it through to the finish line. I appreciate, too, the moments away from the science when you took the time to support me. I feel very lucky to have an advisor who truly cares about me as a person and a scientist.

The Kiessling Group has been an incredible place to learn and live and grow over the past five years. I was drawn to the lab in large part because of the people, and I still feel lucky to spend my days with you all. I have to thank Darryl, for recruiting me, for teaching me all of the hIntL secrets, for believing in my abilities, and for passing down such an awesome project. I learned so much from you in the first months of my time in the lab that set the stage for my time in the Kiessling lab, and to this day, I hear your voice in my head talking about all of the cool hIntL projects we could do. Heather, I will never forget the first time you talked to me. You offered me a pour-over coffee and I felt like I was welcomed into a club. I admire you so much as a scientist and a person, and I learned so much from you, including but not limited to how to most efficiently order Jimmy John’s to the lab. Alex—we really went through it all together. You have been a source of encouragement to me day after day throughout the years and I am

appreciative of you always making the time to troubleshoot experiments, chat, and snack. I hope one day we can return to Bull Shoals and live our best lives. Sayaka, thank you for always sharing your snacks with me and reminding me to eat when I am hangry. Thank you for always taking the time to help me think through experiments and results. And thank you for just being the most kind and sincere and caring person, and for running TWO half marathons with me that I was undertrained for. Finally, I owe so much to LectinLand for embodying the “all hands on deck” motto. I am grateful for having such a wonderful team with so much engagement and excitement about all of our various projects. Thank you all for teaching me, working with me, spending hours sorting bacteria with me, and just generally being the best subgroup.

I want to next thank those who made UW – Madison feel like home. To my IPiB family—moving to a new institution made me realize how special the people and the relationships built at UW truly are. To Delia, my soul twin and little sister, thank you for

everything. To Dylan (Dylz), our QQ and Dominoes study sessions got me through so much, and have been greatly missed the last three years. And I’m guessing you really need a haircut. To Dana, I am yet to meet a comparable cooking partner. To Mark, my bike has been lonely, and I haven’t fallen off of it recently. To Karl, I hope my life legacy is the flong cape, and I am so glad that I get to still see you in Boston. To Anji, you have been an inspiration to me since the day I met you and I can’t wait to see what you do next. Ian and Sue, I’m so glad that we were all Discertators together, and that it led to so many adventures in Wisconsin and Boston. From bike camping to Friday Fish Fry to candlepin bowling, you guys have been incredible friends. I am

(5)

also grateful that I got to collaborate with Ian and had the opportunity to learn so much about protein crystallography from him.

When the Kiessling Group moved to MIT, I was lucky to meet incredible people who have made my experience in Cambridge unforgettable. Lisa, you were my first MIT friend and you have always made me feel welcome here. I am grateful to have you as a source of

encouragement and inspiration, for teaching me everything about job hunting, and for countless runs…and pastries. Smrithi, thank you for always showing up to lab with a smile and ready to make lectins in cells that were basically dead. You brought much needed light to my days and real perseverance to our project and I will always be sad that a pandemic cut your time in the lab short. I have had so much fun working with you and watching you grow over the years, and I am excited to see what you do next. Katherine, thank you for being a great friend, only sometimes being distracting, and for always being passionate. I am grateful for the time we spent as co-presidents of WIC, for ceramics (RIP), and for always having you to talk to and knowing that you will go to bat for me and/or probably literally fight people if I need you to. Victoria, you are wise beyond your years. Thank you for always keeping the perspective that life is full of pain and that you can’t ruin a project with one experiment. Janet, we might not have a snack shelf, but we have many snack adventures. I am grateful for our quarantine walks and for your listening ear and constant encouragement. I don’t know if I would have made it through the last few months if it weren’t for both of you, Sugo Sunday, Caturday, and boba. I believe you will truly move with motion one day.

There are a few very special people who I wouldn’t have made it through without. Liz, I don’t really know how it took us until 5 years after college to know each other, but you are one of my favorite humans. Thank you for randomly moving to Boston for a year so that we could be friends forever. You seriously were the friend I needed that year. Nacey, I don’t even know how it happened but now you are my best friends. Nate, thank you for striking the appropriate balance between making fun of me for how aloof I am and also reminding me that I am smart and good at things. Because of you I can always remember that this whole PhD was basically just a really really long V2. Stacey, thank you for always being down for the beach, slumber parties, and shopping. And corn dogs. Cecie, I don’t even have words for what you mean to me. Thank you for being the best SW and being there every moment of every day for me. For all of the

adventures and movie nights. And for also sometimes telling me to suck it up. Finally, I have to thank Dave for being in this with me and still putting up with me. Thank you for listening to, encouraging, and supporting me, for putting up with all of my tears, and for literally and figuratively pulling me up cliffs.

Lastly and most of all, I have to thank my family. My brother, Adam, for doing everything first in life so that I could follow in his footsteps. Thank you for being such a

supportive older brother and for always believing that I could do this. And my parents, thank you for supporting me in everything over the years, for visiting me in all corners of the country, and for always wanting and finding a way to provide the best for me.

(6)

Table of Contents

Abstract ...3 Acknowledgements ...4 Table of Contents ...6 List of Figures ...9 List of Tables ...9 List of Abbreviations ...10

Chapter 1: X-type lectins: soluble lectins with microbial glycan

specificity and elusive biological function

1.1 Abstract ...16

1.2 Introduction ...17

1.3 X-type lectins in chordates ...18

1.4 X-lectin structure ...19

1.5 Xenopus intelectins ...27

1.5.1 Cortical granule lectins ...27

1.5.2 Serum lectins ...28

1.5.3 Intestinal lectins ...30

1.5.4 Embryonic epidermal lectin ...30

1.6 Mouse intelectins ...31

1.7 Sheep intelectins ...34

1.8 Human intelectins ...36

1.8.1 Glycan recognition ...37

1.8.2 Human omentin ...38

1.8.3 Intelectins in diseases associated with microbial dysbiosis ...40

1.9 Conclusions ...40

1.10 Acknowledgements ...43

1.11 References ...44

Chapter 2: Stereoelectronic effects impact glycan recognition

2.1 Abstract ...54

2.2 Introduction ...55

2.3 Results ...57

2.3.1 hItln-1 binding to microbial monosaccharides ...57

2.3.2 Structure of hItln-1 bound to allyl-KO ...61

2.3.3 Bioinformatic analysis of glycan conformation ...65

2.3.4 Computational analysis of glycan conformation and recognition ...68

2.4 Discussion ...73

2.5 Conclusions ...76

(7)

2.6.2 Chemical synthesis of glycans ...78

2.6.3 Biolayer interferometry (BLItz) ...78

2.6.4 ELISA ...79

2.6.5 Protein X-ray crystallography ...80

2.6.6 Bioinformatics ...81 2.6.7 Computational Analysis ...81 2.7 Funding Sources...82 2.8 Acknowledgements ...83 2.9 References ...84 2.10 Supplemental Information ...89

Chapter 3: Human Intelectin-1 specificity for microbe binding in

synthetic communities

3.1 Abstract ...100

3.2 Introduction ...101

3.3 Results ...103

3.3.1 Recognition of microbial strains by hItln-1 ...103

3.3.2 hItln-1 binding affinity for Gram-positive and Gram-negative bacteria ...106

3.3.3 hItln-1 binding to synthetic microbial communities ...109

3.3.4 hItln-1 competitive binding in mixed microbial communities ...112

3.4 Discussion ...116

3.5 Conclusions ...118

3.6 Materials and Methods ...119

3.6.1 hItln-1 Expression and Purification ...119

3.6.2 hItln-1 Binding to Bacterial Strains ...119

3.6.3 hItln-1 Binding to Synthetic Communities ...120

3.7 Acknowledgements ...121

3.8 References ...122

3.9 Supplemental Information ...124

Chapter 4: Lectin-sequencing for analyzing microbial communities

4.1 Abstract ...126

4.2 Introduction ...127

4.3 Results ...130

4.3.1 Soluble lectins bind bacteria from stool ...130

4.3.2 16S Sequencing reveals patterns of hItln-1 binding to stool bacteria ...133

4.3.3 Metagenomic sequencing identifies lectin-bound bacteria ...136

4.3.4 Lectin binding levels are altered in IBD ...139

4.3.5 HItln-1 and MBL binding to the healthy and IBD microbiota ...141

4.4 Discussion ...141

4.5 Materials and Methods ...147

4.5.1 Protein expression and purification ...147

(8)

4.5.4 Flow cytometry and FACS of donor stool samples ...150

4.5.5 Fluorescence microscopy ...151

4.5.6 Nucleic acid extraction ...151

4.5.7 16S sequencing ...152

4.5.8 Metagenomic sequencing ...154

4.6 Acknowledgements ...156

4.7 References ...157

(9)

List of Figures

1-1. X-type lectin structures ...21

1-2. Carbohydrate binding sites from select X-type lectins ...24

1-3. Alignment of X-lectin amino acid sequences ...25

2-1. Factors contributing to lectin–carbohydrate binding and recognition ...56

2-2. Human intelectin-1 (hItln-1) binding to monosaccharides in a biolayer interferometry (BLI) competition assay ...58

2-3. Evaluation of BSA-conjugated sugars as ligands for hItln-1 using ELISA. ...60

2-4. Structure of hItln-1 bound to allyl-α-KO ...62

2-5. Bioinformatic analysis of exocyclic vicinal diol-containing glycans in the PDB ...66

2-6. Observed and accommodated ligand conformations in hItln-1 binding site ...69

2-7. Stabilizing stereoelectronic effects of preferred rotamers of the proximal side chain C–C bond of KO, L,D-heptose, and D,D-heptose ...72

3-1. Binding of hItln-1 to fixed bacterial strains ...104

3-2. hItln-1 binding affinity for microbial cell surfaces ...108

3-3. Binding of hItln-1 to E. fergusonii is inhibited in a mixed community ...111

3-4. Competitive inhibition of hItln-1 binding in microbial communities ...115

3-S1. Competitive hItln-1 binding in communities is dependent on lectin concentration and washing ...124

4-1. Lectin trimeric structures and binding ligands ...128

4-2. Soluble lectins bind the human microbiome ...132

4-3. 16S lectin-sequencing of hItln-1 sorted stool bacteria ...135

4-4. Lectin-SEQ of healthy donor stool samples with metagenomic sequencing ...138

4-5. HItln-1 and SP-D binding to the IBD microbiome ...140

4-6. HItln-1 and MBL binding levels healthy, UC and CD donor stool microbiome ...141

4-1S. Sequence level enrichment plot of lectin-SEQ with metagenomics ...163

List of Tables

1-1. GenBank accession codes for aligned sequences in Figure 1-2 ...26

2-1. IC50 values of ligands and corresponding changes in free energy of binding compared to KO ...59

2-2. Data collection and refinement statistics for the crystal structure of hItln-1 bound to allyl-α-KO ...64

2-3. Summary of conformational analysis results ...67

2-4. NBO Donor-acceptor interaction energies and calculated ΔENBO of bond rotation ...72

2-S1. Conformational analysis of saccharides containing exocyclic diols in the PDB ...89

2-S2. Cartesian coordinates of saccharides optimized at the M06-2X/6-311+G(d,p); IEFPCM:water level of theory ...94

3-1. Summary of hItln-1 binding to microbes ...105

3-2. Example synthetic microbial community mixtures assayed for hItln-1 binding ...109

(10)

List of Abbreviations

AMPs Antimicrobial peptides

BA Bifidobacterium angulatum

BCSDB Bacterial Carbohydrate Structure Data Base

BLI Biolayer interferometry

BO Bacteroides ovatus

BP Bacteroides plebeius

BSA Bovine serum albumin

CD Crohn’s disease

CPS Capsular polysaccharide

CRD Carbohydrate recognition domain D,D-heptose D-glycero-α-D-manno-heptose DFT Density functional theory

EF Escherichia fergusonii

ELISA Enzyme-linked immunosorbent-like assay ELLA Enzyme-linked lectin assay

FACS Fluorescence-activated cell sorting

FBG Fibrinogen-like

FSC-A Forward scatter

GlcNAc N-Acetyl-D-glucosamine

GlyP D-glycerol-1-phosphate Gro-1-P Glycerol 1-phosphate

(11)

hItln-1 Human intelectin-1

hItln-647 Human intelectin-1 Alexa Fluor 647 hItln-2 Human intelectin-2

HRP Horseradish peroxidase

HUVECs Human umbilical vein epithelial cells IBD Inflammatory bowel disease

IHC Immunohistochemistry

IL Interleukin

KDO D-glycero-D-talo-oct-2-ulosonic acid 3-deoxy-D-manno-oct-2-ulosonic acid

KEGG Kyoto Encyclopedia of Genes and Genomes KO D-glycero-D-talo-oct-2-ulosonic acid L,D-heptose L-glycero-α-D-manno-heptose lectin-SEQ Lectin-sequencing LfR lactoferrin receptor LPS Lipopolysaccharide LR Lactobacillus reuteri MBL Mannose-binding lectin

MBL-555 Mannose binding lectin Alexa Fluor 555 MCP Monocyte chemotactic protein

mItln-1 Mouse intelectin-1 mItln-2 Mouse intelectin-2 NBO Natural bond orbital

(12)

NMR Nuclear magnetic resonance OTUs Operational taxonomic units

OVA Ovalbumin

PCoA Principal coordinate analysis

PDB Protein Data Bank

PP Proteus penneri

sMCP-1 Sheep mast cell protease

SMCs Smooth muscle cells

SP-D surfactant protein-D

STAT6 Signal transducer and activator of transcription 6

TA Teichoic acid

TFF Trefoil factor

TFF3 Trefoil factor 3

Th2 T helper type 2

TLR Toll like receptor

TNF-α Tumor necrosis factor alpha

UC Ulcerative colitis

V109D Human intelectin-1 V109D

XCGL Xenopus laevis cortical granule lectin XCGL2 Xenopus laevis cortical granule lectin 2 XCL Xenopus laevis serum lectin

XCL-2 Xenopus laevis serum lectin 2

(13)

xIntl-3 Xenopus laevis Intelectin 3 xIntl-4 Xenopus laevis Intelectin 4 XL-35 Xenopus laevis lectin 35kDa β-D-Galf β-D-Galactofuranose

β-Galf β-D-Galactofuranose ΔΔG Relative free energy

(14)
(15)

Chapter 1

X-Type lectins: soluble lectins with microbial glycan

specificity and elusive biological function

(16)

1.1 Abstract

The X-type lectins are a recently identified class of calcium-dependent lectins that lack the C-type lectin fold. The X-type lectins are found throughout the animal kingdom and show high levels of sequence homology across chordates. Still, the numbers of intelectin genes and their expression patterns vary widely between species. While X-type lectins have been suggested to function in innate immunity against microbes, the biological function of most X-type lectins remains elusive. In this review, I summarize critical features of intelectin protein structure, glycan recognition, and current understanding of biological functions. By analyzing data from multiple species, my goal is to illuminate areas where insights from individual species are either unique or broadly applicable toward understanding the intelectins. Additionally, I aim to

(17)

1.2 Introduction

Animals have an integral, yet complicated, relationship with microbes. On one hand, the surfaces of the body that contact the environment must maintain a barrier to protect the animal tissues from microbial infection. On the other hand, microbes that reside at epithelial surfaces play important roles for their animal host.1 Most notable is the mutualistic relationship between an animal host and its gut microbiome. The latter breaks down components of the diet to provide important metabolites to the host. A host must therefore have specialized defenses at the

epithelial surfaces to distinguish between microorganisms that could become pathogenic, commensal, or symbiotic.2-4 To make such decisions, the immune system can exploit the carbohydrates that coat all cells on earth.5, 6 In particular, lectins, which are non-antibody carbohydrate binding proteins, can distinguish glycan residues, and many lectins play important innate immune roles in the host.7-9

An under-studied class of lectins, the X-type lectins, was identified in chordates and proposed to function in innate immunity.10 Since the first X-type lectin was discovered in Xenopus laevis oocytes in 1982, homologous proteins have been identified in chordates from tunicates to humans.10, 11 This family of lectins has also been termed the intelectins, for the discovery of an X-type lectin in the mouse intestine (intestinal lectin).12 The expression of the X-type lectins in skin, the intestinal mucosa, and serum suggests a role in innate defense against microorganisms. Indeed, some of these lectins can bind glycans presented on the surface of microorganisms, though the exact biological function of intelectins in many species remains elusive.

The first human intelectin (hItln-1) was discovered in 2001.13-15 In the 20 years since, numerous structural, biochemical, and biological insights have added to our understanding of the

(18)

range of functional roles played by the X-type lectins in chordates. In the present review, I first explore the evolutionary conservation of the X-type lectins. I then cover the structural features of X-type lectins, comparing the structures from various species. Finally, I review the literature on X-type lectins, with a focus on Xenopus laevis and the mammalian X-type lectins—Xenopus for the historical perspective, and mammalian lectins because they are most pertinent for the results described in this thesis. I highlight glycan recognition and putative biological roles of these proteins as well as future directions for their study. While I have not focused on them here for clarity, the invertebrate and fish X-type lectins should not be overlooked.

1.3 X-Type lectins in chordates

X-Type lectins are found from the earliest chordates, amphioxus and tunicates, to humans, with few exceptions. Homologous protein sequences are also present in the marine phyla Placozoa and Cnidaria. A recent evolutionary analysis placed Trichoplax adhaerens, a placozoa and one of the most simple animals, as the likely origin of intelectins.11 These analyses suggest the X-type lectins have been present throughout the evolution of animal species.

There is an apparent correlation between the number of intelectin genes and the

evolutionary age of an organism. For example, amphioxus species have as many as 12 intelectin genes and the tunicate Ciona intestinalis has 21 intelectins.16 Many marine-dwelling vertebrates also have many intelectins. Xenopus laevis has eight known intelectins, which are described in this review, while zebrafish, Danio rerio, have seven intelectins.17 Land mammals have fewer intelectins. Laboratory strains of mice have one to six intelectins, depending on their numbers of duplications of the Itln locus,18 sheep have three intelectins, and humans have two. This pattern suggests that intelectins play an important role in innate immunity, where more copies were

(19)

explanation is that marine animals face additional challenges in protection from environmental microbes at their surface. Fish and frog intelectins are expressed in the intestine as well as in reproductive tissue, embryos, skin slime and gill tissue, suggesting that these intelectins could play important roles in protecting the animal from microbes in their environment. Indeed, many of the lectins expressed by marine animals can agglutinate bacteria, further suggesting a role in protection from microbes.19-24

Interestingly, animals of the order Carnivora lack intelectins. The only exception is the giant panda, which has an intelectin-1-like partial sequence identified by the Basic Local Alignment Search Tool25 for sequences similar to hItln-1. The sequence aligns to positions 128-306 of 1 and contains the calcium binding and aromatic residues corresponding to the hItln-1 ligand binding pocket. Though the Giant Panda is a member of the Carnivora order, it is actually a herbivore consuming almost entirely bamboo. The presence of intelectin in plant-consuming organisms suggests different microbial requirement for the host to access dietary nutrients compared to predatory carnivores.26 Birds and bats also lack intelectins. Because of the requirements for flight, birds and bats have very short digestive tracts and less reliance on a microbiome.27 The tinamou, emu, and kiwi are exceptions that have intelectin genes. These large, ground-dwelling birds from the clade Palaeognathae, have omnivorous diets and do not fly, suggesting a different relationship with their microbiomes compared to flying birds. Taken together, these insights suggest that have a role in microbiome.

1.4 X-Type lectin structure

All X-type lectins characterized to date require calcium ions for carbohydrate-binding, however, they lack sequence similarity to the calcium-dependent C-type lectins.28, 29 In the N-terminal region of their carbohydrate recognition domain, many X-type lectins contain a

(20)

fibrinogen-like (FBG) domain consisting of about 45 amino acids (residues 37-82 in hItln-1, highlighted in Figure 1-1A, B).30 Another class of lectins, the ficolins, also contain an FBG domain and are thought to be the most similar to the intelectins. However, the FBG domain is the only conserved motif between the two classes. Moreover, the ficolins and the X-type lectins have distinct structures. Our group has solved crystal structures for hItln-1 and Xenopus embryonic epidermal lectin (XEEL), and analysis of these structures revealed that the X-type lectins are, indeed, a novel lectin class displaying a unique fold.30, 31

The 1.8-Å resolution structure of Apo-hItln-1 determined by X-ray crystallography (PDB 4WMQ) revealed that hIntl-1 is a disulfide linked homotrimer (Figure 1-1A).31 Each monomer has a globular structure consisting of two highly twisted β-sheet structures surrounded by seven short α-helices. The structure also contains many random coil regions that are mainly located at the interface between monomers. Each monomer has three divalent calcium ions—two structural calcium ions are buried and the third is solvent accessible and sits in the carbohydrate binding pocket. In addition to the intermolecular disulfide between residues C31 and C48, each monomer has four additional intramolecular disulfide bonds (Figure 1-1B). The cysteines that form the intermolecular disulfide between hItln-1 monomers are not conserved among all intelectins. XEEL, for example, lacks these cysteines in the carbohydrate recognition domain (CRD). However, the XEEL CRD does form trimers in solution, and additionally crystallized as a trimer.30 Mouse intelectin-1 (mItln-1) also lacks the analogous cysteines, but similarly form trimers in solution (unpublished data).

(21)

Figure 1-1. X-type lectin structures. (A) X-ray crystal structure of hItln-1 bound to allyl-β-D-Galf (β-Galf, PDB 4WMY). Each monomer is depicted in white with the fibrinogen-like (FBG) domain highlighted in teal; intermolecular disulfides in yellow spheres; intramolecular disulfides in yellow sticks. (B) Depiction of sequence features of hItln-1 showing the signal peptide, N-linked glycosylation site, intramolecular disulfides, cysteines that form intermolecular disulfides highlighted in yellow, fibrinogen domain highlighted in teal, carbohydrate binding pocket highlighted in gray. Amino acid positions are above sequence features. (C) Predicted hexameric structure of XEEL bound to glycerol 1-phosphate (Gro-1-P) adapted from Wankanont et al.30 The hexamer is a dimer of trimers, with each

trimer shown in white or light blue (CRD structure is derived from PDB 4WN0). The predicted N-terminal trimerization domain is shown as a helical bundle from PDB 2SIV. Two of the six predicted Cys-24–Cys-42 intermolecular disulfide bonds are depicted as yellow spheres. (D) Depiction of sequence features of XEEL, with the same feature representations as (B) and the additional helical domain highlighted in light blue.

(22)

The carbohydrate binding pockets of each monomer in hItln-1 and XEEL are oriented on one face of the trimer. This binding site arrangement allows the lectin to take advantage of multivalency when engaging carbohydrates displayed on a surface, such as a microbial cell glycocalyx.32 On the opposite face of the trimer, the FBG domains are exposed and are poised for potential protein-protein interactions (Figure 1-1A). In other proteins containing FBG-domains, such as fibrinogen, tenascins, and angiopoietins, the FBG-domain has been shown to mediate protein–protein interactions to promote tissue repair in response to injury and

infection.33 The potential for the FBG domain of human intelectin to participate in protein– protein interactions suggests that it could play a role in signal transduction between the bacteria it recognizes and the host organism.

The XEEL CRD is structurally very similar to hItln-1. The protein, however harbors an additional N-terminal domain that is not a conserved feature of the X-type lectin family (Figure

1-1C, D). This domain, consisting of residues 22-47, is predicted to be helical and to form an

anti-parallel six-helix bundle with six intermolecular disulfide bonds between 24 and Cys-42.30 The resulting full-length structure of XEEL a disulfide linked hexamer with a predicted barbell-like arrangement (Figure 1-1C, D). The barbell structure explains the ability of XEEL to agglutinate bacteria.30

The structures of hItln-1 and XEEL determined by X-ray crystallography have afforded an understanding of the X-type lectin carbohydrate recognition mode. In the complex of hItln-1 and allyl-β-D-galactofuranose (allyl-β-D-Galf, PDB 4WMY),31 the protein structure is not altered by the addition of the ligand, consistent with the lock-and-key binding model common among lectins.34 This structure also explained the binding specificity of hItln-1, as determined by glycan array, for saccharides containing exocyclic, vicinal diols. The exocyclic diol of allyl-β-D-Galf

(23)

coordinates to the solvent exposed divalent calcium ion and sits in an aromatic box formed by W288 and Y297 (Figure 1-2A). The aromatic box is a conserved feature of the binding pockets in mouse intelectin-1 (mItln-1) and XEEL, both of which have been shown to bind β-D-Galf.30, 31 In mItln-1, the Trp and Tyr residues of the aromatic box are conserved, while XEEL has Trp residues in both positions. Nevertheless, the crystal structure of XEEL bound to glycerol 1-phosphate (Gro-1-P) reveals remarkable structural similarity between the hItln-1 and XEEL binding sites (Figure 1-2).

Alignment of the amino acid sequences of intelectins from human, mouse, sheep,

xenopus, zebrafish, catfish and lamprey using Clustal Omega35, 36 reveals a high conservation of calcium and ligand binding residues (Figure 1-3, Table 1-1). Analysis of the residues that directly complex the structural calcium ions and the calcium ion in the ligand binding site using ggseqlogo37 shows remarkable sequence identity across species. In contrast, the ligand binding residues are more varied (Figure 1-3B). In hItln-1, W288 and Y297 make an aromatic box (Figure 1-2A) that is preserved in many species (Figure 1-3). Notably, though, some X-type lectins including human intelectin-2 (hItln-2), mouse intelectin-2 (mItln-2) and Xenopus cortical granule lectins (XCGL and XCGL2) do not have the conserved aromatic box (Figure 1-2,

Figure 1-3A), indicating divergent carbohydrate recognition profiles. While the binding

specificity of hItln-2 and mItln-2 are not known, glycan array data suggests that XCGL binds Galα(1-3)GalNAc.30 In-silico docking of Galα(1-3)GalNAc in the modeled XCGL binding pocket shows that, even without extensive optimization, the ligand fits remarkably well into the space created by the tryptophan to asparagine change in the binding pocket (Figure 1-2D). Additionally, the presence of a phenylalanine adjacent to the GalNAc C1 suggests that an extended saccharide could take advantage of stacking interactions with the aromatic ring.

(24)

Figure 1-2. Carbohydrate binding sites from select X-type lectins. (A) Carbohydrate binding site of

hItln-1 bound to allyl-β-D-Galf (PBD 4WMY) and model of hItln-2 binding site. (B) Models of mItln-1 and mItln-2. (C) Carbohydrate binding site of XEEL bound to glycerol 1-phosphate (Gro-1-P, PDB 4WN0) and model of XCGL. (D) Modeled XCGL binding site complexed with Galα(1-3)GalNAc. Generated via in silico docking of Galα(1-3)GalNAc by aligning the calcium-coordinating hydroxyls with the bound Gro-1-P in XEEL (4WN0) to match coordination geometry. Side chains altered in the aromatic box are highlighted in cyan. Models were built using SWISS-MODEL. HItln-1 (PDB

4WMQ) was the template for hItln-2, mItln-1, and mItln-2. XEEL (PDB 4WMO) was the template for XCGL. All ligands are shown in grey; calcium ions in green; and ordered water molecules in the binding pocket in red.

(25)

gu re 1 -3. A li gnm ent of X -le ctin a m in o a cid se q u en ce s. (A ) S eq ue nc e a lig nm en t o f h um an , m ou se , sh ee p, x en op us, z eb ra fish , c atfish a nd m pre y in te le ctin s. R esid ue s c oo rd in atin g stru ctu ra l c alc iu m (b lu e), lig an d b in din g site c alc iu m (g re en ) a nd th ose in te ract ing w it h the li gand ra ng e) a re h ig hlig hte d. S eq ue nc e le ng th fo r e ac h fu ll se qu en ce is sh ow n o n th e rig ht, a nd th e p ositio n c orre sp on din g to h It ln -1 for each ghl ight ed res idue is s how n at the top. ( B ) S eq ue nc e lo go sh ow in g a m in o a cid c on se rv atio n a cr os s the sequences s how n in (A ).

(26)

Table 1-1. GenBank accession codes for aligned sequences in Figure 1-2

GenBank accession code Species Protein

BAD98810.1 Lethenteron camtschaticum itlnb

X82626 Xenopus laevis XCGL BF232570 X. laevis XCGL2 AB061238 X. laevis XCL-1 AB061238 X. laevis XCL-2 NP_001085762 X. laevis XCL-3 BC087616 X. laevis XEEL

BAL14267.1 Silurus asotus itln-gill

BAL14266.1 S. asotus itln-skin/kidney

U583680 Danio rerio zItln1

EU583682 D. rerio zItln2

EU583681 D. rerio zItln3

XP_027821289 Ovis aries sItln-1

XP_027821293 O. aries sItln-2

CAP09695 O. aries sItln-3

AAU88049 Mus musculus mItln-1

AAO60215 M. musculus mItln-2

BC020664 Homo sapiens hItln-1

AY358905 H. sapiens hItln-2

The ability to produce recombinant intelectins to delineate their biological and

biochemical properties is valuable. However, the structures of these proteins have revealed that their processing and modification is complex. Many intelectins possess a signal peptide which is cleaved upon secretion, numerous inter- and intramolecular disulfides, and N-linked

glycosylation required for proper folding and solubility. Thus, intelectins are recalcitrant to recombinant expression in Escherichia coli. Moreover, we have observed that improperly folded intelectins can display non-specific but calcium-dependent carbohydrate binding.9 Therefore,

(27)

1.5 Xenopus intelectins

The X-type lectin family is named for the discovery of the first proteins of this class in Xenopus laeivs.10 In 1974, Jerry Hedrick and colleagues observed that contents of the X. laevis egg cortical granule caused calcium-dependent agglutination of the egg jelly coat to block polyspermy.38 This agglutination action was later ascribed to the presence of a

D-galactopyranoside binding lectin found in the oocyte, embryo and the cortical granules of the egg.39-41 This lectin, named both XCGL (X. laevis cortical granule lectin) and XL-35 (X. laevis lectin 35kDa), accounts for more than 70% of the protein in the X. laevis egg cortical granules.28, 41 To date, at least eight X-type lectins have been identified in X. laevis: the cortical granule lectins (XCGL and XCGL2), the serum lectins (XCL-1, XCL-2, and XCL-3), the embryonic epidermal lectin (XEEL), and most recently, the intestinal lectins (xIntl-3 and xIntl-4).

1.5.1 Cortical granule lectins

As mentioned previously, the cortical granule lectin was the first identified X-type lectin. Cortical granules are secretory organelles associated with preventing polyspermy, and the presence of a lectin at these sites is intriguing. XCGL was identified in the cortical granules and fertilization envelope by purification from oocytes using a melibiose affinity column.38, 42 The result was a glycosylated protein that could agglutinate rabbit erythrocytes in a

calcium-dependent manner.39 Later, Quill and Hedrick determined XCGL purified as oligomers of 10 to 12 glycosylated monomers. They then purified egg jelly to isolate large molecular weight mucin-like glycoproteins, and developed an enzyme-linked lectin assay (ELLA) to identify the native XCGL ligand. This assay revealed that α-galactosidases strongly inhibited lectin binding, indicating that the XCGL binds α-galactosides of glycoproteins in the egg jelly.43 Upon release

(28)

from the cortical granules at fertilization, the highly oligomeric XCGL engages in high avidity interactions with mucin glycans in the egg jelly. The result is tightly crosslinked the egg jelly, thereby creating an impenetrable fertilization envelope.44 This fertilization envelope acts as a physical barrier of polyspermy, and may additionally protect the embryo from microorganisms present in the environment.

The second cortical granule protein, XCGL2, shares 87.5% sequence identity with XCGL. Both are expressed at the highest level in unfertilized eggs, and their expression decreases throughout embryogenesis.45 Sequence alignment of XCGL and XCGL2 shows that most variability occurs in the N-terminal domain, while identical amino acid residues are present in the ligand-binding site.30 Together, these data indicate that the two cortical granule lectins share the same ligands. Interestingly, XCGL and XCGL2 are the only Xenopus X-type lectins that differ from XEEL in the residues involved in carbohydrate binding. Where XEEL forms an aromatic box with W317 and W326, XCGL and XCGL2 have phenylalanine and asparagine residues, respectively. The difference in these two amino acid residues influences whether the lectins engage in self-carbohydrate epitope recognition, as with the cortical granule lectins, ir microbial glycan recognition, as with XEEL and the other X-type lectins.30

1.5.2 Serum lectins

At least three X-type lectins have been identified in the serum of X. laevis: 1, XCL-2, and XCL-3. In 1985, a serum lectin was identified by Roberson and colleagues, and shown to differ from XCGL in amino acid composition. Despite the sequence differences, the lectin retained galactoside-binding ability.46 Still, it is unclear whether this is one of the serum lectins later characterized and discussed herein. In 2007, Ishino and colleagues cloned the cDNA of a

(29)

calcium-dependent lectin isolated from adult Xenopus serum (XCL-1). They then used primers targeting the regions with high conservation in other known X-type lectins to identify potential X-type lectins expressed during tail regeneration in X. laevis tadpoles. In this way, they

identified Xenopus calcium-dependent serum lectin 2 (XCL-2), which shares 60% amino acid identity with XCL-1.47 Finally, XCL-3 was identified from a DNA database48 based on sequence similarity to XCL-1 and -2.49 XCL-1, -2, and -3 are differentially expressed in adult frog

tissues.49 Alignment of the amino acid sequences of XCL-1, -2, and -3 with XEEL shows that XCL-1 and -2 have conserved residues at both the structural and carbohydrate-binding calcium ion coordination sites. XCL-3 has conserved residues at almost all sites; the only exception is V305, which corresponds to W317 in the ligand-binding site of XEEL (Figure 1-3).30 The consequence of this change is not clear but, taken with the additional non-conserved C-terminal residues in XCL-3, it likely indicates a different carbohydrate binding specificity. XCL-1, -2, and -3 differ most greatly from each other in their N-terminal sequences downstream of the signal peptide. These differences could affect the oligomerization states of the mature proteins.

Nagata and colleagues performed more in-depth analyses of XCL-1 using a monoclonal antibody specific to the lectin. They determined that expression of the gene encoding XCL-1 is induced in response to lipopolysaccharide (LPS) injection, and additionally observed calcium-dependent binding to Staphylococcus aureus and LPS, and to a lesser extent, to Escherichia coli. Binding of XCL-1 to bacteria was inhibited by the pentoses ribose and xylose.49 However, this binding may not be indicative of XCL-1 glycan specificity, as the carbohydrates used for this experiment had a free reducing end, allowing the carbohydrate to equilibrate between the ring-closed and linear forms in solution. Based on sequence similarity to XEEL, XCL-1 likely shares the conserved mechanism of recognition of glycans with exocyclic vicinal diol groups.30

(30)

1.5.3 Intestinal lectins

Two novel intelectins were recently identified in X. laevis intestine, xIntl-3 and xIntl-4.50 The proteins were purified by galactose-sepharose affinity and characterized by N-terminal amino acid sequencing followed by cDNA cloning. XItln-3 is expressed most highly in the intestine, while xIntl-4 is most strongly expressed in liver, lung, and kidney.

Immunohistochemistry showed xIntl-3 localized in the mucus granules of goblet cells in the intestine and rectum, and injection of LPS increases the xIntl-3 content throughout the intestine and rectum. XIntl-3 formed multimers with as many as 12 copies. Comparison of xIntl-3 and XEEL reveals highly conserved binding site residues in comparison to XEEL, suggesting it also functions in recognizing microbial glycans. Indeed, it can agglutinate E. coli in a calcium-dependent manner.50

1.5.4 Embryonic epidermal lectin

The Xenopus embryonic epidermal lectin (XEEL) was identified by Nagata and

colleagues in 2003. Analysis of cDNA from X. laevis embryo led to the identification of a 342 amino acid protein containing a signal sequence and fibrinogen-like motif that localized to epidermal cells in the embryonic stage.51 The protein shares 62-70% identity with XCGL, mouse intelectin-1, and the human intelectins. Nagata later determined XEEL to be a disulfide linked homohexamer with highest expression at the hatching sage of the embryo.52 Recently,

Wangkanont and colleagues determined the structure of XEEL using x-ray crystallography, which was described in detail in section 1.4. This structure revealed a conserved binding site structure and ligand binding mode between XEEL and human Itln-1 for recognizing microbial glycans containing an exocyclic vicinal diol.30 Taken together, these studies suggest a role of

(31)

XEEL in innate immunity for the hatching embryo. XEEL is secreted into the environmental water and can entrap microbes via agglutination, thereby regulating the microbes at the surface of the developing hatchling.

1.6 Mouse intelectins

The first mouse intelectin was identified by Komiya et al. in 1998 from a large scale in situ hybridization screen on intestinal tissue from BALB/c mice. Because the transcript showed homology to XCGL and appeared to be expressed in the intestine, it was named intelectin, for intestinal lectin.12 Since then, additional intelectins have been identified in mice and,

importantly, the number of intelectins varies between common laboratory mouse strains. The aforementioned intelectin is mouse intelectin-1 (mItln-1, also called Itlna), which is present in all mouse strains. However, it is the only intelectin present in C57BL/6J mice.18, 53 Other common laboratory strains of mice have undergone expansion of the intelectin locus on chromosome 1 to contain up to 6 intelectin genes. 129S7 mice have six intelectins, mItln-1, -2, -5, and -6 are full length proteins with highly conserved but not identical sequences, and mItln-3 and -4 contain early stop codons.18 The variation in the number of intelectin genes is relevant as mice are critical immunological models and the potential function of intelectins in innate immunity may be relevant for interpretation of results between different mouse strains.

MItln-1 and mItln-2 are the only mouse intelectins that have been further studied. MItln-1 is expressed solely in the Paneth cells of the mouse small intestine. The lectin’s binding pocket has residues identical to those of hItln-112, 18 and therefore likely an identical glycan binding profile. However, mItln-1 lacks both the cysteine residues corresponding to those that form the intermolecular disulfide bonds in the hItln-1 homotrimer and the glycosylation site (N163)

(32)

present in hItln-1.54 We hypothesize that mItln-1 is a non-covalent homotrimer, due to the observation that the XEEL CRD is a non-covalent homotrimer in solution (described above,

Figure 1-1C),30 as well as the observation that mItln-1 binds immobilized galactofuranose with a similar affinity to hItln-1,31, 55 suggesting that it is in a multimeric form that takes advantage of multivalent binding to its ligands. MItln-1 expression in small intestinal Paneth cells was shown to increase three and a half fold in response to microbial colonization of germ free NMRI mice,56 suggesting that mItln-1 likely plays an innate immune function at the mucosal surface of the small intestine. In contrast to mItln-1, mItln-2 is a goblet cell product expressed in the mouse lung and intestine.18, 53 The binding site of mItln-2 is unique from both hItln-1 and hItln-2

(Figure 1-2B). Where hItln-1 has the aromatic box consisting of W288 and Y297, mItln-2 has an alanine and tyrosine at the analogous positions. In hItln-2, the analogous positions are tryptophan and serine, respectively. Thus, human and mouse Itln-2 are likely to display different glycan binding specificities, and the mouse Itln-2 should not be viewed as a model for the human Itln-2. To date, there are no known ligands of mItln-2, but is upregulated in response to parasitic

nematode infection.53, 57-59 An intriguing possibility is that the lectin could recognize glycans specific to nematodes.

Studies of mouse intelectins show that the intelectin proteins are upregulated by the T helper type 2 (Th2) innate immune response in response to infection by pathogenic organisms.58 The Th2 response is characterized by the cytokines interleukin (IL) -4 and IL-13 and is the primary response that drives allergic asthma and parasitic helminth expulsion.60 IL-4 and IL-13 activate phosphorylation of signal transducer and activator of transcription 6 (STAT6), a

transcription factor that activates genes involved in humoral immunity.60-62 By infecting BALB/c mice (resistant to nematode infection) and AKR mice (susceptible to infection, lack the Th2

(33)

response) with the intestinal nematode Trichuris muris, Datta et al. showed that only in BALB/C mice is mItln-1 expression upregulated after challenge. A later study showed mItln-1 and mItln-2 upregulation in response to Nippostrongylus brasiliensis in a STAT-6-dependent manner.59 However, when both mouse intelectin genes were constitutively expressed in the lung, there was no sign of enhanced clearance of N. brasiliensis,59 indicating that the lectins themselves are not directly clearing the parasite, but rather are either expressed alongside other STAT6 dependent genes that are responsible for clearance, or part of a pathway that leads to clearance and other parts of that pathway must also be expressed in STAT6-dependent manner. While the

aforementioned studies implicate both mItln-1 and -2 in helminth clearance, another study pointed toward mItln-2 as the critical player. Pemberton et al. showed that when infected with Trichinella spiralis, strains that have mItln-2 (129/SvEv and BALB/c) can expel the worms, while strains that do not have mItln-2 (C57BL/6 and C57BL/10) display delayed worm expulsion.53

The mouse intelectins have also been implicated in the Th2 allergic asthma response. When FVB/NCrl mice were challenged with ovalbumin (OVA) to initiate an allergic airway response, the expression of both mItln-1 and mItln-2 genes in airway mucus cells was significantly increased. In addition, upregulation of the production of monocyte chemotactic proteins (MCP) -1 and -3 appears to depend on mItln expression. The aforementioned proteins increased with the intelectins upon OVA challenge, but when mouse lung epithelial cells were treated with mItln shRNA, MCP-1 and -3 were not upregulated in response to IL-13.63 Other genes expressed in asthma in mouse in response to IL-13 are also likely involved in glycan-protein interactions, including trefoil factor (TFF) 1, TFF2, and the mucins Muc5AC and Muc5B.64, 65

(34)

1.7 Sheep intelectins

Sheep have three intelectins with expression primarily at the mucosa of the airway, lung and gut. These intelectins, sItln-1, sItln-2, and sItln-3, were named by the order in which they were identified. The naming of sheep intelectins has no relation to their homology to other mammalian intelectins. A sequence comparison of sheep and human intelectins indicates that sItln-1 and sItln-2 have 80.4% and 86.6% identity to hItln-1, respectively. Both have identical residues corresponding to the W288 and Y297 in the hItln-1 carbohydrate binding pocket (Figure 1-3B). This analysis suggests that both sItln-1 and sItln-2 recognize similar glycans as hItln-1, but sItln-2 is the most similar to hItln-1. Sheep Itln-3 has a serine in place the Y297 of hItln-1; therefore, sItln-3 maps to the hItln-2 binding site residues. Alignment of the two sequences shows that sItln-3 and hItln-2 are 82.4% identical (Figure 1-3A). The glycan specificities of both sItln-3 and hItln-2 are unknown.

In uninfected sheep, the three sheep intelectins display distinct tissue distribution. SItln-1 is expressed in the lung, abomasum, colon, gastric lymph node and terminal rectum. SItln-2 is strongly expressed in the abomasum (a ruminant stomach component), but not detected in other tissues. SItln-3 is expressed widely in the digestive tract, including the abomasum, jejunum, colon, rectum, and ileal Peyer’s patches.66 By immunohistochemistry, the intelectins appear to be expressed and secreted with the mucus. They are localized to goblet cells in the lung, mucus neck cells in the abomasum, mucus cells in the colon, gastric mucus, and free mucus in the ileum.66-68 Indeed, sItln-2 has been shown to interact with the mucin Muc5AC from gastric mucus of sheep.69 Purified sheep gastric mucus separated by SDS-PAGE showed a high

molecular weight band, that, when analyzed by mass spectrometry contained both Muc5AC and sItln-2. The purified mucin required both SDS and reducing agent, along with boiling, to

(35)

dissociate sItln-2 from the Muc5AC,69 indicating a reasonably high affinity association between the two molecules. SItln-2 shares binding site residues with hItln-1, and hItln-1 does not appear to bind mammalian glycans,31 consequently, it seems unlikely that sItln-2 binds the Muc5AC O-glycans.

The focus of studies of the sheep intelectins has been the Th2 immune response to asthma, allergens, or parasitic infection. Because of the prevalence of nematode infection in livestock, a major emphasis has been on parasitic nematode infection. French and colleagues identified an intelectin in the sheep airway goblet cells that is upregulated by the Th2 cytokine, IL-4.67 This finding follows on the observations that mouse strains either susceptible or resistant to nematode infection have differential expression of intelectins,53 and humans with Th2

asthmatic responses have a high level of intelectin expression.64 Shortly thereafter, the second intelectin, sItln-2 was identified in sheep abomasum and observed to be upregulated in response to infection with the nematode Teladorsagia circumcincta. SItln-2 localizes to the mucus neck cells and the gastric mucus of the abomasum, and was upregulated along with galectin-14, IL-4, and sheep mast cell protease (sMCP-1) in response to T. circumcincta infection.68 Proteomic analysis of the gastric epithelia of sheep capable of preventing the establishment of the parasite (considered immune sheep), showed sItln-2 levels are higher than those of naïve sheep.70 When the parasite was reintroduced to previously infected sheep, they more rapidly upregulated sItln-2 than naïve sheep with initial infection.68 SItln-1 is also upregulated in the abomasum in response to T. circumcincta infection.66 Upon infection with another parasitic worm, Haemonchus

contortus, sItln-2 was the most highly upregulated gene in the early stages of infection, followed by trefoil factor 3 (TFF3).71 The parallels in the expression of intelectins and trefoil factors is interesting. The trefoil factor family of proteins was recently shown to have lectin activity with

(36)

binding specificity for mucin glycans.72 Members of this lectin class are known to play a role in protection of the mucosal epithelium by altering the rheological properties of mucus.73, 74 Finally, infection of the lung with the parasitic worm Dictyocaulus filaria caused upregulation of sItln-2 and sItln-3, which do not normally show expression in the lung.66

The size of parasitic nematodes explains why they cannot be phagocytosed. Rather, via the Th2 immune response, tissue repair pathways are activated to strengthen the epithelial barrier and expel the nematodes. Expulsion is achieved through multiple mechanisms including

epithelial hyperproliferation and increased mucus production.75 It is unclear whether the

observed increase in sItln expression in response to nematode infection is simply a byproduct of the expansion of goblet cells and increased mucus secretion or if sItlns themselves play a direct role in the defense against nematodes. However, the data implicate intelectins in the nematode response. Bolstering these studies in sheep is the finding that mice lacking mItln-2 are

susceptible to parasitic infection, while those expressing mItln-2 are resistant.53 The molecular mechanism by which intelectins participate in nematode resistance is unknown. New tools such as expanded glycan microarrays to identify the carbohydrate ligands of the various intelectins and intelectin knockout model species will be crucial to understanding the role of these proteins in response to nematode infection.

1.8 Human intelectins

Humans have two intelectins, hItln-1 and hItln-2, which share 83% identity and differ in their binding pocket residues (Figures 1-2A, 1-3) and N-terminal regions. Three separate reports identified the human intelectins in 2001.13-15 Using cDNA libraries from small intestine, two homologs of XL-35 were identified and termed HL-1 and HL-2.13 In tandem, hItln-1 was

(37)

fetal tissues showed that hItln-1 is expressed very highly in the intestine.14 A proteomic analysis of a preterm infant revealed that hItln-1 was shown to be one of the 20 most abundant proteins present in stool throughout the first month of life.76 Expression patterns in adult tissues differ, with the highest mRNA expression observed in heart and lower expression observed in the small intestine, colon, thymus, salivary gland, ovary, testis and pancreas.13-15 HItln-2, in contrast, was only expressed in small intestine.13 Single cell RNA sequencing data from human intestinal biopsy tissue shows that hItln-1 is a goblet cell product, while hItln-2 is a Paneth cell product.77, 78 Similarly, in the adult lung, hItln-1 was shown to localize to goblet cells by

immunohistochemistry.79 The human proteins omentin and lactoferrin receptor (LfR) are

identical to hItln-1, and research has implicated this protein, under whichever name, plays a role in innate immunity, metabolic disorders, cancers, asthma and inflammatory bowel disease (IBD)c. In contrast, we could not find studies that focus on hItln-2.

1.8.1 Glycan recognition

Glycan recognition by hItln-1 has been extensively characterized. While early studies of hItln-1 suggested it to be a furanose binding lectin,15 further characterization revealed that it does not have specificity for furanose sugars. The lectin specificity rests in its engagement of glycans with an exocyclic vicinal diol moiety found in the microbial monosaccharides β-Galf, D-glycero-D-talo-oct-2-ulosonic acid (KO), and D-D-glycero-D-talo-oct-2-ulosonic acid 3-deoxy-D-manno-oct-2-ulosonic acid (KDO), and Gro-1-P.31, 80 A likely explanation for the confusion around glycan specificity is that the initial studies used carbohydrates that had a free anomeric hydroxyl (free reducing end). As a result, the carbohydrates tested could adopt either the linear

(38)

or closed ring forms in solution. The linear form of the sugars would display diols that could be bound by hItln-1.

Although the heptose sugars contain an exocyclic vicinal diol, they are not ligands of hItln-1. Their inability to bind effectively to the lectin has been attributed to their prevalent solution conformations, stabilized by stereoelectronic effects, which is sterically incompatible with the hItln-1 binding site.80 The prevalent mammalian saccharide, sialic acid, is also not a ligand of hItln-1. In silico docking studies suggest that it too is sterically occluded from the binding pocket.31 HItln-1 therefore has evolved to have highly specific glycan recognition, despite a seemingly simple minimal binding epitope. Specifically, the exclusion the mammalian building block sialic acid and L,D-heptose, which is the most abundant bacterial

monosaccharide,81 suggests that hItln-1 has evolved to recognize distinct bacterial species

1.8.2 Human omentin

An adipokine identified in human omental fat was named omentin82 and then later determined to be identical in sequence to human Itln-1.83 Omentin, which I will refer to as hItln-1 for consistency, is detected in human serum and, while the levels of serum hItln-hItln-1 are highly variable, it has been measured as high as 800 ng/mL in healthy human sera.84 Serum omentin concentration has been observed to correlate inversely with body mass index, waist

circumference, leptin levels, and insulin resistance;85 thus, hItln-1 is decreased in many metabolic and inflammatory conditions. Lower levels are also observed in many diseases including type 2 diabetes86-88 and polycystic ovary syndrome.89 Patients with IBD, including Crohn’s disease (CD) and ulcerative colitis (UC), have decreased omentin expression in omental

(39)

tissue83 as well as serum omentin levels.90 Additional conditions and diseases with altered serum hItln-1 levels are compiled in reference 84.84

Studies of omentin in humans and mouse have pointed toward a potential

anti-inflammatory role. In vitro studies of smooth muscle cells (SMCs) and human umbilical vein epithelial cells (HUVECs) have shown that omentin inhibits tumor necrosis factor alpha (TNF-α) activation of nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB), and the resultant expression of downstream effector molecules.91-93 Additionally, when human monocytes were differentiated to macrophages in the presence of recombinant omentin, the resulting macrophages were the anti-inflammatory M2-type macrophages, as indicated by expression of PPAR-γ.94 Recombinant omentin was also able to attenuate toll like receptor (TLR) -4 stimulation by LPS and downstream activation of NF-κB in human macrophages from the U937 cell line.95 It is unclear from this study whether the effect of omentin-1 is due to omentin interaction with a receptor, or due to omentin-1 sequestering LPS, as core LPS contains the ligands of hItln-1, KO and KDO. Nevertheless, an in vivo study in mice investigating the effect of omentin in inflammatory bone disease showed that, in an omentin knockout mouse model, bone tissue has increased expression of TNF-α and the inflammatory cytokines 1α, IL-1β, and IL-6. Treatment with an adenovirus containing the mouse omentin gene rescued the inflammatory phenotype and increased the prevalence of M2 macrophages in bone marrow.96 While these studies provide consistent evidence of omentin’s anti-inflammatory role both in vitro and in vivo, it should be noted that all recombinant omentin used in the studies discussed here was from suppliers that expressed the recombinant protein in E. coli or the source was not indicated. As discussed earlier, recombinant X-type lectins produced in E. coli lack N-glycosylation and would not be expected to be folded. Without a demonstration that the

(40)

recombinant protein is functional, such studies should be interpreted cautiously. To further elucidate the anti-inflammatory mechanism of omentin, mammalian produced recombinant proteins should be utilized and additional follow-up studies in mouse models will be necessary.

1.8.3 Intelectins in diseases associated with microbial dysbiosis

Genome-wide association studies (GWAS) have linked the hItln-1 polymorphism, V109D, to asthma and CD.97, 98 Interestingly, an aspartic acid at this position is the most conserved residue across intelectins. In human Itln-1, valine is the predominant residue at position 109, suggesting that a valine at position 109 may confer some functional benefit. However, further studies to understand potential structural and biological consequences of the two variants will be required to gain insight into the contribution to disease states. With regard to hItln-1 production, ]high levels are observed in the mucus of the lung during allergic asthma,99 and single cell RNA sequencing of human intestinal tissue has shown hITLN1 expression to be upregulated in CD and UC.77, 78 Asthma and IBD are each associated with dysbiosis of the

microbiota.100, 101 The observations provided here along with the binding specificity of hItln-1 for microbial glycans strongly suggests a role of hItln-1 in regulation of the mucosa associated microbiota. To date, very few studies have examined the ability of hItln-1 to bind microbes,31, 55 and none have specifically examined binding to components of the human gut microbiota.

1.9 Conclusions

In this review, I have summarized the research to date into the Xenopus, sheep, mouse and human X-type lectins. Throughout chordates, the intelectins share high levels of sequence homology. Most of the variability between intelectin sequences occurs at the N-terminus, which

(41)

governs oligomerization state, such as is observed for the XEEL extended N-terminus (Figure

1-1B).30 Additionally, there is variability in the residues of the carbohydrate recognition domain that confer glycan-binding specificity (Figure 1-2 and 1-3), suggesting that individual intelectins have evolved specificity for different glycan structures. While the glycan specificity of hItln-1 is well-defined, no glycan ligands have been identified for intelectins with varied binding site residues. Expanded glycan arrays will be crucial for determining the glycan specificities of intelectins and other soluble lectins. Missing from the current glycan arrays are representative glycans from fungi, helminths, and commensal microbes. The ability to interrogate glycans from these organisms will expand our understanding of protein carbohydrate interactions relevant to mucosal innate immunity.

While intelectin sequences across species are highly related, intelectin expression is variable. A recurrent theme in mammals is intelectin expression at mucosal sites, including the lung and the gastrointestinal tract. At these sites, expression of the intelectins is upregulated upon infection or in disease states, strongly suggesting a role in innate immunity at mucosal surfaces. Finding that intelectin production is upregulated by the Th2 response is intriguing. The Th2 response also results in goblet cell differentiation and increased mucin secretion, as well as expression of other proteins including trefoil factors and galectins, which have known roles in crosslinking mucins to protect the epithelial surface.72-74 Intelectins are produced by the mucus secreting cells and have been shown to co-purify with the mucin Muc5Ac in sheep.77 While intelectin does not recognize mammalian glycans, it does have a fibrinogen-like domain that could participate in protein-protein interactions. Further, whether intelectins are able to alter mucus properties and whether their microbial binding abilities allow them to entrap or anchor microbes to the mucus layer will provide clues to intelectin biological function.

(42)

Finally, despite a proposed role in innate immunity, recognition of microbial glycans, and expression at mucosal surfaces, the ability of intelectins to interact with microbes resident to the human microbiome has not been explored. I address this gap with the experiments described in this thesis. Herein, I have further defined the glycan specificity of human intelectin-1, revealing that the heptoses are not ligands despite containing an exocyclic diol. This suggests a more selective microbial binding profile than previously suggested, as L,D-heptose the most abundant bacterial carbohydrate moiety.81 I then analyze the ability of hItln-1 to recognize gut commensal bacteria. To this end, I developed a new strategy, termed lectin-sequencing, as a method for identifying bacterial targets of lectins in native microbial communities such as the human gut microbiome. By lectin-sequencing, I identify bacterial species bound by hItln-1 from healthy human stool samples to reveal that hItln-1 recognizes many Gram-positive, butyrate-producing bacteria that are associated with a healthy microbiota. These findings suggest a role for hItln-1 in microbiome regulation that has not been proposed previous for lectins—that it may be a positive effector of microbial colonization.

(43)

1.10 Acknowledgements

I would like to thank Professor Laura L. Kiessling and all of LectinLand for thoughtful and stimulating conversations about intelectins, and for humoring me when I want to talk about panda intelectins, and Dr. Austin Kruger when I want to talk about coral intelectins. I would also like to thank Forrest FitzGerald, a rotation student who worked with me on the sequence

alignments. Finally, I would like to thank Katherine Taylor, Melanie Halim, and Dr. Sayaka Masuko for feedback and edits in preparing this chapter.

Figure

Figure 1-1. X-type lectin structures. (A) X-ray crystal structure of hItln-1 bound to allyl-β- D -Galf (β- (β-Galf, PDB 4WMY)
Figure 1-2. Carbohydrate binding sites from select X-type lectins. (A) Carbohydrate binding site of  hItln-1 bound to allyl-β- D -Galf (PBD 4WMY) and model of hItln-2 binding site
Table 1-1. GenBank accession codes for aligned sequences in Figure 1-2
Figure 2-1. Factors contributing to lectin–
+7

Références

Documents relatifs

PCA showing: (A) individuals colored by urbanization level (rural, semi-urban, urban) and with shapes corresponding to individual enterotypes (F1–F4); (B) contextual variables

Furthermore, to simulate the physical separation of both human and microbial cells encountered in vivo [146], both contingents should be separated either by

En prenant en compte, dans un deuxi` eme temps, le maillage le plus fin, celui discr´ etis´ e en 78.164 ´ el´ ements triangles (soit 354.420 degr´ es de libert´ e) et la mˆ eme

In the HNF1α-mutated HCA, we found an increase in PPAR γ gene expression, however, among three well-known targets of this transcription factor (FABP4, CD36 and UCP2) only

5 - In one of these 4 samples, an Enteroinvasive E. coli were identified.. A highly sensitive test is useful to alert medical authorities to an outbreak of S. At the beginning of

Here, a microresonator-based Kerr frequency comb [5] (soliton microcomb) with a repetition rate of 14 GHz is generated with an ultra-stable pump laser and used to derive an

However, HR rats do not develop cocaine addiction-like behavior more than LR after extended exposure to the drug (Belin et al., 2008) and, as compared to 0crit rats, 3crit,

The relative order of the mud test readings agreed per- fectly with that of the grouser pressure face arease The main advantage of increased track contact area was to enable the