Lignin monomer biosynthesis

From Purdue Genomics Database Facility

Jump to: navigation, search
This work was supported by the National Science Foundation

Nicholas D. Bonawitz (nbonawit@purdue.edu), Jing-Ke Weng (wengj@purdue.edu), and Clint Chapple (chapple@purdue.edu)

Department of Biochemistry, Purdue University, West Lafayette, IN USA

Contents

Introduction

The three monomers of lignin (from Wikipedia)
The phenylpropanoid pathway

Lignin is a branched phenolic polymer present in vascular plants that is required to provide structural strength to the secondary cell wall1. The appearance of lignin represents a crucial step in the evolution of land plants, as it allowed them for the first time to grow tall by providing rigidity to the stem and by resisting negative vascular pressure during transpiration. Bryophytes such as mosses do not synthesize lignin, lack true vasculature, and are only capable of growing low to the ground. Lignin has drawn increased attention from the scientific community recently due to its impact on cellulosic biofuels production. As a component of bulk biomass, lignin inhibits the extraction of fermentable sugars from cellulose and hemiceullolose, and a better understanding of its synthesis may allow for the rational manipulation of lignin in future biofuels crops to minimize these detrimental effects2.

The lignin polymer is composed of three different monomers, or monolignols--sinapyl alcohol, coniferyl alcohol, and p-coumaryl alcohol--which upon incorporation into lignin are referred to as syringyl (S), guaiacyl (G), or hydroxyphenyl (H) units, respectively. The proportion of S, G, and H lignin varies among different cell types and among different species, and can substantially impact the physical properties of the lignin polymer3. In gymnosperms, lignin is primarily composed of G subunits, with smaller amounts of H, and S units are completely absent. Interestingly, S lignin is found in both angiosperms and lycophytes, and phylogenetic analysis suggests that these lineages independently evolved the ability to synthesize syringyl monolignol4.

All three monolignols are products of the phenylpropanoid pathway, a plant specific pathway which generates a number of secondary metabolites using phenylalanine as a starting material. Following deamination of phenylalanine to form cinnamic acid, the synthesis of monolignols can be thought of as a sequence of chain reductions and ring modifications (oxidations and methylations). Recent work has greatly increased our understanding of this pathway5, and the enzymes responsible for carrying out lignin biosynthesis have been identified in Arabidopsis. Below we describe the individual steps in the pathway of lignin monomer biosynthesis, and the genes encoding the putative Selaginella enzymes responsible for carrying them out.

(Note: in many cases, the traditional name of the enzyme may not reflect recent insights on the mechanisms of the phenylpropanoid pathway. For example, the enzyme responsible for carrying out hydroxylation of the 5-position of the aromatic ring of the lignin precursor is misleadingly called ferulate 5-hydroxylase, even though it is now known that F5H shows a strong preference for hydroxylating the aldehyde or alcohol derivatives of ferulate rather than the carboxylic acid form. Nevertheless, these enzymes are typically referred to by their traditional names in the literature and these common names will be retained here.)

The enzymes of lignin biosynthesis

Phenylalanine ammonia lyase (PAL)

The first step in the general phenylpropanoid pathway, PAL (EC4.3.1.5) catalyzes the conversion of phenylalanine to cinnamic acid and ammonia (see reaction at KEGG)6. The Arabidopsis genome encodes four PALs, PAL1 (AT2G37040), PAL2 (AT3G53260), PAL3 (AT5G04230), and PAL4 (AT3G10340). Searching the Selaginella genome revealed 9 sequences that likely encode Selaginella PAL homologs. Three of these genes, SmPAL2, SmPAL3, and SmPAL4 appear to be due to tandem duplication. The SmPAL3-2 allele appears to have been further duplicated on only one contig, the resulting alleles being designated SmPAL3-2a and SmPAL3-2b.

PAL1-1, PAL1-2

PAL2-1, PAL2-2

PAL3-1, PAL3-2a, PAL3-2b

PAL4-1, PAL4-2

phylogenetic tree of PAL sequences:

Phylogenetic tree of PAL sequences

Trans-cinnamate 4-hydroxylase (C4H)

In the second step of lignin monomer biosynthesis, cinnamate is hydroxlated at the 4-position, yielding p-coumarate, by the enzyme C4H (EC1.14.13.11; see reaction at KEGG)7. C4H is a P450-dependent monooxygenase (CYP73A5 according to P450 nomenclature), and like PAL is required for the synthesis of flavonoids as well as lignin. The Selaginella genome appears to encode a single ortholog of Arabiodpsis C4H (AT2G30490). For a phylogenetic tree of C4H, see the separate page on cytochrome P450-dependent monooxygenases.

C4H1-1, C4H1-2

4-Coumarate:CoA ligase (4CL)

The enzyme 4CL (EC 6.2.1.12) activates 4CL to the activated thioester form p-coumaroyl CoA (see reaction at KEGG)8. This reaction is the last step in the general phenylpropanoid pathway, as p-coumaroyl CoA can be acted upon by HCT (described below) to continue through the lignin monomer pathway, or, alternatively, can be shunted into flavonoid biosynthesis by the action of the chalcone synthase enzyme. The Arabidopsis genome was recently shown to encode four functional 4CL enzymes, designated At4CL1 (AT1G51680), At4CL2 (AT3G21240), At4CL3 (AT1G65060), and At4CL5 (AT3G21230, also referred to as 4CL4). All four of the 4CL isoforms described in Arabidopsis are able to convert p-coumarate, caffeate, and ferulate to their activated thioester forms. However, only At4CL5 is capable of taking sinapate as a substrate. The Selaginella genome appears to contain two genes encoding 4CL enzymes.

4CL1-1, 4CL1-2

4CL2-1, 4CL2-2

phylogenetic tree of 4CL sequences

Phylogenetic tree of 4CL sequences

Hydroxycinnamoyl-CoA:shikimate/quinate hydroxycinnamoyl transferase (HCT)

HCT (EC2.3.1.133) catalyzes the formation of an ester bond between p-coumaroyl CoA and either shikimate or quinate (see reaction at KEGG here or here)9. The resulting ester, p-coumaroyl shikimate/quinate, then acts as the substrate for the next step lignin monomer biosynthesis, hydroxylation of the 3-position of the aromatic ring of p-coumarate (i.e. the 3’ position the p-coumaroyl ester) to form caffeoyl shikimate/quinate. HCT is also thought to be responsible for converting the caffeoyl ester back into caffeoyl CoA, and could thus be thought of as carrying out two steps of lignin biosynthesis. HCT is a member of the newly described BAHD family of acyltransferases, a diverse family of enzymes, the majority of whose functions are unknown10. BAHD acyltransferases are the subject of a separate page in this Wiki. The Selaginella genome appears to encode one ortholog of Arabidopsis HCT (AT5G48930), which has been named according to an attempt to create a standardized nomenclature for the BAHD acyltransferases. For a phylogenetic tree of HCT, see the separate page on BAHD acyltransferases.

BAHDe7-1 and BAHDe7-2

p-Coumaroyl shikimate/quinate 3'-hydroxylase (C3'H)

C3’H (EC1.14.13.36), like C4H and F5H, is a cytochrome P450-dependent monooxygenase (CYP98A3) and is responsible for hydroxylation of the aromatic ring of the lignin monomer precursor, in this case at the 3-position (see reaction at KEGG here or here)11,12. For reasons that remain unclear, Arabidopsis C3’H (AT2G40890) takes an esterified form of p-coumarate as a substrate (p-coumaroyl shikimate or p-coumaroyl quinate). The genome of Selaginella appears to contain one ortholog of C3’H. For a phylogenetic tree of C3'H, see the separate page on cytochrome P450-dependent monooxygenases.

C3'H1-1, C3'H1-2

Caffeoyl-CoA O-methyltransferase (CCoAOMT)

CCoAMT (EC2.1.1.104) carries out the first methylation of the monolignol aromatic ring, in this case at the 3-position (see reaction at KEGG)13. Although the COMT enzyme (see below) was originally thought to carry out methylation at both the 3 and the 5 positions of the free monolignol, it was subsequently shown that methylation of the 3-hydroxyl group carried out in angiosperms by the separate CCoAMT enzyme, which takes the CoA thioester of caffeic acid as a substrate. Analysis of the Selaginella genome failed to identify an unambiguous ortholog of Arabidopsis CCoAMT (AT1G67980). This could mean either that one of the related sequences identified in the Selaginella genome functions as a CCoAOMT, or alternatively, that methylation of the 3-hydroxyl group of the monolignol precursor is is carried out by a non-orthologous enzyme.

phylogenetic tree of CCoAMT sequences

Phylogenetic tree of CCoAMT sequences

Cinnamoyl-CoA reductase (CCR)

The CCR enzyme (EC1.2.1.44) is responsible for reduction of feruloyl CoA to coniferaldehyde (see reaction at KEGG)14. Two CCR enzymes are found in Arabidopsis, designated CCR1 (AT1G15950) and CCR2 (AT1G80820). CCR1 was shown to be identical to the IRX4 gene, which was isolated on the basis of its irregular xylem phenotype. The Selaginella genome appears to encode a single homolog CCR.

CCR1-1, CCR1-2

phylogenetic tree of CCR sequences

Phylogenetic tree of CCR sequences

Ferulate 5-hydroxylase (F5H)

The cytochrome P450-dependent monooxygenase F5H (EC1.14.13.-) is responsible for the hydroxylation of the 5 position of aromatic ring of coniferaldehyde or coniferyl alcohol, forming 5-hydroxy coniferaldehyde or 5-hydroxy coniferyl alcohol, respectively (see reaction at KEG here or here)15. Because F5H is the first step of the phenylpropanoid pathway specific for the synthesis of S lignin, it thus plays a key role in determining the ratio of S to G subunits in the lignin polymer. Interestingly, the presence of S lignin is highly restricted in vascular plant lineages. Ferns, cycads, and conifers do not contain S lignin, and it was previously believed by many to be an angiosperm-specific development. However, recent work has shown that the lignin of Selaginella moellendorfii does contain syringyl units, and that the gene encoding Selaginella F5H can complement the phenotype of an F5H deficient Arabidopsis mutant (fah1). It appears this state of affairs is an example of convergent evolution, since the gene encoding Selaginella F5H is not orthologous with Arabidopsis F5H (AT4G36220; CYP84A1). For a phylogenetic tree of F5H, see the separate page on cytochrome P450-dependent monooxygenases.

F5H1-1, F5H1-2

Caffeic acid O-methyltransferase (COMT)

COMT (EC2.1.1.68) is responsible for the final ring modification of monolignol biosynthesis, methylation of the 5-hydroxyl group of 5-hydroxyconiferaldehyde or 5-hydroxyconiferyl alcohol, yielding sinapaldehyde or sinapyl alcohol, respectively (see reaction at KEGG here or here)16. This ring modification is required for the synthesis of syringyl subunits and S lignin, but is dispensable for the production of guaicyl subunits and G lignin. Phylogenetic analysis failed to identify an unambiguous Selaginella ortholog of Arabidopsis COMT (AT5G54160), though we are currently carrying out experiments to identify the true COMT(s) from several possible related genes.

phylogenetic tree of COMT sequences:

Phylogenetic tree of COMT sequences

Cinnamyl alcohol dehydrogenase (CAD)

The CAD enzyme (EC1.1.1.195) is responsible for the last step in lignin monomer biosynthesis, catalyzing the conversion of coniferaldehyde, sinapaldehyde, and p-coumaraldehyde to coniferyl alcohol, sinapyl alcohol, or p-coumaryl alcohol respectively (See reaction at KEGG here, here, or here)17. These alcohols, in turn, serve as the substrates for extracellular peroxidases and laccases and are added into the growing lignin polymer. Although there are nine genes annotated as CADs in Arabidopsis, only three of these have been shown to be catalytically active and involved in lignin biosynthesis. CadC (AT3G19450) and CadD (AT4G34230) are thought to be the most important of these three, and may be partially redundant. Our phylogenetic analysis indicates that the Selaginella genome contains a single ortholog of CadC and CadD. The third CAD enzyme in Arabidopsis, CadG (AT1G72680), appears to be represented in Selaginella by two orthologs.

CadC/CadD ortholog

CAD1-1, CAD1-2

CadG orthologs

CAD2-1, CAD2-2

CAD3-1, CAD3-2

phylogenetic tree of CAD sequences

Phylogenetic tree of CAD sequences

References

1. Ralph et al, Phytochem Rev 2004, 3: 29-60

2. Weng et al, Curr Opin Biotechnol 2008, 19: 166-172

3. Boerjan et al, Annu Rev Plant Biol 2003, 54: 519-546

4. Weng et al, Proc Natl Acad Sci USA 2008, 105: 7887-7892

5. Humphreys and Chapple, Curr Opin Plant Biol 2002, 5:224-229

6. Wanner et al, Plant Mol Biol 1995, 27: 327-338

7. Mizutani et al, Plant Physiol 1997, 113,: 755-763

8. Hamberger and Hahlbrock, Proc Natl Acad Sci USA 2004, 101:2209-2214

9. Hoffman et al, Plant Cell 2004, 16: 1446-1465

10. D'Auria, Curr Opin Plant Biol 2006, 9: 331-340

11. Schoch et al, J Biol Chem 2001, 276: 36566-36574

12. Ruegger and Chapple, Plant J 2002, 30: 33-45

13. Zou and Taylor, Plant Physiol and Biochem 1994, 32: 423-427

14. Lacombe et al, Plant J 1997, 11: 429-441

15. Meyer et al, Proc Natl Acad Sci USA 1996, 93: 6869-6874

16. Zhang et al, Biochim Biophys Acta' 1997, 1353: 199-202

17. Kim et al, Proc Natl Acad Sci USA 2004, 101:1455-1460

research Groups