Huntingtin structure is orchestrated by HAP40 and shows a polyglutamine expansion-specific interaction with exon 1

Harding, Rachel J.; Deme, Justin C.; Hevler, Johannes F.; Tamara, Sem; Lemak, Alexander; Cantle, Jeffrey P.; Szewczyk, Magdalena M.; Begeja, Nola; Goss, Siobhan; Zuo, Xiaobing; Loppnau, Peter; Seitova, Alma; Hutchinson, Ashley; Fan, Lixin; Truant, Ray; Schapira, Matthieu; Carroll, Jeffrey B.; Heck, Albert J. R.; Lea, Susan M.; Arrowsmith, Cheryl H.

doi:10.1038/s42003-021-02895-4

Download PDF

Article
Open access
Published: 08 December 2021

Huntingtin structure is orchestrated by HAP40 and shows a polyglutamine expansion-specific interaction with exon 1

Communications Biology volume 4, Article number: 1374 (2021) Cite this article

5264 Accesses
16 Citations
24 Altmetric
Metrics details

Subjects

Abstract

Huntington’s disease results from expansion of a glutamine-coding CAG tract in the huntingtin (HTT) gene, producing an aberrantly functioning form of HTT. Both wildtype and disease-state HTT form a hetero-dimer with HAP40 of unknown functional relevance. We demonstrate in vivo and in cell models that HTT and HAP40 cellular abundance are coupled. Integrating data from a 2.6 Å cryo-electron microscopy structure, cross-linking mass spectrometry, small-angle X-ray scattering, and modeling, we provide a near-atomic-level view of HTT, its molecular interaction surfaces and compacted domain architecture, orchestrated by HAP40. Native mass spectrometry reveals a remarkably stable hetero-dimer, potentially explaining the cellular inter-dependence of HTT and HAP40. The exon 1 region of HTT is dynamic but shows greater conformational variety in the polyglutamine expanded mutant than wildtype exon 1. Our data provide a foundation for future functional and drug discovery studies targeting Huntington’s disease and illuminate the structural consequences of HTT polyglutamine expansion.

The structure of pathogenic huntingtin exon 1 defines the bases of its aggregation propensity

Article 02 March 2023

Identification of a HTT-specific binding motif in DNAJB1 essential for suppression and disaggregation of HTT

Article Open access 10 August 2022

Molecular mechanisms of heterogeneous oligomerization of huntingtin proteins

Article Open access 20 May 2019

Introduction

The autosomal-dominant neurodegenerative disorder Huntington’s disease (HD) is caused by the expansion of a CAG repeat tract at the 5′ of the huntingtin gene above a critical threshold of ~35 repeats¹. CAG tract expansion corresponds to an expanded polyglutamine tract of the Huntingtin (HTT) protein, which functions aberrantly compared to its unexpanded form². Polyglutamine expanded HTT is thought to be responsible for disrupting a wide range of cellular processes, including proteostasis^3,4, transcription^5,6, mitochondrial function⁷, axonal transport⁸ and synaptic function⁹. HD patients experience a range of physical, cognitive and psychological symptoms and longer repeat expansions are associated with earlier disease onset¹⁰. The prognosis for HD patients is poor, with an average life expectancy of just 18 years from the point of symptom onset. There are currently no disease-modifying therapies available to HD patients.

HTT is a 3144 amino acid protein comprised of namesake HEAT (Huntingtin, Elongation factor 3, protein phosphatase 2A, TOR1) repeats and is hypothesised to function as a scaffold for larger multi-protein assemblies^11,12. Many proteomics and interaction studies suggest HTT has an extensive interactome of hundreds of proteins but the only biophysically and structurally validated interactor of HTT is the so-called 40-kDa HTT-associated protein HAP40^13,14, an interaction partner conserved through evolution^15,16. HAP40 is a TPR domain protein with suggested functions in endocytosis^17,18,19. An earlier 4 Å mid-resolution cryo-electron microscopy (cryo-EM) model of HTT in complex with HAP40 reveals that the HEAT subdomains of HTT wrap around HAP40 across a large interaction interface²⁰. Biophysical and biochemical analyses comparing purified HTT and HTT-HAP40 samples have revealed that HAP40-bound forms of HTT exhibit reduced aggregation propensity, greater structural stability and monodispersity as well as conformational homogeneity^20,21. Consequently, apo HTT is a more difficult sample to work with for structural and biophysical characterisation, and several studies to date have required cross-linking approaches to constrain the HTT molecule to facilitate its analysis, suggesting HTT-HAP40 interactions may stabilise HTT^22,23. The biological function of the HTT-HAP40 complex however, remains elusive, and it is not clear if the function of this complex differs from apo HTT in vivo. It is also not yet understood whether HTT is constitutively bound to HAP40 or whether apo and HAP40-bound forms of HTT perform different functions in the cell.

Current structural information for the full-length HTT molecule provide little information on the N-terminal exon 1 region of the protein spanning residues 1–90, which contains the critical polyglutamine and polyproline tracts. This region of the protein is unresolved in the HTT-HAP40 cryo-EM model (PDBID: 6EZ8)²⁰ and therefore the influence of the tract expansion on HTT structure–function remains the subject of investigation. Although many studies have focussed on understanding the effects of polyglutamine expansion on exon 1 in isolation^24,25,26, there is still very little known about exon 1 in the context of the full-length HTT protein molecule, either in the apo form or in the complex with HAP40. The intrinsically disordered region (IDR), which spans residues 400–660 is subject to a range of post-translational modifications (PTMs), is postulated to be critical in mediating various protein interactions^21,27,28, and is also unresolved in the cryo-EM structure. Understanding the function of both wild-type (WT) and expanded forms of HTT is critical as many potential HD treatments currently under clinical investigation aim to lower HTT expression, using both allele selective or non-selective approaches²⁹. Deeper biological insight into the determinants of cellular HTT protein levels, as well as normal and expanded HTT cellular function would help direct which approaches should be prioritised for long-term patient therapies.

Here we report in vivo studies that show a strong correlation of HTT and HAP40 levels in different genetic backgrounds, providing evidence for the importance of the HTT-HAP40 complex in a physiological setting. Combining the power of multiple complementary structural techniques, we have built a model of the missing regions of our high-resolution (2.6 Å) model of HTT-HAP40, including the biologically critical exon 1 region of HTT and the N-terminal region of HAP40. We demonstrate the remarkable stability of the HTT-HAP40 complex, potentially explaining in vivo codependence of these two proteins and providing important insight for future drug developments in pursuit of treating HD.

Results

HAP40 levels are dependent on HTT

The HTT-associated protein HAP40 co-evolved with HTT¹⁵ and a HAP40 orthologue has been identified in many species, including invertebrates¹⁶. This suggests that HTT and HAP40 may have functions and/or physical interactions that are co-dependent. To investigate the in vivo relationship between HTT and HAP40, we analysed the levels of both proteins and their mRNA transcripts in liver tissue from mice with biallelic knock out of Htt compared to that of WT mice (Fig. 1a–d). Hap40 protein levels were significantly reduced in hepatocyte-specific Htt knock out mice compared to WT mice, and a statistically significant correlation was observed between the levels of Htt and Hap40 protein levels. Importantly, the mRNA transcript levels of Hap40 did not change appreciably in the knockout mice, and the Htt and Hap40 transcript levels are not correlated.

**Fig. 1: HTT lowering reduces HAP40 protein levels but not mRNA levels in vivo.**

Next, we assessed changes to HTT and HAP40 protein levels in hTERT-immortalised RPE1 cells following treatment with the HTT-lowering drug branaplam. Branaplam is a splicing modulator that lowers expression of the HTT protein by changing pre-mRNA transcript processing to retain a poison exon leading to the inclusion of a premature STOP codon³⁰. HAP40 is an intron-less gene coded entirely within intron 22 of the F8A gene so is not anticipated to be a target of branaplam¹⁹. Following treatment, HTT levels were significantly lowered in a dose-dependent manner, even at very low doses of branaplam. HAP40 levels, as measured by western blot analysis, were also significantly lower in an apparent dose-dependent manner and a statistically significant correlation was observed between HTT and HAP40 levels as a function of dose (Fig. 1e, f). Together, these data suggest that HAP40 protein stability and/or abundance is dependent on HTT protein levels.

High-resolution structure of HTT-HAP40 complex

HTT-HAP40 was expressed in insect cells and purified as previously described²¹. We determined the structure of HTT-HAP40 (PDBID: 6X9O) to a nominal resolution of 2.6 Å using cryo-EM (Fig. 2a, b and Supplementary Fig. 1), improving substantially upon the previously published 4 Å model (PDBID: 6EZ8)²⁰ and two recent models (PDBIDs: 7DXJ [3.6 Å] and 7DKK [4.1 Å])³¹. Like all previous models, flexible regions accounting for ~25% of the HTT-HAP40 complex, including exon 1 and the IDR, were not resolved in our high-resolution maps (Fig. 2c). However, our improved resolution permits more confident positioning of amino acid side chains in the structured regions thereby enabling more precise analysis of key structural features and surfaces.

**Fig. 2: HAP40 stabilises the structure of HTT via extensive interactions across all subdomains.**

The overall structure of the complex is similar to the previously published model (PDBID: 6EZ8) with a root-mean-square deviation of 1.9 across the models when superposed. However, key differences exist between the two models (Fig. 2d). Two additional C-terminal α-helices in the HTT C-HEAT domain spanning residues 3105–3137 are resolved in our model (all residue numbering based on HTT NCBI reference NP_002102.4 sequence), whereas the resolution of two N-terminal α-helices of HAP40 spanning residues 42–82 is lost. The unmodified native HAP40 C-terminus in our model is able to thread into the centre of the C-HEAT domain (Fig. 2e). This extended interaction of HAP40 with HTT may be responsible for a small shift we observe of the C-HEAT domain, which pivots ~5° relative to the previous model, reducing the interaction interface of HTT-HAP40 from ~5350 to ~4700 Å². One potential reason for this difference is that the C-terminus of HAP40 in our construct is unmodified, whereas Guo and colleagues²⁰ used a C-terminal Strep-tag in their expression construct, which is unresolved in their model. The differences observed for the HTT and HAP40 interface when comparing our high-resolution structural model (PDBID: 6X9O) and the previous mid-resolution model (PDBID: 6EZ8) indicate that the extensive interaction interface is able to accommodate some variation.

Our high-resolution model enables a comprehensive analysis of the surface-charge features of the HTT-HAP40 complex. The HTT–HAP40 interface is predominantly formed by extensive hydrophobic interactions between the two proteins (Fig. 2f). Previous analysis of this interface has also highlighted a charge-based interaction between the BRIDGE domain of HTT and the C-terminal region of the HAP40 TPR domain²⁰. Interestingly, the N-HEAT domain of HTT has a defined positively charged tract spanning almost 40 Å in length and 5–10 Å in width formed between two stacked HEAT repeats in the N-HEAT solenoid (Fig. 2f arrow). We also conducted sequence conservation analysis of both HTT and HAP40, which we mapped to the high-resolution structure of the complex (Supplementary Data 1 and 2). Interestingly this revealed surfaces on the HAP40-exposed face of the protein as highly conserved, with extended regions of strict conservation partially spanning the C-HEAT domain, BRIDGE and N-HEAT (Fig. 2g). However, the opposite face is less conserved, while the HTT–HAP40 interface is moderately conserved for both HTT and HAP40. The HTT-HAP40 model was analysed for ligand-able pockets, which were assessed for druggability according to factors such as the volume and depth of apparent pockets and their surface charge and hydrophobicity properties. One of the most promising pockets, which is predicted to be ligand-able, lies at the HTT-HAP40 interface and is lined by residues from the N-terminal region of the HAP40 TPR domain as well as the HTT N-HEAT domain (Fig. 2h and Supplementary Table 1). The high resolution of our HTT-HAP40 model provides a foundation for virtual screening of such pockets and other structure-based drug-discovery efforts towards the identification of HTT ligands, which may be developed into proteolysis-targeting chimeras³² or PET ligands³³ to function as, or monitor, HTT-lowering therapies²⁹, a critical area of focus for HD drug discovery.

Our 2.6 Å structure is of sufficient resolution to allow the identification of PTMs. However, no PTMs were observed for any of the resolved residues in the HTT-HAP40 complex. Native mass spectrometry (MS) analysis, on the other hand, revealed the high purity of our HTT-HAP40 samples, albeit that a small mass difference (compared to the theoretical mass) was observed, consistent with the presence of a few PTMs (Supplementary Fig. 2a, d). Further analysis of the HTT-HAP40 complex upon Caspase6 digestion revealed these PTMs to be primarily phosphorylations (at least two), which could be mapped to the regions spanning 586–2647 and 2647–3144 of the HTT sequence (Supplementary Fig. 2b–d). Based on the cumulative evidence from the MS data, these modifications reside within the two flexible portions of HTT not resolved in our cryo-EM maps. Although many studies have identified numerous different sites and possible PTMs of the HTT protein^21,27,28,34, these approaches have so far been qualitative and do not give us a good understanding of the key proteoforms the HD community is studying in either in vitro or in vivo models. Our quantitative top–down and middle–down MS approaches suggest many PTMs are in fact only present at very low abundance, at least in our samples expressed in insect cells.

We attempted to separately purify HTT and HAP40 for comparison to the complex. As reported by Guo and colleagues²⁰, we were also unable to express recombinant HAP40 alone, although it is readily expressed in the presence of HTT, a trend that parallels our in vivo observations and cell biology experiments. In the absence of HAP40, we and others have shown that recombinant HTT self-associates and is conformationally heterogenous in vitro^21,22,34. Cryo-EM analysis of our apo HTT samples yielded a 12 Å resolution envelope (Fig. 3a, b and Supplementary Data 3). Despite the low resolution of this envelope, it is possible to identify the N-HEAT domain, with its central cavity, as well as the C-HEAT domain. The HTT portion of our HTT-HAP40 model can be fitted into this envelope. Comparison of this envelope with the previously reported apo HTT cryo-EM envelopes that were stabilised by cross-linking (EMD4937 and EMD10793)²² shows a less collapsed arrangement of the HTT subdomains. The difference in resolution between apo HTT and HTT-HAP40 samples observed by cryo-EM analysis emphasises the importance of HAP40 in stabilising the HTT protein and constraining the HEAT repeat subdomains into a more rigid conformation, further supporting the idea that this is a critical interaction for modulating HTT structure and function.

**Fig. 3: HTT HEAT domains are conformationally flexible in the absence of HAP40.**

Native top–down MS uses gas-phase activation to dissociate protein complexes enabling identification of complex composition and subunit stoichiometries. The most commonly used activation method using collisions with neutral gas molecules typically results in dissociation of a non-covalent complex into constituent subunits. Interestingly, our native top–down MS analysis of the intact HTT-HAP40 complex (Fig. 4a, b) primarily resulted in backbone fragmentation of HTT, eliminating both N- and C-terminal fragments (Fig. 4c–g). Remarkably, the vast majority of concomitantly formed high-mass dissociation products retained HAP40 (Fig. 4f), suggesting that the extensive hydrophobic interaction interface we observe in our high-resolution model keeps the HTT-HAP40 complex exceptionally stable. Similarly, gas-phase activation of Caspase6-treated HTT-HAP40 revealed that HAP40 remained intact and bound to HTT even at the highest activation energies, whereas the N- and C-terminal fragments of HTT produced upon digestion were readily dissociating from the complex (Supplementary Fig. 2c).

**Fig. 4: HTT and HAP40 form a very stable non-covalent complex that withstands dissociation.**

The recombinant samples of HTT-HAP40 were found to be highly monodisperse (Fig. 4b), displaying optimal biophysical properties (see also Supplementary Fig. 3a). Systematically screening the thermal stability of the HTT-HAP40 complex using a differential scanning fluorimetry (DSF) assay indicates that the complex is highly stable under a broad range of buffer, pH and salt conditions (Supplementary Fig. 3b, c). Destabilisation of the complex structure was only observed at low pH (Fig. 4h). Similarly, the interaction between HTT and HAP40 is retained upon mild proteolysis of the complex (Fig. 4I, j, all data in Supplementary Fig. 3d). For example, in an attempt to fragment HTT with Caspase6 treatment as previously described³⁵, we found that the HTT-HAP40 complex remains associated under native conditions. The same samples, when analysed under denaturing conditions used in western blots, showed apparent HTT cleavage products. These observations suggest caution when drawing conclusions about proteolytic fragments of HTT observed in western blot analyses of biological samples. Taken together, our studies reveal the high structural stability of the HTT-HAP40 complex with resistance to dissociation by native top–down MS, or proteolytic cleavage in solution. These data further support the high codependence of HTT and HAP40 protein levels in cell and animal models of HD and possibly HD patients.

Polyglutamine expansion modulates the dynamic sampling of conformational space by exon 1

Next, we sought to understand how the disease-causing polyglutamine expansions affect HTT structure. Our structural, biophysical and biochemical data presented so far focus on WT HTT (23 glutamines; Q23) and illustrate the importance of HAP40 in stabilising and orienting the HEAT repeat subdomains of HTT. However, 25% of the complex is not resolved in the cryo-EM maps, including many functionally important regions of the protein such as exon 1 (residues 1–90), which harbours the polyglutamine repeat region, and the IDR (residues 407–665). To further investigate the HTT protein structure in its entirety and the influence of polyglutamine expansion within exon 1, we repeated the DSF and proteolysis studies using HTT-HAP40 samples containing either a pathological HD HTT with 54 glutamines (Q54) or an HTT with a partially deleted exon 1 (Δexon 1; comprising residues 80–3144, missing N17, polyglutamine and proline-rich domain). We found that both the Q54 expansion and the removal of exon 1 had no detectable effects on the stability of the HTT-HAP40 complexes compared to the canonical Q23 complex (Supplementary Fig. 3).

To better describe the structure of exon 1 and the effects of the polyglutamine expansion on the HTT-HAP40 complex, we performed cross-linking mass spectrometry (XL-MS) experiments^36,37,38 using the IMAC-enrichable lysine cross-linker, PhoX³⁹. For cross-linking experiments, an optimised PhoX concentration was used, for which no cross-linker-induced protein aggregation was observed by mass photometry (Supplementary Fig. 4a). For Q23, Q54 and Δexon 1 isoforms of HTT-HAP40, we mapped approximately 120 cross-links for each sample (Supplementary Data File 7) which were highly reproducible (Supplementary Fig. 4b, c). When analysed against the 6X9O model of HTT-HAP40, the vast majority of cross-links map to regions unresolved in the cryo-EM maps (Fig. 5a), thereby providing valuable restraints for structural modelling of a more complete HTT-HAP40 complex. The mean distance of cross-links observed for resolved regions of the cryo-EM model was significantly below the 25 Å distance limit of PhoX in all three data sets (Q23: 7 cross-links—mean distance 13.7 Å; Q54: 11 cross-links—mean distance 14.8 Å; Δexon 1: 12 cross-links—mean distance 14.9 Å; Supplementary Data File 7). This is in line with the mass photometry data and indicates that there is a low probability of intermolecular cross-links between HTT molecules, e.g. from aggregation, being included in our data sets (Supplementary Fig. 4a).

**Fig. 5: Exon 1 is highly flexible and conformationally dynamic in the context of the full-length protein.**

We obtained very similar cross-link data for the three different HTT-HAP40 constructs (Supplementary Fig. 5b), which span all subdomains of HTT and also HAP40, indicating good cross-linking efficiency (Supplementary Fig. 5a, c). Of particular note are the large number of exon 1 PhoX cross-links in the HTT-HAP40 Q23 and Q54 samples mediated via lysine-6 or lysine-9 within the N-terminal 17 residues (N17 region) of exon 1 (Fig. 5b). N17 is reported to play key roles for the HTT protein including modulating cellular localisation, aggregation and toxicity^40,41,42 and is proposed to interact with distal parts of HTT⁴³.

For both samples (Q23 and Q54), N17 is found to contact several regions of the N-HEAT domain as well as the cryo-EM unresolved N-terminal region of HAP40, via lysine-32 and lysine-40. Interestingly, N17 of Q54 showed additional cross-links to the more distant C-HEAT domain (Fig. 5b and Supplementary Fig. 4b). Finally, the largest uninterrupted stretch of the HTT-HAP40 protein that is unresolved in the cryo-EM maps is the IDR. We identified cross-links which indicate that this region makes intra-domain contacts as well as contacts with the neighbouring N-HEAT domain.

Size-exclusion chromatography multi-angle light scattering (SEC-MALS) analysis of this same series of samples shows no significant difference in mass but does indicate a small shift in the peak for the elution volume of the HTT-HAP40 Δexon 1 complex compared to Q23 and Q54 complex samples (Fig. 6a). Together with the XL-MS data, this suggests that there are subtle structural differences between the Q23, Q54 and Δexon 1 HTT-HAP40 complexes. To further interpret the cross-linking data in the context of the three-dimensional (3D) structure of the HTT-HAP40 complex, we performed small-angle X-ray scattering (SAXS) analysis of our samples to assess any changes to their global structures (Supplementary Data 4–6). We have previously reported SAXS data for HTT-HAP40 Q23²¹. This revealed that the particle size was significantly larger than the cryo-EM model, which likely accounts for the ~25% of the protein not resolved in cryo-EM maps and therefore not modelled in the structure. Similar analysis of the HTT-HAP40 Q54 and HTT-HAP40 Δexon 1 and comparison with our previous Q23 data shows that polyglutamine expansion or deletion of exon 1 has only very modest effects on the SAXS profiles (Fig. 6b–d). HTT-HAP40 Q54 is slightly larger than the HTT-HAP40 Q23, whereas HTT-HAP40 Δexon 1 samples are slightly smaller, as might be expected, but overall the SAXS-determined parameters for the three samples are very similar (Fig. 6e) as we would expect for samples with highly similar structural cores³¹. In line with that, the SAXS-calculated particle envelopes for the three samples are also very similar in size and shape (Supplementary Fig. 6a).

**Fig. 6: Polyglutamine expansion or deletion of exon 1 has modest effects on the full-length HTT-HAP40 SAXS profile.**

Next, we modelled the complete structures of HTT-HAP40, including flexible and disordered regions, integrating our cryo-EM, SAXS and XL-MS data, similar to other studies using integrated approaches to study disordered protein structures^37,38. Coarse-grain modelling molecular dynamics (MD) simulations were performed and an ensemble of models that best fit both the cross-linking and SAXS data for HTT-HAP40 was calculated for all three variants of the HTT-HAP40 complex (Supplementary Fig. 6b, c and Supplementary Data 8–11). This modelling approach assumed that the residues with known coordinates in the cryo-EM model form a quasi-rigid complex, whereas the residues with missing coordinates are flexible. As expected from our cross-linking results, the suggested conformations adopted by exon 1 in the ensemble model of Q54 HTT-HAP40 complex are skewed compared to the Q23 ensemble with exon 1 interacting with many more surfaces of the Q54 HTT-HAP40 complex (Fig. 7a). Mapping our PhoX exon 1 cross-linked residues for each sample to a representative model from each ensemble reveals how exon 1 Q23 cross-links are largely constrained to the N-HEAT domain, whereas exon 1 Q54 cross-links are also found on the C-HEAT domain (Supplementary Fig. 4b). Exon 1 of our HTT-HAP40 Q54 ensemble appears to explore a larger volume of conformational space, and this seems to have a knock-on effect on the conformational space occupied by the IDR (Fig. 7b).

**Fig. 7: Insights from integrated modelling of full-length HTT-HAP40 combining cryo-EM, SAXS and cross-linking mass spectrometry data.**

Modelling of our HTT-HAP40 structure indicates that the exon 1 region of the Q23 HTT is long enough to make cross-links with the C-HEAT domain, but we do not observe such cross-links in our PhoX data sets. This suggests that the additional cross-links observed for the polyglutamine expanded form of HTT-HAP40 may not be driven solely by the length of the exon 1 region. For all ensembles, the IDR is differentially constrained and occluded from adopting certain conformations depending on the conformational space occupied by exon 1, suggesting that polyglutamine and exon 1-mediated structural changes propagate to the IDR. Despite exon 1 and the IDR being separated by the first HEAT repeat domain spanning aa. 98–406, our modelling suggests that they are proximal in the full-length HTT-HAP40 complex, indicating that structural changes to one of these regions has the potential for a knock-on effect on the other. For the HTT-HAP40 Q54 model ensemble which suggests exon 1 adopts the most diverse conformations, the IDR is the most constrained, occupying a more finite space. However, for the HTT-HAP40 Δexon 1 model ensemble, the IDR is not occluded and so adopts a much wider range of conformations.

We used unconstrained MD simulations to analyse the contact frequency of experimentally identified cross-links in exon 1 and the IDR of HTT-HAP40 Q23 and Q54 (Supplementary Tables 2–5). All cross-links experimentally identified for the exon 1 region of HTT-HAP40 Q23 are also observed for the Q54 form of the protein. However, exon 1–C-HEAT cross-links that are uniquely identified in our HTT-HAP40 Q54 experiments have similar frequencies in our unconstrained simulations of HTT-HAP40 Q23 and Q54, supporting our conclusion that the experimental identification of exon 1–C-HEAT cross-links for HTT-HAP40 Q54 is significant (Supplementary Fig. 7a). Similarly, analysis of experimentally identified IDR cross-links in these simulations shows that the unique exon 1–IDR cross-link in the HTT-HAP40 Q54 data set has a similar frequency in both simulations, again indicating that experimental identification of this cross-link shows a significant difference in the conformational space occupied by exon 1 upon polyglutamine expansion (Supplementary Fig. 7b). These findings also support our use of experimental constraints, including SAXS and cross-linking data, to generate more realistic and statistically representative ensembles of the possible conformations of all flexible regions of HTT-HAP40 to help identify the subtle structural differences caused by polyglutamine expansion.

Together, our data suggest that, while polyglutamine expansion does not affect the core HEAT repeat structure, it does affect the conformational space occupied by not only the exon 1 region but also the IDR.

Discussion

We present new insights for the HTT-HAP40 structure, highlighting the close relationship between HTT and HAP40 as well as unveiling the effect of the polyglutamine expansion, thereby contributing to a richer understanding of HTT and its relationship with HAP40.

HTT is reported to interact with hundreds of different proteins¹⁴ but very few have been validated and the only interaction partner resolved by structural methods is HAP40. HAP40 is thought to have coevolved with HTT¹⁵ and orthologues have been identified in species back to flies¹⁶. The codependence of HTT and HAP40 is highlighted with our in vivo analysis of HTT and HAP40 levels in mice, which shows a strong correlation of the two proteins. We also demonstrate that this relationship is independent of the RNA transcript levels of the HTT and HAP40. Additionally, we showed in human cells that pharmacological lowering of HTT protein levels with the drug branaplam also reduced HAP40, and a statistically significant correlation (R² = 0.85) was observed between HTT and HAP40 levels as a function of dose. Overall, this suggests that HAP40 protein stability and/or abundance is dependent on HTT protein levels.

It remains to be seen whether HTT and HAP40 are in fact constitutively bound to each other, or if they may exist independently or in complex with other binding partners. HAP40 plays an important role in stabilising HTT conformation as we have shown with our biophysical and structural comparison of apo and HAP40-bound HTT samples, but the molecular mechanisms of how HAP40 functions in endosome transport^17,18 or modulating HTT toxicity in HD models¹⁶ remains to be determined. Interestingly, despite the exceptional stability of the HTT–HAP40 interaction, complex integrity was not maintained in our DSF assay at low pH, conditions similar to that of the local environment of the endosome.

How polyglutamine expansion of HTT contributes to changes in protein structure–function remains a critical and unanswered question in HD research. Previously, we have observed that changes in polyglutamine tract length seem to have minimal effects on the biophysical properties of HTT and HTT-HAP40 samples²¹ and studies have shown that these regions are dispensable for HTT function in mice⁴⁴. Similarly, in this study, we find no significant differences between our Q23, Q54 and Δexon 1 HTT-HAP40 samples when assessing monodispersity by mass photometry and native MS: thermal stability in a systematic buffer screen by DSF or stabilisation by proteolysis experiments. The structural differences of Q23, Q54 and Δexon 1 HTT-HAP40 samples are not resolved within the high-resolution cryo-EM maps we calculated. Our experiments using lower-resolution structural methods such as SAXS and mass spectrometry, which do consider the complete protein molecule, also show modest differences between the samples. One way we might rationalise this observation with what we know about HD pathology and HTT biology in physiological conditions is that our experimental systems do not capture any subtle, low abundance or slowly occurring differences of the samples which could be important in HD progression that occurs very slowly, over decades of a patient’s lifetime. Alternatively, it may be that models of HD pathogenesis which posit that large changes in HTT’s globular structure caused by polyglutamine expansion²³ are incorrect.

Notwithstanding the above caveats, our XL-MS studies provide some of the first insight into the structure of the exon 1 portion of the protein in the context of the full-length, HAP40-bound form of HTT. In both Q23 and Q54 samples, exon 1 appears to be highly dynamic and able to adopt multiple conformations. Our data suggest differences in the conformational ensembles of the unexpanded and expanded forms of exon 1 in the context of the full-length HTT protein (Supplementary Movie 1). Specifically, the expanded Q54 forms of exon 1 appear to sample different conformational space than unexpanded Q23. This is not just due to the additional length of this form of exon 1, conferring a higher degree of flexibility and extension to different regions of the protein but perhaps some biophysical consequence of a longer polyglutamine tract. This is the opposite of what has been reported for HTT exon 1 protein in isolation, where polyglutamine expansion compacts the exon 1 structure^45,46,47,48. Our data suggest that in the context of the full-length HAP40-bound HTT protein, exon 1 is not compact, but flexible and conformationally dynamic while retaining moderate structural organisation. Our modelling studies interestingly suggest that the change in exon 1 conformational sampling upon polyglutamine expansion may have consequent effects on the relative conformations and orientations of the IDR, a novel insight to HTT structure. Both exon 1 and the IDR have been highlighted as functionally important regions of HTT, as sites of dynamic PTMs and protease recognition concentrate in these regions. Our results suggest that structural changes in exon 1 induced by polyglutamine expansion could influence the accessibility of the IDR to partner proteins that modify residues within the IDR, despite the relatively rigid intervening regions between them. The flexibility we observe for exon 1 in both Q23 WT and Q54 mutant HTT-HAP40 supports the hypothesis that polyglutamine tracts can function as sensors, sampling and responding to their local environment⁴⁹.

Overall, our findings show that HTT is stabilised by interaction with HAP40 through an extensive hydrophobic interface with its distinct HEAT repeat subdomains, creating a highly stable complex. Expanded and unexpanded exon 1 remains highly dynamic in the context of this complex, sampling a vast range of conformational space and interacting with different regions of both HTT and HAP40. We present novel insight into the structural differences of WT and mutant HTT, which suggests that the conformational constraints of WT and mutant exon 1 are different and that models of HD pathogenesis relying on the hypothesis that polyglutamine expansion drives large-scale changes in HTT conformation may need to be re-examined.

Methods

In vivo HTT-HAP40 protein and RNA transcript levels

To generate samples with genetic reduction of HTT levels in the liver, mice in which the first exon of Htt is flanked by LoxP sites⁵⁰ were crossed with mice expressing CRE recombinase from the Alb promoter (JAX:003574). Fresh frozen livers from WT and LKO mice were collected at 5–6 months of age for subsequent protein and RNA analyses. Protein lysates were prepared for western blotting using non-denaturing lysis buffer (20 mM Tris HCl pH 8, 127 mM NaCl, 1% NP-40, 2 mM EDTA), with 50 ug of protein separated using 3–8% tris-acetate gels (Invitrogen EA0378) and transferred using an iBlot2 transfer system (Invitrogen IB21001). Probing with antibodies against HTT (Abcam EPR5526; 1:1000) and HAP40 (Novus NBP2-54731; 1:500) was performed overnight at 4 C with gentle shaking, followed by incubation with near infrared secondary antibodies (Licor 926–68073; 1:10,000). Signal was normalised to total protein in the lane (Licor 926–11010). Imaging was performed using an Odyssey imager and signal quantitated using ImageStudio (Licor). RNA samples were prepared using RNeasy Lipid Mini Kit (Qiagen) with on-column DNase digestion. cDNA was prepared using Superscript III (Invitrogen 18080-051) with random hexamer primers. Multiplexed quantitative PCR was performed on a Step One thermocycler (Applied Biosystems) using TaqMan assays and 200 ng cDNA template per reaction. FAM-labelled Htt and Hap40 probes were run in separate reactions, each with VIC-labelled ActB probes as internal control (ThermoFisher Mm01213820_m1, Mm03016217_s1, and Mm02619580_g1, respectively). Each biological replicate represents the average of three technical replicates, with relative quantification performed using delta-delta Ct calculation⁵¹. Mouse strain: C57Bl/6J. Sex: western blot all females; qPCR mixed male/female. Age: western blot 5 months +/− a week; qPCR 5.5 months +/− 2 weeks. All procedures were reviewed and approved by the animal care and use committee at Western Washington University.

Cell culture and branaplam treatment experiments

The hTERT-immortalised retinal pigment epithelial cell line, RPE1 (ATCC—CRL-4000) was cultured in Dulbecco’s Modified Eagle Medium/F12 1:1 media supplemented with 10% fetal bovine serum and 0.01 mg/mL hygromycin B at 37 °C in a 5% CO₂ incubator. Cell lines were not authenticated after purchase from supplier. Cells were negative for mycoplasma. At approximately 40% cell confluency, cells were treated with branaplam (Selleckchem—(6E)-3-(1H-Pyrazol-4-yl)-6-[3-(2,2,6,6-tetramethylpiperidin-4-yl)oxy-1H-pyridazin-6-ylidene]cyclohexa-2,4-dien-1-one) dissolved in dimethyl sulfoxide and then diluted with culture media to final concentrations of 5, 10, 25, 50, and 100 nM for 72 h. Cells were prepared for western blotting using RIPA buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl, 1% NP-40, 0.25% sodium deoxycholate, 1 mM EDTA, protease and phosphatase inhibitors (Roche) with 60 μg protein separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) on a 4–20% gradient gel (Bio-Rad) and transferred to a polyvinylidene difluoride membrane (Millipore). Membranes were blocked in TBS-T (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.1% Tween-20) containing 5% skim milk powder for 1 h then cut into three sections to be probed with primary antibodies against HTT (Millipore MAB2166; 1:2500), HAP40 (LS‑C167891, LSBio; 1:500), and vinculin (EPR8185, Abcam; 1:2500) in the same buffer overnight at 4 °C. Membranes were washed four times with TBS-T and then probed with horseradish peroxidase secondary antibodies (Abcam) for 30 min at room temperature (RT). After washing as above, membranes were incubated with ECL (Millipore) and imaged with a DNR MicroChemi chemiluminescence detector. Signal was quantified with ImageJ using the “Analyze Gel” option. HTT and HAP40 signals were normalised to the vinculin signal from the corresponding lane. Experiments were completed with three independent replicates.

Protein expression constructs

HTT Q23, HTT Q54 and HAP40 constructs used in this study have been previously described²¹ and are available through Addgene with accession numbers 111726, 111727 and 124060, respectively. HTT Δexon 1 clones spanning HTT aa. 80–3144 were also cloned into the pBMDEL vector. A PCR product encoding HTT from residues P76 to C3140 was amplified from cDNA (Kazusa clone FHC15881) using primers FWD (ttaagaaggagatatactatgCCGGCTGTGGCTGAGGAGC) and REV (gattggaagtagaggttctctgcGCAGGTGGTGACCTTGTGG). PCR products were inserted using the In-Fusion cloning kit (Clontech) into the pBMDEL that had been linearised with BfuAI. The HTT-coding sequences of expression constructs were confirmed by DNA sequencing. The sequences were also confirmed by Addgene where these reagents have been deposited. This clone is available through Addgene with accession number 162274.

Protein expression and purification

HTT and HTT-HAP40 protein samples were expressed in insect cells and purified using a similar protocol as previously described²¹. Briefly, Sf9 cells were infected with P3 recombinant baculovirus and grown until viability dropped to 80–85%, normally after ∼72 h post-infection. For HTT-HAP40 complex production, a 1:1 ratio of HTT:HAP40 P3 recombinant baculovirus was used for infection. Cells were harvested, lysed with freeze–thaw cycles and then clarified by centrifugation. HTT protein samples were purified by FLAG-affinity chromatography. FLAG eluted samples were bound to Heparin FF cartridge (GE) and washed with 10 CV 20 mM HEPES pH 7.4, 50 mM KCl, 1 mM TCEP, 2.5% glycerol and eluted with a gradient from 50 mM KCl buffer to 1 M KCl buffer over 10 CV. All samples were purified with a final gel filtration step, using a Superose6 10/300 column in 20 mM HEPES pH 7.4, 300 mM NaCl, 1 mM TCEP, 2.5% (v/v) glycerol. HTT-HAP40 samples were further purified with an additional Ni-affinity chromatography step prior to gel filtration. Fractions of the peaks corresponding to the HTT monomer or HTT-HAP40 heterodimer were pooled, concentrated, aliquoted and flash frozen prior to use in downstream experiments. Sample purity was assessed by SDS-PAGE. The sample identities were confirmed by native mass spectrometry (Fig. 5).

SDS-PAGE and western blot analysis

SDS-PAGE and western blot analysis were performed according to standard protocols. Primary antibodies used in western blots are anti-HTT EPR5526 (Abcam), anti-HTT D7F7 (Cell Signaling Technologies) and anti-Flag #F4799 (Sigma). Secondary antibodies used in western blots are goat-anti-rabbit IgG-IR800 (LI-COR) and donkey anti-mouse IgG-IR680 (LI-COR). Membranes were visualised on an Odyssey® CLx Imaging System (LI-COR).

DSF analysis of HTT samples

HTT samples were diluted in different buffer conditions and incubated at RT for 15 min before the addition of Sypro Orange (Invitrogen) to a final concentration of 5×. The final protein concentration was 0.15 mg/mL. Measurements were performed using a Light Cycler 480 II instrument from Roche Applied Science over the course of 20–95 °C. Temperature scan curves were fitted to a Boltzmann sigmoid function, and the transition temperature values were obtained from the midpoint of the transition.

Caspase6 proteolysis of HTT protein samples

HTT protein samples were mixed with recombinant Caspase6 (Enzo Life Sciences) in a ratio of 100 U caspase6 to 1 pmol of HTT in 20 mM HEPES pH 7.4, 150 mM NaCl and 1 mM TCEP with a final protein concentration of ~1 μM. The reaction and control mixture without caspase6 were incubated at RT for 16 h and then analysed by SDS-PAGE, blue native PAGE and analytical gel filtration using a Superose6 10/300 column in 20 mM HEPES pH 7.4, 150 mM NaCl and 1 mM TCEP.

Cross-linking of HTT-HAP40 samples with PhoX

For cross-linking experiments, HTT-HAP40 samples (HTTQ23-HAP40, HTTQ54-HAP40, HTT Δexon 1-HAP40) were diluted to a protein concentration of 1 mg/1 mL using cross-linking buffer (20 mM Hepes pH 7.4, 300 mM NaCl, 2.5% glycerol, 1 mM TCEP). HTT-HAP40 samples were treated with an optimised concentration of PhoX cross-linker to avoid protein aggregation (Supplementary Fig. 4a). After incubation with PhoX (0.5 mM) for 30 min at RT, the reaction was quenched for additional 30 min at RT by the addition of Tris HCl (1 M, pH 7.5) to a final concentration of 50 mM. Protein digestion was performed in 100 mM Tris-HCl, pH 8.5, 1% SDC, 5 mM TCEP and 30 mM CAA, with the addition of Lys-C and Trypsin proteases (1:25 and 1:100 ratio (w/w)) overnight at 37 °C. The reaction was stopped by addition of TFA to a final concentration of 0.1% or until pH ~2. Next, peptides were desalted using an Oasis HLB plate, before IMAC enrichment of cross-linked peptides like previously described³⁹. Four technical replicates were completed for each form of HTT-HAP40.

LC-MS analysis of cross-linked HTT-HAP40 samples

For LC-MS analysis, the samples were re-suspended in 2% formic acid (FA) and analysed using an UltiMate™ 3000 RSLCnano System (Thermo Fischer Scientific) coupled on-line to either a Q Exactive HF-X (Thermo Fischer Scientific) or an Orbitrap Exploris 480 (Thermo Fischer Scientific). First, peptides were trapped for 5 min in solvent A (0.1% FA in water), using a 100-µm inner diameter 2-cm trap column (packed in-house with ReproSil-Pur C18-AQ, 3 µm) prior to separation on an analytical column (50 cm of length, 75 µM inner diameter; packed in-house with Poroshell 120 EC-C18, 2.7 µm). Peptides were eluted following a 45 or 55 min gradient from 9-35% solvent B (80% ACN, 0.1% FA), or 9-41% solvent B, respectively. On the Q Exactive HF-X, a full-scan MS spectra from 375 to 1600 Da were acquired in the Orbitrap at a resolution of 60,000 with the automatic gain control (AGC) target set to 3 × 10⁶ and maximum injection time (IT) of 120 ms. For measurements on the Orbitrap Exploris 480, a full-scan MS spectra from 375 to 2200 m/z were acquired in the Orbitrap at a resolution of 60,000 with the AGC target set to 2 × 10⁶ and maximum IT of 25 ms. Only peptides with charged states 3–8 were fragmented, and dynamic exclusion properties were set to n = 1, for a duration of 10 s (Q Exactive HF-X) and 15 s (Orbitrap Exploris 480). Fragmentation was performed using in a stepped HCD collision energy mode (27, 30, 33% Q Exactive HF-X; 20, 28, 36% Orbitrap Exploris 480) in the ion trap and acquired in the Orbitrap at a resolution of 30,000 after accumulating a target value of 1 × 10⁵ with an isolation window of 1.4 m/z and maximum IT of 54 ms (Q Exactive HF-X) and 55 ms (Orbitrap Exploris 480).

Data analysis of HTT-HAP40 cross-links

Raw files for cross-linked HTT-HAP40 samples were analysed using the XlinkX node⁵² in the Proteome Discoverer (PD) software suite 2.5 (Thermo Fischer Scientific), with signal to noise threshold set to 1.4. Trypsin was set as a digestion enzyme (max two allowed missed cleavages), the precursor tolerance set to 10 ppm and the maximum false discovery rate set to 1%. Additionally, carbamidomethyl modification (Cystein) was set as fixed modification and acetylation (protein N-terminus) and oxidation (Methionine) were set as dynamic modifications. Cross-links obtained for respective HTTQ-HAP40 samples were filtered (only cross-links identified with an XlinkX score >40 were considered) and further validated using our recently deposited structure of HTTQ23-HAP40 (PDBID: 6X9O) (EMD-22106). Contact maps and circos plots were generated in R (http://www.R-project.org/) using the circlize⁵³ and XLmaps⁵⁴ packages.

Mass photometry

Mass photometry analysis was performed on a Refeyn OneMP instrument (Oxford, UK), which was calibrated using a native marker protein mixture (NativeMark Unstained Protein Standard, Thermo Scientific). The marker contained proteins in the wide mass range up to 1.2 MDa. Four proteins were used to generate a standard calibration curve, with following rounded average masses: 66, 146, 480, and 1048 kDa. The experiments were conducted using glass coverslips, extensively cleaned through several rounds of washing with Milli-Q water and isopropanol. A set of 4–6 gaskets made of clear silicone was placed onto the thoroughly dried glass surface to create wells for sample load. Typically, 1 µL of HTT samples was applied to 19 µL of phosphate-buffered saline (PBS) resulting in a final concentration of ∼5 nM. Movies consisting of 6000 final frames were recorded using the AcquireMP software at a 100 Hz framerate. Particle landing events were automatically detected amounting to ∼3000 per acquisition. The data were analysed using the DiscoverMP software. Average masses of HTT proteins and HTT-HAP40 complexes were determined by taking the value at the mode of the normal distribution fitted into the histograms of particle masses. Finally, probability density function was calculated and drawn over the histogram to produce the final mass profile. Measurement and analysis of mass photometry data were done for the following samples: HTT-Q23-HAP40, HTT-Q54-HAP40, and HTT-∆exon 1-HAP40.

Intact mass and middle–down MS sample preparation

Sample preparation: Samples containing HTT-HAP40 complexes were digested using human Caspase6 (Enzo Life Sciences, Farmingdale, USA) by adding 200 U of the enzyme to the 20 µg of the protein. The mixture was stored in PBS for 24 h. Following the digestion, samples were diluted to the final concentration of 500 ng/µL with 2% FA. Approximately 2 µg of the sample were injected for a single intact mass LC-MS or middle–down LC-MS/MS experiment.

LC-MS(/MS) for intact and middle–down MS

Produced peptides of HTT were separated using a Vanquish Flex UHPLC (Thermo Fisher Scientific, Bremen, Germany) coupled on-line to an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher Scientific, San Jose, USA) via reversed-phase analytical column (MAbPac, 1 mm × 150 mm, Thermo Fisher Scientific). The column compartment and preheater were kept at 80 °C during the measurements to ensure efficient unfolding and separation of the analysed peptides. Analytes were separated and measured for 22 min at a flow rate of 150 µL/min. Elution was conducted using A (Milli-Q H2O/0.1% CH₂O₂) and B (C₂H₃N/0.1% CH₂O₂) mobile phases. In the first minute, B was increased from 10 to 30%, followed by 30 to 57% B gradient over 14 min, 1 min 57 to 95% B ramp-up, 95% B for 1 min and equilibration of the column at 10% B for 4 min.

During data acquisition, Lumos Fusion instrument was set to Intact Protein and Low Pressure mode. MS1 resolution of 7500 (determined at 200 m/z and equivalent to 16 ms transient signal length) was used, which enables optimal detection of protein ions above 30 kDa in mass. Mass range of 500–3000 m/z, the AGC target of 250%, and a max IT of 50 ms were used for recording of MS1 scans. Two µscans were averaged in the time domain and recorded for the 7500 resolution scans during the LC-MS experiment and 5 µscans when tandem MS (MS/MS) was performed. MS/MS scans were recorded at a resolution setting of 120,000 (determined at 200 m/z and equivalent to 16 ms transient signal length), 10,000% AGC target, 250 ms max IT, and 5 µscans, for the single most abundant peak detected in the preceding MS1 scan. The selected ions were mass-isolated by a quadrupole in a 4 m/z window and accumulated to an estimate of 5e6 ions prior to the gas-phase activation. Two separate LC-MS/MS runs were recorded per sample with either higher-energy collisional dissociation (HCD) or electron transfer dissociation (ETD) used for fragmentation. For ETD, the following parameters were used: ETD reaction time—16 ms, max IT of the ETD reagent—200 ms, and the AGC target of the ETD reagent—1e6. For HCD, 30 V activation energy was used. MS/MS scans were acquired with the minimum intensity of the precursor set to 5e4 and the range of 350–5000 m/z using quadrupole in the high mass isolation mode.

Data analysis of intact and middle–down MS

LC-MS data were deconvoluted with ReSpect algorithm in BioPharma Finder 3.2 (Thermo Fisher Scientific, San Jose, USA). ReSpect parameters: precursor m/z tolerance—0.2 Th, target mass—50 kDa, relative abundance threshold—0%, mass range—3–100 kDa; tolerance—30 ppm, charge range—3–100. MS1 and MS2 masses were recalibrated using an external calibrant mixture of intact proteins (PiercePierce™ Intact Protein Standard Mix, Thermo Scientific) measured before and after each HTT sample. Iterative sequence adjustments of putative HTT peptides was done until the exact precursor and fragment masses matched to determine a final set of HTT peptides generated by Caspase6 enzyme. HCD fragments of HTT peptides were used solely to confirm identified sequences. Phosphorylation was matched as 80 Da variable modification mass, added to the mass of the identified HTT peptides. Visualisation was done in R extended with ggplot2 package.

Native (top–down) MS sample preparation

Samples were stored at −80 °C in the buffer containing 20 mM HEPES pH 7.4, 300 mM NaCl, 2.5% (v/v) glycerol, 1 mM TCEP. Approximately 40 µg of the HTT-Q23, HTT-Q54, HTT-∆exon 1, and their respective complexes with Hap40 protein were buffer-exchanged into 150 mM aqueous ammonium acetate (pH = 7.5) by using P-6 Bio-Spin gel filtration columns (Bio-rad, Veenendaal, the Netherlands). The protein’s resulting concentration was estimated to be ~2–5 µM before native MS analysis. For the recording of denaturing MS, samples were spiked with FA to the final concentration of 2% right before the MS measurement.

Native (top–down) data acquisition

HTT-containing samples were directly injected into a Q Exactive Ultra-High Mass Range (UHMR) Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) using in-house pulled and gold-coated borosilicate capillaries. Following mass spectrometer parameters were used: capillary voltage—1.5 kV, positive ion mode, source temperature—250 °C, S-lens radio frequency (RF) level—200, IT—mostly 200 ms, noise level parameter—3.64. In-source trapping with a desolvation voltage of −100 V was used to desolvate the proteinaceous ions efficiently. No additional acceleration voltage was used in the back-end of the instrument. The AGC was switched to fixed. Resolutions of 4375 and 8750 (both at m/z = 200 Th) were used, representing 16 and 32 ms transient, respectively. Ion guide optics and voltage gradient throughout the instrument were manually adjusted for optimal transmission and detection of HTT and HTT-HAP40 ions. The HCD cell was filled with Nitrogen, and the trapping gas pressure was set to 3 or 4 setting value, corresponding to ~2e-10–4e-10 mBar for the ultra-high vacuum readout of the instrument. The instrument was calibrated in the m/z range of interest using a concentrated aqueous cesium iodide (CsI) solution. Acquisition of the spectra was usually performed by averaging 100–200 µscans in the time domain. Peaks corresponding to the protein complex of interest were isolated with a 20–Th window for single charge state isolation and a 2000 Th window for charge-state ensemble isolation. In both cases, isolated HTT-HAP40 ions were investigated for dissociation using elevated HCD voltages, with direct eV setting varied in the range 1–500 V. For detection of high-m/z dissociation product ions, mass analyser detection mode and transmission RF settings were set to “high m/z”. For detection of low-m/z fragment ions, all relevant instrument settings were set to “low m/z”, and the instrument resolution was increased to 140,000 (at m/z = 200 Th).

Data analysis for native (top–down) MS

Raw native MS and high-m/z native top–down MS data were processed with UniDec⁵⁵ to obtain zero-charged mass spectra. Native top–down MS data recorded with high resolution (140,000) were deconvoluted using the Xtract algorithm within FreeStyle software (1.7SP1; Thermo Fisher Scientific). The resulting zero-charge fragments were matched to the theoretical fragments produced for HTT and HAP40 using in-house scripts with 5 ppm mass tolerance. Theoretical fragment intensities were derived from the corresponding fragments obtained upon deconvolution of raw native mass spectrum. Final visualisation was performed in R extended with ggplot2 library.

Cryo‐EM sample preparation and data acquisition

HTT was diluted to 0.4 mg/mL in 20 mM HEPES pH 7.5, 300 mM NaCl, 1 mM TCEP and adsorbed to glow-discharged holey carbon-coated grids (Quantifoil 300 mesh, Au R1.2/1.3) for 10 s. Grids were then blotted with filter paper for 2 s at 100% humidity at 4 °C and frozen in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific).

HTT-HAP40 was diluted to 0.2 mg/mL in 25 mM HEPES pH 7.4, 300 mM NaCl, 0.025% w/v CHAPS and 1 mM DTT and adsorbed onto gently glow-discharged suspended monolayer graphene grids (Graphenea) for 60 s. Grids were then blotted with filter paper for 1 s at 100% humidity, 4 °C and frozen in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific).

Data were collected in counting mode on a Titan Krios G3 (FEI) operating at 300 kV with a BioQuantum imaging filter (Gatan) and K2 direct detection camera (Gatan) at ×165,000 magnification, pixel size of 0.822 Å. Movies were collected over 32 fractions at a dose rate of 6.0 e⁻/Å²/s, exposure time of 8 s, resulting in a total dose of 48.0 e⁻/Å².

Cryo-EM data processing

For apo HTT, patched motion correction and dose weighting were performed using MotionCor implemented in RELION 3.0⁵⁶. Contrast transfer function parameters were estimated using CTFFIND4⁵⁷. Particles were picked in SIMPLE 3.0⁵⁸ and processed in RELION 3.0. Six hundred and sixty-nine movies were collected in total and 108,883 particles extracted. Particles were subjected to one round of reference-free two-dimensional (2D) classification against 100 classes (k = 100) using a soft circular mask of 180 Å in diameter in RELION. A subset of 25,424 particles were recovered at this stage and subjected to 3D auto-refinement in RELION using a 40 Å low-pass-filtered map of HTT-HAP40 (EMDB 3984) as initial reference. This generated a ~12 Å map based on gold-standard Fourier shell correlation curves using the 0.143 criterion as calculated within RELION.

For HTT-HAP40 (Supplementary Fig. 1), 15,003 movies were processed in real time using the SIMPLE 3.0 pipeline, using SIMPLE-unblur for patched motion correction, SIMPLE-CTFFIND for patched CTF estimation and SIMPLE-picker for particle picking. After initial 2D classification in SIMPLE 3.0 using the cluster2D_stream module (k = 500), cleaned particles were imported into RELION and subjected to reference-free 2D classification (k = 200) using a 180 Å soft circular mask. An ab initio map, generated from a selected subset of particles (372,226), was subsequently lowpass filtered to 40 Å and used as reference for coarse-sampled (7.5°) 3D classification (k = 4) with a 180 Å soft spherical mask against the same particle subset. Particles (102,729) belonging to the most defined, highest resolution class were selected for 3D auto-refinement against its corresponding map, lowpass filtered to 40 Å, using a soft mask covering the protein which generated a 3.5 Å volume. This map was lowpass filtered to 40 Å and used as initial reference for a multi-step 3D classification (k = 5, 15 iterations at 7.5° followed by 5 iterations at 3.75°), with 180 Å soft spherical mask, against the full cleaned data set of 2,240,373 particles. Selected particles (647,468) from the highest resolution class were subjected to masked 3D auto-refinement against its reference map, lowpass filtered to 15 Å, yielding a 3.1 Å volume. CTF refinement using per-particle defocus plus beamtilt estimation further improved map quality to 3.0 Å. Bayesian particle polishing followed by an additional round of CTF refinement with per-particle defocus plus beamtilt estimation on a larger box size (448 × 448) generated a final volume with global resolution of 2.6 Å as assessed by Gold standard Fourier shell correlations using the 0.143 criterion within RELION. Map local resolution estimation was calculated within Relion (Supplementary Fig. 1). Additional rounds of 3D classification using either global/local searches or classification only without alignment did not improve map quality.

Model building and refinement

The model for HTT-HAP40 (Table 1) was generated by rigid body fitting the 4 Å HTT-HAP40 model²⁰ (PDBID: 6EZ8) into our globally-sharpened, local resolution filtered 2.6 Å map followed by multiple rounds of manual real-space refinement using Coot v. 0.95⁵⁹ and automated real-space refinement in PHENIX v. 1.18.2–38746⁶⁰ using secondary structure, rotamer and Ramachandran restraints. HTT-HAP40 model was validated using MolProbity⁶¹ within PHENIX. Figures were prepared using UCSF ChimeraX v.1.1⁶² and PyMOL v.2.4.0 (The PyMOL Molecular Graphics System, v.2.0; Schrödinger).

Table 1 Cryo-EM data collection, refinement and validation statistics.

Full size table

SAXS data collection and analysis

SAXS experiments were performed at beamline 12-ID-B of the Advanced Photon Source (APS) at Argonne National Laboratory. The energy of the X-ray beam was 13.3 keV (wavelength λ = 0.9322 Å), and two set-ups (small- and wide-angle X-ray scattering) were used simultaneously to cover scattering q ranges of 0.006 < q < 2.6 Å⁻¹, where q = (4π/λ)sinθ, and 2θ is the scattering angle. For HTT-HAP40 Q54, 32-dimensional images were recorded for buffer or sample solutions using a flow cell, with an exposure time of 0.8 s to reduce radiation damage and obtain good statistics. The flow cell is made of a cylindrical quartz capillary 1.5 mm in diameter and 10 µm wall thickness. Concentration-series measurements for this sample were carried out at 300 K with concentrations of 0.5, 1.0, and 2.0 mg/mL in 20 mM HEPES, pH 7.5, 300 mM NaCl, 2.5% (v/v) glycerol and 1 mM TCEP. No radiation damage was observed as confirmed by the absence of systematic signal changes in sequentially collected X-ray scattering images. The 2D images were corrected for solid angle of each pixel, and reduced to one-dimensional (1D) scattering profiles using the Matlab software package at the beamlines. The 1D SAXS profiles were grouped by sample and averaged.

For HTT-HAP40 Δexon 1, data were collected using an in-line FPLC AKTA micro set-up with a Superose6 Increase 10/300 GL size exclusion column in 20 mm HEPES, pH 7.5, 300 mm NaCl, 2.5% (v/v) glycerol and 1 mm TCEP. A 150 μL sample loop was used and the stock sample concentration was 5 mg/mL. The sample passed through the FPLC column and was fed to the flow cell for SAXS measurements. The SAXS data were collected every 2 s and the X-ray exposure time was set to 0.75 s. Only the SAXS data collected above the half maximum of the elution peak, about 50–100 frames, were averaged and used for further analysis. Background data were collected before and after the peak (each 100 frames), while data before the peak were found better and used for the background subtraction.

SAXS data were analysed with the software package ATSAS 2.8⁶³. The experimental radius of gyration, R_g, was calculated from data at low q values using the Guinier approximation. The pair distance distribution function, P(r), the maximum dimension of the protein, D_max, and R_g in real space were calculated with the indirect Fourier transform using the program GNOM⁶⁴. Estimation of the molecular weight of samples was obtained by both SAXMOW^65,66 and by using volume of correlation, Vc⁶⁷. The theoretical scattering intensity of the atomic structure model was calculated using FoXS⁶⁸. Ab initio shape reconstructions (molecular envelopes) were performed using both bead modelling with DAMMIF⁶⁹ and calculating 3D particle electron densities directly from SAXS data with DENSS⁷⁰ (Supplementary Fig. 6a).

Coarse-grained MD simulations

We used a Gō-like coarse-grained model of HTT-HAP40 for structural modelling of the complex as described previously²¹. In this model, amino acid residues in the proteins are represented as single beads located at their C_α positions and interacting via appropriate bonding, bending, torsion-angle, and non-bonding potential. A Gō-like model⁷¹ was employed to maintain the structured, globular domains as quasi-rigid in the simulation. The model was built based on the experimental EM structure of the complex (PDB 6X9O). The EM structure of the complex is missing ~26% of the residues. We assume that the residues with known coordinates form a quasi-rigid part of the complex while the residues with missing coordinates are flexible.

For the flexible regions, we adopt a simple model in which adjacent amino acids beads are joined together into a polymer chain by means of virtual bond and angle interactions with a quadratic potential.

$${V}_{b}={K}_{b}{(b-{b}_{0})}^{2};\,{V}_{\alpha }={K}_{a}{(\alpha -{\alpha }_{0})}^{2}$$

with the constants K_b = 50 kcal/mol and K_α = 1.75 kcal/mol and the equilibrium values b₀ = 3.8 Å and α₀ = 112° for bonds and angles, respectively. The excluded volume between non-bonded beads was treated with pure repulsive potential

$${V}_{R}={\varepsilon }_{R}{({\sigma }_{R}/{r}_{ij})}^{12}$$

where r_ij is the inter-bead distance, ${\sigma }_{R}$ = 4 Å, and ${\varepsilon }_{R}$ = 2.0 kcal/mol.

We used experimentally observed cross-links to improve the sampling of the flexible regions of the model. Because of the structural flexibility, not all observed cross-links are compatible with one another, meaning that there is no one conformation of the complex that has a geometry such that all cross-links can be formed. Therefore, only a subset of compatible cross-links can be included in MD simulations as distance restraints. We implemented a procedure where during MD simulation is run with a set of randomly selected restraints for a period of time, then a new set of randomly selected restraints was generated. This was repeated ~2000 times along the MD trajectory. A harmonic potential normally used for distance restraints is not suitable for this procedure, since a new randomly selected distance restraint could be incompatible with the current conformation (the corresponding C_α–C_α distance is large and causes very large forces), which will lead MD simulation to terminate. This problem can be avoided by using the sigmoidal function as a potential for a distance restraint, since at large distances, the sigmoidal potential produces forces close to zero. This method has previously been used to successfully model other dynamic complexes⁷².

To account for the experimentally observed cross-links, we introduced in the force field a distance restraint term given by the following potential:

$${V}_{{{{{{\rm{XL}}}}}}}(t)=\mathop{\sum }\limits_{p=1}^{{N}_{{{{{{\rm{c}}}}}}}}\mathop{\sum }\limits_{k=1}^{{N}_{{{{{{\rm{XL}}}}}}}}{\delta }_{{\xi }_{p}(t)}^{k}{V}_{l}^{k};\,{V}_{l}^{k}={K}_{{{{{{\rm{XL}}}}}}}/(1+{e}^{-\beta ({l}_{k}(t)-{l}_{0})})$$

The sum is over all cross-links, N_XL is the number of cross-links; N_c is the number of selected active cross-links; ${l}_{k}$ is the C_α–C_α distance for residues involved in kth cross-link; ${l}_{0}$ = 25 Å is the upper bounds for PhoX cross-links; β = 0.5 is the slope of the sigmoidal function; K_XL = 10 kcal/mol is the force constant; ${\delta }_{i}^{k}$ is the Kronecker delta; and ${\xi }_{p}(t)$ is the random digital number selected from the interval [1, N_XL]. We chose to keep active a small number, N_c = 5 randomly selected restraints, numbers ${\xi }_{p}(t)$, that are updated every ${\tau }_{{{{{{{\rm{XL}}}}}}}}$= 0.5 ns during the MD simulation.

Fitting structural ensemble to SAXS data

The goodness-of-fit of an ensemble of structural models of the complex to the SAXS data was evaluated by comparing an ensemble average profile, I_avrg(q), with the experimental one I_exp(q).

$${\chi }_{{{{{{{\rm{SAXS}}}}}}}}={\left[\tfrac{1}{{N}_{q}}\mathop{\sum }\nolimits_{i=1}^{{N}_{q}}{\left[\tfrac{{I}_{{\exp }}\left({q}_{i}\right)-\alpha \cdot {I}_{{{{{{{\rm{avrg}}}}}}}}({q}_{i})}{\sigma ({q}_{i})}\right]}^{2}\right]}^{1/2}$$

where,

$$\alpha ={\sum }_{i=1}^{{N}_{q}}{I}_{{{{{{\rm{exp}}}}}}}({q}_{i})\cdot {I}_{{{{{{\rm{avrg}}}}}}}({q}_{i})/{\sum }_{i=1}^{{N}_{q}}{I}_{{{{{{\rm{exp}}}}}}}({q}_{i})\cdot {I}_{{{{{{\rm{exp}}}}}}}({q}_{i}),$$

and,

$${I}_{{{{{{{\rm{avrg}}}}}}}}\left({q}_{i}\right)=\mathop{\sum }\nolimits_{k=1}^{{N}_{{{{{{{\rm{ens}}}}}}}}}{I}_{{{{{{{\rm{calc}}}}}}}}^{k}({q}_{i})\cdot {w}_{k}$$

Here ${I}_{{{{{{{\rm{calc}}}}}}}}^{k}(q)$ is scattering intensity predicted for the kth conformation, N_q is number of experimental points, σ(q) is the experimental error, N_ens is the number of conformations in the ensemble and w_k is a weight of the kth conformation.

The optimal weights for each ensemble member were obtained with SES method⁷³ to minimise the discrepancy of the ensemble average profile from the experimental scattering data. Theoretical scattering profiles for each conformation in the ensemble were calculated in the q range 0 < q < 0.30 Å⁻¹ using FoXS⁶⁸.

Validation of effects of polyglutamine expansion on ensemble structure with MD simulation

In all, 800 ns long unconstrained MD trajectories were calculated for both HTT-HAP40 Q23 and HTT-HAP40 Q54 complexes, saving frames every 100 ps. We obtained two ensembles of structures, each consisting of 8000 models, that were used to analyse distances between lysine residues for which experimental cross-links were observed. Focussing on the cross-links that were observed for exon 1 and IDR residues (Supplementary Tables 3–6), we assessed the cross-link contact frequency of each cross-link in the ensemble.

Size-exclusion chromatography multi-angle light scattering

The absolute molar masses and mass distributions of purified protein samples of HTT-HAP40 Q23, HTT-HAP40 Q54 and HTT-HAP40 Δexon 1 at 1 mg/mL were determined using SEC-MALS. Samples were injected through a Superose 6 10/300 GL column (GE Healthcare) equilibrated in 20 mm HEPES, pH 7.5, 300 mm NaCl, 2.5% (v/v) glycerol and 1 mm TCEP followed in-line by a Dawn Heleos-II light scattering detector (Wyatt Technologies) and a 2414 refractive index detector (Waters). Molecular mass calculations were performed using ASTRA 6.1.1.17 (Wyatt Technologies) assuming a dn/dc value of 0.185 mL/g.

In silico analysis of the HTT-HAP40 protein complex structure

HTT-HAP40 models were analysed using Pymol⁷⁴ and APBS⁷⁵. For conservation analysis, HTT and HAP40 orthologues were extracted from Ensembl, parsed to remove low quality or partial sequences and then aligned using Clustal⁷⁶. Multiple sequence alignments were then analysed using Consurf⁷⁷ and conservation scores mapped to the HTT-HAP40 (PDBID: 6X9O) structure in Pymol. Ligand-able pocket analysis was completed as previously reported⁷⁸. Briefly, HTT-HAP40 model PDB files were loaded in ICM (Molsoft, San Diego). Proteins were protonated, optimal positions of added polar hydrogens were generated and correct orientation of side-chain amide groups for glutamine and asparagine and most favourable histidine isomers were identified. The PocketFinder algorithm implemented in ICM, which uses a transformation of the Lennard–Jones potential to identify ligand-binding envelopes regardless of the presence of bound ligands, was then applied⁷⁹. Residues with side-chain heavy atoms within 2.8 Å of the molecular envelope were identified as lining the pocket.

Statistics and reproducibility

Experiments were performed at least 2–3 times in distinct technical replicates to confirm reproducibility. Sample sizes and statistical analyses used in this study are described above in “Methods” and are also detailed in figure legends.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All Supplementary Data files can be accessed via Zenodo⁸⁰. Raw and preprocessed mass spectrometry data used in this study is deposited in Figshare with identifier 839⁸¹ and PRIDE through accession PXD028313. Also available through these links are CSM tables that show the Scores and CSMs identified in our XLMS data sets as well as mass error. Cryo-EM maps can be downloaded at EMDB 22106 and model coordinates at PDBID 6X9O. All expression constructs are available through Addgene.

References

Donaldson, J., Powell, S., Rickards, N., Holmans, P. & Jones, L. What is the pathogenic CAG expansion length in Huntington’s disease? J. Huntingtons Dis. 10, 175–202 (2021).
Article CAS PubMed PubMed Central Google Scholar
Saudou, F. & Humbert, S. The biology of huntingtin. Neuron 89, 910–926 (2016).
Article CAS PubMed Google Scholar
Harding, R. J. & Tong, Y. Proteostasis in Huntington’s disease: disease mechanisms and therapeutic opportunities. Acta Pharmacol. Sin. 39, 754–769 (2018).
Article CAS PubMed PubMed Central Google Scholar
Koyuncu, S., Fatima, A., Gutierrez-Garcia, R. & Vilchez, D. Proteostasis of huntingtin in health and disease. Int. J. Mol. Sci. 18, 1568 (2017).
Gao, R. et al. Mutant huntingtin impairs PNKP and ATXN3, disrupting DNA repair and transcription. eLife 8, e42988 (2019).
Article PubMed PubMed Central Google Scholar
Poplawski, G. H. D. et al. Injured adult neurons regress to an embryonic transcriptional growth state. Nature 581, 77–82 (2020).
Article CAS PubMed Google Scholar
Carmo, C., Naia, L., Lopes, C. & Rego, A. C. Mitochondrial dysfunction in Huntington’s disease. Adv. Exp. Med. Biol. 1049, 59–83 (2018).
Article CAS PubMed Google Scholar
Vitet, H., Brandt, V. & Saudou, F. Traffic signaling: new functions of huntingtin and axonal transport in neurological disease. Curr. Opin. Neurobiol. 63, 122–130 (2020).
Article CAS PubMed Google Scholar
Smith-Dijak, A. I., Sepers, M. D. & Raymond, L. A. Alterations in synaptic function and plasticity in Huntington disease. J. Neurochem. 150, 346–365 (2019).
Article CAS PubMed Google Scholar
McColgan, P. & Tabrizi, S. J. Huntington’s disease: a clinical review. Eur. J. Neurol. 25, 24–34 (2018).
Article CAS PubMed Google Scholar
Maiuri, T. et al. Huntingtin is a scaffolding protein in the ATM oxidative DNA damage response complex. Hum. Mol. Genet. 26, 395–406 (2017).
CAS PubMed Google Scholar
Rui, Y.-N. et al. Huntingtin functions as a scaffold for selective macroautophagy. Nat. Cell Biol. 17, 262–275 (2015).
Article CAS PubMed PubMed Central Google Scholar
Shirasaki, D. I. et al. Network organization of the huntingtin proteomic interactome in mammalian brain. Neuron 75, 41–57 (2012).
Article CAS PubMed PubMed Central Google Scholar
Wanker, E. E., Ast, A., Schindler, F., Trepte, P. & Schnoegl, S. The pathobiology of perturbed mutant huntingtin protein-protein interactions in Huntington’s disease. J. Neurochem. 151, 507–519 (2019).
Article CAS PubMed Google Scholar
Seefelder, M. et al. The evolution of the huntingtin-associated protein 40 (HAP40) in conjunction with huntingtin. BMC Evol. Biol. 20, 162 (2020).
Article CAS PubMed PubMed Central Google Scholar
Xu, S. et al. HAP40 is a conserved central regulator of Huntingtin and a specific modulator of mutant Huntingtin toxicity. Preprint at bioRxiv https://doi.org/10.1101/2020.05.27.119552 (2020).
Pal, A., Severin, F., Lommer, B., Shevchenko, A. & Zerial, M. Huntingtin-HAP40 complex is a novel Rab5 effector that regulates early endosome motility and is up-regulated in Huntington’s disease. J. Cell Biol. 172, 605–618 (2006).
Article CAS PubMed PubMed Central Google Scholar
Pal, A., Severin, F., Höpfner, S. & Zerial, M. Regulation of endosome dynamics by Rab5 and Huntingtin-HAP40 effector complex in physiological versus pathological conditions. Methods Enzymol. 438, 239–257 (2008).
Article CAS PubMed Google Scholar
Peters, M. F. & Ross, C. A. Isolation of a 40-kDa Huntingtin-associated protein. J. Biol. Chem. 276, 3188–3194 (2001).
Article CAS PubMed Google Scholar
Guo, Q. et al. The cryo-electron microscopy structure of huntingtin. Nature https://doi.org/10.1038/nature25502 (2018).
Harding, R. J. et al. Design and characterization of mutant and wild-type huntingtin proteins produced from a toolkit of scalable eukaryotic expression systems. J. Biol. Chem. https://doi.org/10.1074/jbc.RA118.007204 (2019).
Jung, T. et al. The polyglutamine expansion at the N-terminal of huntingtin protein modulates the dynamic configuration and phosphorylation of the C-terminal HEAT domain. Structure 28, 1035.e8–1050.e8 (2020).
Article CAS Google Scholar
Vijayvargia, R. et al. Huntingtin’s spherical solenoid structure enables polyglutamine tract-dependent modulation of its structure and function. eLife 5, e11184 (2016).
Article PubMed PubMed Central CAS Google Scholar
Boatz, J. C. et al. Protofilament structure and supramolecular polymorphism of aggregated mutant huntingtin exon 1. J. Mol. Biol. 432, 4722–4744 (2020).
Article CAS PubMed PubMed Central Google Scholar
Falk, A. S. et al. Structural model of the proline-rich domain of huntingtin exon-1 fibrils. Biophys. J. 119, 2019–2028 (2020).
Article CAS PubMed PubMed Central Google Scholar
Matlahov, I. & van der Wel, P. C. Conformational studies of pathogenic expanded polyglutamine protein deposits from Huntington’s disease. Exp. Biol. Med. 244, 1584–1595 (2019).
Article CAS Google Scholar
Ratovitski, T. et al. Post-translational modifications (PTMs), identified on endogenous huntingtin, cluster within proteolytic domains between HEAT repeats. J. Proteome Res. https://doi.org/10.1021/acs.jproteome.6b00991 (2017).
Schilling, B. et al. Huntingtin phosphorylation sites mapped by mass spectrometry modulation of cleavage and toxicity. J. Biol. Chem. 281, 23686–23697 (2006).
Article CAS PubMed Google Scholar
Tabrizi, S. J., Ghosh, R. & Leavitt, B. R. Huntingtin lowering strategies for disease modification in Huntington’s disease. Neuron 101, 801–819 (2019).
Article CAS PubMed Google Scholar
McAllister, G. Oral HTT lowering therapies. In EHDN Plenary Meeting (CHDI Foundation, 2020).
Huang, B. et al. Pathological polyQ expansion does not alter the conformation of the Huntingtin-HAP40 complex. Structure https://doi.org/10.1016/j.str.2021.04.003 (2021).
Deshaies, R. J. Protein degradation: prime time for PROTACs. Nat. Chem. Biol. 11, 634–635 (2015).
Article CAS PubMed Google Scholar
Liu, L. et al. Imaging mutant huntingtin aggregates: development of a potential PET ligand. J. Med. Chem. 63, 8608–8633 (2020).
Article CAS PubMed Google Scholar
Huang, B. et al. Scalable production in human cells and biochemical characterization of full-length normal and mutant huntingtin. PLoS ONE 10, e0121055 (2015).
Article PubMed PubMed Central CAS Google Scholar
Graham, R. K. et al. Cleavage at the caspase-6 site is required for neuronal dysfunction and degeneration due to mutant huntingtin. Cell 125, 1179–1191 (2006).
Article CAS PubMed Google Scholar
Liu, F., Lössl, P., Scheltema, R., Viner, R. & Heck, A. J. R. Optimized fragmentation schemes and data analysis strategies for proteome-wide cross-link identification. Nat. Commun. 8, 15473 (2017).
Article CAS PubMed PubMed Central Google Scholar
Mintseris, J. & Gygi, S. P. High-density chemical cross-linking for modeling protein interactions. Proc. Natl Acad. Sci. USA 117, 93–102 (2020).
Article CAS PubMed Google Scholar
Greber, B. J. et al. Architecture of the large subunit of the mammalian mitochondrial ribosome. Nature 505, 515–519 (2014).
Article CAS PubMed Google Scholar
Steigenberger, B., Pieters, R. J., Heck, A. J. R. & Scheltema, R. A. PhoX: an IMAC-enrichable cross-linking reagent. ACS Cent. Sci. 5, 1514–1522 (2019).
Article CAS PubMed PubMed Central Google Scholar
Gu, X. et al. N17 modifies mutant Huntingtin nuclear pathogenesis and severity of disease in HD BAC transgenic mice. Neuron 85, 726–741 (2015).
Article CAS PubMed PubMed Central Google Scholar
Jayaraman, M. et al. Kinetically competing huntingtin aggregation pathways control amyloid polymorphism and properties. Biochemistry 51, 2706–2716 (2012).
Article CAS PubMed Google Scholar
Maiuri, T., Woloshansky, T., Xia, J. & Truant, R. The huntingtin N17 domain is a multifunctional CRM1 and Ran-dependent nuclear and cilial export signal. Hum. Mol. Genet. 22, 1383–1394 (2013).
Article CAS PubMed PubMed Central Google Scholar
Caron, N. S., Desmond, C. R., Xia, J. & Truant, R. Polyglutamine domain flexibility mediates the proximity between flanking sequences in huntingtin. Proc. Natl Acad. Sci. USA 110, 14610–14615 (2013).
Article CAS PubMed PubMed Central Google Scholar
André, E. A., Braatz, E. M., Liu, J.-P. & Zeitlin, S. O. Generation and characterization of knock-in mouse models expressing versions of huntingtin with either an N17 or a combined PolyQ and proline-rich region deletion. J. Huntingtons Dis. 6, 47–62 (2017).
Bravo-Arredondo, J. M. et al. The folding equilibrium of huntingtin exon 1 monomer depends on its polyglutamine tract. J. Biol. Chem. 293, 19613–19623 (2018).
Article CAS PubMed PubMed Central Google Scholar
Newcombe, E. A. et al. Tadpole-like conformations of huntingtin exon 1 are characterized by conformational heterogeneity that persists regardless of polyglutamine length. J. Mol. Biol. 430, 1442–1458 (2018).
Article CAS PubMed PubMed Central Google Scholar
Warner, J. B. et al. Monomeric huntingtin exon 1 has similar overall structural features for wild-type and pathological polyglutamine lengths. J. Am. Chem. Soc. 139, 14456–14469 (2017).
Article CAS PubMed PubMed Central Google Scholar
Peters-Libeu, C. et al. Disease-associated polyglutamine stretches in monomeric huntingtin adopt a compact structure. J. Mol. Biol. 421, 587–600 (2012).
Article CAS PubMed PubMed Central Google Scholar
Gerbich, T. M. & Gladfelter, A. S. Moving beyond disease to function: physiological roles for polyglutamine-rich sequences in cell decisions. Curr. Opin. Cell Biol. 69, 120–126 (2021).
Article CAS PubMed Google Scholar
Dragatsis, I., Levine, M. S. & Zeitlin, S. Inactivation of Hdh in the brain and testis results in progressive neurodegeneration and sterility in mice. Nat. Genet. 26, 300–306 (2000).
Article CAS PubMed Google Scholar
Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real-time quantitative PCR and the 2−ΔΔCT method. Methods 25, 402–408 (2001).
CAS PubMed Google Scholar
Klykov, O. et al. Efficient and robust proteome-wide approaches for cross-linking mass spectrometry. Nat. Protoc. 13, 2964–2990 (2018).
Article CAS PubMed Google Scholar
Gu, Z., Gu, L., Eils, R., Schlesner, M. & Brors, B. circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812 (2014).
Article CAS PubMed Google Scholar
Schweppe, D. K., Chavez, J. D. & Bruce, J. E. XLmap: an R package to visualize and score protein structure models based on sites of protein cross-linking. Bioinformatics 32, 306–308 (2016).
Article CAS PubMed Google Scholar
Marty, M. T. et al. Bayesian deconvolution of mass and ion mobility spectra: from binary interactions to polydisperse ensembles. Anal. Chem. 87, 4370–4376 (2015).
Article CAS PubMed PubMed Central Google Scholar
Zivanov, J., Nakane, T. & Scheres, S. H. W. A Bayesian approach to beam-induced motion correction in cryo-EM single-particle analysis. IUCrJ 6, 5–17 (2019).
Article CAS PubMed PubMed Central Google Scholar
Rohou, A. & Grigorieff, N. CTFFIND4: Fast and accurate defocus estimation from electron micrographs. J. Struct. Biol. 192, 216–221 (2015).
Article PubMed PubMed Central Google Scholar
Caesar, J. et al. SIMPLE 3.0. Stream single-particle cryo-EM analysis in real time. J. Struct. Biol. X 4, 100040 (2020).
CAS PubMed PubMed Central Google Scholar
Brown, A. et al. Tools for macromolecular model building and refinement into electron cryo-microscopy reconstructions. Acta Crystallogr. D Biol. Crystallogr. 71, 136–153 (2015).
Article CAS PubMed PubMed Central Google Scholar
Afonine, P. V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr. Sect. Struct. Biol. 74, 531–544 (2018).
Article CAS Google Scholar
Prisant, M. G., Williams, C. J., Chen, V. B., Richardson, J. S. & Richardson, D. C. New tools in MolProbity validation: CaBLAM for CryoEM backbone, UnDowser to rethink “waters,” and NGL Viewer to recapture online 3D graphics. Protein Sci. 29, 315–329 (2020).
Article CAS PubMed Google Scholar
Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).
Article CAS PubMed Google Scholar
Franke, D. et al. ATSAS 2.8: a comprehensive data analysis suite for small-angle scattering from macromolecular solutions. J. Appl. Crystallogr. 50, 1212–1225 (2017).
Article CAS PubMed PubMed Central Google Scholar
Svergun, D., Barberato, C. & Koch, M. H. J. CRYSOL – a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates. J. Appl. Crystallogr. 28, 768–773 (1995).
Article CAS Google Scholar
Fischer, H. et al. Determination of the molecular weight of proteins in solution from a single small-angle X-ray scattering measurement on a relative scale. J. Appl. Crystallogr. 43, 101–109 (2010).
Article CAS Google Scholar
Piiadov, V., Ares de Araújo, E., Oliveira Neto, M., Craievich, A. F. & Polikarpov, I. SAXSMoW 2.0: online calculator of the molecular weight of proteins in dilute solution from experimental SAXS data measured on a relative scale. Protein Sci. 28, 454–463 (2019).
Article CAS PubMed Google Scholar
Rambo, R. P. & Tainer, J. A. Accurate assessment of mass, models and resolution by small-angle scattering. Nature 496, 477–481 (2013).
Article CAS PubMed PubMed Central Google Scholar
Schneidman-Duhovny, D., Hammel, M. & Sali, A. FoXS: a web server for rapid computation and fitting of SAXS profiles. Nucleic Acids Res. 38, W540–W544 (2010).
Article CAS PubMed PubMed Central Google Scholar
Franke, D. & Svergun, D. I. DAMMIF, a program for rapid ab-initio shape determination in small-angle scattering. J. Appl. Crystallogr. 42, 342–346 (2009).
Article CAS PubMed PubMed Central Google Scholar
Grant, T. D. Ab initio electron density determination directly from solution scattering data. Nat. Methods 15, 191–193 (2018).
Article CAS PubMed Google Scholar
Clementi, C., Nymeyer, H. & Onuchic, J. N. Topological and energetic factors: what determines the structural details of the transition state ensemble and ‘en-route’ intermediates for protein folding? An investigation for small globular proteins. J. Mol. Biol. 298, 937–953 (2000).
Article CAS PubMed Google Scholar
Kaustov, L. et al. The MLL1 trimeric catalytic complex is a dynamic conformational ensemble stabilized by multiple weak interactions. Nucleic Acids Res. 47, 9433–9447 (2019).
CAS PubMed PubMed Central Google Scholar
Berlin, K. et al. Recovering a representative conformational ensemble from underdetermined macromolecular structural data. J. Am. Chem. Soc. 135, 16595–16609 (2013).
Article CAS PubMed PubMed Central Google Scholar
The PyMOL Molecular Graphics System, Version 1.8 (Schrödinger, LLC., 2015).
Dolinsky, T. J. et al. PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res. 35, W522–W525 (2007).
Article PubMed PubMed Central Google Scholar
Madeira, F. et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 47, W636–W641 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ashkenazy, H. et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 44, W344–W350 (2016).
Article CAS PubMed PubMed Central Google Scholar
Yazdani, S. et al. Genetic Variability of the SARS-CoV-2 Pocketome. J. Proteome Res. 8, 4212–4215 (2021).
Article CAS Google Scholar
An, J., Totrov, M. & Abagyan, R. Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol. Cell. Proteomics MCP 4, 752–761 (2005).
Article CAS PubMed Google Scholar
Harding, R. J. et al. Huntingtin structure is orchestrated by HAP40 and shows a polyglutamine expansion-specific interaction with exon 1 datasets. zenodo https://doi.org/10.5281/zenodo.5514263 (2021).
Harding, R. J. et al. Raw and preprocessed MS data for ‘Huntingtin structure is orchestrated by HAP40 and shows a polyglutamine expansion-specific interaction with exon 1’. figshare https://doi.org/10.23644/uu.14338937.v1 (2021).
Bogdanos, D. P., Gao, B. & Gershwin, M. E. in Comprehensive Physiology 567–598 (American Cancer Society, 2013).

Download references

Acknowledgements

We acknowledge the use of the SAXS Core Facility of the Center for Cancer Research (CCR), NCI, National Institutes of Health. NCI SAXS Core is funded by FNLCR contract HHSN261200800001E and the intramural research programme of the NIH, NCI, CCR. This research used 12-ID-B beamline of the Advanced Photon Source, a United States Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357. This research was supported by the CHDI Foundation (to R.J.H., C.H.A., J.B.C.), the Huntington Society of Canada (to R.J.H., C.H.A.), the Wellcome Trust #219477 (to S.M.L., J.D.) and the EU Horizon 2020 programme INFRAIA project Epic-XS Project 823839 (to J.F.H., S.T., A.J.R.H.). R.J.H. is the recipient of the Huntington’s Disease Society of America Berman Topper Career Development Fellowship. The Structural Genomics Consortium is a registered charity (no: 1097737) that receives funds from AbbVie, Bayer AG, Boehringer Ingelheim, Genentech, Genome Canada through Ontario Genomics Institute [OGI-196], the EU and EFPIA through the Innovative Medicines Initiative 2 Joint Undertaking [EUbOPEN grant 875510], Janssen, Merck KGaA (aka EMD in Canada and US), Pfizer, Takeda and the Wellcome Trust [106169/ZZ14/Z].

Author information

Authors and Affiliations

Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada
Rachel J. Harding, Magdalena M. Szewczyk, Peter Loppnau, Alma Seitova, Ashley Hutchinson, Matthieu Schapira & Cheryl H. Arrowsmith
Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford, OX1 3RE, UK
Justin C. Deme & Susan M. Lea
Central Oxford Structural Molecular Imaging Centre, University of Oxford, South Parks Road, Oxford, OX1 3RE, UK
Justin C. Deme & Susan M. Lea
Center for Structural Biology, Center for Cancer Research, National Cancer Institute, Frederick, MD, 21702, USA
Justin C. Deme & Susan M. Lea
Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH, Utrecht, The Netherlands
Johannes F. Hevler, Sem Tamara & Albert J. R. Heck
Netherlands Proteomics Center, Padualaan 8, 3584 CH, Utrecht, The Netherlands
Johannes F. Hevler, Sem Tamara & Albert J. R. Heck
Princess Margaret Cancer Centre and Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada
Alexander Lemak & Cheryl H. Arrowsmith
Behavioral Neuroscience Program, Department of Psychology, Western Washington University, Bellingham, WA, 98225, USA
Jeffrey P. Cantle & Jeffrey B. Carroll
Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, L8S 4K1, Canada
Nola Begeja, Siobhan Goss & Ray Truant
X-ray Science Division, Argonne National Laboratory, Lemont, IL, 60439, USA
Xiaobing Zuo
Basic Science Program, Frederick National Laboratory for Cancer Research, SAXS Core of NCI, National Institutes of Health, Frederick, MD, 21701, USA
Lixin Fan
Department of Pharmacology and Toxicology, University of Toronto, Toronto, ON, M5S 1A8, Canada
Matthieu Schapira

Authors

Rachel J. Harding
View author publications
You can also search for this author in PubMed Google Scholar
Justin C. Deme
View author publications
You can also search for this author in PubMed Google Scholar
Johannes F. Hevler
View author publications
You can also search for this author in PubMed Google Scholar
Sem Tamara
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Lemak
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey P. Cantle
View author publications
You can also search for this author in PubMed Google Scholar
Magdalena M. Szewczyk
View author publications
You can also search for this author in PubMed Google Scholar
Nola Begeja
View author publications
You can also search for this author in PubMed Google Scholar
Siobhan Goss
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobing Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Peter Loppnau
View author publications
You can also search for this author in PubMed Google Scholar
Alma Seitova
View author publications
You can also search for this author in PubMed Google Scholar
Ashley Hutchinson
View author publications
You can also search for this author in PubMed Google Scholar
Lixin Fan
View author publications
You can also search for this author in PubMed Google Scholar
Ray Truant
View author publications
You can also search for this author in PubMed Google Scholar
Matthieu Schapira
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey B. Carroll
View author publications
You can also search for this author in PubMed Google Scholar
Albert J. R. Heck
View author publications
You can also search for this author in PubMed Google Scholar
Susan M. Lea
View author publications
You can also search for this author in PubMed Google Scholar
Cheryl H. Arrowsmith
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

CryoEM experiments and data processing was completed by J.D.; all mass spectrometry experiments were completed by J.F.H. and S.T.; SAXS data collection was completed by X.Z.; mouse experiments were completed by J.P.C.; cell biology experiments were completed by N.B. and S.G.; modelling experiments were completed by A.L.; HTT/caspase-6 western blots were completed by M.M.S.; cloning and baculoviral production was completed by P.L., A.S. and A.H.; all other experiments were completed by R.J.H. R.J.H. conceived the project, designed and conducted experiments, analysed and interpreted data, supervised the project and wrote the manuscript. J.D., J.F.H., S.T., A.L., J.P.C., M.S., N.B. and X.Z. designed and conducted experiments, analysed and interpreted data and contributed to drafting and editing the manuscript. M.M.S., A.H., A.S., P.L. and S.G. conducted experiments and analysed data. R.T., A.J.R.H., J.B.C., C.H.A., S.M.L. and L.F. supervised the work, analysed and interpreted data and contributed to drafting and editing the manuscript.

Corresponding authors

Correspondence to Rachel J. Harding or Cheryl H. Arrowsmith.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review information

Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary handling editor: Anam Akhtar. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer Review File

Supplementary Information

Description of Additional Supplementary Files

Supplementary Movie 1

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Supplementary Data 11

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Harding, R.J., Deme, J.C., Hevler, J.F. et al. Huntingtin structure is orchestrated by HAP40 and shows a polyglutamine expansion-specific interaction with exon 1. Commun Biol 4, 1374 (2021). https://doi.org/10.1038/s42003-021-02895-4

Download citation

Received: 27 July 2021
Accepted: 09 November 2021
Published: 08 December 2021
DOI: https://doi.org/10.1038/s42003-021-02895-4

This article is cited by

Detection of antibodies against the huntingtin protein in human plasma
- Hélèna L. Denis
- Melanie Alpaugh
- Francesca Cicchetti
Cellular and Molecular Life Sciences (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.