Revisiting the TALE repeat.
Transcription activator-like (TAL) effectors specifically bind to double stranded (ds) DNA through a central domain of tandem repeats. Each TAL effector (TALE) repeat comprises 33-35 amino acids and recognizes one specific DNA base through a highly variable residue at a fixed position in the repeat. Structural studies have revealed the molecular basis of DNA recognition by TALE repeats. Examination of the overall structure reveals that the basic building block of TALE protein, namely a helical hairpin, is one-helix shifted from the previously defined TALE motif. Here we wish to suggest a structure-based re-demarcation of the TALE repeat which starts with the residues that bind to the DNA backbone phosphate and concludes with the base-recognition hyper-variable residue. This new numbering system is consistent with the α-solenoid superfamily to which TALE belongs, and reflects the structural integrity of TAL effectors. In addition, it confers integral number of TALE repeats that matches the number of bound DNA bases. We then present fifteen crystal structures of engineered dHax3 variants in complex with target DNA molecules, which elucidate the structural basis for the recognition of bases adenine (A) and guanine (G) by reported or uncharacterized TALE codes. Finally, we analyzed the sequence-structure correlation of the amino acid residues within a TALE repeat. The structural analyses reported here may advance the mechanistic understanding of TALE proteins and facilitate the design of TALEN with improved affinity and specificity.