It can be helpful when looking at a protein sequence to be able to quickly identify regions that are hydrophilic, acidic, basic, lipophilic, and so on.
Some online tools like CLUSTAL Omega have good colorizing, but wouldn't it be nice to have this kind of highlighting in Notepad++?
This should be preferred if you have Notepad++ 8.4.3 or later (any version where EnhanceAnyLexer can be installed).
- Install the EnhanceAnyLexer plugin.
- Add
protein_lexer_udl.xml
into theuserDefineLang
folder as a child of theNotepadPlus
element. - Add
protein_lexer.ini
intoEnhanceAnyLexerConfig.ini
. - Now this installation of Notepad++ will have colored proteins!
I've created a script with the PythonScript plugin that colorizes protein files. Go to that link for info on how to install the plugin.
Once you've installed PythonScript to Notepad++, you can download the attached protein_lexer.py
and drop it into the
plugins/PythonScript/scripts
subfolder of your Notepad++ installation's directory.
If you just want to colorize a file without always running the script at startup, you can just run it from the
Plugins->PythonScript->Scripts
drop-down menu whenever you open a protein file.
You can set the script to run on startup by opening plugins/PythonScript/scripts/setup.py
and adding two lines
to import protein_lexer. Then you can go to Plugins->PythonScript->Configuration...
from the main menu and change
the Initialisation
combo box value to ATSTARTUP
.
Once the script runs, it will automatically colorize fasta
and clustal_num
files whenever they are opened in the editor.
You can add more file extensions and customize the colors for each type of amino acid by editing protein_lexer.py
.
The styles are the tuples of three ints in all caps near the top of the file, e.g.
ACID_STYLE = (0xbe, 0, 0) # red
and the file extensions are just above that.
By default the colors for amino acids are as follows:
Any characters other than the standard one-letter codes will also be colored black.
- The lexer is quite slow. For example, there's a noticeable delay in lexing even a 5kb file. It's faster when loading a file than it is when lexing a pre-opened file, though. Not sure if there's any way to fix this.
- Consider only applying styles to large blocks of several (say 8+) contiguous UPPERCASELETTERS. That might reduce performance though.
- For some more recent versions of Notepad++ (this probably doesn't apply for anything before
8.4.6
), the little colored swatch at the side of a line that indicates if there was a saved or unsaved change since the file was opened will consume the entire line once the plugin has been run (see below). Not sure how to fix this.
For reference, it should look like this: