Background
There appears to be a bug in Perl < 5.14 concerning SigAction.pm. This module is needed to run Maker2. Upon installation, running the Maker2 Perl script results in the following error message:
Assertion ((sv)->sv_flags & (0x00200000|0x00400000|0x00800000)) failed: file "mg.c", line 88 at /home/mderbyshire/local/perl5/lib/Sys/SigAction.pm line 145.
Compilation failed in require at ./maker line 45.
BEGIN failed--compilation aborted at ./maker line 45.
Installing a newer Perl without the bug
There is a patch for this bug, but I opted to install an alternate version of Perl. The installation instructions here worked for version 5.24.1, and have been designed to install on a SuSE system, which contains a bug in the ODBM_File header. First download and unpack the tarball. Then swap the programming environment from cray to gnu. Then, configure answering default to all, disabling the ODBM_File header and specifying the gcc compiler. Then make:
tar xvzf perl-5.24.1.tar.gz
module swap PrgEnv-cray PrgEnv-gnu
cd perl-5.24.1
./Configure -des -Dprefix=/home/localperl -Dnoextensions=ODBM_File -Dcc=gcc
make
Before running the above commands, create the directory /home/localperl. The commands above configure the build for this local directory, as use of Pawsey is not as root.
Then test the build:
make test
cd t
./perl harness
This will result in some failures.The harness command gives more information:
(Wstat: 1536 Tests: 60 Failed: 6)
So, 90 % of the tests were passed. Then try:
make install
user@host:~> /home/localperl/bin/perl -v
This is perl 5, version 24, subversion 1 (v5.24.1) built for x86_64-linux
Copyright 1987-2017, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
So, it seems like the perl compiler works, at least at the surface level. Then add it to the system path before the existing perl distribution. Also adding the libraries to the library path:
export PATH=/home/localperl/bin:$PATH
export PERL5LIB=/home/localperl/lib:$PERL5LIB
From anywhere in the OS, you shuold beable to do this:
user@host:~> perl -v
This is perl 5, version 24, subversion 1 (v5.24.1) built for x86_64-linux
Copyright 1987-2017, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
Then, re-build Maker2 from source:
perl Build.PL
./Build installdeps
./Build install
Now try Maker2 again:
user@host> ./maker
Argument "2.56_01" isn't numeric in numeric ge (>=) at /home/mderbyshire/localperl/lib/site_perl/5.24.1/x86_64-linux/forks.pm line 1570.
ERROR: Control files not found
MAKER version 2.31.9
Usage:
maker [options] <maker_opts> <maker_bopts> <maker_exe>
Description:
MAKER is a program that produces gene annotations in GFF3 format using
evidence such as EST alignments and protein homology. MAKER can be used to
produce gene annotations for new genomes as well as update annotations
from existing genome databases.
The three input arguments are control files that specify how MAKER should
behave. All options for MAKER should be set in the control files, but a
few can also be set on the command line. Command line options provide a
convenient machanism to override commonly altered control file values.
MAKER will automatically search for the control files in the current
working directory if they are not specified on the command line.
Input files listed in the control options files must be in fasta format
unless otherwise specified. Please see MAKER documentation to learn more
about control file configuration. MAKER will automatically try and
locate the user control files in the current working directory if these
arguments are not supplied when initializing MAKER.
It is important to note that MAKER does not try and recalculated data that
it has already calculated. For example, if you run an analysis twice on
the same dataset you will notice that MAKER does not rerun any of the
BLAST analyses, but instead uses the blast analyses stored from the
previous run. To force MAKER to rerun all analyses, use the -f flag.
MAKER also supports parallelization via MPI on computer clusters. Just
launch MAKER via mpiexec (i.e. mpiexec -n 40 maker). MPI support must be
configured during the MAKER installation process for this to work though
Options:
-genome|g <file> Overrides the genome file path in the control files
-RM_off|R Turns all repeat masking options off.
-datastore/ Forcably turn on/off MAKER's two deep directory
nodatastore structure for output. Always on by default.
-old_struct Use the old directory styles (MAKER 2.26 and lower)
-base <string> Set the base name MAKER uses to save output files.
MAKER uses the input genome file name by default.
-tries|t <integer> Run contigs up to the specified number of tries.
-cpus|c <integer> Tells how many cpus to use for BLAST analysis.
Note: this is for BLAST and not for MPI!
-force|f Forces MAKER to delete old files before running again.
This will require all blast analyses to be rerun.
-again|a recaculate all annotations and output files even if no
settings have changed. Does not delete old analyses.
-quiet|q Regular quiet. Only a handlful of status messages.
-qq Even more quiet. There are no status messages.
-dsindex Quickly generate datastore index file. Note that this
will not check if run settings have changed on contigs
-nolock Turn off file locks. May be usful on some file systems,
but can cause race conditions if running in parallel.
-TMP Specify temporary directory to use.
-CTL Generate empty control files in the current directory.
-OPTS Generates just the maker_opts.ctl file.
-BOPTS Generates just the maker_bopts.ctl file.
-EXE Generates just the maker_exe.ctl file.
-MWAS <option> Easy way to control mwas_server for web-based GUI
options: STOP
START
RESTART
-version Prints the MAKER version.
-help|? Prints this usage statement.
Note: Cannot discern where this error message is coming from exactly:
Argument "2.56_01" isn't numeric in numeric ge (>=) at /home/mderbyshire/localperl/lib/site_perl/5.24.1/x86_64-linux/forks.pm line 1570.
ERROR: Control files not found
However, this can be fixed with a (pretty dodgy) hack. WARNING: DO NOT ATTEMPT THIS UNLESS YOU AT LEAST SORT OF KNOW WHAT YOUR ARE DOING, THERE IS A RISK OF BREAKING THE LOCAL PERL INSTALLATION MODULE FORKS. The offending line of code is line 1570 of forks.pm (part of the core libraries of the above Perl distribution):
1570 local $Storable::Deparse = 1 if $Storable::VERSION >= 2.05;
1571 local $Storable::Eval = 1 if $Storable::VERSION >= 2.05;
To hack our way out of this situation, we can just remove the '_01' from the Storable version. First create a variable from the Storable version method, then rename and use the variable in the comparison.
1570 my $version = $Storable::VERSION;
1571 $version =~ s/_01//g;
1572 local $Storable::Deparse = 1 if $version >= 2.05;
1573 local $Storable::Eval = 1 if $version >= 2.05;
Then you can run Maker2 without a hitch:
user@host> maker
MAKER version 2.31.9
Usage:
maker [options] <maker_opts> <maker_bopts> <maker_exe>
Description:
MAKER is a program that produces gene annotations in GFF3 format using
evidence such as EST alignments and protein homology. MAKER can be used to
produce gene annotations for new genomes as well as update annotations
from existing genome databases.
The three input arguments are control files that specify how MAKER should
behave. All options for MAKER should be set in the control files, but a
few can also be set on the command line. Command line options provide a
convenient machanism to override commonly altered control file values.
MAKER will automatically search for the control files in the current
working directory if they are not specified on the command line.
Input files listed in the control options files must be in fasta format
unless otherwise specified. Please see MAKER documentation to learn more
about control file configuration. MAKER will automatically try and
locate the user control files in the current working directory if these
arguments are not supplied when initializing MAKER.
It is important to note that MAKER does not try and recalculated data that
it has already calculated. For example, if you run an analysis twice on
the same dataset you will notice that MAKER does not rerun any of the
BLAST analyses, but instead uses the blast analyses stored from the
previous run. To force MAKER to rerun all analyses, use the -f flag.
MAKER also supports parallelization via MPI on computer clusters. Just
launch MAKER via mpiexec (i.e. mpiexec -n 40 maker). MPI support must be
configured during the MAKER installation process for this to work though
Options:
-genome|g <file> Overrides the genome file path in the control files
-RM_off|R Turns all repeat masking options off.
-datastore/ Forcably turn on/off MAKER's two deep directory
nodatastore structure for output. Always on by default.
-old_struct Use the old directory styles (MAKER 2.26 and lower)
-base <string> Set the base name MAKER uses to save output files.
MAKER uses the input genome file name by default.
-tries|t <integer> Run contigs up to the specified number of tries.
-cpus|c <integer> Tells how many cpus to use for BLAST analysis.
Note: this is for BLAST and not for MPI!
-force|f Forces MAKER to delete old files before running again.
This will require all blast analyses to be rerun.
-again|a recaculate all annotations and output files even if no
settings have changed. Does not delete old analyses.
-quiet|q Regular quiet. Only a handlful of status messages.
-qq Even more quiet. There are no status messages.
-dsindex Quickly generate datastore index file. Note that this
will not check if run settings have changed on contigs
-nolock Turn off file locks. May be usful on some file systems,
but can cause race conditions if running in parallel.
-TMP Specify temporary directory to use.
-CTL Generate empty control files in the current directory.
-OPTS Generates just the maker_opts.ctl file.
-BOPTS Generates just the maker_bopts.ctl file.
-EXE Generates just the maker_exe.ctl file.
-MWAS <option> Easy way to control mwas_server for web-based GUI
options: STOP
START
RESTART
-version Prints the MAKER version.
-help|? Prints this usage statement.
Now try running Maker on the example data. First cd into the example data directory and create maker config files:
cd data
maker -CTL
Then add the following lines to maker_opts.ctl:
genome=dpp_contig.fasta
est=dpp_est.fasta
protein=dpp_protein.fasta
est2genome=1
Then run maker:
maker
...
flattening protein clusters
prepare section files
processing the chunk divide
processing contig output
Maker is now finished!!!
Start_time: 1486874187
End_time: 1486874296
Elapsed: 109
Seems to have worked! (Mwahahahahaha)