This protocol uses the Makefile ensemblgenomes_FTP_client.mk, which can also be used to install organisms from Ensembl Genomes, as explained in Installing genomes from Ensembl Genomes.
This procedure supports the installation of arbitrary genomes from any sources, provided that 4 input files are obtained with the following extensions:
where SPECIES_RSAT_ID is a string identifying this organism and its annotation.
Note: parse-gtf takes also GFF3 files, but the script expects the .gtf extension
For instance, to install assembly Wm82.a2.v1 of Glycine max from JGI Phytozome, we could do:
cd $RSAT
SPECIES_RSAT_ID=Glycine_max.Wm82.a2.v1.JGI
mkdir -p $RSAT/data/genomes/${SPECIES_RSAT_ID}/genome
# put there those 4 files (dna.toplevel.fa,dna_rm.genome.fa,.gtf,.pep.all.fa)
make -f makefiles/ensemblgenomes_FTP_client.mk SPECIES=Glycine_max \
$SPECIES_RSAT_ID \
SPECIES_DIR=/var/www/html/rsat/data/genomes/$SPECIES_RSAT_ID TAXON_ID=3847 GTF_SOURCE=JGI \
SPECIES_RSAT_ID= install_from_gtf
Note that TAXON_ID can be obtained at https://www.ncbi.nlm.nih.gov/taxonomy
The newly installed species will be added to $RSAT/data/supported_organisms.tab and should be listed with the following command-line:
supported-organisms