Class SnpEffPredictorFactory

java.lang.Object
org.snpeff.snpEffect.factory.SnpEffPredictorFactory
Direct Known Subclasses:
SnpEffPredictorFactoryFeatures, SnpEffPredictorFactoryGenesFile, SnpEffPredictorFactoryGff, SnpEffPredictorFactoryKnownGene, SnpEffPredictorFactoryRefSeq

public abstract class SnpEffPredictorFactory extends Object
This class creates a SnpEffectPredictor from a file (or a set of files) and a configuration
Author:
pcingola
  • Field Details

    • MARK

      public static final int MARK
      See Also:
    • MIN_TOTAL_FRAME_COUNT

      public static int MIN_TOTAL_FRAME_COUNT
  • Constructor Details

    • SnpEffPredictorFactory

      public SnpEffPredictorFactory(Config config, int inOffset)
  • Method Details

    • add

      protected void add(Cds cds)
    • add

      protected void add(Chromosome chromo)
    • add

      protected Exon add(Exon exon)
      Add an exon
      Parameters:
      exon -
      Returns:
      exon added. Note: If the exon exists with the same ID, return old exon. If exon exists with same ID and same coordiates, add a new exon with different ID.
    • add

      protected void add(Gene gene)
      Add a Gene
    • add

      protected void add(Marker marker)
      Add a generic Marker
    • add

      protected void add(Transcript tr)
      Add a transcript
    • addMarker

      protected void addMarker(Marker marker, boolean unique)
      Add a marker to the collection
    • addSequences

      protected void addSequences(String chr, String chrSeq)
      Add genomic reference sequences
    • adjustChromosomes

      protected void adjustChromosomes()
      Adjust chromosome length using gene information This is used when the sequence is not available (which makes sense on test-cases and debugging only)
    • adjustTranscripts

      protected void adjustTranscripts()
      Adjust transcripts: recalculate start, end, strand, etc.
    • beforeExonSequences

      protected void beforeExonSequences()
      Perform some actions before reading sequences
    • codingFromCds

      protected void codingFromCds()
      Only coding transcripts have CDS: Make sure that transcripts having CDS are protein coding It might not be always "precise" though: $ grep CDS genes.gtf | cut -f 2 | ~/snpEff/scripts/uniqCount.pl 113 IG_C_gene 64 IG_D_gene 24 IG_J_gene 366 IG_V_gene 21 TR_C_gene 3 TR_D_gene 82 TR_J_gene 296 TR_V_gene 461 non_stop_decay 63322 nonsense_mediated_decay 905 polymorphic_pseudogene 34 processed_transcript 1340112 protein_coding
    • collapseZeroLenIntrons

      protected void collapseZeroLenIntrons()
      Collapse exons having zero size introns between them
    • create

      public abstract SnpEffectPredictor create()
    • createRandSequences

      protected void createRandSequences()
      Create random sequences for exons Note: This is only used for test cases!
    • deleteRedundant

      protected void deleteRedundant()
      Consolidate transcripts: If two exons are one right next to the other, join them E.g. exon1:1234-2345, exon2:2346-2400 => exon:1234-2400 This happens mostly in GTF files, where the stop-codon is specified separated from the exon info.
    • exonsFromCds

      protected void exonsFromCds()
      Create exons from CDS info
    • exonsFromCds

      protected void exonsFromCds(Transcript tr)
      Create exons from CDS info WARNING: We might end up with redundant exons if some exons existed before this process
      Parameters:
      tr - : Transcript with CDS info, but no exons
    • findGene

      protected Gene findGene(String id)
    • findGene

      protected Gene findGene(String geneId, String id)
    • findMarker

      protected Marker findMarker(String id)
    • findTranscript

      protected Transcript findTranscript(String id)
    • findTranscript

      protected Transcript findTranscript(String trId, String id)
    • getOrCreateChromosome

      protected Chromosome getOrCreateChromosome(String chromoName)
      Get a chromosome. If it doesn't exist, create it
    • getProteinByTrId

      public Map<String,String> getProteinByTrId()
    • parsePosition

      protected int parsePosition(String posStr)
      Parse a string as a 'position'. Note: It subtracts 'inOffset' so that all coordinates are zero-based
    • readExonSequences

      protected void readExonSequences()
      Read exon sequences from a FASTA file
    • replaceTranscript

      protected void replaceTranscript(Transcript trOld, Transcript trNew)
    • setCircularCorrectLargeGap

      public void setCircularCorrectLargeGap(boolean circularCorrectLargeGap)
    • setCreateRandSequences

      public void setCreateRandSequences(boolean createRandSequences)
    • setDebug

      public void setDebug(boolean debug)
    • setFastaFile

      public void setFastaFile(String fastaFile)
    • setFileName

      public void setFileName(String fileName)
    • setRandom

      public void setRandom(Random random)
    • setReadSequences

      public void setReadSequences(boolean readSequences)
      Read sequences? Note: This is only used for debugging and testing
    • setStoreSequences

      public void setStoreSequences(boolean storeSequences)
    • setVerbose

      public void setVerbose(boolean verbose)
    • showChromoNamesDifferences

      protected String showChromoNamesDifferences()
      Shw differences in chromosome names