Using Read-Split-Run

    An explanation of the options required of the tool are displayed below. All fields are required, but some have default values provided that you can and shoudl adjust to fit your needs.

    • Experiment Settings

      The experiment settings allow you to perform the analysis in different ways, according to your needs.

      • Mode

        Analytical or Comparative
      • Analytical mode simply finds the split-regions in your data.
      • Comparative mode shows the regions common and alien to two sets of data. This will run the pipeline twice; once for each set, and produce a side-by-side comparison of the identified regions.
      • Reads Type

        Allows you to specify whether your reads are single-ended or paired-ended. The reads-provider will specify whether this is or is not the case. This is a feature of bowtie.

      • Single : not-paired. all reads are treated as normal.
      • Paired : reads are treated as being part of a set and are handled specially.
      • Replicates

        Does your experiment have replicate data? this drop-down allows you to add more data fields to the experiment. At current, we have room for up to 3 replicates. This number may change; however, if you would like to run an analysis with additional, you may contact us at help@bioinf1.indstate.edu.

    • Data

    • Genome Selection

      Here is where you select the genome to which you want to align your reads. We have three genomes currently available, listed under their "common name" from the table below. Also listed is the assembly we use for alignments.

      Common NameSpeciesAssembly
      MouseMus musculusmm9sp35
      HumanHomo Sapienshg19sp101
    • Select File(s)

      In this (these) field(s), provide the reads files. Currently accepted forms are: plain-text FASTQ. This may expand in the future to include gzipped-fastq.
      Notes: The fields are ordered first by Experiment (in the case of a comparative run), then by replicate, then by which side (in the case of paired-reads). The field labelled "Select RNA-Seq File" should be the left-mate of your first replicate for your first experiment...

    • Pre-processing

      In the course of bowtying, and for our own formatting, certain data must be known, and a few parameters are left for you to decide.

    • Quality Encoding

      The method for encoding the quality data of the reads must be specified (this is a bowtie requirement). If you know what encoding was used, you may select from the drop-down list; otherwise press the "check" button to let our server guess.

    • Length of Reads

      The length of the reads must be consistent. Our process relies on splitting the reads into specific segments (see Minimum split size, below), therefore unevently-sized reads will throw off the process. Press the "check" button adjacent to the Quality Encoding dropdown to have the server analyze your reads.
      The length of your reads IN ALL FILES must be the EXACT SAME. No exceptions. If a mismatch occurs, an error will be shown and you will need to provide a different file.
      Also note: our guesser only reads up-to the first 40,000 lines of your reads files. if there is a mismatch further along, it will pass and you will receive modesly unpredictible results. Please take care.

    • Minimum Split Size

      The pipeline takes the unmapped reads from the first alignment and splits them into two segments. The smallest segment are this size, and grows progressively larger until they reach the read-length less this number. For example: if your reads were 33 base-pairs, and your minimum split were 15bp; then you'd have pairs of reads of lengths 15/18, 16/17, 17/16, and 18/15. If you specify more than half the read-length as a minimum, the minimum will be treated as (read-length - this number).

    • Maximum good alignments allowed per read

      A bowtie option. If a given read maps to more than this number of places on the genome, discard it. In the event that a read maps too many times, it is not a good candidate... either it is too common or something may have gone wrong. Adjust this number to your liking.

    • Candidate Selection

    • Minimum distance between candidate pairs

      When selecting pairs of reads as possible "spliced regions," we only consider them as viable if they are at least this far away. Too-close reads may be false-positives.

    • Maximum distance between candidate pairs

      Candidate pairs cannot be too far apart; as limited by this value. This helps elminate considering as candidates pairs which are not part of the same gene.

    • Read mapping region boundary buffer

      When determing if reads support one-another (that is: they help to show that a given gene exhibits spliced-region encoding), they must have a difference in starting-position within this tolerance.

    • Minimum number of supporting reads

      Report only genes exhibiting this many (or more) supporting reads.

  1. Email

    When the job is finished, an email containing a download link to your results will be sent to this address. If possible, an archive of the results file will be attached (iff the results file is < 10Mb).