ec.app.sequence
Class ThreadedSequenceFeatureInterpreter
java.lang.Object
ec.app.sequence.ThreadedSequenceFeatureInterpreter
public class ThreadedSequenceFeatureInterpreter
- extends java.lang.Object
This code is meant to read the Features stored in file (hall of fame output)
and generate LibSVM specific format files. It will also output the features
separtely for those who like to study the features. The features are output
to file SSCleanFeatures.txt.
This does parallelization of computing feature matching. The parallelization
is controlled by input argument of Threads. It will use chunking i.e total number
of sequence % threads-1 will get equal share and the last one will get all
the sequences in case of odd/even distribution. Number of threads used should be
equal to number of cores/processors on machines for faster throughput.
Another thing this class does is it does some simple simplification like
if there is (AND true true) etc or (OR (NOT false)) etc will be reduced.
In future we can even reduce the features to remove redundancy like
matchesAtPosition motif3 AGT @ 45 AND matchesAtPosition motif1 T at 47
but have to think through whether redundancy (bloat) can be good/bad in some cases.
- Author:
- udaykamath
Field Summary |
boolean |
cleanOnly
|
static org.biojava.utils.regex.PatternFactory |
factory
|
Method Summary |
void |
close()
|
void |
generateLibSVMFile(java.io.File gpFile,
int threads)
|
static void |
main(java.lang.String[] args)
|
void |
quickWrite(java.lang.String data)
All threads use this, so better be synchronized |
void |
setup(java.lang.String fileName)
This method reads the sequences from File with labels +1, -1 and tries to
put it in right buckets. |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
cleanOnly
public boolean cleanOnly
factory
public static org.biojava.utils.regex.PatternFactory factory
ThreadedSequenceFeatureInterpreter
public ThreadedSequenceFeatureInterpreter()
quickWrite
public void quickWrite(java.lang.String data)
throws java.lang.Exception
- All threads use this, so better be synchronized
- Parameters:
data
-
- Throws:
java.lang.Exception
close
public void close()
throws java.lang.Exception
- Throws:
java.lang.Exception
generateLibSVMFile
public void generateLibSVMFile(java.io.File gpFile,
int threads)
throws java.lang.Exception
- Throws:
java.lang.Exception
setup
public void setup(java.lang.String fileName)
- This method reads the sequences from File with labels +1, -1 and tries to
put it in right buckets. It also initializes factor for IUPAC parsing etc
- Parameters:
fileName
-
main
public static void main(java.lang.String[] args)