Releases: althonos/pyhmmer
Releases · althonos/pyhmmer
v0.8.0
PyHMMER has been accepted for publication in Bioinformatics. Paper accessible here: doi:10.1093/bioinformatics/btad214.
Added
pyhmmer.hmmer.jackhmmer
function to run several JackHMMER iterative searches in parallel using multithreading (#35, by @zdk123).HMM.to_profile
shortcut method to allocate and configure a newProfile
object.
Fixed
- Type annotations of
Pipeline.iterate_seq
andPipeline.iterate_hmm
. - Potential memory leak on exceptions raised by
HMMPressedFile.read
. Offsets.profile
not recording offsets properly, causingpyhmmer.hmmer.hmmpress
to produce invalid pressed files (#37).
Changed
HMM.__init__
andHMM.sample
now take theAlphabet
as the first argument, for consistency with the rest of the API.HMM
now require aname
argument.
Removed
- Deprecated
ignore_gaps
argument inSequenceFile.__init__
. - Deprecated
Sequence.taxonomy_id
property.
v0.7.4
Added
- Recipes page to the documentation with code example for loading multiple HMM files (#24, by @zdk123).
Fixed
TraceAligner
methods causing a segfault when passed an uninitialized HMM (#36).
Changed
HMM
default constructor now always creates a valid HMM (with respects to probability arrays).TraceAligner
now validates the inputHMM
before calling the HMMER code.- Use stack allocation for all error buffers instead of creating empty
bytearray
objects where applicable.
v0.7.3
v0.7.2
Added
easel.GeneticCode
class wrapping anESL_GENCODE
struct for configuring translation.DigitalSequence.translate
method to translate a nucleotide sequence to a protein sequence. Metadata is copied from the source sequence to its translation (#31, by @valentynbez).
Deprecated
Sequence.taxonomy_id
property, as it is not used by Easel and implementation is not consistent (See EddyRivasLab/easel#68).
v0.7.1
Added
- Missing
__reduce__
method toTopHits
.
Fixed
- Build detection of available platform functions in
setup.py
.
v0.7.0
Added
Bitfield.zeros
andBitfield.ones
classmethods for constructing an empty bitfield of known size.Bitfield.copy
method to copy a bitfield object.SequenceBlock
andOptimizedProfileBlock
classes to store Python objects next to a contiguous array of pointers for iterating with the GIL released.SequenceFile.read_block
method to read a whole sequence block from a file.HMM.sample
class method to generate a HMM at random given aRandomness
source.hmmscan
function to scan a profile database with sequence queries.deepcopy
implementations toHMM
,Profile
andOptimizedProfile
classes ofplan7
.rewind
method toHMMFile
,HMMPressedFile
andSequenceFile
to reset a file back to its initial position.name
attribute toHMMFile
,HMMPressedFile
,MSAFile
andSequenceFile
to expose the path of a file (when it was created from path).local
property toProfile
andOptimizedProfile
, indicating whether a profile is in local or global mode.multihit
property toProfile
andOptimizedProfile
, indicating whether a profile is in unihit or multihit mode, with a setter taking care of the reconfiguration.Domain.included
andDomain.reported
settable properties to report the inclusion and reporting status of a single domain.TopHits.included
andTopHits.reported
sized iterator to iterate only on included and reported hits.Domains.included
andDomains.reported
sized iterator to iterate only on included and reported domains.
Changed
Bitfield
,Vector
andMatrix
can now be created from an iterable.Pipeline
search methods now expect aDigitalSequenceBlock
or aSequenceFile
for the target sequence database.Pipeline
scan methods now expect anOptimizedProfileBlock
or aHMMPressedFile
for the target profile database.TraceAligner
now expect aDigitalSequenceBlock
for the sequences to align to the HMM.Profile.configure
now uses a default value of 400 for theL
argument.hmmsearch
,nhmmer
andphmmer
support being given a single query instead of requiring an iterable.HMMPressedFile
can now be created, closed and used as a context manager directly without having to manage the sourceHMMFile
.- Renamed
Profile.optimized
method toProfile.to_optimized
. - Replaced
Randomness.is_fast
method with theRandomness.fast
property. - Rewrite handling of
Hit
flags using settable properties (Hit.included
,Hit.reported
,Hit.new
,Hit.dropped
,Hit.duplicate
) instead of methods.
Fixed
- Memory leak in the
LongTargetsPipeline
search loop. - PyPy behaviour change of
readinto
methods now expectingunsigned char*
instead ofchar*
memoryview. NULL
-pointer dereference inPipeline.search_hmm
when given a query without name.LongTargetsPipeline
not recording the query name and accession.- Memory leak caused by using a non-default prior scheme when constructing a
Builder
.
Removed
PipelineSearchTargets
, replaced in functionality witheasel.DigitalSequenceBlock
.is_local
andis_multihit
methods ofProfile
andOptimizedProfile
, replaced with equivalent properties.Hit.manually_drop
andHit.manually_include
methods, replaced with the differentHit
properties.
v0.6.3
Fixed
- Error not being raised on alphabet detection failure in
SequenceFile
orMSAFile
. - Add check in
DigitalSequence
constructor to make sure encoded characters are in valid range (#25).
Added
SequenceFile.guess_alphabet
andMSAFile.guess_alphabet
to guess the alphabet from an open file.Alphabet.encode
andAlphabet.decode
to convert raw sequences between digital and text format.
v0.6.2
Changed
hmmsearch
,phmmer
andnhmmer
functions will reduce the requested number of threads to the number of queries, if it can be detected usingoperator.length_hint
.
Added
- Documentation for loading all HMMs from an
HMMFile
object at once (#23). - List of projects depending on PyHMMER to the
Examples
page of the documentation.
v0.6.1
Added
pickle
protocol support forTopHits
objects, using the HMMER network serialization.TopHits.write
method to write hits to a file in tabular format.query_name
andquery_accession
properties toTopHits
objects to access the name and accession of the query that produced the hits.
Fixed
- Extraction of filename from file-like objects in the
HMMFile
constructor. - Use
os.cpu_count
instead ofmultiprocessing.cpu_count
where applicable to preserve OS scheduling. - Wrong return type in docstring of
HMM.insert_emissions
. TopHits.searched_nodes
returning the searched number of residues instead of the searched number of model nodes.- Unsound decoding of pickled
MatrixF
orVectorF
when data comes from a source of different endianness.
Changed
- Rewrite
pyhmmer.hmmer
threading code usingDeque
instead ofcollections.Queue
to store the queries and results. - Reduce memory consumption of
pyhmmer.hmmer
by reducing the number of semaphores and event flags used concurrently. - Make
pyhmmer.hmmer
main threads block on query insertion rather than result retrieval to make sure worker threads are never idling.
v0.6.0
Added
pyhmmer.daemon
module with an client implementation to communicate to ahmmpgmd
server.Pipeline.arguments
methods to get a list of CLI arguments from the parameters used to initialize thePipeline
.- Setters for
name
,accession
anddescription
properties ofplan7.Hit
. - Constructor for individual
plan7.Trace
objects outside aplan7.Traces
list. plan7.Trace.from_sequence
constructor to create a faux trace from a single sequence.manually_include
andmanually_drop
methods toplan7.Hit
for manually selecting the inclusion status of aHit
in aTopHits
instance.compare_ranking
method toplan7.TopHits
for comparing the order of the hits compared to a previous run on the same targets stored in aneasel.KeyHash
object.Pipeline.iterate_seq
andPipeline.iterate_hmm
to run iterative queries like JackHMMER.repr
implementations foreasel.MSAFile
,easel.SequenceFile
andeasel.HMMFile
showing the path or file object they were created from.repr
implementation foreasel.Randomness
showing the seed and the RNG algorithm in use.str
implementation forplan7.Alignment
using HMMER original code to display a domain alignment like in search/scan results.
Changed
plan7.Trace.posterior_probabilities
property may now beNone
in case no memory is allocated for the posteriors in theP7_TRACE
struct.TopHits.to_msa
can now add additional sequences passed as arguments to the alignment.plan7.HMMPressedFile
now raises an exception on attempts to create a new instance manually.ignore_gaps
argument ofeasel.SequenceFile
is now deprecated.repr
implementations foreasel
types now use the fully qualified class name.
Fixed
easel.SequenceFile.readinto
docstring not rendering properly in documentation.- Type annotations of
hits_included
andhits_reported
ofplan7.TopHits
marking these properties asbool
instead ofint
. - Setters of
name
,accession
,description
andauthor
properties ofeasel.MSA
crashing when givenNone
values. - Exception value raised from Easel code not being properly extracted.
- Plain strings being used in example for
easel.TextSequence
andeasel.TextMSA
constructors where byte strings are expected (#20).