Skip to main content

Takes SeqRecordExpanded objects and creates datasets for phylogenetic software

Project description

Dataset-creator

Dataset creator for phylogenetic software

tests

Travis-CI Build Status Requirements Status Coverage Status
Code issues

package

PyPI Package latest release PyPI Wheel Supported versions Supported implementations

Takes SeqRecordExpanded objects and creates datasets for phylogenetic software

  • Free software: BSD license

Installation

pip install dataset_creator

Usage

The list of SeqRecordExpanded objects should be sorted by gene_code first then by voucher_code.

>>> from seqrecord_expanded import SeqRecord
>>> from dataset_creator import Dataset
>>>
>>> # `table` is the Translation Table code based on NCBI
>>> seq_record1 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
...                         table=1, voucher_code='CP100-10',
...                         taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record2 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
...                         table=1, voucher_code='CP100-10',
...                         taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record3 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='wingless',
...                         table=1, voucher_code='CP100-10',
...                         taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record4 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='winglesss',
...                         table=1, voucher_code='CP100-10',
...                         taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_records = [
...    seq_record1, seq_record2, seq_record3, seq_record4,
... ]
>>> # codon positions can be 1st, 2nd, 3rd, 1st-2nd, ALL (default)
>>> dataset = Dataset(seq_records, format='NEXUS', partitioning='by gene',
...                   codon_positions='1st',
...                   )
>>> print(dataset.dataset_str)
"""#NEXUS
blah blah
"""

Development

To run the all tests run:

tox

Changelog

0.1.2 (2015-09-30)

  • Creates datasets as degenerated sequences using the method by Zwick et al.

0.1.1 (2015-09-30)

  • It will issue errors if reading frames are not specified unless they are strictly necessary to build the dataset (datasets need to be divided by codon positions).

  • Added documentation using sphinx-doc

  • Creates datasets as aminoacid sequences.

0.1.0 (2015-09-23)

  • Creates Nexus, Tnt, Fasta, Phylip and Mega dataset formats.

0.0.1 (2015-06-10)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataset-creator-0.2.0.tar.gz (92.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataset_creator-0.2.0-py2.py3-none-any.whl (15.1 kB view details)

Uploaded Python 2Python 3

File details

Details for the file dataset-creator-0.2.0.tar.gz.

File metadata

File hashes

Hashes for dataset-creator-0.2.0.tar.gz
Algorithm Hash digest
SHA256 04ffa4ba100ac0a4aaef20cde55c748ae44d2aa6dffbc18db998789b25d006c6
MD5 47b42dfdf27fe9282670343bd8dc7dae
BLAKE2b-256 d7165ad979949c97390c1448aaffa05f46783fc0155a6186eb7367ab320182a7

See more details on using hashes here.

File details

Details for the file dataset_creator-0.2.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for dataset_creator-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7c4db52792354c07a132b53488bb8e6c44593d00b187d66d50df524fe1d01024
MD5 9e585d2f36d45babfa5839d2a314fe3a
BLAKE2b-256 e598bcfdc9f513503f64c0390bc61673fe9a2089c6f3bdb75550b37c770709a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page