pysam: htslib interface for python¶
Author: | Andreas Heger, Kevin Jacobs and contributors |
---|---|
Date: | Apr 09, 2023 |
Version: | 0.21.0 |
Pysam is a python module for reading, manipulating and writing genomic data sets.
Pysam is a wrapper of the htslib C-API and provides facilities to read and write SAM/BAM/VCF/BCF/BED/GFF/GTF/FASTA/FASTQ files as well as access to the command line functionality of the samtools and bcftools packages. The module supports compression and random access through indexing.
This module provides a low-level wrapper around the htslib C-API as using cython and a high-level, pythonic API for convenient access to the data within genomic file formats.
The current version wraps htslib-1.17, samtools-1.17, and bcftools-1.17.
To install the latest release, type:
pip install pysam
See the Installation notes for details.
Contents¶
- Introduction
- API
- Working with BAM/CRAM/SAM-formatted files
- Using samtools commands within python
- Working with tabix-indexed files
- Working with VCF/BCF formatted files
- Extending pysam
- Installing pysam
- FAQ
- How should I cite pysam
- Is pysam thread-safe?
- pysam coordinates are wrong
- Calling pysam.fetch() confuses existing iterators
- AlignmentFile.fetch does not show unmapped reads
- I can’t call AlignmentFile.fetch on a file without an index
- BAM files with a large number of reference sequences are slow
- Weirdness with spliced reads in samfile.pileup(chr,start,end) given spliced alignments from an RNA-seq bam file
- I can’t edit quality scores in place
- Why is there no SNPCaller class anymore?
- I get an error ‘PileupProxy accessed after iterator finished’
- Pysam won’t compile
- ImportError: cannot import name csamtools
- Developer’s guide
- Release notes
- Release 0.19.1
- Release 0.19.0
- Release 0.18.0
- Release 0.17.0
- Release 0.16.0
- Release 0.15.4
- Release 0.15.3
- Release 0.15.2
- Release 0.15.1
- Release 0.15.0
- Release 0.14.1
- Release 0.14.0
- Release 0.13.0
- Release 0.12.0.1
- Release 0.12.0
- Release 0.11.2.2
- Release 0.11.2.1
- Release 0.11.2
- Release 0.11.1
- Release 0.11.0
- Release 0.10.0
- Release 0.9.1
- Release 0.9.0
- Release 0.8.4
- Release 0.8.3
- Release 0.8.2.1
- Release 0.8.2
- Release 0.8.1
- Release 0.8.0
- Release 0.7.8
- Release 0.7.7
- Release 0.7.6
- Release 0.7.5
- Release 0.7.4
- Release 0.7.3
- Release 0.7.2
- Release 0.7.1
- Release 0.7
- Release 0.6
- Release 0.5
- Release 0.4
- Release 0.3
- Benchmarking
- Glossary
References¶
[Li.2009] | The Sequence Alignment/Map format and SAMtools. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. Bioinformatics. 2009 Aug 15;25(16):2078-9. Epub 2009 Jun 8 btp352. PMID: 19505943. |
[Bonfield.2021] | HTSlib: C library for reading/writing high-throughput sequencing data. Bonfield JK, Marshall J, Danecek P, Li H, Ohan V, Whitwham A, Keane T, Davies RM. GigaScience (2021) 10(2) giab007. PMID: 33594436. |
[Danecek.2021] | Twelve years of SAMtools and BCFtools. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. GigaScience (2021) 10(2) giab008. PMID: 33590861. |
See also
- Information about htslib
- http://www.htslib.org
- The samtools homepage
- http://samtools.sourceforge.net
- The cython C-extensions for python
- https://cython.org/
- The python language
- https://www.python.org