SciPy Website

Site Navigation

Table Of Contents

Topical Software

This page indexes add-on software and other resources relevant to SciPy, categorized by scientific discipline or computational topic. It is intended to be exhaustive. If you know of an unlisted resource, see About This Page, below.

About This Page

As the links in each section become numerous, they may be placed on separate pages of their own, so be sure to visit those pages as well, if they’re relevant to your search. The listings are roughly organized by topic, with introductory resources first, more general topics next, and discipline-specific resources last.

Unless otherwise indicated, all packages listed here are provided under some form of open source license.

If you distribute or know of a resource that is not listed here, please add a listing. Make an account, login, visit the wiki. You’ll now see an “edit” tab at the top. Click it and go to it. If you need help, there is a help link at the top of the page. Please follow the general format of the existing listings. Be as brief as you can, but make sure you include in your text a link to the resource’s home page and some keywords that potential users might search for to find the resource. Finally, when you have saved, check out your listing and test the link to make sure it works.

If you wish to restructure the page (e.g., break out a section into its own page, change the section headings, etc.), please propose the idea to the SciPy-dev@scipy.org mailing list first and get community feedback.

General Python resources

Tutorials and texts

Some generic Python/programming tutorials:

Scientific computing with Python

  • A tutorial focused on interactive data analysis] for astronomy, but of generic utility to most scientific users. Developed at the STSCI, available for free download including all data files necessary to run the examples.
  • Jacek Generowicz’s Python Courses.
  • The Mathesaurus: A Python/MATLAB/Octave/SciLab/R/Gnuplot/IDL/Axiom cross-language reference by Vidar Gundersen.
  • A reference guide for IDL users listing IDL commands and their rough equivalents in Python/Numarray, mostly applicable to Numarray’s successor NumPy.
  • Software Carpentry is an open source course on basic software development skills for people with backgrounds in science, engineering, and medicine.
  • A tutorial for SciPy by Dave Kuhlman.

Working environments

  • IPython is an interactive environment with many features geared towards efficient work in typical scientific usage. It borrows many ideas from the interactive shells of Mathematica, IDL, Matlab and similar packages. It includes special support for the matplotlib and gnuplot plotting packages. IPython also has support for (X)Emacs, to be used as a full IDE with IPython as the interactive Python shell.
  • Sage is a monolithic software distribution which includes a Mathematica-like notebook interface which runs in a web browser. Cells of the notebook can run Python and Cython (as well as Sage’s own preprocessed Python dialect) and display plots and interactive widgets.
  • Spyder: A Qt based IDE suited to developing scientific applications. Includes integrated and external Python consoles, code checking built into the editor, a graphical class browser and full support for matplotlib graphs.
  • Pymacs: a tool which, once started from Emacs, allows both-way communication between Emacs Lisp and Python.
  • Pythonika: A MathLink module for Mathematica allowing to write Python code within a Mathematica notebook and transparently translating objects between the two.
  • tmPython, the TeXmacs Python plugin, provides a ‘Mathematica Notebook’ styled interface to Python from within TeXmacs. Blocks of code can be run independently of their order, and full multiline blocks, i.e., function remain visible and easily modifiable.

Science: basic tools

  • ScientificPython: another collection of Python modules for scientific computing. It includes basic geometry (vectors, tensors, transformations, vector and tensor fields), quaternions, automatic derivatives, (linear) interpolation, polynomials, elementary statistics, nonlinear least-squares fits, unit calculations, Fortran-compatible text formatting, 3D visualization via VRML, and two Tk widgets for simple line plots and 3D wireframe models. There are also interfaces to the netCDF library (portable structured binary files), to MPI (Message Passing Interface, message-based parallel programming), and to BSPlib (Bulk Synchronous Parallel programming). Much of this functionality has been incorporated into SciPy, but not all.
  • PyROOT, a run-time based Python binding to the ROOT framework: ROOT is a complete system for development of scientific applications, from math and graphics libraries, to efficient storage and reading of huge data sets, to distributed analysis. The Python bindings are based on run-time type information, such that you can add your own C++ classes on the fly to the system with a one-liner and down-casting as well as pointer manipulations become unnecessary. Using RTTI keeps memory and call overhead down to a minimum, resulting in bindings that are more light-weight and faster than any of the “standard” bindings generators.
  • PAIDA, is a pure Python scientific analysis tool for working with common objects in physics, such as histograms and ntuples, which provides an AIDA interface.
  • NetworkX, Python package for the creation, manipulation, and study of the structure, dynamics, and function of complex networks.
  • PyTrilinos Python interface to Trilinos, a framework for solving large-scale, complex multi-physics engineering and scientific problems.
  • PyAMG, a library of Algebraic Multigrid (AMG) solvers for large scale linear algebra problems.
  • PyDX provides automatic differentiation, arbitrary precision arithmetic, interval arithmetic, interval ODE solver, differential geometry constructs.

Python Wrappers for Existing Numerical Libraries

Running Code Written In Other Languages

C/C++

  • Cython is an amicable fork of the venerable Pyrex package which contains many advanced features not found in Pyrex, including very natural support for handling NumPy arrays as well as the PEP-3118 buffer protocol. It is quite popular within the Python scientific computing community.
  • ctypes is a module in the Python standard library (bundled with Python 2.5 and later) that allows you to create and manipulate C data types in Python, and to call functions in dynamic link libraries/shared DLLs. It allows wrapping these libraries in pure Python. NumPy arrays include a ctypes property allowing easy passing of the arrays to code wrapped with ctypes. See the cookbook entry for more information.
  • SWIG: SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. SWIG is primarily used with common scripting languages such as Perl, Python, Tcl/Tk and Ruby. The SWIG Typemaps page SWIG modifications for usage with Numeric arrays.
  • Boost.Python: a C++ library which enables seamless interoperability between C++ and Python. The PythonInfo Wiki contains a good howto reference. C++-sig at Python.org is devoted to Boost.
  • Pyrex is a Python dialect that lets you write code that mixes Python and C data types any way you want, and compiles it into a C extension for Python. See also Cython.
  • PyCxx: CXX/Objects is a set of C++ facilities to make it easier to write Python extensions. The chief way in which PyCXX makes it easier to write Python extensions is that it greatly increases the probability that your program will not make a reference-counting error and will not have to continually check error returns from the Python C API.
  • SIP: a tool for automatically generating Python bindings for C and C++ libraries. SIP was originally developed in 1998 for PyQt, Python bindings for the Qt GUI toolkit, but is suitable for generating bindings for any C or C++ library.

Inlining C/C++

  • Weave is a module included in SciPy that allows the inclusion of C/C++ within Python code. It has facilities for automatic creation of C/C++ based Python extension modules, as well as for direct inlining of C/C++ code in Python sources. The latter combines the scripting flexibility of Python with the execution speed of compiled C/C++, while handling automatically all module generation details.
  • Instant is a Python module that allows for instant in-lining of C and C++ code in Python, similar to Weave. It is a small Python module built on top of SWIG.

Fortran

  • F2PY: provides a connection between the Python and Fortran languages. F2PY is a Python extension tool for creating Python C/API modules from (handwritten or F2PY generated) signature files (or directly from Fortran sources).

R

  • RPy is a Python interface to the R Programming Language. It can manage all kinds of R objects and can execute arbitrary R functions (including the graphic functions). All errors from the R language are converted to Python exceptions. Any module installed for the R system can be used from within Python.

MATLAB

  • mlabwrap: A high-level Python-to-MATLAB bridge. Instead of opening connections to the MATLAB engine and executing statements, MATLAB functions are exposed as Python functions and complicated structures as proxy objects.
  • pythoncall is a MATLAB-to-Python bridge. Runs a Python interpreter inside MATLAB, and allows transferring data (matrices etc.) between the Python and Matlab workspaces.

Converting Code From Other Array Languages

  • i2py converts code from IDL, the Interactive Data Language from ITT, into NumPy expressions.
  • pym aims to translate MATLAB m-files into analogous NumPy statements (see “Source” tab, or browse).
  • OMPC is a MATLAB-to-Python compiler and support library that reads MATLAB m-files and translates them into code that can run in a Python interpreter with the accompanying OMPC runtime.

Plotting, data visualization, 3-D programming

Tools with a (mostly) 2-D focus

  • matplotlib: a Python 2-D plotting library which produces publication quality figures using in a variety of hardcopy formats (PNG, JPG, PS, SVG) and interactive GUI environments (WX, GTK, Tkinter, FLTK, Qt) across platforms. matplotlib can be used in Python scripts, interactively from the Python shell (ala MATLAB or Mathematica), in web application servers generating dynamic charts, or embedded in GUI applications. For interactive use, IPython provides a special mode which integrates with matplotlib. See the matplotlib Cookbook for recipes.
  • Chaco: Chaco is a Python toolkit for producing interactive plotting applications. Chaco applications can range from simple line plotting scripts up to GUI applications for interactively exploring different aspects of interrelated data. As an open-source project being developed by Enthought, Chaco leverages other Enthought technologies such as Kiva, Enable, and Traits to produce highly interactive plots of publication quality. See the recent SciPy presentation slides for an introduction.
  • PyQwt: a set of Python bindings for the Qwt C++ class library which extends the Qt framework with widgets for scientific and engineering applications. It provides a widget to plot 2-dimensional data and various widgets to display and control bounded or unbounded floating point values.
  • HippoDraw:a highly interactive data analysis environment. It is written in C++ with the Qt library from Nokia (formerly Trolltech). It includes Python bindings, and has a number of features for the kinds of data analysis typical of High Energy physics environments, as it includes native support for ROOT NTuples. It is well optimized for real-time data collection and display.
  • Biggles: a module for creating publication-quality 2D scientific plots. It supports multiple output formats (postscript, x11, png, svg, gif), understands simple TeX, and sports a high-level, elegant interface.
  • Gnuplot.py is a Python package that interfaces to gnuplot, the popular open-source plotting program. It allows you to use gnuplot from within Python to plot arrays of data from memory, data files, or mathematical functions. If you use Python to perform computations or as ‘glue’ for numerical programs, you can use this package to plot data on the fly as they are computed. IPython includes additional enhancements to Gnuplot.py (but which require the base package) to make it more efficient in interactive usage.
  • Pylab console is a Python console using GTK that allows to display matplotlib figures inline. Any call to plot, imshow, matshow or show functions actually produces a Figure that is inserted within the console.
  • Graceplot is a Python interface to the Grace 2D plotting program.
  • disipyl: an object-oriented wrapper around the DISLIN plotting library, written in the computer language Python. disipyl provides a set of classes which represent various aspects of DISLIN plots, as well as providing some easy to use classes for creating commonly used plot formats (e.g. scatter plots, histograms, 3-D surface plots). A major goal in designing the library was to facilitate interactive data exploration and plot creation.
  • PyChart is a library for creating Encapsulated Postscript, PDF, PNG, or SVG charts. It currently supports line plots, bar plots, range-fill plots, and pie charts.
  • pygame, though intended for writing games using Python, contains general-purpose multimedia libraries that definitely have other applications in visualization.
  • PyNGL is a Python module for creating publication-quality 2D visualizations, with emphasis in the geosciences. PyNGL can create contours, vectors, streamlines, XY plots, and overlay any one of these on several map projections. PyNGL’s graphics are based on the same high-quality graphics as the NCAR Command Language and NCAR Graphics.
  • Veusz : a scientific plotting package written in Python. It uses PyQt and Numarray. Veusz is designed to produce publication-ready Postscript output.
  • ppgplot is a Python module that provides bindings to the PGPLOT graphics subroutine library popular among astronomers (v 1.3 works with Numeric and numarray, but porting to NumPy is very easy).

Data visualization (mostly 3-D, surfaces and volumetric rendering)

  • MayaVi is a free, easy to use scientific data visualizer. It is written in Python and uses the amazing Visualization Toolkit (VTK) for the graphics. It provides a GUI written using Tkinter. MayaVi supports visualizations of scalar, vector and tensor data in a variety of ways, including meshes, surfaces and volumetric rendering. MayaVi can be used both as a standalone GUI program and as a Python library to be driven by other Python programs.
  • Mayavi2 is the successor of MayaVi. It is vastly superior to MayaVi1, has a Pythonic API, supports NumPy arrays transparently, provides a powerful application, reusable library and a powerful pylab like equivalent called mlab for rapid 3D plotting.
  • Py-OpenDX : Py-OpenDX is a Python binding for the OpenDX API. Currently only the DXLink library is wrapped, though this may be expanded in the future to cover other DX libraries such as CallModule and DXLite.
  • Py2DX is a Python binding for the OpenDX API based on Py-OpenDX. Mavis is a visualisation software built using this interfacce and the OpenDX library.
  • IVuPy (I-View-Py) serves to develop Python programs for 3D visualization of huge data sets using Qt and PyQt. IVuPy interfaces more than 600 classes of two of the Coin3D C++ libraries to Python, integrates very well with PyQt, and is fun to program. Coin3D is a scene graph library, and is optimized for speed. In comparison with VTK, Coin3D is more low level and lacks many of VTK’s advanced visualization and imaging algorithms.
  • Pivy is another Coin3D binding for Python. Pivy allows the development of Coin3D applications and extensions in Python, interactive modification of Coin3D programs from within the Python interpreter at runtime and incorporation of Scripting Nodes into the scene graph which are capable of executing Python code and callbacks. Installation instructions for Ubuntu 7.04 using the latest Coin (v 2.4.6) and SoQt (v 1.4.1) can be found at the Pivy Wiki.
  • Mat3D provides a few routines for basic 3D plotting. It makes use of OpenGL and is written in Python and Tk. One can interact (rotate and zoom) with with the generated graph and the view can be saved to an image.
  • S2PLOT is a three-dimensional plotting library based on OpenGL with support for standard and enhanced display devices. The S2PLOT library was written in C and can be used with C, C++, FORTRAN and Python programs on GNU/Linux, Apple/OSX and GNU/Cygwin systems. The library is currently closed-source, but free for commercial and academic use. They are hoping for an open source release towards the end of 2008.

LaTeX, PostScript, diagram generation

  • PyX is a package for the creation of encapsulated PostScript figures. It provides both an abstraction of PostScript and a TeX/LaTeX interface. Complex tasks like 2-D and 3-D plots in publication-ready quality are built out of these primitives.
  • Pyepix is a wrapper for the ePiX plotting library for LaTeX.
  • pydot is a Python interface to Graphviz’s Dot language. It provides an interface for creating both directed and non directed graphs from Python. Currently all attributes implemented in the Dot language are supported (up to Graphviz 1.10). Output can be inlined in Postscript into interactive scientific environments like TeXmacs, or output in any of the format’s supported by the Graphviz tools dot, neato, twopi.
  • Dot2TeX is another tool in the Dot/Graphviz/LaTeX family, this is a Graphviz to LaTeX converter. The purpose of dot2tex is to give graphs generated by Graphviz a more LaTeX friendly look and feel. This is accomplished by converting xdot output from Graphviz to a series of PSTricks or PGF/TikZ commands.
  • pyreport runs a script and captures the output (matplotlib graphics included). Generates a LaTeX or pdf report out of it, including literal comments and pretty printed code.

Other 3D programming tools

  • VPython: a Python module that offers real-time 3D output, and is easily usable by novice programmers.
  • OpenRM Scene Graph: is a developer’s toolkit that implements a scene graph API, and which uses OpenGL for hardware accelerated rendering. OpenRM is intended to be used to construct high performance, portable graphics and scientific visualization applications on Unix/Linux/Windows platforms.
  • Panda3D is an open source game and simulation engine.
  • Python Computer Graphics Kit is a collection of Python modules that contain the basic types and functions required for creating 3D computer graphics images.
  • PyGeo is a Dynamic 3-D geometry laboratory. PyGeo may be used to explore the most basic concepts of Euclidean geometry at an introductory level, including by elementary schools students and their teachers. But is particularly suitable for exploring more advanced geometric topics — such as projective geometry and the geometry of complex numbers.
  • Python 3-D software collection: A small collection of pointers to Python software for working in three dimensions.
  • pythonOCC provides Python bindings for OpenCascade, a 3D modeling & numerical simulation library (related projects <http://qtocc.sourceforge.net/links-related.html>`_).
  • PyGTS is a Python package used to construct, manipulate, and perform computations on 3D triangulated surfaces. It is a hand-crafted and Pythonic binding for the GNU Triangulated Surface (GTS) Library.

Numerical Optimization

  • OpenOpt (license: BSD) – numerical optimization framework with some new solvers and connections to lots of other. It allows connection of any-licensed software, while scipy.optimize allows only copyleft-free one (like BSD, MIT). Other features are convenient standard interface for all solvers, graphical output, automatic 1st derivatives check and much more. You can optimize FuncDesigner models with automatic differentiation. The OpenOpt website also hosts a numerical optimization forum.
  • CVXOPT is a tool for convex optimization which defines its own matrix-like object, interfaces to FFTW, BLAS, and LAPACK. Licensed under the GNU GPL version 3.
  • pycplex A Python interface to the ILOG CPLEX Callable Library.

Expression Evaluation Optimizers

  • Numexpr accepts NumPy array expressions as strings, rewrites them to optimize execution time and memory use, and executes them much faster than NumPy usually can.
  • PyDX is a package for working with calculus (differential geometry), arbitrary precision arithmetic (using gmpy), and interval arithmetic. PyDX uses lazy computation techniques to greatly enhance performance of the resulting functions.
  • Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays, melding some aspects of a computer algebra system (CAS) with aspects of an optimizing compiler.

Automatic differentiation

Not to be confused with numerical differentiation via finite differences or with symbolic differentiation provided by Maxima, SymPy, etc. See the Wikipedia entry on automatic differentiation for an explanation of the differences.

  • FuncDesigner - A tool for building mathematical functions interactively which can be automatically differentiated and optimized using OpenOpt.
  • ScientificPython - see modules Scientific.Functions.FirstDerivatives and Scientific.Functions.Derivatives
  • pycppad - wrapper for CppAD, a second-order forward/reverse automatic differentiation package.
  • PYADOLC is a Python module to differentiate complex algorithms written in Python. It wraps the functionality of the library ADOL-C (C++).
  • Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It also allows you to compute the gradient of an expression with respect to another. Symbolic expressions may be compiled into functions, which work on the same data structures as NumPy, allowing for easy interoperability.
  • PyDX is a package for working with calculus (differential geometry), arbitrary precision arithmetic (using gmpy), and interval arithmetic. PyDX uses lazy computation techniques to greatly enhance performance of the resulting functions. PyDX provides, among other things, multivariate automatic differentiation (to arbitrary order).

Finite differences derivatives approximation

Data Storage / Database

  • PyTables: PyTables is a hierarchical database package designed to efficiently manage very large amounts of data. It is built on top of the HDF5 library and the NumPy package.
  • pyhdf: pyhdf is a Python interface to the NCSA HDF4 library. Among the numerous components offered by HDF4, the following are currently supported by pyhdf: SD (Scientific Dataset), VS (Vdata), V (Vgroup) and HDF (common declarations).

Parallel and distributed programming

For a brief discussion of parallel programming within NumPy/SciPy, see ParallelProgramming.

  • PyMPI: Distributed Parallel Programming for Python! This package builds on traditional Python by enabling users to write distributed, parallel programs based on MPI message passing primitives. General Python objects can be messaged between processors.
  • Pypar: Parallel Programming in the spirit of Python! Pypar is an efficient but easy-to-use module that allows programs/scripts written in the Python programming language to run in parallel on multiple processors and communicate using message passing. Pypar provides bindings to an important subset of the message passing interface standard MPI.
  • MPI for Python: Object Oriented Python bindings for the Message Passing Interface. This module provides MPI suport to run Python scripts in parallel. It is constructed on top of the MPI-1 specification, but provides an object oriented interface which closely follows stantard MPI-2 C++ bindings. Any picklable Python object can be communicated. There is support for point-to-point (sends, receives) and collective (broadcasts, scatters, gathers) communications as well as group and communicator (inter, intra and topologies) management.
  • A discussion on Python and MPI: very useful discussion on this topic, carried at the CSC Climate Wiki.
  • PyPVM: A Python interface to Parallel Virtual Machine (PVM), a portable heterogeneous message-passing system. It provides tools for interprocess communication, process spawning, and execution on multiple architectures.
  • Module Scientific.BSP in Konrad Hinsen’s ScientificPython provides an experimental interface to the Bulk Synchronous Parallel (BSP) model of parallel programming (note the link to the BSP tutorial on the ScientificPython page). Module Scientific.MPI provides an MPI interface. The BSP model is an alternative to MPI and PVM message passing model. It is said to be easier to use than the message passing model, and is guaranteed to be deadlock-free.
  • Pyro: PYthon Remote Objects (Pyro) provides an object-oriented form of RPC. It is a Distributed Object Technology system written entirely in Python, designed to be very easy to use. Never worry about writing network communication code again, when using Pyro you just write your Python objects like you would normally. With only a few lines of extra code, Pyro takes care of the network communication between your objects once you split them over different machines on the network. All the gory socket programming details are taken care of, you just call a method on a remote object as if it were a local object!
  • PyXG: Object oriented Python interface to Apple’s Xgrid. PyXG makes it possible to submit and manage Xgrid jobs and tasks from within interactive Python sessions or standalone scripts. It provides an extremely lightweight method for performing independent parallel tasks on a cluster of Macintosh computers.
  • Pyslice is a specialized templating system that replaces variables in a template data set with numbers taken from all combinations of variables. It creates a dataset from input template files for each combination of variables in the series and can optionally run a simulation or submit a simulation run to a gueue against each created data set. For example: create all possible combination of datasets that represent the ‘flow’ variable with numbers from 10 to 20 by 2 and the ‘level’ variable with 24 values taken from a normal distribution with a mean of 104 and standard deviation of 5.
  • Python::OpenCL is a Boost.Python-based interface to OpenCL. OpenCL is a standard for parallel programming on heterogeneous devices including CPUs, GPUs, and others processors. It provides a common language C-like language for executing code on those devices, as well as APIs to setup the computations. Python::OpenCL aims at being an easy-to-use Python wrapper around the OpenCL library.

Partial differential equation (PDE) solvers

  • FiPy
  • SfePy
  • Hermes

Topic guides, organized by scientific field

Astronomy

  • PyFITS: interface to FITS formatted files under the Python scripting language and PyRAF, the Python-based interface to IRAF.
  • PyRAF is a new command language for running IRAF tasks that is based on the Python scripting language.
  • BOTEC is a simple astrophysical and orbital mechanics calculator, including a database of all named Solar System objects.
  • AstroLib is an open source effort to develop general astronomical utilities akin to those available in the IDL ASTRON package.
  • APLpy is a Python module aimed at producing publication-quality plots of astronomical imaging data in FITS format.
  • A tutorial on using Python for interactive data analysis in astronomy.
  • ParselTongue is a Python interface to classic AIPS for the calibration, data analysis, image display etc. of (primarily) Radio Astronomy data.
  • Casa is a suite of C++ application libraries for the reduction and analysis of radioastronomical data (derived from the former AIPS++ package) with a Python scripting interface.
  • Healpy is a Python package for using and plotting HEALpix data (e.g. spherical surface maps such as WMAP data).
  • Pysolar is a collection of Python libraries for simulating the irradiation of any point on earth by the sun. Pysolar includes code for extremely precise ephemeris calculations, and more. Could be also grouped under engineering tools.

Artificial intelligence & machine learning

  • See also the Bayesian Statistics section below
  • ffnet is a library for feed-forward neural networks written in Python, uses NumPy arrays and SciPy optimizers.
  • pyem is a tool for Gaussian mixture models. It implements EM algorithm for Gaussian mixtures (including full matrix covariance), BIC criterion for clustering. Since October 2006, it is included in SciPy toolbox.
  • Orange is a component-based data mining software package written partly in Python.
  • A Python neural network tutorial with a simple implementation based on http://arctrix.com/nas/python/bpnn.py
  • pymorph Morphology Toolbox The pymorph Morphology Toolbox for Python is a powerful collection of latest state-of-the-art gray-scale morphological tools that can be applied to image segmentation, non-linear filtering, pattern recognition and image analysis. Pymorph was originally written by Roberto A. Lutofu and Rubens C. Machado but is now maintained by Luís Pedro Coelho.
  • Plearn A C++ library for machine learning with a Python interface (PyPlearn)
  • ELEFANT We aim at developing an open source machine learning platform which will become the platform of choice for prototyping and deploying machine learning algorithms.
  • Bayes Blocks The library is a C++/Python implementation of the variational building block framework using variational Bayesian learning.
  • Monte Python: A machine learning library written in pure Python. The focus is on gradient based learning. Monte includes neural networks, conditional random fields, logistic regression and more.

Bayesian statistics

  • PyMC: PyMC is a Python module that provides a Markov chain Monte Carlo (MCMC) toolkit, making Bayesian simulation models relatively easy to implement. PyMC relieves users of the need for re-implementing MCMC algorithms and associated utilities, such as plotting and statistical summary. This allows the modelers to concentrate on important aspects of the problem at hand, rather than the mundane details of Bayesian statistical simulation.

Biology (including Neuroscience)

  • Brian: a simulator for spiking neural networks in Python.
  • BioPython: an international association of developers of freely available Python tools for computational molecular biology.
  • Python For Structural BioInformatics Tutorial: This tutorial will demonstrate the utility of the interpreted programming language Python for the rapid development of component-based applications for structural bioinformatics. We will introduce the language itself, along with some of its most important extension modules. Bio-informatics specific extensions will also be described and we will demonstrate how these components have been assembled to create custom applications.
  • PySAT: Python Seqeuence Analysis Tools (Version 1.0): PySAT is a collection of bioinformatics tools written entirely in python. A paper describing these tools.
  • Python Protein Annotators’ Assistant In this project, a software tool has been developed which, given a list of protein identifiers, e.g. as returned by a BLAST or FASTA search, clusters the identifiers around keywords and phrases that might indicate the functions performed by the protein that was used in the original search query.
  • Python/Tk Viewer for the NCBI Taxonomy Database: A viewer for the NCBI taxonomy database, written in Python/Tk, was developed in 1998.
  • PyPhy : A phylogenomic approach to microbial evolution: PyPhy is a set of Python scripts and modules for automatic, large-scale reconstructions of phylogenetic relationships of complete microbial genomes.
  • PySCeS: the Python Simulator for Cellular Systems: PySCes includes tools for the simulation and analysis of cellular systems (GPL).
  • SloppyCell: SloppyCell is a software environment for simulation and analysis of biomolecular networks developed by the groups of Jim Sethna and Chris Myers at Cornell University.
  • PyDSTool: PyDSTool is an integrated simulation, modeling and analysis package for dynamical systems used in scientific computing, and includes special toolboxes for computational neuroscience, biomechanics, and systems biology applications.
  • Epigrass: Epidemiological Geo-Referenced Analysis and Simulation System. Simulation and analysis of epidemics over networks.
  • NIPY: The neuroimaging in Python project is an environment for the analysis of structural and functional neuroimaging data. It currently has a full system for general linear modeling of functional magnetic resonance imaging (FMRI).
  • PsychoPy: create psychology stimuli in Python

Dynamical systems

  • PyDSTool is an integrated simulation, modeling and analysis package for dynamical systems (ODEs, DDEs, DAEs, maps, time-series, hybrid systems). Continuation and bifurcation analysis tools are built-in, via PyCont. It also contains a library of general classes useful for scientific computing, including an enhanced array class and wrappers for SciPy algorithms. Application-specific utilities are also provided for systems biology, computational neuroscience, and biomechanics. Development of complex systems models is simplified using symbolic math capabilities and compositional model-building classes. These can be “compiled” automatically into dynamically-linked C code or Python simulators.
  • SimPy is an object-oriented, process-based discrete-event simulation language based on standard Python SimPy provides the modeler with components of a simulation model including processes, for active components like customers, messages, and vehicles, and resources, for passive components that form limited capacity congestion points like servers, checkout counters, and tunnels. It also provides monitor variables to aid in gathering statistics. Random variates are provided by the standard Python random module. SimPy comes with data collection capabilities, GUI and plotting packages. It can be easily interfaced to other packages, such as plotting, statistics, GUI, spreadsheets, and data bases.
  • Pyarie is a continuous modeling environment useful for modeling systems of ordinary differential equations. The system is designed to be modular so that state variables and relationships, as well as complete models, can be re-used and re-defined and combined. Multiple integration methods are supplied for ODEs, and tools for optimization and linear programming are currently being built. Pyarie is being designed so little to no knowledge of programming is necessary for its use, but with full access to its structures, so that programmers can extend the system at will and use it as a powerful continuous modeling programming language.
  • Model-Builder is a GUI-based application for building and simulation of ODE (Ordinary Differential Equations) models. Models are defined in mathematical notation, with no coding required by the user. Results can be exported in CSV format. Graphical output based on matplotlib include time-series plots, state-space plots, Spectrogram, Continuous wavelet transforms of time series. It also includes a sensitivity and uncertainty analysis module. Ideal for classroom use.
  • VFGEN is a source code generator for differential equations and delay differential equations. The equations are defined once in an XML format, and then VFGEN is used to generate the functions that implement the equations in a wide variety of formats. Python users will be interested in the SciPy, PyGSL, and PyDSTool commands provided by VFGEN.

Economics and Econometrics

  • pyTrix is a small set of utilities for economics and econometrics, including pyGAUSS (GAUSS command analogues for use in Python).

Electromagnetic

  • PyFemax is a library for computation of electro-magnetic waves in accelerator cavities.
  • FiPy
  • FEval

Geosciences

  • CDAT is a suite of tools for analysis of climate models. CDMS is the most commonly used submodule.
  • Jeff Whitaker has made a number of useful tools for atmospheric modelers, including the basemap toolkit for matplotlib, and a NumPy compatible netCDF4 interface.
  • seawater is a package for computing properties of seawater (UNESCO 1981 and UNESCO 1983).
  • Fluid is a series of routines for calulating properties of fluids (air and seawater), and their interactions (e.g., wind stress).
  • atmqty computes atmospheric quantities on Earth that are directly derivative (i.e. not requiring time integration or modeling) from standard state variables.
  • TAPPy - Tidal Analysis Program in Python decomposes an hourly time-series of water levels into tidal components. It uses SciPy’s least squares optimization.
  • PyClimate performs EOF analysis, downscaling by means of CCA and analogs (in the PC and CCC spaces), linear digital filters, kernel based probability density function estimation and access to DCDFLIB.C library from Python, amongst many other things.
  • CliMT is an object-oriented climate modeling and diagnostics toolkit.
  • ClimPy is an early-stage hydrologic-oriented library.
  • The Unofficial Python GIS SIG is a mailing list for software developers, programmers, and architects, for discussion of general issues around the Python language (and variants), platforms, and geographic information systems.

Molecular modeling

  • MGLTOOLS is a comprehensive set of tools for molecular interaction calculations and visualization.
  • The Molecular Modelling Toolkit (MMTK) is a library for molecular simulation applications. In addition to providing ready-to-use implementations of standard algorithms, MMTK serves as a code basis that can be easily extended and modified to deal with standard and non-standard problems in molecular simulations.
  • Biskit is an object-oriented platform for structural bioinformatics research. Structure and trajectory objects tightly integrate with NumPy allowing, for example, fast take and compress operations on molecules or trajectory frames. Biskit integrates many external programs (e.g. XPlor, Modeller, Amber, DSSP, T-Coffee, Hmmer...) into workflows and supports parallelization via a high-level access to PyPVM.
  • PyMOL is a molecular graphics system with an embedded Python interpreter designed for real-time visualization and rapid generation of high-quality molecular graphics images and animations.
  • UCSF Chimera is a highly extensible, interactive molecular graphics program. It is the successor to UCSF Midas and MidasPlus; however, it has been completely redesigned to maximize extensibility and leverage advances in hardware. UCSF Chimera can be downloaded free of charge for academic, government, non-profit, and personal use.
  • The Python Macromolecular Library (mmLib) is a software toolkit and library of routines for the analysis and manipulation of macromolecular structural models. It provides a range of useful software components for parsing mmCIF, PDB, and MTZ files, a library of atomic elements and monomers, an object-oriented data structure describing biological macromolecules, and an OpenGL molecular viewer.
  • MDTools for Python is a Python module which provides a set of classes useful for the analysis and modification of protein structures. Current capabilities include reading PSF files, reading and writing (X-PLOR style) pdb and dcd files, calculating phi-psi angles and other properties for arbitrary selections of residues, and parsing output from NAMD into an easy-to-manipulate data object.
  • BALL - Biochemical Algorithms Library is a set of libraries and applications for molecular modeling and visualization. OpenGL and Qt are the underlying C++ layers; some components are LGPL licensed, others GPL.
  • SloppyCell is a software environment for simulation and analysis of biomolecular networks developed by the groups of Jim Sethna and Chris Myers at Cornell University.
  • PyVib2 is a program for analyzing vibrational motion and vibrational spectra. The program is supposed to be an open source “all-in-one” solution for scientists working in the field of vibrational spectroscopy (Raman and IR) and vibrational optical activity (ROA and VCD). It is based on NumPy, matplotlib, VTK and Pmw.

Signal processing

  • GNU Radio is a free software development toolkit that provides the signal processing runtime and processing blocks to implement software radios using readily-available, low-cost external RF hardware and commodity processors. GNU Radio applications are primarily written using the Python programming language, while the supplied, performance-critical signal processing path is implemented in C++ using processor floating point extensions where available. Thus, the developer is able to implement real-time, high-throughput radio systems in a simple-to-use, rapid-application-development environment. While not primarily a simulation tool, GNU Radio does support development of signal processing algorithms using pre-recorded or generated data, avoiding the need for actual RF hardware.
  • pysamplerate is a small wrapper for Source Rabbit Code (http://www.mega-nerd.com/SRC/), still incomplete, but which can be used now for high quality resampling of audio signals, even for non-rational ratio.
  • audiolab is a small library to import data from audio files to NumPy arrays, and export NumPy arrays to audio files. It uses libsndfile for the IO (http://www.mega-nerd.com/libsndfile/), which means many formats are available, including wav, aiff, HTK format and FLAC, an open source lossless compressed format. Previously known as pyaudio (not to confuse with pyaudio), now part of scikits.
  • PyWavelets is a user-friendly Python package to compute various kinds of Discrete Wavelet Transform.
  • PyAudiere is a very flexible and easy to use audio library for Python users. Available methods allow you to read soundfiles of various formats into memory and play them, or stream them if they are large. You can pass sound buffers as NumPy arrays of float32’s to play (non-blocking). You can also create pure tones, square waves, or ‘on-line’ white or pink noise. All of these functions can be utilized concurrently.
  • CMU Sphinx is a free automatic speech recognition system. The SphinxTrain package for training acoustic models includes Python modules for reading and writing Sphinx-format acoustic feature and HMM parameter files to/from NumPy arrays.

Symbolic math, number theory, etc.

  • Swiginac are a set of SWIG wrappers around GINAC, a C++ symbolic math library.
  • NZMATH is a Python based number theory oriented calculation system developed at Tokyo Metropolitan University. It contains routines for factorization, gcd, lattice reduction, factorial, finite fields, and other such goodies. Unfortunately short on documentation, but contains a lot of useful stuff if you can find it.
  • Sage is a comprehensive environment with support for research in algebra, geometry and number theory. It wraps existing libraries and provides new ones for elliptic curves, modular forms, linear and non-commutative algebras, and a lot more.
  • SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python and does not require any external libraries, except optionally for plotting support.
  • Python bindings for CLNUM, a library which provides exact rationals and arbitrary precision floating point, orders of magnitude faster (and more full-featured) than the Decimal.py module from Python’s standard library. From the same site, the ratfun module provides rational function approximations, and rpncalc is a full RPN interactive Python-based calculator.
  • DecInt is a Python class that provides support for operations on very large decimal integers. Conversion to and from the decimal string representation is very fast; the multiplication and division algorithms are asymptotically faster than the native Python ones.
  • Kayali is a Qt based Computer Algebra System (CAS) written in Python. It is essentially a front end GUI for Maxima and Gnuplot.

Miscellaneous

  • Pybliographer: a tool for managing bibliographic databases. It can be used for searching, editing, reformatting, etc. In fact, it’s a simple framework that provides easy to use python classes and functions, and therefore can be extended to many uses (generating HTML pages according to bibliographic searches, etc). In addition to the scripting environment, a graphical Gnome interface is available. It provides powerful editing capabilities, a nice hierarchical search mechanism, direct insertion of references into LyX and Kile, direct queries on Medline, and more. It currently supports the following file formats: BibTeX, ISI, Medline, Ovid, Refer.
  • py2tex: format Python source code as LaTeX. Note that this site contains an older release of the same code, don’t be confused.
  • pyreport: runs a script and captures the output (pylab graphics included). Generates a LaTeX or pdf report out of it, including litteral comments and pretty printed code.
  • Vision Egg: produce stimuli for vision research experiments
  • PsychoPy: a freeware library for vision research experiments (and analyse data) with an emphasis on psychophysics.
  • PyEPL: the Python Experiment Programing Library. A free library to create experiments ranging from simple display of stimuli and recording of responses (including audio) to the creation of interactive virtual reality environments.
  • Pythonica: a Python implementation of a symbolic math program, based upon the fantastic precedent set by Mathematica.
  • Module dependency graph:a few scripts to glue modulefinder.py into graphviz, producing import dependency pictures pretty enough for use as a poster, and containing enough information to be a core part of my process for understanding physical dependencies.
  • Modular Toolkit for Data Processing (MDP): a library to implement data processing elements (nodes) and to combine them into data processing sequences (flows). Already implemented nodes include Principal Component Analysis (PCA), Independent Component Analysis (ICA), Slow Feature Analysis (SFA), and Growing Neural Gas.
  • FiPy: FiPy is an object oriented, partial differential equation (PDE) solver, written in Python , based on a standard finite volume (FV) approach. The framework has been developed in the Metallurgy Division and Center for Theoretical and Computational Materials Science (CTCMS), in the Materials Science and Engineering Laboratory (MSEL) at the National Institute of Standards and Technology (NIST).
  • SfePy: SfePy is a finite element solver written in Python, with the time demanding parts implemented in C and interfaced by SWIG. It can be used to solve various problems described by partial differential equations in 2D or 3D, for example the linear elasticity, hyperelasticity, heat conduction, Navier-Stokes, Biot, and other problems. As a research code it is used to implement models derived by the theory of homogenization, with applications in modeling of porous media (for example bones or soft tissue organs) or phononic materials.
  • Hermes: Hermes is a free C++/Python library for rapid prototyping of adaptive FEM and hp-FEM solvers developed by an open source community around the hp-FEM group at the University of Nevada, Reno.
  • FEval: FEval is useful for conversion between many finite element file formats. The main functionality is extraction of model data in the physical domain, for example to calculate flow lines.
  • CSC Climate Wiki: wiki for the Climate Systems Center (CSC) at the University of Chicago. Topics include climate research, the philosophy of modularizing climate models, the use of Python in climate modeling, and software packages produced by CSC. This site contains a lot of useful information about Python for scientific computing.
  • peak-o-mat: peak-o-mat is a curve fitting program for the spectrocopist. It is especially designed for batch cleaning, conversion and fitting of spectra from visibile optics expriments if you’re facing a large number of similar spectra.
  • scalar: The scalar package is designed to represent physical scalars and to eliminate errors involving implicit physical units (e.g., confusing angular degrees and radians). It comes with a complete implementation of the standard metric system of units and many standard non-metric units. It also allows the user to easily define a specialized or reduced set of appropriate physical units for any particular application or domain. Once an application has been developed and tested, the scalar class can be switched off for production runs to achieve the execution efficiency of operations on built-in numeric types, which can be up to two orders of magnitude faster. A user guide is provided.