conda: a cross-platform package manager for any binary distribution (scipy 2014)

Post on 10-May-2015

884 Views

Category:

Software

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Conda: A Cross-Platform

Package Manager for Any

Binary Distribution Aaron Meurer

Ilan Schnell

Continuum Analytics, Inc

or,

Solving the Packaging

Problem

What is the packaging problem?

History

Two sides

Installing Building

Two sides

Installing Building

User Developer

Installing

• setup.py install

• easy_install

• pip

• apt-get

• rpm

• emerge

• homebrew

• port

• fink

• …

setup.py install

• fine if it’s pure Python, not so much if it isn’t

• you have to have compilers installed

distutils.errors.DistutilsError: Setup script exited with error: command 'gcc' failed with exit status 1

setup.py install

You are your own package manager

pip

• Only works with Python

• Not so great for scientific packages that depend on big C libraries

• Try installing h5py if you don’t have HDF5

pip

You are a “self integrator”

Building

Problems• distutils is not really designed for compiled packages

• numpy.distutils “fork”

• setuptools is over complicated

• import setuptools monkeypatches distutils

• Entry points require pkg_resources

• pkg_resources.DistributionNotFound: flake8==2.1.0

• Each egg adds an entry to sys.path

• import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

Package maintainers hate having packages that no one can install

What is the packaging problem?

What about wheels?

• Python package specific

• Can’t build wheels for C libraries

• Can’t make a wheel for Python itself

• Still doesn’t address problem that some metadata is only in the package

itself

• You are still a “self integrator”

System Packaging solutions

yum (rpm)

apt-get (dpkg)

Linux OSXmacports

homebrew

fink

Windows

chocolatey

npackd

System Packaging solutions

yum (rpm)

apt-get (dpkg)

Linux OSXmacports

homebrew

fink

Windows

chocolatey

npackd

Cross-platform

conda

Conda• System level package manager (Python agnostic)

• Python, hdf5, and h5py are all conda packages

• Cross platform (works on Windows, OS X, and Linux)

• Doesn’t require administrator privileges

• Installs binaries (no more compiler woes)

• Metadata stored separately in the repository index

• Uses a SAT solver to resolve dependency before packages are

installed

Basic conda usageInstall a package conda install sympy

List all installed packages conda list

Search for packages conda search llvm

Create a new environment conda create -n py3k python=3

Remove a package conda remove nose

Get help conda install --help

Advanced usageInstall a package in an

environmentconda install -n py3k sympy

Update all packages conda update --all

Export list of packages conda list --export packages.txt

Install packages from an export conda install --file packages.txt

See package history conda list --revisions

Revert to a revision conda install --revision 23

Remove unused packages and

cached tarballs

conda clean -pt

What is a conda package?

What is a conda package?Just a tar.bz2 file with the files from the package, and some metadata

/lib

/include

/bin

/man/info

files

index.json

What is a conda package?Just a tar.bz2 file with the files from the package, and some metadata

/lib

/include

/bin

/man/info

files

index.json

Files are not Python specific.

Any kind of program at all can be a conda package.

Metadata is static.

Python Agnostic

• A conda package can be anything

• Python packages

• Python itself

• C libraries (GDAL, netCDF4, dynd, …)

• R

• Node JS

• Perl

Installation

• The tarball is unarchived in the pkgs directory

• Files are hard-linked to the install path

• Shebang lines and other instances of a place-holder prefix are

replaced with the install prefix

• The metadata is updated, so that conda knows that it is installed

• post-link script is run (these are rare)

And that’s it

conda install sympy

Installation

And that’s it

conda install sympy

Environments• Environments are simple: just link the package to a different directory

• Hard-links are very cheap, and very fast

• Conda environments are completely independent installations of

everything

• No fiddling with PYTHONPATH or symlinking site-packages

• “Activating” an environment just means changing your PATH so that

its bin/ or Scripts/ comes first.

• Unix:

• Windows:

conda create -n py3k python=3.4

source activate py3k

activate py3k

Environments

/python-3.4.1-0

/bin/python /sympy-0.7.5-0

/bin/isympy

/lib/python3.4/

site-packages/

sympy

/envs

/sympy-env

/bin/python

/bin/isympy

/lib/python3.4/

site-packages/

sympy

Hard links

/pkgs

/test

/bin/python

EnvironmentsUses:

• Testing (python 2.6, 2.7, 3.3)

• Development

• Trying new packages from PyPI

• Separating deployed apps with different

dependency needs

• Trying new versions of Python

• Reproducible science

Building

Conda Recipes

• meta.yaml contains metadata

• build.sh is the build script for Unix and

bld.bat is the build script for Windows

meta.yaml

build.sh

bld.bat

(optional)

fix.patch

run_test.py

post-link.sh

conda build path/to/recipe/

Example meta.yaml

Conda Recipes

• Lots more

• Command line entry points

• Fine-grained control over conda’s relocation logic

• Inequalities for versions of dependencies (like >=1.2,<2.0)

• “Preprocessing selectors” allow using the same meta.yaml

for many platforms

• See http://conda.pydata.org/docs/build.html for full

documentation

conda build path/to/recipe/

• conda build is only a convenient wrapper

• You can also build packages manually just by following the package

specification (http://conda.pydata.org/docs/spec.html)

Sharing • Once you have a conda package,

the easiest way to share it is to upload it to Binstar

• Others can install your package with

conda install -c binstar_username package

• Or add your channel to their configuration withconda config -—add channels

binstar_username

Self Hosting

• You can also self-host

• Store packages in a directory by platform (osx-64, linux-32, linux-64,

win-32 ,win-64)

• Run conda index on that directory to generate the repodata.json

• Serve this up, or use a file:// url as a channel

• Binstar is just a very convenient hosted wrapper around conda index

conda index directory/osx-64

Final words

• conda is completely open source (BSD) https://github.com/conda/conda

• We have a mailing list (conda@continuum.io)

• A big thanks to Continuum for paying me to work on open source

Thanks!

Sean Ross-Ross (principal binstar.org developer)

Bryan Van de Ven (original conda author)

Ilan Schnell (principal conda developer)

Travis Oliphant (Continuum CEO)

top related