代码之家 › 专栏 › 技术社区 › Community wiki

SciPy和NumPy之间的关系

scipy numpy python

278

Community wiki · 技术社区 · 2 年前

SciPy似乎在自己的命名空间中提供了NumPy的大部分(但不是全部[1])函数。换句话说,如果有一个名为 numpy.foo ,几乎可以肯定 scipy.foo 。大多数情况下,两者看起来完全相同,通常甚至指向同一个函数对象。

有时,它们是不同的。举一个最近出现的例子:

numpy.log10 是 ufunc 返回否定参数的NaN;
scipy.log10 为负参数返回复杂值,并且看起来不是ufunc。

同样的道理 log , log2 和 logn ,但不是关于 log1p [2] 。

另一方面 numpy.exp 和 scipy.exp 似乎是同一ufunc的不同名称。这也适用于 scipy.log1p 和 numpy.log1p 。

另一个例子是 numpy.linalg.solve vs scipy.linalg.solve 。它们很相似,但后者比前者提供了一些额外的功能。

为什么会出现明显的重复?如果这是一个批发进口 numpy 进入 scipy 命名空间,为什么行为上的细微差异和缺少的功能?是否有一些总体逻辑可以帮助消除混乱?

[1] numpy.min , numpy.max , numpy.abs 其他一些人在 scipy 命名空间。

[2] 使用NumPy 1.5.1和SciPy 0.9.0rc2进行测试。

8 回复 | 直到 7 年前

153

user davidism 8 年前

上次我检查的时候 __init__ 方法执行

from numpy import *

以便在导入scipy模块时将整个numpy命名空间包含在scipy中。

这个 log10 你描述的行为很有趣,因为 二者都 版本来自numpy。一个是 ufunc ,另一个是 numpy.lib 作用为什么scipy更喜欢库函数而不是 ufunc ,我根本不知道。

编辑:事实上,我可以回答 log10 问题看起来很傻 __初始化__ 方法我看到:

# Import numpy symbols to scipy name space
import numpy as _num
from numpy import oldnumeric
from numpy import *
from numpy.random import rand, randn
from numpy.fft import fft, ifft
from numpy.lib.scimath import *

这个 log10 你在scipy中得到的函数来自 numpy.lib.scimath 。查看该代码,它显示:

"""
Wrapper functions to more user-friendly calling of certain math functions
whose output data-type is different than the input data-type in certain
domains of the input.

For example, for functions like log() with branch cuts, the versions in this
module provide the mathematically valid answers in the complex plane:

>>> import math
>>> from numpy.lib import scimath
>>> scimath.log(-math.exp(1)) == (1+1j*math.pi)
True

Similarly, sqrt(), other base logarithms, power() and trig functions are
correctly handled.  See their respective docstrings for specific examples.
"""

模块似乎覆盖了 sqrt , log , log2 , logn , log10 , power , arccos , arcsin 和 arctanh 。这就解释了你所看到的行为。这样做的根本设计原因可能隐藏在某个地方的邮件列表中。

John D. Cook 14 年前

来自SciPy参考指南:

…所有Numpy函数都有已并入 scipy 命名空间,以便功能在没有另外导入Numpy。

其目的是让用户不必知道 scipy 和 numpy 命名空间,不过显然您发现了一个例外。

redtuna 7 年前

从 SciPy FAQ NumPy中的一些函数是出于历史原因而出现的,而它应该仅在SciPy中:

NumPy和SciPy之间有什么区别?

在理想的情况下,NumPy只包含数组数据类型和最基本的操作:索引、排序、整形、基本元素功能等。所有数字代码都将位于 SciPy。然而,NumPy的一个重要目标是兼容性,因此NumPy 尝试保留其任何前任支持的所有功能。因此 NumPy包含一些线性代数函数,尽管这些函数正确地属于SciPy。无论如何,SciPy包含更全面的功能线性代数模块的版本,以及许多其他数值算法。如果你用python做科学计算,你应该可能同时安装NumPy和SciPy。大多数新功能属于SciPy 而不是NumPy。

这就解释了为什么 scipy.linalg.solve 提供了一些附加功能 numpy.linalg.solve 。

我没有看到SethMMorton对 related question

shortorian 14 年前

在 introduction to SciPy 文件:

另一个有用的命令是 source 。当给定一个用Python编写的函数作为参数时,它会打印出该函数的源代码列表。这有助于学习算法或准确理解函数处理其论点。另外,不要忘记Python命令dir,它可以用于查看模块或包的名称空间。

我认为这将使对所有相关包都有足够了解的人能够准确地分辨出它们之间的区别一些 scipy和numpy函数(这对我解决log10问题毫无帮助)。我当然不知道,但是 来源 确实表明 scipy.linalg.solve 和 numpy.linalg.solve 以不同的方式与lapack相互作用;

Python 2.4.3 (#1, May  5 2011, 18:44:23) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
>>> import scipy
>>> import scipy.linalg
>>> import numpy
>>> scipy.source(scipy.linalg.solve)
In file: /usr/lib64/python2.4/site-packages/scipy/linalg/basic.py

def solve(a, b, sym_pos=0, lower=0, overwrite_a=0, overwrite_b=0,
          debug = 0):
    """ solve(a, b, sym_pos=0, lower=0, overwrite_a=0, overwrite_b=0) -> x

    Solve a linear system of equations a * x = b for x.

    Inputs:

      a -- An N x N matrix.
      b -- An N x nrhs matrix or N vector.
      sym_pos -- Assume a is symmetric and positive definite.
      lower -- Assume a is lower triangular, otherwise upper one.
               Only used if sym_pos is true.
      overwrite_y - Discard data in y, where y is a or b.

    Outputs:

      x -- The solution to the system a * x = b
    """
    a1, b1 = map(asarray_chkfinite,(a,b))
    if len(a1.shape) != 2 or a1.shape[0] != a1.shape[1]:
        raise ValueError, 'expected square matrix'
    if a1.shape[0] != b1.shape[0]:
        raise ValueError, 'incompatible dimensions'
    overwrite_a = overwrite_a or (a1 is not a and not hasattr(a,'__array__'))
    overwrite_b = overwrite_b or (b1 is not b and not hasattr(b,'__array__'))
    if debug:
        print 'solve:overwrite_a=',overwrite_a
        print 'solve:overwrite_b=',overwrite_b
    if sym_pos:
        posv, = get_lapack_funcs(('posv',),(a1,b1))
        c,x,info = posv(a1,b1,
                        lower = lower,
                        overwrite_a=overwrite_a,
                        overwrite_b=overwrite_b)
    else:
        gesv, = get_lapack_funcs(('gesv',),(a1,b1))
        lu,piv,x,info = gesv(a1,b1,
                             overwrite_a=overwrite_a,
                             overwrite_b=overwrite_b)

    if info==0:
        return x
    if info>0:
        raise LinAlgError, "singular matrix"
    raise ValueError,\
          'illegal value in %-th argument of internal gesv|posv'%(-info)

>>> scipy.source(numpy.linalg.solve)
In file: /usr/lib64/python2.4/site-packages/numpy/linalg/linalg.py

def solve(a, b):
    """
    Solve the equation ``a x = b`` for ``x``.

    Parameters
    ----------
    a : array_like, shape (M, M)
        Input equation coefficients.
    b : array_like, shape (M,)
        Equation target values.

    Returns
    -------
    x : array, shape (M,)

    Raises
    ------
    LinAlgError
        If `a` is singular or not square.

    Examples
    --------
    Solve the system of equations ``3 * x0 + x1 = 9`` and ``x0 + 2 * x1 = 8``:

    >>> a = np.array([[3,1], [1,2]])
    >>> b = np.array([9,8])
    >>> x = np.linalg.solve(a, b)
    >>> x
    array([ 2.,  3.])

    Check that the solution is correct:

    >>> (np.dot(a, x) == b).all()
    True

    """
    a, _ = _makearray(a)
    b, wrap = _makearray(b)
    one_eq = len(b.shape) == 1
    if one_eq:
        b = b[:, newaxis]
    _assertRank2(a, b)
    _assertSquareness(a)
    n_eq = a.shape[0]
    n_rhs = b.shape[1]
    if n_eq != b.shape[0]:
        raise LinAlgError, 'Incompatible dimensions'
    t, result_t = _commonType(a, b)
#    lapack_routine = _findLapackRoutine('gesv', t)
    if isComplexType(t):
        lapack_routine = lapack_lite.zgesv
    else:
        lapack_routine = lapack_lite.dgesv
    a, b = _fastCopyAndTranspose(t, a, b)
    pivots = zeros(n_eq, fortran_int)
    results = lapack_routine(n_eq, n_rhs, a, n_eq, pivots, b, n_eq, 0)
    if results['info'] > 0:
        raise LinAlgError, 'Singular matrix'
    if one_eq:
        return wrap(b.ravel().astype(result_t))
    else:
        return wrap(b.transpose().astype(result_t))

这也是我的第一篇帖子,所以如果我需要更改这里的内容,请告诉我。

Mu Mind hora 14 年前

来自维基百科( http://en.wikipedia.org/wiki/NumPy#History ):

数字代码被调整为它更易于维护和灵活足以实现新功能 Numarray的。这个新项目是 SciPy。为了避免安装一个整体包只是为了获得一个数组对象, 这个新包裹被分开了称为NumPy。

scipy 取决于 numpy 并进口许多 numpy 为了方便起见,函数插入到其命名空间中。

DaveP 14 年前

关于linalg包,scipy函数将调用lapack和blas,它们在许多平台上都有高度优化的版本,并且提供了非常好的性能,特别是对于在相当大的密集矩阵上的操作。另一方面,它们不是容易编译的库,需要fortran编译器和许多特定于平台的调整才能获得完整的性能。因此,numpy提供了许多常见线性代数函数的简单实现,这些函数对于许多目的来说都足够好。

Vlad Bezden 8 年前

来自“ Quantitative Economics '

SciPy是一个包含各种工具的包,这些工具是在NumPy之上构建的,使用其数组数据类型和相关功能

事实上,当我们导入SciPy时,我们也会得到NumPy,这可以从SciPy初始化文件中看到

# Import numpy symbols to scipy name space
import numpy as _num
linalg = None
from numpy import *
from numpy.random import rand, randn
from numpy.fft import fft, ifft
from numpy.lib.scimath import *

__all__  = []
__all__ += _num.__all__
__all__ += ['randn', 'rand', 'fft', 'ifft']

del _num
# Remove the linalg imported from numpy so that the scipy.linalg package can be
# imported.
del linalg
__all__.remove('linalg')

然而,显式使用NumPy功能是更常见和更好的做法

import numpy as np

a = np.identity(3)

SciPy中有用的是其子包中的功能

scipy.optimize、scipy.integrate、scipy.stats等。

jbbiomed 7 年前

除了 SciPy FAQ 描述复制主要是为了向后兼容性,在 NumPy documentation 这么说

可选SciPy加速例程(numpy.dual)

Scipy可能加速的函数的别名。

SciPy可以构建为使用加速或以其他方式改进的库用于FFT、线性代数和特殊函数。此模块允许当 SciPy可用,但仍然支持只安装了 NumPy。

为了简洁起见,这些是:

线性代数
FFT
第一类0阶修正贝塞尔函数

此外,从 SciPy Tutorial :

SciPy的顶层还包含NumPy和 numpy.lib.scimath。但是,最好直接从 NumPy模块。

因此,对于新的应用程序,您应该更喜欢在SciPy的顶级中复制的NumPy版本的数组操作。对于上面列出的域,您应该更喜欢SciPy中的域,并在必要时在NumPy中检查向后兼容性。

根据我个人的经验,我使用的大多数数组函数都存在于NumPy的顶级中(除了 random )。然而,所有特定于领域的例程都存在于SciPy的子包中,所以我很少使用SciPy顶级的任何东西。