Thursday, November 19, 2015

Saturday, April 4, 2015

R's sapply in Numpy

One quick blog entry about sapply and numpy:

Usually, sapply is equal to list comprehensions in Python.
i.e. [f(x) for x in range(1, 10)] is sapply(1:10, f).

But: When you want to apply a function (mapping a number to a vector!) to a vector, you get a matrix. R's sapply does raise the rank of the output to two, while in Numpy you have to push the vector to a higher rank via the magical vector[:, np.newaxis].

In [1]: import numpy as np

In [2]: def f(x):
   ...:     return np.array([4,2,3,9.1,-1]) * x
   ...: 

In [3]: f(9)
Out[3]: array([ 36. ,  18. ,  27. ,  81.9,  -9. ])

In [4]: vals = np.linspace(-1,1,20)

In [5]: np.apply_along_axis(f, 1, vals[:, np.newaxis])
Out[5]: 
array([[-4.        , -2.        , -3.        , -9.1       ,  1.        ],
       [-3.57894737, -1.78947368, -2.68421053, -8.14210526,  0.89473684],
       [-3.15789474, -1.57894737, -2.36842105, -7.18421053,  0.78947368],
       [-2.73684211, -1.36842105, -2.05263158, -6.22631579,  0.68421053],
       [-2.31578947, -1.15789474, -1.73684211, -5.26842105,  0.57894737],
       [-1.89473684, -0.94736842, -1.42105263, -4.31052632,  0.47368421],
       [-1.47368421, -0.73684211, -1.10526316, -3.35263158,  0.36842105],
       [-1.05263158, -0.52631579, -0.78947368, -2.39473684,  0.26315789],
       [-0.63157895, -0.31578947, -0.47368421, -1.43684211,  0.15789474],
       [-0.21052632, -0.10526316, -0.15789474, -0.47894737,  0.05263158],
       [ 0.21052632,  0.10526316,  0.15789474,  0.47894737, -0.05263158],
       [ 0.63157895,  0.31578947,  0.47368421,  1.43684211, -0.15789474],
       [ 1.05263158,  0.52631579,  0.78947368,  2.39473684, -0.26315789],
       [ 1.47368421,  0.73684211,  1.10526316,  3.35263158, -0.36842105],
       [ 1.89473684,  0.94736842,  1.42105263,  4.31052632, -0.47368421],
       [ 2.31578947,  1.15789474,  1.73684211,  5.26842105, -0.57894737],
       [ 2.73684211,  1.36842105,  2.05263158,  6.22631579, -0.68421053],
       [ 3.15789474,  1.57894737,  2.36842105,  7.18421053, -0.78947368],
       [ 3.57894737,  1.78947368,  2.68421053,  8.14210526, -0.89473684],
       [ 4.        ,  2.        ,  3.        ,  9.1       , -1.        ]])

It also works happily for an additional argument weights, which will be defined in the variable w.

In [6]: def f(x, weights):
    return weights * x
   ...: 

In [7]: w = np.array([-2,-1,5,1.1])

In [8]: np.apply_along_axis(f, 1, vals[:, np.newaxis], w)
Out[8]: 
array([[ 2.        ,  1.        , -5.        , -1.1       ],
       [ 1.78947368,  0.89473684, -4.47368421, -0.98421053],
       [ 1.57894737,  0.78947368, -3.94736842, -0.86842105],
       [ 1.36842105,  0.68421053, -3.42105263, -0.75263158],
       [ 1.15789474,  0.57894737, -2.89473684, -0.63684211],
       [ 0.94736842,  0.47368421, -2.36842105, -0.52105263],
       [ 0.73684211,  0.36842105, -1.84210526, -0.40526316],
       [ 0.52631579,  0.26315789, -1.31578947, -0.28947368],
       [ 0.31578947,  0.15789474, -0.78947368, -0.17368421],
       [ 0.10526316,  0.05263158, -0.26315789, -0.05789474],
       [-0.10526316, -0.05263158,  0.26315789,  0.05789474],
       [-0.31578947, -0.15789474,  0.78947368,  0.17368421],
       [-0.52631579, -0.26315789,  1.31578947,  0.28947368],
       [-0.73684211, -0.36842105,  1.84210526,  0.40526316],
       [-0.94736842, -0.47368421,  2.36842105,  0.52105263],
       [-1.15789474, -0.57894737,  2.89473684,  0.63684211],
       [-1.36842105, -0.68421053,  3.42105263,  0.75263158],
       [-1.57894737, -0.78947368,  3.94736842,  0.86842105],
       [-1.78947368, -0.89473684,  4.47368421,  0.98421053],
       [-2.        , -1.        ,  5.        ,  1.1       ]])