Chris@87: """
Chris@87: Set operations for 1D numeric arrays based on sorting.
Chris@87: 
Chris@87: :Contains:
Chris@87:   ediff1d,
Chris@87:   unique,
Chris@87:   intersect1d,
Chris@87:   setxor1d,
Chris@87:   in1d,
Chris@87:   union1d,
Chris@87:   setdiff1d
Chris@87: 
Chris@87: :Notes:
Chris@87: 
Chris@87: For floating point arrays, inaccurate results may appear due to usual round-off
Chris@87: and floating point comparison issues.
Chris@87: 
Chris@87: Speed could be gained in some operations by an implementation of
Chris@87: sort(), that can provide directly the permutation vectors, avoiding
Chris@87: thus calls to argsort().
Chris@87: 
Chris@87: To do: Optionally return indices analogously to unique for all functions.
Chris@87: 
Chris@87: :Author: Robert Cimrman
Chris@87: 
Chris@87: """
Chris@87: from __future__ import division, absolute_import, print_function
Chris@87: 
Chris@87: import numpy as np
Chris@87: 
Chris@87: 
Chris@87: __all__ = [
Chris@87:     'ediff1d', 'intersect1d', 'setxor1d', 'union1d', 'setdiff1d', 'unique',
Chris@87:     'in1d'
Chris@87:     ]
Chris@87: 
Chris@87: 
Chris@87: def ediff1d(ary, to_end=None, to_begin=None):
Chris@87:     """
Chris@87:     The differences between consecutive elements of an array.
Chris@87: 
Chris@87:     Parameters
Chris@87:     ----------
Chris@87:     ary : array_like
Chris@87:         If necessary, will be flattened before the differences are taken.
Chris@87:     to_end : array_like, optional
Chris@87:         Number(s) to append at the end of the returned differences.
Chris@87:     to_begin : array_like, optional
Chris@87:         Number(s) to prepend at the beginning of the returned differences.
Chris@87: 
Chris@87:     Returns
Chris@87:     -------
Chris@87:     ediff1d : ndarray
Chris@87:         The differences. Loosely, this is ``ary.flat[1:] - ary.flat[:-1]``.
Chris@87: 
Chris@87:     See Also
Chris@87:     --------
Chris@87:     diff, gradient
Chris@87: 
Chris@87:     Notes
Chris@87:     -----
Chris@87:     When applied to masked arrays, this function drops the mask information
Chris@87:     if the `to_begin` and/or `to_end` parameters are used.
Chris@87: 
Chris@87:     Examples
Chris@87:     --------
Chris@87:     >>> x = np.array([1, 2, 4, 7, 0])
Chris@87:     >>> np.ediff1d(x)
Chris@87:     array([ 1,  2,  3, -7])
Chris@87: 
Chris@87:     >>> np.ediff1d(x, to_begin=-99, to_end=np.array([88, 99]))
Chris@87:     array([-99,   1,   2,   3,  -7,  88,  99])
Chris@87: 
Chris@87:     The returned array is always 1D.
Chris@87: 
Chris@87:     >>> y = [[1, 2, 4], [1, 6, 24]]
Chris@87:     >>> np.ediff1d(y)
Chris@87:     array([ 1,  2, -3,  5, 18])
Chris@87: 
Chris@87:     """
Chris@87:     ary = np.asanyarray(ary).flat
Chris@87:     ed = ary[1:] - ary[:-1]
Chris@87:     arrays = [ed]
Chris@87:     if to_begin is not None:
Chris@87:         arrays.insert(0, to_begin)
Chris@87:     if to_end is not None:
Chris@87:         arrays.append(to_end)
Chris@87: 
Chris@87:     if len(arrays) != 1:
Chris@87:         # We'll save ourselves a copy of a potentially large array in
Chris@87:         # the common case where neither to_begin or to_end was given.
Chris@87:         ed = np.hstack(arrays)
Chris@87: 
Chris@87:     return ed
Chris@87: 
Chris@87: def unique(ar, return_index=False, return_inverse=False, return_counts=False):
Chris@87:     """
Chris@87:     Find the unique elements of an array.
Chris@87: 
Chris@87:     Returns the sorted unique elements of an array. There are two optional
Chris@87:     outputs in addition to the unique elements: the indices of the input array
Chris@87:     that give the unique values, and the indices of the unique array that
Chris@87:     reconstruct the input array.
Chris@87: 
Chris@87:     Parameters
Chris@87:     ----------
Chris@87:     ar : array_like
Chris@87:         Input array. This will be flattened if it is not already 1-D.
Chris@87:     return_index : bool, optional
Chris@87:         If True, also return the indices of `ar` that result in the unique
Chris@87:         array.
Chris@87:     return_inverse : bool, optional
Chris@87:         If True, also return the indices of the unique array that can be used
Chris@87:         to reconstruct `ar`.
Chris@87:     return_counts : bool, optional
Chris@87:         .. versionadded:: 1.9.0
Chris@87:         If True, also return the number of times each unique value comes up
Chris@87:         in `ar`.
Chris@87: 
Chris@87:     Returns
Chris@87:     -------
Chris@87:     unique : ndarray
Chris@87:         The sorted unique values.
Chris@87:     unique_indices : ndarray, optional
Chris@87:         The indices of the first occurrences of the unique values in the
Chris@87:         (flattened) original array. Only provided if `return_index` is True.
Chris@87:     unique_inverse : ndarray, optional
Chris@87:         The indices to reconstruct the (flattened) original array from the
Chris@87:         unique array. Only provided if `return_inverse` is True.
Chris@87:     unique_counts : ndarray, optional
Chris@87:         .. versionadded:: 1.9.0
Chris@87:         The number of times each of the unique values comes up in the
Chris@87:         original array. Only provided if `return_counts` is True.
Chris@87: 
Chris@87:     See Also
Chris@87:     --------
Chris@87:     numpy.lib.arraysetops : Module with a number of other functions for
Chris@87:                             performing set operations on arrays.
Chris@87: 
Chris@87:     Examples
Chris@87:     --------
Chris@87:     >>> np.unique([1, 1, 2, 2, 3, 3])
Chris@87:     array([1, 2, 3])
Chris@87:     >>> a = np.array([[1, 1], [2, 3]])
Chris@87:     >>> np.unique(a)
Chris@87:     array([1, 2, 3])
Chris@87: 
Chris@87:     Return the indices of the original array that give the unique values:
Chris@87: 
Chris@87:     >>> a = np.array(['a', 'b', 'b', 'c', 'a'])
Chris@87:     >>> u, indices = np.unique(a, return_index=True)
Chris@87:     >>> u
Chris@87:     array(['a', 'b', 'c'],
Chris@87:            dtype='|S1')
Chris@87:     >>> indices
Chris@87:     array([0, 1, 3])
Chris@87:     >>> a[indices]
Chris@87:     array(['a', 'b', 'c'],
Chris@87:            dtype='|S1')
Chris@87: 
Chris@87:     Reconstruct the input array from the unique values:
Chris@87: 
Chris@87:     >>> a = np.array([1, 2, 6, 4, 2, 3, 2])
Chris@87:     >>> u, indices = np.unique(a, return_inverse=True)
Chris@87:     >>> u
Chris@87:     array([1, 2, 3, 4, 6])
Chris@87:     >>> indices
Chris@87:     array([0, 1, 4, 3, 1, 2, 1])
Chris@87:     >>> u[indices]
Chris@87:     array([1, 2, 6, 4, 2, 3, 2])
Chris@87: 
Chris@87:     """
Chris@87:     ar = np.asanyarray(ar).flatten()
Chris@87: 
Chris@87:     optional_indices = return_index or return_inverse
Chris@87:     optional_returns = optional_indices or return_counts
Chris@87: 
Chris@87:     if ar.size == 0:
Chris@87:         if not optional_returns:
Chris@87:             ret = ar
Chris@87:         else:
Chris@87:             ret = (ar,)
Chris@87:             if return_index:
Chris@87:                 ret += (np.empty(0, np.bool),)
Chris@87:             if return_inverse:
Chris@87:                 ret += (np.empty(0, np.bool),)
Chris@87:             if return_counts:
Chris@87:                 ret += (np.empty(0, np.intp),)
Chris@87:         return ret
Chris@87: 
Chris@87:     if optional_indices:
Chris@87:         perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
Chris@87:         aux = ar[perm]
Chris@87:     else:
Chris@87:         ar.sort()
Chris@87:         aux = ar
Chris@87:     flag = np.concatenate(([True], aux[1:] != aux[:-1]))
Chris@87: 
Chris@87:     if not optional_returns:
Chris@87:         ret = aux[flag]
Chris@87:     else:
Chris@87:         ret = (aux[flag],)
Chris@87:         if return_index:
Chris@87:             ret += (perm[flag],)
Chris@87:         if return_inverse:
Chris@87:             iflag = np.cumsum(flag) - 1
Chris@87:             iperm = perm.argsort()
Chris@87:             ret += (np.take(iflag, iperm),)
Chris@87:         if return_counts:
Chris@87:             idx = np.concatenate(np.nonzero(flag) + ([ar.size],))
Chris@87:             ret += (np.diff(idx),)
Chris@87:     return ret
Chris@87: 
Chris@87: def intersect1d(ar1, ar2, assume_unique=False):
Chris@87:     """
Chris@87:     Find the intersection of two arrays.
Chris@87: 
Chris@87:     Return the sorted, unique values that are in both of the input arrays.
Chris@87: 
Chris@87:     Parameters
Chris@87:     ----------
Chris@87:     ar1, ar2 : array_like
Chris@87:         Input arrays.
Chris@87:     assume_unique : bool
Chris@87:         If True, the input arrays are both assumed to be unique, which
Chris@87:         can speed up the calculation.  Default is False.
Chris@87: 
Chris@87:     Returns
Chris@87:     -------
Chris@87:     intersect1d : ndarray
Chris@87:         Sorted 1D array of common and unique elements.
Chris@87: 
Chris@87:     See Also
Chris@87:     --------
Chris@87:     numpy.lib.arraysetops : Module with a number of other functions for
Chris@87:                             performing set operations on arrays.
Chris@87: 
Chris@87:     Examples
Chris@87:     --------
Chris@87:     >>> np.intersect1d([1, 3, 4, 3], [3, 1, 2, 1])
Chris@87:     array([1, 3])
Chris@87: 
Chris@87:     """
Chris@87:     if not assume_unique:
Chris@87:         # Might be faster than unique( intersect1d( ar1, ar2 ) )?
Chris@87:         ar1 = unique(ar1)
Chris@87:         ar2 = unique(ar2)
Chris@87:     aux = np.concatenate((ar1, ar2))
Chris@87:     aux.sort()
Chris@87:     return aux[:-1][aux[1:] == aux[:-1]]
Chris@87: 
Chris@87: def setxor1d(ar1, ar2, assume_unique=False):
Chris@87:     """
Chris@87:     Find the set exclusive-or of two arrays.
Chris@87: 
Chris@87:     Return the sorted, unique values that are in only one (not both) of the
Chris@87:     input arrays.
Chris@87: 
Chris@87:     Parameters
Chris@87:     ----------
Chris@87:     ar1, ar2 : array_like
Chris@87:         Input arrays.
Chris@87:     assume_unique : bool
Chris@87:         If True, the input arrays are both assumed to be unique, which
Chris@87:         can speed up the calculation.  Default is False.
Chris@87: 
Chris@87:     Returns
Chris@87:     -------
Chris@87:     setxor1d : ndarray
Chris@87:         Sorted 1D array of unique values that are in only one of the input
Chris@87:         arrays.
Chris@87: 
Chris@87:     Examples
Chris@87:     --------
Chris@87:     >>> a = np.array([1, 2, 3, 2, 4])
Chris@87:     >>> b = np.array([2, 3, 5, 7, 5])
Chris@87:     >>> np.setxor1d(a,b)
Chris@87:     array([1, 4, 5, 7])
Chris@87: 
Chris@87:     """
Chris@87:     if not assume_unique:
Chris@87:         ar1 = unique(ar1)
Chris@87:         ar2 = unique(ar2)
Chris@87: 
Chris@87:     aux = np.concatenate((ar1, ar2))
Chris@87:     if aux.size == 0:
Chris@87:         return aux
Chris@87: 
Chris@87:     aux.sort()
Chris@87: #    flag = ediff1d( aux, to_end = 1, to_begin = 1 ) == 0
Chris@87:     flag = np.concatenate(([True], aux[1:] != aux[:-1], [True]))
Chris@87: #    flag2 = ediff1d( flag ) == 0
Chris@87:     flag2 = flag[1:] == flag[:-1]
Chris@87:     return aux[flag2]
Chris@87: 
Chris@87: def in1d(ar1, ar2, assume_unique=False, invert=False):
Chris@87:     """
Chris@87:     Test whether each element of a 1-D array is also present in a second array.
Chris@87: 
Chris@87:     Returns a boolean array the same length as `ar1` that is True
Chris@87:     where an element of `ar1` is in `ar2` and False otherwise.
Chris@87: 
Chris@87:     Parameters
Chris@87:     ----------
Chris@87:     ar1 : (M,) array_like
Chris@87:         Input array.
Chris@87:     ar2 : array_like
Chris@87:         The values against which to test each value of `ar1`.
Chris@87:     assume_unique : bool, optional
Chris@87:         If True, the input arrays are both assumed to be unique, which
Chris@87:         can speed up the calculation.  Default is False.
Chris@87:     invert : bool, optional
Chris@87:         If True, the values in the returned array are inverted (that is,
Chris@87:         False where an element of `ar1` is in `ar2` and True otherwise).
Chris@87:         Default is False. ``np.in1d(a, b, invert=True)`` is equivalent
Chris@87:         to (but is faster than) ``np.invert(in1d(a, b))``.
Chris@87: 
Chris@87:         .. versionadded:: 1.8.0
Chris@87: 
Chris@87:     Returns
Chris@87:     -------
Chris@87:     in1d : (M,) ndarray, bool
Chris@87:         The values `ar1[in1d]` are in `ar2`.
Chris@87: 
Chris@87:     See Also
Chris@87:     --------
Chris@87:     numpy.lib.arraysetops : Module with a number of other functions for
Chris@87:                             performing set operations on arrays.
Chris@87: 
Chris@87:     Notes
Chris@87:     -----
Chris@87:     `in1d` can be considered as an element-wise function version of the
Chris@87:     python keyword `in`, for 1-D sequences. ``in1d(a, b)`` is roughly
Chris@87:     equivalent to ``np.array([item in b for item in a])``.
Chris@87: 
Chris@87:     .. versionadded:: 1.4.0
Chris@87: 
Chris@87:     Examples
Chris@87:     --------
Chris@87:     >>> test = np.array([0, 1, 2, 5, 0])
Chris@87:     >>> states = [0, 2]
Chris@87:     >>> mask = np.in1d(test, states)
Chris@87:     >>> mask
Chris@87:     array([ True, False,  True, False,  True], dtype=bool)
Chris@87:     >>> test[mask]
Chris@87:     array([0, 2, 0])
Chris@87:     >>> mask = np.in1d(test, states, invert=True)
Chris@87:     >>> mask
Chris@87:     array([False,  True, False,  True, False], dtype=bool)
Chris@87:     >>> test[mask]
Chris@87:     array([1, 5])
Chris@87:     """
Chris@87:     # Ravel both arrays, behavior for the first array could be different
Chris@87:     ar1 = np.asarray(ar1).ravel()
Chris@87:     ar2 = np.asarray(ar2).ravel()
Chris@87: 
Chris@87:     # This code is significantly faster when the condition is satisfied.
Chris@87:     if len(ar2) < 10 * len(ar1) ** 0.145:
Chris@87:         if invert:
Chris@87:             mask = np.ones(len(ar1), dtype=np.bool)
Chris@87:             for a in ar2:
Chris@87:                 mask &= (ar1 != a)
Chris@87:         else:
Chris@87:             mask = np.zeros(len(ar1), dtype=np.bool)
Chris@87:             for a in ar2:
Chris@87:                 mask |= (ar1 == a)
Chris@87:         return mask
Chris@87: 
Chris@87:     # Otherwise use sorting
Chris@87:     if not assume_unique:
Chris@87:         ar1, rev_idx = np.unique(ar1, return_inverse=True)
Chris@87:         ar2 = np.unique(ar2)
Chris@87: 
Chris@87:     ar = np.concatenate((ar1, ar2))
Chris@87:     # We need this to be a stable sort, so always use 'mergesort'
Chris@87:     # here. The values from the first array should always come before
Chris@87:     # the values from the second array.
Chris@87:     order = ar.argsort(kind='mergesort')
Chris@87:     sar = ar[order]
Chris@87:     if invert:
Chris@87:         bool_ar = (sar[1:] != sar[:-1])
Chris@87:     else:
Chris@87:         bool_ar = (sar[1:] == sar[:-1])
Chris@87:     flag = np.concatenate((bool_ar, [invert]))
Chris@87:     indx = order.argsort(kind='mergesort')[:len(ar1)]
Chris@87: 
Chris@87:     if assume_unique:
Chris@87:         return flag[indx]
Chris@87:     else:
Chris@87:         return flag[indx][rev_idx]
Chris@87: 
Chris@87: def union1d(ar1, ar2):
Chris@87:     """
Chris@87:     Find the union of two arrays.
Chris@87: 
Chris@87:     Return the unique, sorted array of values that are in either of the two
Chris@87:     input arrays.
Chris@87: 
Chris@87:     Parameters
Chris@87:     ----------
Chris@87:     ar1, ar2 : array_like
Chris@87:         Input arrays. They are flattened if they are not already 1D.
Chris@87: 
Chris@87:     Returns
Chris@87:     -------
Chris@87:     union1d : ndarray
Chris@87:         Unique, sorted union of the input arrays.
Chris@87: 
Chris@87:     See Also
Chris@87:     --------
Chris@87:     numpy.lib.arraysetops : Module with a number of other functions for
Chris@87:                             performing set operations on arrays.
Chris@87: 
Chris@87:     Examples
Chris@87:     --------
Chris@87:     >>> np.union1d([-1, 0, 1], [-2, 0, 2])
Chris@87:     array([-2, -1,  0,  1,  2])
Chris@87: 
Chris@87:     """
Chris@87:     return unique(np.concatenate((ar1, ar2)))
Chris@87: 
Chris@87: def setdiff1d(ar1, ar2, assume_unique=False):
Chris@87:     """
Chris@87:     Find the set difference of two arrays.
Chris@87: 
Chris@87:     Return the sorted, unique values in `ar1` that are not in `ar2`.
Chris@87: 
Chris@87:     Parameters
Chris@87:     ----------
Chris@87:     ar1 : array_like
Chris@87:         Input array.
Chris@87:     ar2 : array_like
Chris@87:         Input comparison array.
Chris@87:     assume_unique : bool
Chris@87:         If True, the input arrays are both assumed to be unique, which
Chris@87:         can speed up the calculation.  Default is False.
Chris@87: 
Chris@87:     Returns
Chris@87:     -------
Chris@87:     setdiff1d : ndarray
Chris@87:         Sorted 1D array of values in `ar1` that are not in `ar2`.
Chris@87: 
Chris@87:     See Also
Chris@87:     --------
Chris@87:     numpy.lib.arraysetops : Module with a number of other functions for
Chris@87:                             performing set operations on arrays.
Chris@87: 
Chris@87:     Examples
Chris@87:     --------
Chris@87:     >>> a = np.array([1, 2, 3, 2, 4, 1])
Chris@87:     >>> b = np.array([3, 4, 5, 6])
Chris@87:     >>> np.setdiff1d(a, b)
Chris@87:     array([1, 2])
Chris@87: 
Chris@87:     """
Chris@87:     if not assume_unique:
Chris@87:         ar1 = unique(ar1)
Chris@87:         ar2 = unique(ar2)
Chris@87:     aux = in1d(ar1, ar2, assume_unique=True)
Chris@87:     if aux.size == 0:
Chris@87:         return aux
Chris@87:     else:
Chris@87:         return np.asarray(ar1)[aux == 0]