Class Quantile


  • public final class Quantile
    extends Object
    Provides quantile computation.

    For values of length n:

    • The result is NaN if n = 0.
    • The result is values[0] if n = 1.
    • Otherwise the result is computed using the Quantile.EstimationMethod.

    Computation of multiple quantiles will handle duplicate and unordered probabilities. Passing ordered probabilities is recommended if the order is already known as this can improve efficiency; for example using uniform spacing through the array data, or to identify extreme values from the data such as [0.001, 0.999].

    This implementation respects the ordering imposed by Double.compare(double, double) for NaN values. If a NaN occurs in the selected positions in the fully sorted values then the result is NaN.

    The NaNPolicy can be used to change the behaviour on NaN values.

    Instances of this class are immutable and thread-safe.

    Support for long arrays

    The result on long values can be returned as a double or a long using a StatisticResult.

    The double result is computed within 1 ULP of the exact result. In some cases this may be outside the range defined by the minimum and maximum of the input array following rounding to a 53-bit floating point representation. For example a quantile of an array containing only Long.MAX_VALUE as a double is 263, which is the closest representation of 263 - 1.

    The long result is returned using the nearest whole number. In the event of ties the result is rounded towards positive infinity. This value will always be within the range defined by the minimum and maximum of the input array. Due to interpolation it may be a value not observed in the input values.

    Interpolation between two long values requires extended precision floating-point arithmetic. This can be avoided using a discontinuous Quantile.EstimationMethod. In this case the long quantile will be a value observed in the input values.

    If the array length n is zero the result as a double is NaN and the result as a long will raise an ArithmeticException.

    Multiple quantile results required as only one of the primitive values can be converted to a primitive array using a stream, for example:

    
     long[] values = ...
     double[] p = Quantile.probabilities(10);
     Quantile q = Quantile.withDefaults();
     long[] result = Arrays.stream(q.evaluate(values, p))
                           .mapToLong(StatisticResult::getAsLong)
                           .toArray();
     
    Since:
    1.1
    See Also:
    with(NaNPolicy), Quantile (Wikipedia)
    • Method Detail

      • withCopy

        public Quantile withCopy​(boolean v)
        Return an instance with the configured copy behaviour. If false then the input array will be modified by the call to evaluate the quantiles; otherwise the computation uses a copy of the data.
        Parameters:
        v - Value.
        Returns:
        an instance
      • with

        public Quantile with​(NaNPolicy v)
        Return an instance with the configured NaNPolicy.

        Note: This implementation respects the ordering imposed by Double.compare(double, double) for NaN values: NaN is considered greater than all other values, and all NaN values are equal. The NaNPolicy changes the computation of the statistic in the presence of NaN values.

        • NaNPolicy.INCLUDE: NaN values are moved to the end of the data; the size of the data includes the NaN values and the quantile will be NaN if any value used for quantile interpolation is NaN.
        • NaNPolicy.EXCLUDE: NaN values are moved to the end of the data; the size of the data excludes the NaN values and the quantile will never be NaN for non-zero size. If all data are NaN then the size is zero and the result is NaN.
        • NaNPolicy.ERROR: An exception is raised if the data contains NaN values.

        Note that the result is identical for all policies if no NaN values are present.

        Parameters:
        v - Value.
        Returns:
        an instance
      • probabilities

        public static double[] probabilities​(int n)
        Generate n evenly spaced probabilities in the range [0, 1].
         1/(n + 1), 2/(n + 1), ..., n/(n + 1)
         
        Parameters:
        n - Number of probabilities.
        Returns:
        the probabilities
        Throws:
        IllegalArgumentException - if n < 1
      • probabilities

        public static double[] probabilities​(int n,
                                             double p1,
                                             double p2)
        Generate n evenly spaced probabilities in the range [p1, p2].
         w = p2 - p1
         p1 + w/(n + 1), p1 + 2w/(n + 1), ..., p1 + nw/(n + 1)
         
        Parameters:
        n - Number of probabilities.
        p1 - Lower probability.
        p2 - Upper probability.
        Returns:
        the probabilities
        Throws:
        IllegalArgumentException - if n < 1; if the probabilities are not in the range [0, 1]; or p2 <= p1.
      • evaluate

        public double evaluate​(double[] values,
                               double p)
        Evaluate the p-th quantile of the values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Performance

        It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluate(double[], double...) method should be used which provides better performance.

        Parameters:
        values - Values.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if the probability p is not in the range [0, 1]; or if the values contain NaN and the configuration is NaNPolicy.ERROR
        See Also:
        evaluate(double[], double...), with(NaNPolicy)
      • evaluateRange

        public double evaluateRange​(double[] values,
                                    int from,
                                    int to,
                                    double p)
        Evaluate the p-th quantile of the specified range of values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Performance

        It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluateRange(double[], int, int, double...) method should be used which provides better performance.

        Parameters:
        values - Values.
        from - Inclusive start of the range.
        to - Exclusive end of the range.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if the probability p is not in the range [0, 1]; or if the values contain NaN and the configuration is NaNPolicy.ERROR
        IndexOutOfBoundsException - if the sub-range is out of bounds
        Since:
        1.2
        See Also:
        evaluateRange(double[], int, int, double...), with(NaNPolicy)
      • evaluate

        public double[] evaluate​(double[] values,
                                 double... p)
        Evaluate the p-th quantiles of the values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Parameters:
        values - Values.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if any probability p is not in the range [0, 1]; no probabilities are specified; or if the values contain NaN and the configuration is NaNPolicy.ERROR
        See Also:
        with(NaNPolicy)
      • evaluateRange

        public double[] evaluateRange​(double[] values,
                                      int from,
                                      int to,
                                      double... p)
        Evaluate the p-th quantiles of the specified range of values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Parameters:
        values - Values.
        from - Inclusive start of the range.
        to - Exclusive end of the range.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if any probability p is not in the range [0, 1]; no probabilities are specified; or if the values contain NaN and the configuration is NaNPolicy.ERROR
        IndexOutOfBoundsException - if the sub-range is out of bounds
        Since:
        1.2
        See Also:
        with(NaNPolicy)
      • evaluate

        public double evaluate​(int[] values,
                               double p)
        Evaluate the p-th quantile of the values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Performance

        It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluate(int[], double...) method should be used which provides better performance.

        Parameters:
        values - Values.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if the probability p is not in the range [0, 1]
        See Also:
        evaluate(int[], double...)
      • evaluateRange

        public double evaluateRange​(int[] values,
                                    int from,
                                    int to,
                                    double p)
        Evaluate the p-th quantile of the specified range of values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Performance

        It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluateRange(int[], int, int, double...) method should be used which provides better performance.

        Parameters:
        values - Values.
        from - Inclusive start of the range.
        to - Exclusive end of the range.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if the probability p is not in the range [0, 1]
        IndexOutOfBoundsException - if the sub-range is out of bounds
        Since:
        1.2
        See Also:
        evaluateRange(int[], int, int, double...)
      • evaluate

        public double[] evaluate​(int[] values,
                                 double... p)
        Evaluate the p-th quantiles of the values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Parameters:
        values - Values.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
      • evaluateRange

        public double[] evaluateRange​(int[] values,
                                      int from,
                                      int to,
                                      double... p)
        Evaluate the p-th quantiles of the specified range of values..

        Note: This method may partially sort the input values if not configured to copy the input data.

        Parameters:
        values - Values.
        from - Inclusive start of the range.
        to - Exclusive end of the range.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
        IndexOutOfBoundsException - if the sub-range is out of bounds
        Since:
        1.2
      • evaluate

        public StatisticResult evaluate​(long[] values,
                                        double p)
        Evaluate the p-th quantile of the values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Performance

        It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluate(long[], double...) method should be used which provides better performance.

        Parameters:
        values - Values.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if the probability p is not in the range [0, 1]
        Since:
        1.3
        See Also:
        evaluate(long[], double...)
      • evaluateRange

        public StatisticResult evaluateRange​(long[] values,
                                             int from,
                                             int to,
                                             double p)
        Evaluate the p-th quantile of the specified range of values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Performance

        It is not recommended to use this method for repeat calls for different quantiles within the same values. The evaluateRange(long[], int, int, double...) method should be used which provides better performance.

        Parameters:
        values - Values.
        from - Inclusive start of the range.
        to - Exclusive end of the range.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if the probability p is not in the range [0, 1]
        IndexOutOfBoundsException - if the sub-range is out of bounds
        Since:
        1.3
        See Also:
        evaluateRange(long[], int, int, double...)
      • evaluate

        public StatisticResult[] evaluate​(long[] values,
                                          double... p)
        Evaluate the p-th quantiles of the values.

        Note: This method may partially sort the input values if not configured to copy the input data.

        Parameters:
        values - Values.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
        Since:
        1.3
      • evaluateRange

        public StatisticResult[] evaluateRange​(long[] values,
                                               int from,
                                               int to,
                                               double... p)
        Evaluate the p-th quantiles of the specified range of values..

        Note: This method may partially sort the input values if not configured to copy the input data.

        Parameters:
        values - Values.
        from - Inclusive start of the range.
        to - Exclusive end of the range.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if any probability p is not in the range [0, 1]; or no probabilities are specified.
        IndexOutOfBoundsException - if the sub-range is out of bounds
        Since:
        1.3
      • evaluate

        public double evaluate​(int n,
                               IntToDoubleFunction values,
                               double p)
        Evaluate the p-th quantile of the sorted values provided as a double.

        This method can be used when the values of known size are already sorted. It can be used for primitive types not supported by other evaluation methods. Numeric types byte, short and float can be converted to type double without loss of precision.

        
         short[] x = ...
         Arrays.sort(x);
         double q = Quantile.withDefaults().evaluate(x.length, i -> x[i], 0.05);
         

        If the sorted array is a long datatype this method can lose information about the precision of the quantiles due to primitive type conversion. Use the method evaluateAsLong(int, IntToLongFunction, double) to compute the long quantile result.

        Parameters:
        n - Size of the values.
        values - Values function.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if size < 0; or if the probability p is not in the range [0, 1].
        See Also:
        evaluateAsLong(int, IntToLongFunction, double)
      • evaluate

        public double[] evaluate​(int n,
                                 IntToDoubleFunction values,
                                 double... p)
        Evaluate the p-th quantiles of the sorted values provided as a double.

        This method can be used when the values of known size are already sorted. It can be used for primitive types not supported by other evaluation methods. Numeric types byte, short and float can be converted to type double without loss of precision.

        
         short[] x = ...
         Arrays.sort(x);
         double[] q = Quantile.withDefaults().evaluate(x.length, i -> x[i], 0.25, 0.5, 0.75);
         

        If the sorted array is a long datatype this method can lose information about the precision of the quantiles due to primitive type conversion. Use the method evaluateAsLong(int, IntToLongFunction, double...) to compute the long quantile result.

        Parameters:
        n - Size of the values.
        values - Values function.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if size < 0; if any probability p is not in the range [0, 1]; or no probabilities are specified.
        See Also:
        evaluateAsLong(int, IntToLongFunction, double...)
      • evaluateAsLong

        public StatisticResult evaluateAsLong​(int n,
                                              IntToLongFunction values,
                                              double p)
        Evaluate the p-th quantile of the sorted values provided as a long.

        This method can be used when the values of known size are already sorted.

        
         long[] x = ...
         Arrays.sort(x);
         StatisticResult q = Quantile.withDefaults()
                                     .evaluateAsLong(x.length, i -> x[i], 0.05);
         

        Note: It is not recommended to sort data for use only in the quantile computation. The evaluate(long[], double) method will partially sort the data as required and in most cases will be more efficient.

        Parameters:
        n - Size of the values.
        values - Values function.
        p - Probability for the quantile to compute.
        Returns:
        the quantile
        Throws:
        IllegalArgumentException - if size < 0; or if the probability p is not in the range [0, 1].
        Since:
        1.3
      • evaluateAsLong

        public StatisticResult[] evaluateAsLong​(int n,
                                                IntToLongFunction values,
                                                double... p)
        Evaluate the p-th quantiles of the sorted values provided as a long.

        This method can be used when the values of known size are already sorted.

        
         long[] x = ...
         Arrays.sort(x);
         StatisticResult[] q = Quantile.withDefaults()
                                       .evaluateAsLong(x.length, i -> x[i], 0.25, 0.5, 0.75);
         

        Note: It is not recommended to sort data for use only in the quantile computation. The evaluate(long[], double...) method will partially sort the data as required and in most cases will be more efficient.

        Parameters:
        n - Size of the values.
        values - Values function.
        p - Probabilities for the quantiles to compute.
        Returns:
        the quantiles
        Throws:
        IllegalArgumentException - if size < 0; if any probability p is not in the range [0, 1]; or no probabilities are specified.
        Since:
        1.3