jcuda.jcudpp
Class JCudpp

java.lang.Object
  extended by jcuda.jcudpp.JCudpp

public class JCudpp
extends java.lang.Object

Java bindings for the public interface of CUDPP, the CUDA Data Parallel Primitives Library.
(http://code.google.com/p/cudpp)

Most comments are taken from the CUDPP library documentation.


Field Summary
static int CUDPP_HASH_KEY_NOT_FOUND
          The value that indicates that a hash key was not found
 
Method Summary
static int cudppCompact(CUDPPHandle planHandle, Pointer d_out, Pointer d_numValidElements, Pointer d_in, Pointer d_isValid, long numElements)
          Given an array d_in and an array of 1/0 flags in deviceValid, returns a compacted array in d_out of corresponding only the "valid" values from d_in.
static int cudppCreate(CUDPPHandle theCudpp)
          Creates an instance of the CUDPP library, and returns a handle.
static int cudppDestroy(CUDPPHandle theCudpp)
          Destroys an instance of the CUDPP library given its handle.
static int cudppDestroyHashTable(CUDPPHandle cudppHandle, CUDPPHandle plan)
          Destroys a hash table given its handle.
static int cudppDestroyPlan(CUDPPHandle plan)
          Destroy a CUDPP Plan.
static int cudppDestroySparseMatrix(CUDPPHandle sparseMatrixHandle)
          Destroy a CUDPP Sparse Matrix Object.
static int cudppHashInsert(CUDPPHandle plan, Pointer d_keys, Pointer d_vals, long num)
          Inserts keys and values into a CUDPP hash table.
static int cudppHashRetrieve(CUDPPHandle plan, Pointer d_keys, Pointer d_vals, long num)
          Retrieves values, given keys, from a CUDPP hash table.
static int cudppHashTable(CUDPPHandle cudppHandle, CUDPPHandle plan, CUDPPHashTableConfig config)
          Creates a CUDPP hash table in GPU memory given an input hash table configuration; returns the plan for that hash table.
static int cudppMultiScan(CUDPPHandle planHandle, Pointer d_out, Pointer d_in, long numElements, long numRows)
          Performs numRows parallel scan operations of numElements each on its input (d_in) and places the output in d_out, with the scan parameters set by config.
static int cudppMultivalueHashGetAllValues(CUDPPHandle plan, Pointer d_vals)
          Retrieves a pointer to the values array in a multivalue hash table.
static int cudppMultivalueHashGetValuesSize(CUDPPHandle plan, int[] size)
          Retrieves the size of the values array in a multivalue hash table.
static int cudppPlan(CUDPPHandle cudppHandle, CUDPPHandle planHandle, CUDPPConfiguration config, long n, long rows, long rowPitch)
          Create a CUDPP plan.
static int cudppRand(CUDPPHandle planHandle, Pointer d_out, long numElements)
          Rand puts numElements random 32-bit elements into d_out.
static int cudppRandSeed(CUDPPHandle planHandle, int seed)
          Sets the seed used for rand.
static int cudppReduce(CUDPPHandle planHandle, Pointer d_out, Pointer d_in, long numElements)
          Reduces an array to a single element using a binary associative operator.
static int cudppScan(CUDPPHandle planHandle, Pointer d_out, Pointer d_in, long numElements)
          Performs a scan operation of numElements on its input in GPU memory (d_in) and places the output in GPU memory (d_out), with the scan parameters specified in the plan pointed to by planHandle.
static int cudppSegmentedScan(CUDPPHandle planHandle, Pointer d_out, Pointer d_idata, Pointer d_iflags, long numElements)
          Performs a segmented scan operation of numElements on its input in GPU memory (d_idata) and places the output in GPU memory (d_out), with the scan parameters specified in the plan pointed to by planHandle.
static int cudppSort(CUDPPHandle planHandle, Pointer d_keys, Pointer d_values, long numElements)
          Sorts key-value pairs or keys only.
static int cudppSparseMatrix(CUDPPHandle cudppHandle, CUDPPHandle sparseMatrixHandle, CUDPPConfiguration config, long numNonZeroElements, long numRows, Pointer A, Pointer h_rowIndices, Pointer h_indices)
          Create a CUDPP Sparse Matrix Object.
static int cudppSparseMatrixVectorMultiply(CUDPPHandle sparseMatrixHandle, Pointer d_y, Pointer d_x)
          Perform matrix-vector multiply y = A*x for arbitrary sparse matrix A and vector x.
static int cudppTridiagonal(CUDPPHandle planHandle, Pointer a, Pointer b, Pointer c, Pointer d, Pointer x, int systemSize, int numSystems)
          Solves tridiagonal linear systems.
static void initialize()
          Initializes the native library.
static void setExceptionsEnabled(boolean enabled)
          Enables or disables exceptions.
static void setLogLevel(LogLevel logLevel)
          Set the specified log level for the JCudpp library.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CUDPP_HASH_KEY_NOT_FOUND

public static final int CUDPP_HASH_KEY_NOT_FOUND
The value that indicates that a hash key was not found

See Also:
Constant Field Values
Method Detail

setLogLevel

public static void setLogLevel(LogLevel logLevel)
Set the specified log level for the JCudpp library.

Currently supported log levels:
LOG_QUIET: Never print anything
LOG_ERROR: Print error messages
LOG_TRACE: Print a trace of all native function calls

Parameters:
logLevel - The log level to use.

setExceptionsEnabled

public static void setExceptionsEnabled(boolean enabled)
Enables or disables exceptions. By default, the methods of this class only return the CUDPPResult error code from the underlying CUDA function. If exceptions are enabled, a CudaException with a detailed error message will be thrown if a method is about to return a result code that is not CUDPPResult.CUDPP_SUCCESS

Parameters:
enabled - Whether exceptions are enabled

initialize

public static void initialize()
Initializes the native library. Note that this method does not have to be called explicitly by the user of the library: The library will automatically be initialized with the first function call.


cudppCreate

public static int cudppCreate(CUDPPHandle theCudpp)
Creates an instance of the CUDPP library, and returns a handle.

cudppCreate() must be called before any other CUDPP function. In a multi-GPU application that uses multiple CUDA context, cudppCreate() must be called once for each CUDA context. Each call returns a different handle, because each CUDA context (and the host thread that owns it) must use a separate instance of the CUDPP library.
Parameters:
  [in,out]    theCudpp    a pointer to the CUDPPHandle for the created CUDPP instance.
 

Parameters:
theCudpp - The handle
Returns:
CUDPPResult indicating success or error condition

cudppDestroy

public static int cudppDestroy(CUDPPHandle theCudpp)
Destroys an instance of the CUDPP library given its handle.

cudppDestroy() should be called once for each handle created using cudppCreate(), to ensure proper resource cleanup of all library instances.

Parameters:
 [in]    theCudpp    the handle to the CUDPP instance to destroy.
 

Parameters:
theCudpp - The handle
Returns:
CUDPPResult indicating success or error condition

cudppPlan

public static int cudppPlan(CUDPPHandle cudppHandle,
                            CUDPPHandle planHandle,
                            CUDPPConfiguration config,
                            long n,
                            long rows,
                            long rowPitch)
Create a CUDPP plan.

A plan is a data structure containing state and intermediate storage space that CUDPP uses to execute algorithms on data. A plan is created by passing to cudppPlan() a CUDPPConfiguration that specifies the algorithm, operator, datatype, and options. The size of the data must also be passed to cudppPlan(), in the numElements, numRows, and rowPitch arguments. These sizes are used to allocate internal storage space at the time the plan is created. The CUDPP planner may use the sizes, options, and information about the present hardware to choose optimal settings.

Note that numElements is the maximum size of the array to be processed with this plan. That means that a plan may be re-used to process (for example, to sort or scan) smaller arrays.

Parameters:
     [in]    cudppHandle A handle to an instance of the CUDPP library used for resource management 
     [out]   planHandle  A pointer to an opaque handle to the internal plan
     [in]    config  The configuration struct specifying algorithm and options
     [in]    numElements     The maximum number of elements to be processed
     [in]    numRows     The number of rows (for 2D operations) to be processed
     [in]    rowPitch    The pitch of the rows of input data, in elements
 


cudppDestroyPlan

public static int cudppDestroyPlan(CUDPPHandle plan)
Destroy a CUDPP Plan.

Deletes the plan referred to by planHandle and all associated internal storage.

Parameters:
      [in]    planHandle  The CUDPPHandle to the plan to be destroyed
 


cudppScan

public static int cudppScan(CUDPPHandle planHandle,
                            Pointer d_out,
                            Pointer d_in,
                            long numElements)
Performs a scan operation of numElements on its input in GPU memory (d_in) and places the output in GPU memory (d_out), with the scan parameters specified in the plan pointed to by planHandle.
The input to a scan operation is an input array, a binary associative operator (like + or max), and an identity element for that operator (+'s identity is 0). The output of scan is the same size as its input. Informally, the output at each element is the result of operator applied to each input that comes before it. For instance, the output of sum-scan at each element is the sum of all the input elements before that input.

More formally, for associative operator +, outi = in0 + in1 + ... + ini-1.

CUDPP supports "exclusive" and "inclusive" scans. For the ADD operator, an exclusive scan computes the sum of all input elements before the current element, while an inclusive scan computes the sum of all input elements up to and including the current element.

Before calling scan, create an internal plan using cudppPlan().
After you are finished with the scan plan, clean up with cudppDestroyPlan().

Parameters:
     [in]    planHandle  Handle to plan for this scan
     [out]   d_out   output of scan, in GPU memory
     [in]    d_in    input to scan, in GPU memory
     [in]    numElements     number of elements to scan
 

See Also:
cudppPlan(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPConfiguration, long, long, long), cudppDestroyPlan(jcuda.jcudpp.CUDPPHandle)

cudppMultiScan

public static int cudppMultiScan(CUDPPHandle planHandle,
                                 Pointer d_out,
                                 Pointer d_in,
                                 long numElements,
                                 long numRows)
Performs numRows parallel scan operations of numElements each on its input (d_in) and places the output in d_out, with the scan parameters set by config. Exactly like cudppScan except that it runs on multiple rows in parallel.

Note that to achieve good performance with cudppMultiScan one should allocate the device arrays passed to it so that all rows are aligned to the correct boundaries for the architecture the app is running on. The easy way to do this is to use cudaMallocPitch() to allocate a 2D array on the device. Use the rowPitch parameter to cudppPlan() to specify this pitch. The easiest way is to pass the device pitch returned by cudaMallocPitch to cudppPlan() via rowPitch.

Parameters:
     [in]    planHandle  handle to CUDPPScanPlan
     [out]   d_out   output of scan, in GPU memory
     [in]    d_in    input to scan, in GPU memory
     [in]    numElements     number of elements (per row) to scan
     [in]    numRows     number of rows to scan in parallel
 

See Also:
cudppScan(jcuda.jcudpp.CUDPPHandle, jcuda.Pointer, jcuda.Pointer, long), cudppPlan(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPConfiguration, long, long, long)

cudppSegmentedScan

public static int cudppSegmentedScan(CUDPPHandle planHandle,
                                     Pointer d_out,
                                     Pointer d_idata,
                                     Pointer d_iflags,
                                     long numElements)
Performs a segmented scan operation of numElements on its input in GPU memory (d_idata) and places the output in GPU memory (d_out), with the scan parameters specified in the plan pointed to by planHandle.

The input to a segmented scan operation is an input array of data, an input array of flags which demarcate segments, a binary associative operator (like + or max), and an identity element for that operator (+'s identity is 0). The array of flags is the same length as the input with 1 marking the the first element of a segment and 0 otherwise. The output of segmented scan is the same size as its input. Informally, the output at each element is the result of operator applied to each input that comes before it in that segment. For instance, the output of segmented sum-scan at each element is the sum of all the input elements before that input in that segment.

More formally, for associative operator +, outi = ink + ink+1 + ... + ini-1. k is the index of the first element of the segment in which i lies.

We support both "exclusive" and "inclusive" variants. For a segmented sum-scan, the exclusive variant computes the sum of all input elements before the current element in that segment, while the inclusive variant computes the sum of all input elements up to and including the current element, in that segment.

Before calling segmented scan, create an internal plan using cudppPlan().
After you are finished with the scan plan, clean up with cudppDestroyPlan().
Parameters:
     [in]    planHandle  Handle to plan for this scan
     [out]   d_out   output of segmented scan, in GPU memory
     [in]    d_idata     input data to segmented scan, in GPU memory
     [in]    d_iflags    input flags to segmented scan, in GPU memory
     [in]    numElements     number of elements to perform segmented scan on
 

See Also:
cudppPlan(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPConfiguration, long, long, long), cudppDestroyPlan(jcuda.jcudpp.CUDPPHandle)

cudppCompact

public static int cudppCompact(CUDPPHandle planHandle,
                               Pointer d_out,
                               Pointer d_numValidElements,
                               Pointer d_in,
                               Pointer d_isValid,
                               long numElements)
Given an array d_in and an array of 1/0 flags in deviceValid, returns a compacted array in d_out of corresponding only the "valid" values from d_in.

Takes as input an array of elements in GPU memory (d_in) and an equal-sized unsigned int array in GPU memory (deviceValid) that indicate which of those input elements are valid. The output is a packed array, in GPU memory, of only those elements marked as valid.
Internally, uses cudppScan.

Example:
  d_in    = [ a b c d e f ]
  deviceValid = [ 1 0 1 1 0 1 ]
  d_out   = [ a c d f ]
 

Parameters:
     [in]    planHandle  handle to CUDPPCompactPlan
     [out]   d_out   compacted output
     [out]   d_numValidElements  set during cudppCompact;
             is set with the number of elements valid flags in the
             d_isValid input array
     [in]    d_in    input to compact
     [in]    d_isValid   which elements in d_in are valid
     [in]    numElements     number of elements in d_in
 


cudppReduce

public static int cudppReduce(CUDPPHandle planHandle,
                              Pointer d_out,
                              Pointer d_in,
                              long numElements)
Reduces an array to a single element using a binary associative operator.

For example, if the operator is CUDPP_ADD, then:
 d_in    = [ 3 2 0 1 -4 5 0 -1 ]
 d_out   = [ 6 ]
 
If the operator is CUDPP_MIN, then:
 d_in    = [ 3 2 0 1 -4 5 0 -1 ]
 d_out   = [ -4 ]
 
Limits: numElements must be at least 1, and is currently limited only by the addressable memory in CUDA (and the output accuracy is limited by numerical precision).

Parameters:
 [in]    planHandle  handle to CUDPPReducePlan
 [out]   d_out   Output of reduce (a single element) in GPU memory. Must be a pointer to an array of at least a single element.
 [in]    d_in    Input array to reduce in GPU memory. Must be a pointer to an array of at least numElements elements.
 [in]    numElements the number of elements to reduce.
 

Returns:
CUDPPResult indicating success or error condition

cudppSort

public static int cudppSort(CUDPPHandle planHandle,
                            Pointer d_keys,
                            Pointer d_values,
                            long numElements)
Sorts key-value pairs or keys only.

Takes as input an array of keys in GPU memory (d_keys) and an optional array of corresponding values, and outputs sorted arrays of keys and (optionally) values in place. Key-value and key-only sort is selected through the configuration of the plan, using the options CUDPP_OPTION_KEYS_ONLY and CUDPP_OPTION_KEY_VALUE_PAIRS.

Supported key types are CUDPP_FLOAT and CUDPP_UINT. Values can be any 32-bit type (internally, values are treated only as a payload and cast to unsigned int).

Parameters:
    [in]    planHandle  handle to CUDPPSortPlan
    [out]   d_keys  keys by which key-value pairs will be sorted
    [in]    d_values    values to be sorted
    [in]    numElements     number of elements in d_keys and d_values
 

See Also:
cudppPlan(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPConfiguration, long, long, long), CUDPPConfiguration, CUDPPAlgorithm

cudppSparseMatrix

public static int cudppSparseMatrix(CUDPPHandle cudppHandle,
                                    CUDPPHandle sparseMatrixHandle,
                                    CUDPPConfiguration config,
                                    long numNonZeroElements,
                                    long numRows,
                                    Pointer A,
                                    Pointer h_rowIndices,
                                    Pointer h_indices)
Create a CUDPP Sparse Matrix Object.

The sparse matrix plan is a data structure containing state and intermediate storage space that CUDPP uses to perform sparse matrix dense vector multiply. This plan is created by passing to CUDPPSparseMatrixVectorMultiplyPlan() a CUDPPConfiguration that specifies the algorithm (sprarse matrix-dense vector multiply) and datatype, along with the sparse matrix itself in CSR format. The number of non-zero elements in the sparse matrix must also be passed as numNonZeroElements. This is used to allocate internal storage space at the time the sparse matrix plan is created.

Parameters:
     [in]    cudppHandle A handle to an instance of the CUDPP library used for resource management 
     [out]   sparseMatrixHandle  A pointer to an opaque handle to the sparse matrix object
     [in]    config  The configuration struct specifying algorithm and options
     [in]    numNonZeroElements  The number of non zero elements in the sparse matrix
     [in]    numRows     This is the number of rows in y, x and A for y = A * x
     [in]    A   The matrix data
     [in]    h_rowIndices    An array containing the index of the start of each row in A
     [in]    h_indices   An array containing the index of each nonzero element in A
 


cudppDestroySparseMatrix

public static int cudppDestroySparseMatrix(CUDPPHandle sparseMatrixHandle)
Destroy a CUDPP Sparse Matrix Object.

Deletes the sparse matrix data and plan referred to by sparseMatrixHandle and all associated internal storage.

Parameters:
     [in]    sparseMatrixHandle  The CUDPPHandle to the matrix object to be destroyed
 


cudppSparseMatrixVectorMultiply

public static int cudppSparseMatrixVectorMultiply(CUDPPHandle sparseMatrixHandle,
                                                  Pointer d_y,
                                                  Pointer d_x)
Perform matrix-vector multiply y = A*x for arbitrary sparse matrix A and vector x.

Given a matrix object handle (which has been initialized using cudppSparseMatrix()), This function multiplies the input vector d_x by the matrix referred to by sparseMatrixHandle, returning the result in d_y.

Parameters:
         sparseMatrixHandle  Handle to a sparse matrix object created
                             with cudppSparseMatrix()
         d_y     The output vector, y
         d_x     The input vector, x
 

See Also:
cudppSparseMatrix(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPConfiguration, long, long, jcuda.Pointer, jcuda.Pointer, jcuda.Pointer), cudppDestroySparseMatrix(jcuda.jcudpp.CUDPPHandle)

cudppRand

public static int cudppRand(CUDPPHandle planHandle,
                            Pointer d_out,
                            long numElements)
Rand puts numElements random 32-bit elements into d_out.

Outputs numElements random values to d_out. d_out must be of type unsigned int, allocated in device memory.

The algorithm used for the random number generation is stored in planHandle. Depending on the specification of the pseudo random number generator(PRNG), the generator may have one or more seeds. To set the seed, use cudppRandSeed().

Currently only MD5 PRNG is supported. We may provide more rand routines in the future. Parameters:
    [in]    planHandle  Handle to plan for rand
    [in]    numElements     number of elements in d_out.
    [out]   d_out   output of rand, in GPU memory. Should be an array of unsigned integers.
 

See Also:
CUDPPConfiguration, CUDPPAlgorithm

cudppRandSeed

public static int cudppRandSeed(CUDPPHandle planHandle,
                                int seed)
Sets the seed used for rand.

The seed is crucial to any random number generator as it allows a sequence of random numbers to be replicated. Since there may be multiple different rand algorithms in CUDPP, cudppRandSeed uses planHandle to determine which seed to set. Each rand algorithm has its own unique set of seeds depending on what the algorithm needs.

Parameters:
    [in]    planHandle  the handle to the plan which specifies which rand seed to set
    [in]    seed    the value which the internal cudpp seed will be set to


cudppTridiagonal

public static int cudppTridiagonal(CUDPPHandle planHandle,
                                   Pointer a,
                                   Pointer b,
                                   Pointer c,
                                   Pointer d,
                                   Pointer x,
                                   int systemSize,
                                   int numSystems)
Solves tridiagonal linear systems.

The solver uses a hybrid CR-PCR algorithm described in our papers "Fast Fast Tridiagonal Solvers on the GPU" and "A Hybrid Method for Solving Tridiagonal Systems on the GPU". (See the References bibliography). Please refer to the papers for a complete description of the basic CR (Cyclic Reduction) and PCR (Parallel Cyclic Reduction) algorithms and their hybrid variants.

Both float and double data types are supported.
Both power-of-two and non-power-of-two system sizes are supported.
The maximum system size could be limited by the maximum number of threads of a CUDA block, the number of registers per multiprocessor, and the amount of shared memory available. For example, on the GTX 280 GPU, the maximum system size is 512 for the float datatype, and 256 for the double datatype, which is limited by the size of shared memory in this case.
The maximum number of systems is 65535, that is the maximum number of one-dimensional blocks that could be launched in a kernel call. Users could launch the kernel multiple times to solve more systems if required.

Parameters:
 [in]    planHandle  Handle to plan for tridiagonal solver
 [in]    d_a Lower diagonal
 [in]    d_b Main diagonal
 [in]    d_c Upper diagonal
 [in]    d_d Right hand side
 [out]   d_x Solution vector
 [in]    systemSize  The size of the linear system
 [in]    numSystems  The number of systems to be solved
 

Returns:
CUDPPResult indicating success or error condition

cudppHashTable

public static int cudppHashTable(CUDPPHandle cudppHandle,
                                 CUDPPHandle plan,
                                 CUDPPHashTableConfig config)
Creates a CUDPP hash table in GPU memory given an input hash table configuration; returns the plan for that hash table.

Requires a CUDPPHandle for the CUDPP instance (to ensure thread safety); call cudppCreate() to get this handle.

The hash table implementation requires hardware capability 2.0 or higher (64-bit atomic operations).

Hash table types and input parameters are discussed in CUDPPHashTableType and CUDPPHashTableConfig.

After you are finished with the hash table, clean up with cudppDestroyHashTable().

Parameters:
 [in]    cudppHandle Handle to CUDPP instance
 [out]   plan    Handle to hash table instance
 [in]    config  Configuration for hash table to be created
 

Returns:
CUDPPResult indicating if creation was successful
See Also:
cudppCreate(jcuda.jcudpp.CUDPPHandle), cudppDestroyHashTable(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle), CUDPPHashTableType, CUDPPHashTableConfig

cudppDestroyHashTable

public static int cudppDestroyHashTable(CUDPPHandle cudppHandle,
                                        CUDPPHandle plan)
Destroys a hash table given its handle.

Requires a CUDPPHandle for the CUDPP instance (to ensure thread safety); call cudppCreate() to get this handle.

Requires a CUDPPHandle for the hash table instance; call cudppHashTable() to get this handle.

Parameters:
 [in]    cudppHandle Handle to CUDPP instance
 [in]    plan    Handle to hash table instance
 

Returns:
CUDPPResult indicating if destruction was successful
See Also:
cudppHashTable(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHashTableConfig)

cudppHashInsert

public static int cudppHashInsert(CUDPPHandle plan,
                                  Pointer d_keys,
                                  Pointer d_vals,
                                  long num)
Inserts keys and values into a CUDPP hash table.

Requires a CUDPPHandle for the hash table instance; call cudppHashTable() to create the hash table and get this handle.

d_keys and d_values should be in GPU memory. These should be pointers to arrays of unsigned ints.

Parameters:
 [in]    plan    Handle to hash table instance
 [in]    d_keys  GPU pointer to keys to be inserted
 [in]    d_vals  GPU pointer to values to be inserted
 [in]    num Number of keys/values to be inserted
 

Returns:
CUDPPResult indicating if insertion was successful
See Also:
cudppHashTable(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHashTableConfig), cudppHashRetrieve(jcuda.jcudpp.CUDPPHandle, jcuda.Pointer, jcuda.Pointer, long)

cudppHashRetrieve

public static int cudppHashRetrieve(CUDPPHandle plan,
                                    Pointer d_keys,
                                    Pointer d_vals,
                                    long num)
Retrieves values, given keys, from a CUDPP hash table.

Requires a CUDPPHandle for the hash table instance; call cudppHashTable() to create the hash table and get this handle.

d_keys and d_values should be in GPU memory. These should be pointers to arrays of unsigned ints.

Parameters:
 [in]    plan    Handle to hash table instance
 [in]    d_keys  GPU pointer to keys to be retrieved
 [out]   d_vals  GPU pointer to values to be retrieved
 [in]    num Number of keys/values to be retrieved
 

Returns:
CUDPPResult indicating if retrieval was successful
See Also:
cudppHashTable(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHashTableConfig)

cudppMultivalueHashGetValuesSize

public static int cudppMultivalueHashGetValuesSize(CUDPPHandle plan,
                                                   int[] size)
Retrieves the size of the values array in a multivalue hash table.

Only relevant for multivalue hash tables.

Requires a CUDPPHandle for the hash table instance; call cudppHashTable() to get this handle.

Parameters:
 [in]    plan    Handle to hash table instance
 [out]   size    Pointer to size of multivalue hash table
 

Returns:
CUDPPResult indicating if operation was successful
See Also:
cudppHashTable(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHashTableConfig), cudppMultivalueHashGetAllValues(jcuda.jcudpp.CUDPPHandle, jcuda.Pointer)

cudppMultivalueHashGetAllValues

public static int cudppMultivalueHashGetAllValues(CUDPPHandle plan,
                                                  Pointer d_vals)
Retrieves a pointer to the values array in a multivalue hash table.

Only relevant for multivalue hash tables.

Requires a CUDPPHandle for the hash table instance; call cudppHashTable() to get this handle.

Parameters:
 [in]    plan    Handle to hash table instance
 [out]   d_vals  Pointer to pointer of values (in GPU memory)
 

Returns:
CUDPPResult indicating if operation was successful
See Also:
cudppHashTable(jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHandle, jcuda.jcudpp.CUDPPHashTableConfig), cudppMultivalueHashGetValuesSize(jcuda.jcudpp.CUDPPHandle, int[])