cudnn library (1)

Upload: ariel-lenis

Post on 02-Jun-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 CUDNN Library (1)

    1/38

  • 8/10/2019 CUDNN Library (1)

    2/38

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 2

  • 8/10/2019 CUDNN Library (1)

    3/38

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 1

    Chapter 1.INTRODUCTION

    NVIDIAcuDNN is a GPU-accelerated library of primitives for deep neural networks.It provides highly tuned implementations of routines arising frequently in DNNapplications:

    Convolution forward and backward, including cross-correlation

    Pooling forward and backward

    Softmax forward and backward

    Neuron activations forward and backward:

    Rectified linear (ReLU)

    Sigmoid

    Hyperbolic tangent (TANH)

    Tensor transformation functions

    cuDNN's convolution routines aim for performance competitive with the fastest GEMM

    (matrix multiply) based implementations of such routines while using significantly lessmemory.

    cuDNN features customizable data layouts, supporting flexible dimension ordering,striding, and subregions for the 4D tensors used as inputs and outputs to all of itsroutines. This flexibility allows easy integration into any neural network implementationand avoids the input/output transposition steps sometimes necessary with GEMM-basedconvolutions.

    cuDNN offers a context-based API that allows for easy multithreading and (optional)interoperability with CUDA streams.

  • 8/10/2019 CUDNN Library (1)

    4/38

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 2

    Chapter 2.GENERAL DESCRIPTION

    2.1. Programming ModelThe cuDNN Library exposes a Host API but assumes that for operations using the GPUthe data is directly accessible from the device.

    The application must initialize the handle to the cuDNN library context by calling thecudnnCreate()function. Then, the handle is explicitly passed to every subsequentlibrary function call that operate on GPU data. Once the application finishes using thelibrary, it must call the function cudnnDestroy()to release the resources associatedwith the cuDNN library context. This approach allows the user to explicitly controlthe library setup when using multiple host threads and multiple GPUs. For example,the application can use cudaSetDevice()to associate different devices with differenthost threads and in each of those host threads it can initialize a unique handle to the

    cuDNN library context, which will use the particular device associated with thathost thread. Then, the cuDNN library function calls made with different handle willautomatically dispatch the computation to different devices. The device associatedwith a particular cuDNN context is assumed to remain unchanged between thecorresponding cudnnCreate()and cudnnDestroy()calls. In order for the cuDNNlibrary to use a different device within the same host thread, the application must set thenew device to be used by calling cudaSetDevice()and then create another cuDNNcontext, which will be associated with the new device, by calling cudnnCreate().

    2.2. Thread SafetyThe library is thread safe and its functions can be called from multiple host threads,even with the same handle. When multiple threads share the same handle, extreme careneeds to be taken when the handle configuration is changed because that change willaffect potentially subsequent cuDNN calls in all threads. It is even more true for thedestruction of the handle. So it is not recommended that multiple threads share the samecuDNN handle.

  • 8/10/2019 CUDNN Library (1)

    5/38

    General Description

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 3

    2.3. ReproducibilityBy design, most of cuDNN API routines from a given version generate the same bit-wise results at every run when executed on GPUs with the same architecture andthe same number of SMs. However, bit-wise reproducibility is not guaranteed acrossversions, as the implementation of a given routine may change. With the current release,the following routines do not guarantee reproducibility because they use atomic addoperations:

    cudnnConvolutionBackwardFilter

    cudnnConvolutionBackwardData

    2.4. RequirementscuDNN supports NVIDIA GPUs of compute capability 3.0 and higher and requires an

    NVIDIA Driver compatible with CUDA Toolkit 6.5.

  • 8/10/2019 CUDNN Library (1)

    6/38

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 4

    Chapter 3.CUDNN DATATYPES REFERENCE

    This chapter describes all the types and enums of the cuDNN library API.

    3.1. cudnnHandle_tcudnnHandle_tis a pointer to an opaque structure holding the cuDNN library context.The cuDNN library context must be created using cudnnCreate()and the returnedhandle must be passed to all subsequent library function calls. The context should bedestroyed at the end using cudnnDestroy(). The context is associated with only oneGPU device, the current device at the time of the call to cudnnCreate(). Howevermultiple contexts can be created on the same GPU device.

    3.2. cudnnStatus_tcudnnStatus_tis an enumerated type used for function status returns. All cuDNNlibrary functions return their status, which can be one of the following values:

    Value Meaning

    CUDNN_STATUS_SUCCESS The operation completed successfully.

    CUDNN_STATUS_NOT_INITIALIZED The cuDNN library was not initialized properly.

    This error is usually returned when a call to

    cudnnCreate()fails or when cudnnCreate()has not been called prior to calling another cuDNN

    routine. In the former case, it is usually due

    to an error in the CUDA Runtime API called by

    cudnnCreate()or by an error in the hardwaresetup.

    CUDNN_STATUS_ALLOC_FAILED Resource allocation failed inside the cuDNN

    library. This is usually caused by an internal

    cudaMalloc()failure.

    To correct: prior to the function call, deallocate

    previously allocated memory as much as possible.

  • 8/10/2019 CUDNN Library (1)

    7/38

    cuDNN Datatypes Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 5

    Value Meaning

    CUDNN_STATUS_BAD_PARAM An incorrect value or parameter was passed to the

    function.

    To correct: ensure that all the parameters being

    passed have valid values.

    CUDNN_STATUS_ARCH_MISMATCH The function requires a feature absent fromthe current GPU device. Note that cuDNN only

    supports devices with compute capabilities greater

    than or equal to 3.0.

    To correct: compile and run the application on a

    device with appropriate compute capability.

    CUDNN_STATUS_MAPPING_ERROR An access to GPU memory space failed, which isusually caused by a failure to bind a texture.

    To correct: prior to the function call, unbind any

    previously bound textures.

    Otherwise, this may indicate an internal error/bug

    in the library.

    CUDNN_STATUS_EXECUTION_FAILED The GPU program failed to execute. This is usuallycaused by a failure to launch some cuDNN kernel

    on the GPU, which can occur for multiple reasons.

    To correct: check that the hardware, an

    appropriate version of the driver, and the cuDNN

    library are correctly installed.

    Otherwise, this may indicate a internal error/bug

    in the library.

    CUDNN_STATUS_INTERNAL_ERROR An internal cuDNN operation failed.

    CUDNN_STATUS_NOT_SUPPORTED The functionality requested is not presently

    supported by cuDNN.

    CUDNN_STATUS_LICENSE_ERROR The functionality requested requires some licenseand an error was detected when trying to check

    the current licensing. This error can happen if

    the license is not present or is expired or if the

    environment variable NVIDIA_LICENSE_FILE is not

    set properly.

    3.3. cudnnTensor4dDescriptor_t

    cudnnCreateTensor4dDescriptor_t is a pointer to an opaque structure holdingthe description of a generic 4D dataset. cudnnCreateTensor4dDescriptor()is used to create one instance, and cudnnSetTensor4dDescriptor()orcudnnSetTensor4dDescriptorEx() must be used to initialize this instance.

  • 8/10/2019 CUDNN Library (1)

    8/38

    cuDNN Datatypes Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 6

    3.4. cudnnFilterDescriptor_tcudnnFilterDescriptor_tis a pointer to an opaque structure holding the descriptionof a filter dataset. cudnnCreateFilterDescriptor()is used to create one instance,and cudnnSetFilterDescriptor()must be used to initialize this instance.

    3.5. cudnnConvolutionDescriptor_tcudnnConvolutionDescriptor_tis a pointer to an opaque structure holding thedescription of a convolution operation. cudnnCreateFilterDescriptor()is used tocreate one instance, and cudnnSetFilterDescriptor()must be used to initialize thisinstance.

    3.6. cudnnPoolingDescriptor_tcudnnPoolingDescriptor_tis a pointer to an opaque structure holding thedescription of a pooling operation. cudnnCreatePoolingDescriptor()is used tocreate one instance, and cudnnSetPoolingDescriptor()must be used to initializethis instance.

    3.7. cudnnDataType_tcudnnDataType_tis an enumerated type indicating the data type to which a tensordescriptor or filter descriptor refers.

    Value Meaning

    CUDNN_DATA_FLOAT The data is 32-bit single-precision floating point

    (float).

    CUDNN_DATA_DOUBLE The data is 64-bit double-precision floating point

    (double).

    3.8. cudnnTensorFormat_tcudnnTensorFormat_tis an enumerated type used bycudnnSetTensor4dDescriptor()to create a tensor with a pre-defined layout.

    Value Meaning

    CUDNN_TENSOR_NCHW This tensor format specifies that the data is laid

    out in the following order: image, features map,

    rows, columns. The strides are implicitly defined

    in such a way that the data are contiguous in

    memory with no padding between images, feature

    maps, rows, and columns; the columns are the

  • 8/10/2019 CUDNN Library (1)

    9/38

    cuDNN Datatypes Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 7

    Value Meaning

    inner dimension and the images are the outermost

    dimension.

    CUDNN_TENSOR_NHWC This tensor format specifies that the data is laid

    out in the following order: image, rows, columns,

    features maps. The strides are implicitly defined insuch a way that the data are contiguous in memory

    with no padding between images, rows, columns,

    and features maps; the feature maps are the

    inner dimension and the images are the outermost

    dimension.

    3.9. cudnnAddMode_tcudnnAddMode_tis an enumerated type used by cudnnAddTensor4d()to specify howa bias tensor is added to an input/output tensor.

    Value Meaning

    CUDNN_ADD_IMAGEor CUDNN_ADD_SAME_HW In this mode, the bias tensor is defined as one

    image with one feature map. This image will be

    added to every feature map of every image of the

    input/output tensor.

    CUDNN_ADD_FEATURE_MAPor

    CUDNN_ADD_SAME_CHW

    In this mode, the bias tensor is defined as one

    image with multiple feature maps. This image

    will be added to every image of the input/output

    tensor.

    CUDNN_ADD_SAME_C In this mode, the bias tensor is defined as oneimage with multiple feature maps of dimension

    1x1; it can be seen as an vector of feature maps.Each feature map of the bias tensor will be added

    to the corresponding feature map of all height-by-

    width pixels of every image of the input/output

    tensor.

    CUDNN_ADD_FULL_TENSOR In this mode, the bias tensor has the same

    dimensions as the input/output tensor. It will be

    added point-wise to the input/output tensor.

    3.10. cudnnConvolutionMode_t

    cudnnConvolutionMode_tis an enumerated type used bycudnnSetConvolutionDescriptor() to configure a convolution descriptor. Thefilter used for the convolution can be applied in two different ways, correspondingmathematically to a convolution or to a cross-correlation. (A cross-correlation isequivalent to a convolution with its filter rotated by 180 degrees.)

  • 8/10/2019 CUDNN Library (1)

    10/38

    cuDNN Datatypes Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 8

    Value Meaning

    CUDNN_CONVOLUTION In this mode, a convolution operation will be done

    when applying the filter to the images.

    CUDNN_CROSS_CORRELATION In this mode, a cross-correlation operation will be

    done when applying the filter to the images.

    3.11. cudnnConvolutionPath_tcudnnConvolutionPath_tis an enumerated type used by the helper routinecudnnGetOutputTensor4dDim()to select the results to output.

    Value Meaning

    CUDNN_CONVOLUTION_FWD cudnnGetOutputTensor4dDim()will returndimensions related to the output tensor of the

    forward convolution.

    CUDNN_CONVOLUTION_WEIGHT_GRAD cudnnGetOutputTensor4dDim()will return thedimensions of the output filter produced while

    computing the gradients, which is part of the

    backward convolution.

    CUDNN_CONVOLUTION_DATA_GRAD cudnnGetOutputTensor4dDim()will return the

    dimensions of the output tensor produced while

    computing the gradients, which is part of the

    backward convolution.

    3.12. cudnnAccumulateResult_tcudnnAccumulateResult_tis an enumerated type used bycudnnConvolutionForward(), cudnnConvolutionBackwardFilter() andcudnnConvolutionBackwardData() to specify whether those routines accumulatetheir results with the output tensor or simply write them to it, overwriting the previousvalue.

    Value Meaning

    CUDNN_RESULT_ACCUMULATE The results are accumulated with (added to the

    previous value of) the output tensor.

    CUDNN_RESULT_NO_ACCUMULATE The results overwrite the output tensor.

    3.13. cudnnSoftmaxAlgorithm_tcudnnSoftmaxAlgorithm_tis used to select an implementation of the softmaxfunction used in cudnnSoftmaxForward()and cudnnSoftmaxBackward().

  • 8/10/2019 CUDNN Library (1)

    11/38

  • 8/10/2019 CUDNN Library (1)

    12/38

  • 8/10/2019 CUDNN Library (1)

    13/38

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 11

    Chapter 4.CUDNN API REFERENCE

    This chapter describes the API of all the routines of the cuDNN library.

    4.1. cudnnCreatecudnnStatus_t cudnnCreate(cudnnHandle_t *handle)

    This function initializes the cuDNN library and creates a handle to an opaquestructure holding the cuDNN library context. It allocates hardware resources onthe host and device and must be called prior to making any other cuDNN librarycalls. The cuDNN library context is tied to the current CUDA device. To use thelibrary on multiple devices, one cuDNN handle needs to be created for each device.For a given device, multiple cuDNN handles with different configurations (e.g.,different current CUDA streams) may be created. Because cudnnCreateallocatessome internal resources, the release of those resources by calling cudnnDestroywill

    implicitly call cudaDeviceSynchronize; therefore, the recommended best practiceis to call cudnnCreate/cudnnDestroyoutside of performance-critical code paths.For multithreaded applications that use the same device from different threads, therecommended programming model is to create one (or a few, as is convenient) cuDNNhandle(s) per thread and use that cuDNN handle for the entire life of the thread.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The initialization succeeded.

    CUDNN_STATUS_NOT_INITIALIZED CUDA Runtime API initialization failed.

    CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

    4.2. cudnnDestroycudnnStatus_t cudnnDestroy(cudnnHandle_t handle)

    This function releases hardware resources used by the cuDNN library. This functionis usually the last call with a particular handle to the cuDNN library. BecausecudnnCreateallocates some internal resources, the release of those resources by

  • 8/10/2019 CUDNN Library (1)

    14/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 12

    calling cudnnDestroywill implicitly call cudaDeviceSynchronize; therefore,the recommended best practice is to call cudnnCreate/cudnnDestroyoutside ofperformance-critical code paths.

    Return Value Meaning

    CUDNN_STATUS_SUCCESSThe cuDNN context destruction was successful.

    CUDNN_STATUS_NOT_INITIALIZED The library was not initialized.

    4.3. cudnnSetStreamcudnnStatus_t cudnnSetStream(cudnnHandle_t handle, cudaStream_t streamId)

    This function sets the cuDNN library stream, which will be used to execute allsubsequent calls to the cuDNN library functions with that particular handle. If thecuDNN library stream is not set, all kernels use the default (NULL) stream. In particular,this routine can be used to change the stream between kernel launches and then to reset

    the cuDNN library stream back toNULL.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The stream was set successfully.

    4.4. cudnnGetStreamcudnnStatus_t cudnnGetStream(cudnnHandle_t handle, cudaStream_t *streamId)

    This function gets the cuDNN library stream, which is being used to execute all calls tothe cuDNN library functions. If the cuDNN library stream is not set, all kernels use the

    defaultNULLstream.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The stream was returned successfully.

    4.5. cudnnCreateTensor4dDescriptorcudnnStatus_t cudnnCreateTensor4dDescriptor(cudnnTensor4dDescriptor_t*tensorDesc)

    This function creates a Tensor4D descriptor object by allocating the memory needed to

    hold its opaque structure.Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was created successfully.

    CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

  • 8/10/2019 CUDNN Library (1)

    15/38

  • 8/10/2019 CUDNN Library (1)

    16/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 14

    This function initializes a previously created Tensor4D descriptor object, similarly tocudnnSetTensor4dDescriptorbut with the strides explicitly passed as parameters.This can be used to lay out the 4D tensor in any order or simply to define gaps betweendimensions.

    At present, some cuDNN routines have limited support for strides; forexample,wStride==1is sometimes required. Those routines will return

    CUDNN_STATUS_NOT_SUPPORTED if a Tensor4D object with an unsupported stride is

    used. cudnnTransformTensor4dcan be used to convert the data to a supported

    layout.

    Param In/out Meaning

    tensorDesc input/

    output

    Handle to a previously created tensor descriptor.

    datatype input Data type.

    n input Number of images.

    c input Number of feature maps per image.

    h input Height of each feature map.

    w input Width of each feature map.

    nStride input Stride between two consecutive images.

    cStride input Stride between two consecutive feature maps.

    hStride input Stride between two consecutive rows.

    wStride input Stride between two consecutive columns.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was set successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the parameters n,c,h,wor

    nStride,cStride,hStride,wStrideis negative

    or dataTypehas an invalid enumerant value.

    4.8. cudnnGetTensor4dDescriptorcudnnStatus_tcudnnGetTensor4dDescriptor( cudnnTensor4dDescriptor_t tensorDesc, cudnnDataType_t *dataType, int*n, int*c, int*h, int*w, int*nStride, int*cStride, int*hStride, int*wStride )

  • 8/10/2019 CUDNN Library (1)

    17/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 15

    This function queries the parameters of the previouly initialized Tensor4D descriptorobject.

    Param In/out Meaning

    tensorDesc input Handle to a previously insitialized tensor descriptor.

    datatype output Data type.

    n output Number of images.

    c output Number of feature maps per image.

    h output Height of each feature map.

    w output Width of each feature map.

    nStride output Stride between two consecutive images.

    cStride output Stride between two consecutive feature maps.

    hStride output Stride between two consecutive rows.

    wStride output Stride between two consecutive columns.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The operation succeeded.

    4.9. cudnnDestroyTensor4dDescriptorcudnnStatus_t cudnnDestroyTensor4dDescriptor(cudnnTensor4dDescriptor_t

    tensorDesc)

    This function destroys a previously created Tensor4D descriptor object.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was destroyed successfully.

    4.10. cudnnTransformTensor4dcudnnStatus_tcudnnTransformTensor4d( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )

    This function copies the data from one tensor to another tensor with a differentlayout. Those descriptors need to have the same dimensions but not necessarily thesame strides. The input and output tensors must not overlap in any way (i.e., tensorscannot be transformed in place). This function can be used to convert a tensor with anunsupported format to a supported one.

  • 8/10/2019 CUDNN Library (1)

    18/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 16

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    srcDesc input Handle to a previously initialized tensor descriptor.

    srcData input Pointer to data of the tensor described by the srcDescdescriptor.

    destDesc input Handle to a previously initialized tensor descriptor.

    destData output Pointer to data of the tensor described by the destDescdescriptor.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function launched successfully.

    CUDNN_STATUS_BAD_PARAM The dimensions n,c,h,wor the dataTypeof thetwo tensor descriptors are different.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

    4.11. cudnnAddTensor4dcudnnStatus_tcudnnAddTensor4d( cudnnHandle_t handle, cudnnAddMode_t mode, constvoid *alpha, cudnnTensor4dDescriptor_t biasDesc, constvoid *biasData, cudnnTensor4dDescriptor_t srcDestDesc, void *srcDestData )

    This function adds the scaled values of one tensor to another tensor. Themodeparametercan be used to select different ways of performing the scaled addition. The amountof data described by thebiasDescdescriptor must match exactly the amount of dataneeded to perform the addition. Therefore, the following conditions must be met:

    Except for the CUDNN_ADD_SAME_Cmode, the dimensions h,wof the two tensorsmust match.

    In the case of CUDNN_ADD_IMAGEmode, the dimensions n,cof the bias tensor mustbe 1.

    In the case of CUDNN_ADD_FEATURE_MAPmode, the dimension nof the bias tensormust be 1 and the dimension cof the two tensors must match.

    In the case of CUDNN_ADD_FULL_TENSORmode, the dimensions n,cof the two

    tensors must match. In the case of CUDNN_ADD_SAME_Cmode, the dimensions n,w,hof the bias tensor

    must be 1 and the dimension cof the two tensors must match.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    biasDesc input Handle to a previously initialized tensor descriptor.

  • 8/10/2019 CUDNN Library (1)

    19/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 17

    Param In/out Meaning

    mode input Addition mode that describe how the addition is performed.

    alpha input Scalar factor to be applied to every data element of the bias tensor before

    it is added to the output tensor.

    srcData input Pointer to data of the tensor described by thebiasDescdescriptor.

    srcDestDesc input/

    output

    Handle to a previously initialized tensor descriptor.

    srcDestData input/

    output

    Pointer to data of the tensor described by the srcDestDescdescriptor.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function executed successfully.

    CUDNN_STATUS_BAD_PARAM The dimensions n,c,h,wof the bias tensor referto an amount of data that is incompatible with the

    modeparameter and the output tensor dimensionsor the dataTypeof the two tensor descriptors are

    different.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

    4.12. cudnnCreateFilterDescriptorcudnnStatus_t cudnnCreateFilterDescriptor(cudnnFilterDescriptor_t *filterDesc)

    This function creates a filter descriptor object by allocating the memory needed to holdits opaque structure,

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was created successfully.

    CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

    4.13. cudnnSetFilterDescriptorcudnnStatus_tcudnnSetFilterDescriptor( cudnnFilterDescriptor_t filterDesc,

    cudnnDataType_t dataType, intk, intc, inth, intw )

    This function initializes a previously created filter descriptor object. Filters layout mustbe contiguous in memory.

  • 8/10/2019 CUDNN Library (1)

    20/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 18

    Param In/out Meaning

    filterDesc input/

    output

    Handle to a previously created filter descriptor.

    datatype input Data type.

    k input Number of output feature maps.

    c input Number of input feature maps.

    h input Height of each filter.

    w input Width of each filter.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was set successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the parameters k,c,h,wisnegative or dataTypehas an invalid enumerantvalue.

    4.14. cudnnGetFilterDescriptorcudnnStatus_tcudnnGetFilterDescriptor( cudnnFilterDescriptor_t filterDesc, cudnnDataType_t *dataType, int*k, int*c, int*h, int*w )

    This function queries the parameters of the previouly initialized filter descriptor object.

    Param In/out Meaning

    filterDesc input Handle to a previously created filter descriptor.

    datatype output Data type.

    k output Number of output feature maps.

    c output Number of input feature maps.

    h output Height of each filter.

    w output Width of each filter.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was set successfully.

  • 8/10/2019 CUDNN Library (1)

    21/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 19

    4.15. cudnnDestroyFilterDescriptorcudnnStatus_t cudnnDestroyFilterDescriptor(cudnnFilterdDescriptor_t filterDesc)

    This function destroys a previously created Tensor4D descriptor object.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was destroyed successfully.

    4.16. cudnnCreateConvolutionDescriptorcudnnStatus_t cudnnCreateConvolutionDescriptor(cudnnConvolutionDescriptor_t*convDesc)

    This function creates a convolution descriptor object by allocating the memory needed tohold its opaque structure,

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was created successfully.

    CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

    4.17. cudnnSetConvolutionDescriptorcudnnStatus_tcudnnSetConvolutionDescriptor( cudnnConvolutionDescriptor_t convDesc, cudnnTensor4dDescriptor_t inputTensorDesc, cudnnFilterDescriptor_t filterDesc, intpad_h, intpad_w, intu, intv, intupscalex, intupscaley, cudnnConvolutionMode_t mode )

    This function initializes a previously created convolution descriptor object, accordingto an input tensor descriptor and a filter descriptor passed as parameter. This functionassumes that the tensor and filter descriptors corresponds to the formard convolutionpath and checks if their settings are valid. That same convolution descriptor can bereused in the backward path provided it corresponds to the same layer.

    Param In/out Meaning

    convDesc input/

    output

    Handle to a previously created convolution descriptor.

    inputTensorDesc input Input tensor descriptor used for that layer on the forward path.

    filterDesc input Filter descriptor used for that layer on the forward path.

  • 8/10/2019 CUDNN Library (1)

    22/38

  • 8/10/2019 CUDNN Library (1)

    23/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 21

    This function initializes a previously created convolution descriptor object. It is similarto cudnnSetConvolutionDescriptorbut every parameter of the convolution must bepassed explicitly.

    Param In/out Meaning

    convDesc input/

    output

    Handle to a previously created convolution descriptor.

    n input Number of images.

    c input Number of input feature maps.

    h input Height of each input feature map.

    w input Width of each input feature map.

    k input Number of output feature maps.

    r input Height of each filter.

    s input Width of each filter.

    pad_h input zero-padding height: number of rows of zeros implicitly concatenated onto

    the top and onto the bottom of input images.

    pad_w input zero-padding width: number of columns of zeros implicitly concatenated

    onto the left and onto the right of input images.

    u input Vertical filter stride.

    v input Horizontal filter stride.

    upscalex input Upscale the input in x-direction.

    upscaley input Upscale the input in y-direction.

    mode input Selects between CUDNN_CONVOLUTIONand CUDNN_CROSS_CORRELATION.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was set successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:

    One of the parameters u,vis negative.

    The parametermodehas an invalid enumerantvalue.

    CUDNN_STATUS_NOT_SUPPORTED The parameter upscalexor upscaleyis not 1.

  • 8/10/2019 CUDNN Library (1)

    24/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 22

    4.19. cudnnGetOutputTensor4dDimcudnnStatus_tcudnnGetOutputTensor4dDim( constcudnnConvolutionDescriptor_t convDesc, cudnnConvolutionPath_t path, int*n, int*c, int*h, int*w )

    This function returns the dimensions of a convolution's output, given the convolutiondescriptor and the direction of the convolution. This function can help to setup theoutput tensor and allocate the proper amount of memory prior to launch the actualconvolution.

    Param In/out Meaning

    convDesc input Handle to a previously created convolution descriptor.

    path input Enumerant to specify the direction of the convolution.

    n output Number of output images.

    c output Number of output feature maps per image.

    h output Height of each output feature map.

    w output Width of each output feature map.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_BAD_PARAM Thepathparameter has an invalid enumerantvalue.

    CUDNN_STATUS_SUCCESS The object was set successfully.

    4.20. cudnnDestroyFilterDescriptorcudnnStatus_t cudnnDestroyConvolutionDescriptor(cudnnConvolutionDescriptor_tconvDesc)

    This function destroys a previously created convolution descriptor object.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was destroyed successfully.

  • 8/10/2019 CUDNN Library (1)

    25/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 23

    4.21. cudnnConvolutionForwardcudnnStatus_tcudnnConvolutionForward( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnFilterDescriptor_t filterDesc, constvoid *filterData, cudnnConvolutionDescriptor_t convDesc, cudnnTensor4dDescriptor_t destDesc, void *destData, cudnnAccumulateResult_t accumulate )

    This function executes convolutions or cross-correlations over srcusing the specifiedfilters, returning results in dest.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    srcDesc input Handle to a previously initialized tensor descriptor.

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    filterDesc input Handle to a previously initialized filter descriptor.

    filterData input Data pointer to GPU memory associated with the filter descriptor

    filterDesc.

    convDesc input Previously initialized convolution descriptor.

    destDesc input Handle to a previously initialized tensor descriptor.

    destData input/

    output

    Data pointer to GPU memory associated with the tensor descriptor

    destDescthat carries the result of the convolution.

    accumulate input Enumerant that specifies whether the convolution accumulates with or

    overwrites the output tensor.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The operation was launched successfully.

    CUDNN_STATUS_MAPPING_ERROR An error occured during the texture binding of thefilter data.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

  • 8/10/2019 CUDNN Library (1)

    26/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 24

    4.22. cudnnConvolutionBackwardBiascudnnStatus_tcudnnConvolutionBackwardBias( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData, cudnnAccumulateResult_t accumulate )

    This function computes the convolution gradient with respect to the bias, which is thesum of every element belonging to the same feature map across all of the images of theinput tensor. Therefore, the number of elements produced is equal to the number offeatures maps of the input tensor.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    srcDesc input Handle to the previously initialized input tensor descriptor.

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    destDesc input Handle to the previously initialized output tensor descriptor.

    destData output Data pointer to GPU memory associated with the output tensor descriptor

    destDesc.

    accumulate input Enumerant that specifies whether the convolution accumulates with or

    overwrites the output tensor.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The operation was launched successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:

    One of the parameters n,h,wof the output

    tensor is not 1.

    The numbers of feature maps of the input

    tensor and output tensor differ.

    The dataTypeof the two tensor descriptors

    are different.

    CUDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met: The width stride of the input tensor is not 1.

    The height stride and the width of the input

    tensor differ.

    The feature map stride of the output tensor is

    not 1.

  • 8/10/2019 CUDNN Library (1)

    27/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 25

    4.23. cudnnConvolutionBackwardFiltercudnnStatus_tcudnnConvolutionBackwardFilter( cudnnHandle_t handle, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t diffDesc, constvoid *diffData, cudnnConvolutionDescriptor_t convDesc, cudnnFilterDescriptor_t gradDesc, void *gradData, cudnnAccumulateResult_t accumulate )

    This function computes the convolution gradient with respect to the filter coefficients.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    srcDesc input Handle to a previously initialized tensor descriptor.

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    diffDesc input Handle to the previously initialized input differential tensor descriptor.

    diffData input Data pointer to GPU memory associated with the input differential tensor

    descriptor diffDesc.

    convDesc input Previously initialized convolution descriptor.

    gradDesc input Handle to a previously initialized filter descriptor.

    gradData input/

    output

    Data pointer to GPU memory associated with the filter descriptor

    gradDescthat carries the result.

    accumulate input Enumerant that specifies whether the convolution accumulates with or

    overwrites the output tensor.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The operation was launched successfully.

    CUDNN_STATUS_NOT_SUPPORTED The requested operation is not currently supportedin cuDNN. Your diffDesc is likely not in NCHW

    format.

    CUDNN_STATUS_MAPPING_ERROR An error occurs during the texture binding of thefilter data.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

  • 8/10/2019 CUDNN Library (1)

    28/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 26

    4.24. cudnnConvolutionBackwardDatacudnnStatus_tcudnnConvolutionBackwardData( cudnnHandle_t handle, cudnnFilterDescriptor_t filterDesc, constvoid *filterData, cudnnTensor4dDescriptor_t diffDesc, constvoid *diffData, cudnnConvolutionDescriptor_t convDesc, cudnnTensor4dDescriptor_t gradDesc, void *gradData, cudnnAccumulateResult_t accumulate );

    This function computes the convolution gradient with respect to the ouput tensor.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    filterDesc input Handle to a previously initialized filter descriptor.

    filterData input Data pointer to GPU memory associated with the filter descriptor

    filterDesc.

    diffDesc input Handle to the previously initialized input differential tensor descriptor.

    diffData input Data pointer to GPU memory associated with the input differential tensor

    descriptor diffDesc.

    convDesc input Previously initialized convolution descriptor.

    gradDesc input Handle to the previously initialized output tensor descriptor.

    gradData input/

    output

    Data pointer to GPU memory associated with the output tensor descriptor

    gradDescthat carries the result.

    accumulate input Enumerant that specifies whether the convolution accumulates with or

    overwrites the output tensor.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The operation was launched successfully.

    CUDNN_STATUS_NOT_SUPPORTED The requested operation is not currently supportedin cuDNN. Your diffDesc is likely not in NCHW

    format.

    CUDNN_STATUS_MAPPING_ERROR An error occurs during the texture binding of the

    filter data or the input differential tensor data

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

  • 8/10/2019 CUDNN Library (1)

    29/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 27

    4.25. cudnnSoftmaxForwardcudnnStatus_tcudnnSoftmaxForward( cudnnHandle_t handle, cudnnSoftmaxAlgorithm_t algorithm, cudnnSoftmaxMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )

    This routine computes the softmax function.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    algorithm input Enumerant to specify the softmax algorithm.

    mode input Enumerant to specify the softmax mode.

    srcDesc input Handle to the previously initialized input tensor descriptor.

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    destDesc input Handle to the previously initialized output tensor descriptor.

    destData output Data pointer to GPU memory associated with the output tensor descriptor

    destDesc.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function launched successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:

    The dimensions n,c,h,wof the input tensor

    and output tensors differ.

    The datatypeof the input tensor and output

    tensors differ.

    The parameters algorithmormodehave aninvalid enumerant value.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

  • 8/10/2019 CUDNN Library (1)

    30/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 28

    4.26. cudnnSoftmaxBackwardcudnnStatus_tcudnnSoftmaxBackward( cudnnHandle_t handle, cudnnSoftmaxAlgorithm_t algorithm, cudnnSoftmaxMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t srcDiffDesc, constvoid *srcDiffData, cudnnTensor4dDescriptor_t destDiffDesc, void *destDiffData )

    This routine computes the gradient of the softmax function.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    algorithm input Enumerant to specify the softmax algorithm.

    mode input Enumerant to specify the softmax mode.

    srcDesc input Handle to the previously initialized input tensor descriptor.

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    srcDiffDesc input Handle to the previously initialized input differential tensor descriptor.

    srcDiffData input Data pointer to GPU memory associated with the tensor descriptor

    srcDiffData.

    destDiffDesc input Handle to the previously initialized output differential tensor descriptor.

    destDiffData output Data pointer to GPU memory associated with the output tensor descriptordestDiffDesc.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function launched successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:

    The dimensions n,c,h,wof the srcDesc,

    srcDiffDescand destDiffDesctensorsdiffer.

    The strides nStride, cStride, hStride,wStrideof the srcDescand srcDiffDesctensors differ.

    The datatypeof the three tensors differs.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

  • 8/10/2019 CUDNN Library (1)

    31/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 29

    4.27. cudnnCreatePoolingDescriptorcudnnStatus_t cudnnCreatePoolingDescriptor( cudnnPoolingDescriptor_t*poolingDesc )

    This function creates a pooling descriptor object by allocating the memory needed tohold its opaque structure,

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was created successfully.

    CUDNN_STATUS_ALLOC_FAILED The resources could not be allocated.

    4.28. cudnnSetPoolingDescriptorcudnnStatus_t

    cudnnSetPoolingDescriptor( cudnnPoolingDescriptor_t poolingDesc, cudnnPoolingMode_t mode, intwindowHeight, intwindowWidth, intverticalStride, inthorizontalStride )

    This function initializes a previously created pooling descriptor object.

    Param In/out Meaning

    poolingDesc input/

    output

    Handle to a previously created pooling descriptor.

    mode input Enumerant to specify the pooling mode.

    windowHeight input Height of the pooling window.

    windowWidth input Width of the pooling window.

    verticalStride input Pooling vertical stride.

    horizontalStride input Pooling horizontal stride.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was set successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the parameterswindowHeight,windowWidth, verticalStride,

    horizontalStrideis negative ormodehas an

    invalid enumerant value.

  • 8/10/2019 CUDNN Library (1)

    32/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 30

    4.29. cudnnGetPoolingDescriptorcudnnStatus_tcudnnGetPoolingDescriptor( cudnnPoolingDescriptor_t poolingDesc, cudnnPoolingMode_t *mode, int*windowHeight, int*windowWidth, int*verticalStride, int*horizontalStride )

    This function queries a previously created pooling descriptor object.

    Param In/out Meaning

    poolingDesc input Handle to a previously created pooling descriptor.

    mode output Enumerant to specify the pooling mode.

    windowHeight output Height of the pooling window.

    windowWidth output Width of the pooling window.

    verticalStride output Pooling vertical stride.

    horizontalStride output Pooling horizontal stride.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was set successfully.

    4.30. cudnnDestroyPoolingDescriptorcudnnStatus_t cudnnDestroyPoolingDescriptor( cudnnPoolingDescriptor_tpoolingDesc )

    This function destroys a previously created pooling descriptor object.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The object was destroyed successfully.

    4.31. cudnnPoolingForwardcudnnStatus_tcudnnPoolingForward( cudnnHandle_t handle, cudnnPoolingDescriptor_t poolingDesc, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )

    This function computes pooling of input values (i.e., the maximum or average of severaladjacent values) to produce an output with smaller height and/or width.

  • 8/10/2019 CUDNN Library (1)

    33/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 31

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    poolingDesc input Handle to a previously initialized pooling descriptor.

    srcDesc input Handle to the previously initialized input tensor descriptor.

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    destDesc input Handle to the previously initialized output tensor descriptor.

    destData output Data pointer to GPU memory associated with the output tensor descriptor

    destDesc.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function launched successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:

    The dimensions n,cof the input tensor and

    output tensors differ.

    The datatypeof the input tensor and output

    tensors differs.

    CUDNN_STATUS_NOT_SUPPORTED ThewStrideof input tensor or output tensor is

    not 1.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

    4.32. cudnnPoolingBackwardcudnnStatus_tcudnnPoolingBackward( cudnnHandle_t handle, cudnnPoolingDescriptor_t poolingDesc, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t srcDiffDesc, constvoid *srcDiffData, cudnnTensor4dDescriptor_t destDesc, constvoid *destData, cudnnTensor4dDescriptor_t destDiffDesc, void *destDiffData )

    This function computes the gradient of a pooling operation.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    poolingDesc input Handle to the previously initialized pooling descriptor.

    srcDesc input Handle to the previously initialized input tensor descriptor.

  • 8/10/2019 CUDNN Library (1)

    34/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 32

    Param In/out Meaning

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    srcDiffDesc input Handle to the previously initialized input differential tensor descriptor.

    srcDiffData input Data pointer to GPU memory associated with the tensor descriptorsrcDiffData.

    destDesc input Handle to the previously initialized output tensor descriptor.

    destData input Data pointer to GPU memory associated with the output tensor descriptor

    destDesc.

    destDiffDesc input Handle to the previously initialized output differential tensor descriptor.

    destDiffData output Data pointer to GPU memory associated with the output tensor descriptor

    destDiffDesc.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function launched successfully.

    CUDNN_STATUS_BAD_PARAM At least one of the following conditions are met:

    The dimensions n,c,h,wof the srcDescand

    srcDiffDesctensors differ.

    The strides nStride, cStride, hStride,wStrideof the srcDescand srcDiffDesctensors differ.

    The dimensions n,c,h,wof the destDescand destDiffDesctensors differ.

    The stridesnStride, cStride,hStride, wStrideof the destDescand

    destDiffDesctensors differ.

    The datatypeof the four tensors differ.

    CUDNN_STATUS_NOT_SUPPORTED ThewStrideof input tensor or output tensor isnot 1.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

    4.33. cudnnActivationForward

    cudnnStatus_tcudnnActivationForward( cudnnHandle_t handle, cudnnActivationMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t destDesc, void *destData )

    This routine applies a specified neuron activation function element-wise over each inputvalue.

  • 8/10/2019 CUDNN Library (1)

    35/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 33

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    mode input Enumerant to specify the activation mode.

    srcDesc input Handle to the previously initialized input tensor descriptor.

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    destDesc input Handle to the previously initialized output tensor descriptor.

    destData output Data pointer to GPU memory associated with the output tensor descriptor

    destDesc.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function launched successfully.

    CUDNN_STATUS_BAD_PARAM The parametermodehas an invalid enumerantvalue.

    CUDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met:

    The dimensions n,c,h,wof the input tensorand output tensors differ.

    The datatypeof the input tensor and outputtensors differs.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

    4.34. cudnnActivationBackwardcudnnStatus_tcudnnActivationBackward( cudnnHandle_t handle, cudnnActivationMode_t mode, cudnnTensor4dDescriptor_t srcDesc, constvoid *srcData, cudnnTensor4dDescriptor_t srcDiffDesc, constvoid *srcDiffData, cudnnTensor4dDescriptor_t destDesc, constvoid *destData, cudnnTensor4dDescriptor_t destDiffDesc, void *destDiffData )

    This routine computes the gradient of a neuron activation function.

    Param In/out Meaning

    handle input Handle to a previously created cuDNN context.

    mode input Enumerant to specify the activation mode.

    srcDesc input Handle to the previously initialized input tensor descriptor.

  • 8/10/2019 CUDNN Library (1)

    36/38

    cuDNN API Reference

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 34

    Param In/out Meaning

    srcData input Data pointer to GPU memory associated with the tensor descriptor

    srcDesc.

    srcDiffDesc input Handle to the previously initialized input differential tensor descriptor.

    srcDiffData input Data pointer to GPU memory associated with the tensor descriptorsrcDiffData.

    destDesc input Handle to the previously initialized output tensor descriptor.

    destData input Data pointer to GPU memory associated with the output tensor descriptor

    destDesc.

    destDiffDesc input Handle to the previously initialized output differential tensor descriptor.

    destDiffData output Data pointer to GPU memory associated with the output tensor descriptor

    destDiffDesc.

    The possible error values returned by this function and their meanings are listed below.

    Return Value Meaning

    CUDNN_STATUS_SUCCESS The function launched successfully.

    CUDNN_STATUS_BAD_PARAM The parametermodehas an invalid enumerant

    value.

    CUDNN_STATUS_NOT_SUPPORTED At least one of the following conditions are met:

    The dimensions n,c,h,wof the four tensorsdiffer.

    The strides nStride, cStride, hStride,

    wStrideof the input tensor and the input

    differential tensor differ.

    The strides nStride, cStride, hStride,

    wStrideof the output tensor and the output

    differential tensor differ.

    The datatypeof the four tensors differs.

    CUDNN_STATUS_EXECUTION_FAILED The function failed to launch on the GPU.

  • 8/10/2019 CUDNN Library (1)

    37/38

    www.nvidia.comcuDNN Library DU-06702-001_v6.5 | 35

    Chapter 5.ACKNOWLEDGMENTS

    Some of the cuDNN library routines were derived from code developed by theUniversity of Tennessee and are subject to the Modified Berkeley Software DistributionLicense as follows:

    Copyright (c) 2010 The University of Tennessee.

    All rights reserved.

    Redistribution and use in source and binary forms, with or withoutmodification, are permitted provided that the following conditions aremet: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer listed in this license in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOTLIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FORA PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHTOWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOTLIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANYTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USEOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

  • 8/10/2019 CUDNN Library (1)

    38/38