The error due to internal declaration.,for eg, the kernel is as follows.
extern „C“global void ComputationdClustersInternelOnGPU(int numTokenSrc,int numWordSrc,int srcLength, char *src,int *srctokensSFIndices,int *srctokensLength,int *srcIndices, int *srcStartIndices,int totalLengthDistinct, char *patternRemoved,int numTokenPattern,int numWordPattern,int patternLength,char *pattern,int *patterntokensSFIndices,int *patterntokensLength,int *patternIndices,int patternStartIndices,float WordsFinal){int ix = blockIdx.x * blockDim.x + threadIdx.x;int dX = (int) malloc(srcLength * totalLengthDistinct * sizeof(int));if(ix<totalLengthDistinct){for (int i = 0; i < srcLength; i++) {if (src[i] == ‚,‘)dX[ix * srcLength + i] = 0;else{if (src[i] == patternRemoved[ix])dX[ix * srcLength + i] = srcIndices[i];else if (src[i] != patternRemoved[ix])dX[ix * srcLength + i] = dX[ix * srcLength + i-1];}} } __syncthreads(); for(int i =0;i<srcLength*totalLengthDistinct;i++){ printf("Elements of an array"); printf("%d\n",dX[i]); }}
When I run the kernel, it gives all zeros as an output for the matrix dX. While, when I declare dX inside the host and send it to the kernel, it gives the right output for the matrix. What is wrong in the declaration in the device, it is only an int matrix of size srcLength * totalLengthDistinct .0000000000000000000000000000000
Can you help me,please?