链接 CUDA 和 C++：架构 i386 的未定义符号答案

【问题标题】：Linking CUDA and C++: Undefined symbols for architecture i386链接 CUDA 和 C++：架构 i386 的未定义符号
【发布时间】：2011-10-03 13:32:54
【问题描述】：

我真的很努力，但没有成功。我希望有人可以帮助我完成这项工作。我有两个源文件。

Main.cpp

#include <stdio.h>
#include "Math.h"
#include <math.h>
#include <iostream>

int cuda_function(int a, int b);
int callKnn(void);

int main(void)
{
    int x = cuda_function(1, 2);
    int f = callKnn();
    std::cout << f << std::endl;
    return 1;
}

CudaFunctions.cu

#include <cuda.h>
#include <stdio.h>
#include "Math.h"
#include <math.h>
#include "cuda.h"
#include <time.h>
#include "knn_cuda_without_indexes.cu"

__global__ void kernel(int a, int b)
{
  //statements
}

int cuda_function2(int a, int b)
{
    return 2;
}

int callKnn(void)
{   
    // Variables and parameters
    float* ref;                 // Pointer to reference point array
    float* query;               // Pointer to query point array
    float* dist;                // Pointer to distance array
    int    ref_nb     = 4096;   // Reference point number, max=65535
    int    query_nb   = 4096;   // Query point number,     max=65535
    int    dim        = 32;     // Dimension of points
    int    k          = 20;     // Nearest neighbors to consider
    int    iterations = 100;
    int    i;

    // Memory allocation
    ref    = (float *) malloc(ref_nb   * dim * sizeof(float));
    query  = (float *) malloc(query_nb * dim * sizeof(float));
    dist   = (float *) malloc(query_nb * sizeof(float));

    // Init 
    srand(time(NULL));
    for (i=0 ; i<ref_nb   * dim ; i++) ref[i]    = (float)rand() / (float)RAND_MAX;
    for (i=0 ; i<query_nb * dim ; i++) query[i]  = (float)rand() / (float)RAND_MAX;

    // Variables for duration evaluation
    cudaEvent_t start, stop;
    cudaEventCreate(&start);
    cudaEventCreate(&stop);
    float elapsed_time;

    // Display informations
    printf("Number of reference points      : %6d\n", ref_nb  );
    printf("Number of query points          : %6d\n", query_nb);
    printf("Dimension of points             : %4d\n", dim     );
    printf("Number of neighbors to consider : %4d\n", k       );
    printf("Processing kNN search           :"                );

    // Call kNN search CUDA
    cudaEventRecord(start, 0);
    for (i=0; i<iterations; i++)
        knn(ref, ref_nb, query, query_nb, dim, k, dist);
    cudaEventRecord(stop, 0);
    cudaEventSynchronize(stop);
    cudaEventElapsedTime(&elapsed_time, start, stop);
    printf(" done in %f s for %d iterations (%f s by iteration)\n", elapsed_time/1000, iterations, elapsed_time/(iterations*1000));

    // Destroy cuda event object and free memory
    cudaEventDestroy(start);
    cudaEventDestroy(stop);
    free(dist);
    free(query);
    free(ref);

    return 1;
}

我尝试使用以下命令从终端运行它：

g++ -c Main.cpp -m32
nvcc -c CudaFunctions.cu -lcuda -D_CRT_SECURE_NO_DEPRECATE
nvcc -o mytest Main.o CudaFunctions.o

但出现以下错误：

Undefined symbols for architecture i386:
  "cuda_function(int, int)", referenced from:
      _main in Main.o
  "_cuInit", referenced from:
      knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
  "_cuCtxCreate_v2", referenced from:
      knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
  "_cuMemGetInfo_v2", referenced from:
      knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
  "_cuCtxDetach", referenced from:
      knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
ld: symbol(s) not found for architecture i386
collect2: ld returned 1 exit status

我不知道这是否与#include 语句或头文件有关。我没有想法可以尝试。

【问题讨论】：

您是否还需要告诉 nvcc 您的 CUDA 代码是 32 位的？
我试过了。同样的错误。

标签： c++ macos linker cuda

【解决方案1】：

第一个未定义的符号

"cuda_function(int, int)", referenced from:
   _main in Main.o

是因为CudaFunctions.cu 定义了cuda_function2，而不是cuda_function。更正 CudaFunctions.cu 或 Main.cpp 中的名称。

其余未定义的符号是由于未正确链接到libcuda.dylib 而引起的，因为那是这些符号所在的位置。尝试将-lcuda 参数移动到第二个nvcc 命令行，该命令行实际上将程序链接在一起。更好的是，尝试完全省略 -lcuda 参数，因为它不是必需的。

【讨论】：

嗨贾里德。这是我得到的错误： ld: duplicate symbol _main in CudaFunctions.o and Main.o for architecture i386 collect2: ld returned 1 exit status
我尝试去 Developer/GPU Computing/C 并输入 make clean make x86_64=1 但结果相同。
在另一个论坛中，我有一个不同的例子，有同样的问题。 forums.nvidia.com/index.php?showtopic=211771