【发布时间】:2020-07-23 15:17:18
【问题描述】:
我在 Ubuntu 18.04LTS PyTorch C++(1.5.1,CUDA 10.1)上通过预训练的 wordvector (glove.300d) 使用 torch::Embedding 模块计算单词相似度。我相信我已经将所有我能做的事情都转移到了 GPU 上,但是当我执行它时,它仍然显示(问题末尾的完整错误日志):
Expected object of device type cuda but got device type cpu for
argument #1 'self' in call to _th_index_select
(checked_dense_tensor_unwrap at /pytorch/aten/src/ATen/Utils.h:72)
我在main.cpp中检查了我的模型初始化方法,只做初始化就可以了。
SimilarityModel simiModel(args, 400000, 300);
simiModel.to(device);
//model forward
torch::Tensor data = ids.index({Slice(i*batch_size, (i+1)*batch_size), Slice()}).to(torch::kInt64).to(device); //take a batch
tie(score, indice) = simiModel.forward(data); //forward and transfer score, indice to cpu for further calculation
这就是我在 Similarity.h 中定义 SimilarityModel 的方式:
class SimilarityModel : public torch::nn::Module {
public:
int64_t topk; // num of top words;
Dictionary dict;
int64_t vocab_size;
int64_t embedding_dim;
torch::nn::Embedding embedding{nullptr};
vector<vector<float> > vec_embed;
SimilarityModel(unordered_map<string, string> args, int64_t vocab_size, int64_t embed_dim);
tuple<torch::Tensor, torch::Tensor> forward(torch::Tensor x);
};
同时我在Similarity.cpp的SimilarityModel函数中做了嵌入初始化:
SimilarityModel::SimilarityModel(unordered_map<string, string> args, int64_t vocab_size, int64_t embed_dim)
:embedding(vocab_size, embed_dim) { //Embedding initialize
this->topk = stoi(args["topk"]);
vector<vector<float> > pre_embed;
tie(pre_embed, dict) = loadwordvec(args); //load pretrained wordvec from txt file
this->vocab_size = int64_t(dict.size());
this->embedding_dim = int64_t(pre_embed[0].size());
this->vec_embed = pre_embed;
this->dict = dict;
vector<float> temp_embed;
for(const auto& i : pre_embed) //faltten to 1-d
for(const auto& j : i)
temp_embed.push_back(j);
torch::Tensor data = torch::from_blob(temp_embed.data(), {this->vocab_size, this->embedding_dim}, torch::TensorOptions().dtype(torch::kFloat32)).clone(); //vector to tensor
register_module("embedding", embedding);
this->embedding = embedding.from_pretrained(data, torch::nn::EmbeddingFromPretrainedOptions().freeze(true));
}
Similarity.cpp中的和forward函数:
tuple<torch::Tensor, torch::Tensor> SimilarityModel::forward(torch::Tensor x) {
auto cuda_available = torch::cuda::is_available(); //copy to gpu
torch::Device device(cuda_available ? torch::kCUDA : torch::kCPU);
torch::Tensor wordvec;
wordvec = this->embedding->forward(x).to(device); //python:embedding(x)
torch::Tensor similarity_score = wordvec.matmul(this->embedding->weight.transpose(0, 1)).to(device);
torch::Tensor score, indice;
tie(score, indice) = similarity_score.topk(this->topk, -1, true, true); //Tensor.topk(int64_t k, int64_t dim, bool largest = true, bool sorted = true)
score = score.to(device);
indice = indice.to(device);
score.slice(1, 1, score.size(1)); //Tensor.slice(int64_t dim, int64_t start, int64_t end, int64_t step)
indice.slice(1, 1, indice.size(1));
return {score.cpu(), indice.cpu()}; //transfer to cpu for further calculation
}
至于forward()中的中间变量也已经放到GPU上了。但是,我完全不知道 CPU 中还剩下哪一个,而且错误日志也没有多大帮助。我已经尝试过Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_index_select 中的方法来做SimilarityModel().to(device),但这不起作用。我仍然很难阅读此错误日志,并且想要一些有关如何调试此类问题的说明。
错误日志:
terminate called after throwing an instance of 'c10::Error'
what(): Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_index_select (checked_dense_tensor_unwrap at /pytorch/aten/src/ATen/Utils.h:72)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7fb566a27536 in /home/switchsyj/Downloads/libtorch/lib/libc10.so)
frame #1: <unknown function> + 0x101a80b (0x7fb520fa380b in /home/switchsyj/Downloads/libtorch/lib/libtorch_cuda.so)
frame #2: <unknown function> + 0x105009c (0x7fb520fd909c in /home/switchsyj/Downloads/libtorch/lib/libtorch_cuda.so)
frame #3: <unknown function> + 0xf9d76b (0x7fb520f2676b in /home/switchsyj/Downloads/libtorch/lib/libtorch_cuda.so)
frame #4: <unknown function> + 0x10c44e3 (0x7fb558d224e3 in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #5: at::native::embedding(at::Tensor const&, at::Tensor const&, long, bool, bool) + 0x2e2 (0x7fb558870712 in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #6: <unknown function> + 0x114ef9d (0x7fb558dacf9d in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #7: <unknown function> + 0x1187b4d (0x7fb558de5b4d in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x2bfe42f (0x7fb55a85c42f in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #9: <unknown function> + 0x1187b4d (0x7fb558de5b4d in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #10: <unknown function> + 0x32b63a9 (0x7fb55af143a9 in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #11: torch::nn::EmbeddingImpl::forward(at::Tensor const&) + 0x71 (0x7fb55af127b1 in /home/switchsyj/Downloads/libtorch/lib/libtorch_cpu.so)
frame #12: SimilarityModel::forward(at::Tensor) + 0xa9 (0x55c96b8e5793 in ./demo)
frame #13: main + 0xaba (0x55c96b8bfe5c in ./demo)
frame #14: __libc_start_main + 0xe7 (0x7fb51edf5b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #15: _start + 0x2a (0x55c96b8bd74a in ./demo)
Aborted (core dumped)
【问题讨论】:
-
您确定
this->embedding已移至 GPU 吗? -
不确定,但我注意到 c++API 中没有用于 Embedding 类的 .device( ) 函数。而且我也不能通过 .cuda( ) 或 .to(device) 把它放到 GPU 中。我怀疑 this->embedding 也不在 GPU 中(有人说 .to(device) 不是就地操作),所以我尝试通过 SimilarityModel.h 中的 TORCH_MODULE(SimilarityModel) 将它们组装到火炬模块中,然后做main.cpp 中的 simiModel->to(device)。最后,还是出现了同样的错误。@Berriel
标签: c++ pytorch runtime-error embedding libtorch