【发布时间】:2019-12-20 07:19:21
【问题描述】:
我正在尝试使用 Open MPI 的 Java 绑定创建自定义数据类型(我称之为 AttribStruct),但是当我尝试运行该程序时遇到了 invalid datatype 错误。我怀疑这是因为结构中数组的维度是在运行时确定的。
我的目标是在每个 send 和 recv 操作中发送 n 个 AttribStruct 缓冲区。因此,我将所有 n 单个缓冲区连接到一个大缓冲区中,以便将它们全部发送到一起。然后接收端解构缓冲区。
下面是 AttribStruct 类
public class AttribStruct extends Struct {
// Pointers to the objectives, variables, and constraints
private final int objectives;
private final int variables;
private final int constraints;
// an int that represents a boolean for whether the objectives are valid (1 true, 0 false)
private final int validObjectiveFunctionsValues;
private final int validConstraintsViolationValues;
public final int objCount;
public final int varCount;
public final int constrCount;
public AttribStruct(int objCount, int varCount, int constrCount) {
this.objCount = objCount;
this.varCount = varCount;
this.constrCount = constrCount;
this.objectives = addDouble(this.objCount);
this.constraints = addDouble(this.constrCount);
this.variables = addDouble(this.varCount);
this.validObjectiveFunctionsValues = addInt();
this.validConstraintsViolationValues = addInt();
}
@Override
protected Data newData() {
return new Data();
}
public class Data extends Struct.Data {
/*
* Getters go here
*/
/*
* Setters go here
*/
} // End -- Data
}
下面是我发送多个结构的示例:
AttribStruct attr = new AttribStruct(4, 3, 0);
/*
* Build buffer goes here
*/
attr.getType().commit();
MPI.COMM_WORLD.iSend(toSend, n, attr.getType(), target, 0);
attr.getType().free();
其中toSend 是n 连接在一起的不同 AttribStruct 缓冲区,n 是我发送的属性结构的数量,attr 是 AttribStruct 类的一个实例,target 是我正在与之通信的节点,0 只是一个占位符。
以下是目标节点接收消息的粗略示例:
AttribStruct attr = new AttribStruct(4, 3, 0);
attr.getType().commit();
MPI.COMM_WORLD.recv(msgBuffer, n, attr.getType(), MASTER_RANK, 0);
attr.getType().free();
/*
* Deconstruct buffer goes here
*/
但是,当我运行程序时,我收到以下错误消息:
[node01:13219] *** An error occurred in MPI_Recv
[node01:13219] *** reported by process [203161601,0]
[node01:13219] *** on communicator MPI_COMM_WORLD
[node01:13219] *** MPI_ERR_TYPE: invalid datatype
[node01:13219] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node01:13219] *** and potentially your MPI job)
我的整体策略有意义吗?如果是这样,你能帮我弄清楚我做错了什么吗?如果您需要更多详细信息,请告诉我,因为我省略了一些代码以减少混乱。
【问题讨论】: