分配矩阵时的 Fortran Seg 错误答案

【问题标题】：Fortran Seg Fault when assigning Matrices分配矩阵时的 Fortran Seg 错误
【发布时间】：2015-06-19 18:50:32
【问题描述】：

[更新] 代码和一些句子已更改以反映我在第二条评论中解释的实现。代码应该使用下面的行编译，但是，我有一个较旧的 gfortran，可能看不到您可能会看到的一些错误。

gfortran BLU_implementation_copy.f90 -o BLU_implementation_copy.x

我在使用 Fortran 90 运行时遇到了令人难以置信的不一致 seg fault 错误。

我的代码的总体目标是将随机生成的、复杂的、对称的、对角占优的矩阵分解成可以轻松并行化的块。最后可以找到精简版（相关功能除外）。

在函数 mat2tiles 中将原始矩阵分解为块（图块）时遇到错误。具体来说，这行代码一直失败：

o(i,j)=cell(x(a:min(i*Nblock,M), b:min(j*Nblock, N)))

这是将输入矩阵 X 的一个块分配给索引 i,j 处的矩阵 o。它之所以做到这一点，是因为 o 的每个索引都是通过派生类型的矩阵 NblockxNblock：

type cell complex*16 :: block(Nblock,Nblock) end type cell

type(cell) :: o(M/Nblock+rem,N/Nblock+rem)

对于某些矩阵大小和图块大小，一切正常。对于较大的尺寸，错误开始出现在接近矩阵尺寸的图块尺寸中，并最终达到矩阵尺寸的一半。例如，我发现这种情况的最小实例是一个 20x20 的矩阵，其 tile 大小为 18。但这不会始终如一地发生。如果我尝试所有正常运行并建立到该大小的较小矩阵，它就会运行。如果我做一个更大的尺寸来分段故障，然后用 18 运行 20x20，它会分段故障。我发现的最小一致参数是 25x25，平铺大小为 23。但即使是这个大小，如果我打印出在 o(i,j) 分配行之后失败的块：

print*, o(2,1)%block

它突然毫无问题地贯穿整个事情。取出打印语句使其再次出现段错误。最后，最烦人的一个 [已修复 - 请参阅我的第二条评论]，如果我做一个 2000x2000 的矩阵，其 tile 大小为 1000 或更大，我在进入函数后会出现 seg 错误（这意味着它发生在可变分配）。这让我相信这个问题可能源于矩阵的分配方式，特别是因为我使用的是派生类型。但是当我尝试诊断矩阵的大小和内容时，一切似乎都很正常。

program BLU_implementation
    implicit none
    integer :: rem
    integer, parameter :: M=20, N=20, Nblock=19
    real*8 :: start, finish
    complex*16 :: A(M,N)

    type cell
    complex*16 :: block(Nblock,Nblock)
    end type cell

!determines if cell matrix doesn't need to be bigger
    rem=1
    if (modulo(M,Nblock)==0) then
        rem=0
    endif

    call cpu_time(start)
    call functionCalling(A, M, N, Nblock, rem)
    call cpu_time(finish)
    print*, 'overall time:', finish-start, 'seconds'

contains
!==================================================================================================================
subroutine functionCalling(A, M, N, Nblock, rem)
    implicit none
    integer :: IPIV, INFO, M, N, Nblock, rem
    real*8 :: start, finish
    complex*16 :: A(M,N)

    type(cell) :: C(M/Nblock+rem,N/Nblock+rem)

    call cpu_time(start)

    A= CSPDmatrixFill(A,M,N)

    call cpu_time(finish)
    print*, 'matrix fill time:', finish-start, 'seconds'


    call cpu_time(start)

    C= mat2tiles(A,M,N,Nblock,rem) 

    call cpu_time(finish)
    print*, 'tiling time:', finish-start, 'seconds'

end subroutine

!===================================================================================================================
! generates a complex, symmetric, positive-definite matrix (based off of another's code)

function CSPDmatrixFill(A, M, N) result (Matrix)

! initialization  
      implicit none
      integer :: i, j
      integer :: M, N
      real*8 :: x, xi
      complex*16 :: A(M, N), Matrix(M, N), EYE(M, N), MT(N, M)

      EYE=0
      forall(j=1:M) EYE(j,j)=1

! execution

      call random_seed

      do i=1, M
        do j=1, N
          call random_number (x)
            call random_number(xi)
            Matrix(i,j) = cmplx(x,xi)
       end do
      end do  

! construct a symmetric matrix ( O(n^2) )
      call Mtranspose(Matrix, M, N)
      Matrix = Matrix+MT  

! make positive definite (diagonally dominant)
        Matrix = Matrix + N*EYE

end function CSPDmatrixFill

!======================================================================================================
subroutine Mtranspose(A, i, j)
! takes a matrix and the two parameters used to make the matrix: A(i,j)
! returns a matrix with switched indices: A(j,i)
      implicit none

      integer :: i, j
      complex*16 :: A(i,j), MT(j,i)

      MT=A(j,i)
      return
end subroutine Mtranspose

!=======================================================================================================
!MAT2TILES - breaks up an array into a cell array of adjacent sub-arrays of equal sizes
!
! O=mat2tiles(X,M,N,Nblock)
!
!will produce a cell array o containing adjacent chunks of the array X(M,N)
!with each chunk of dimensions NblockxNblock. If Nblock does
!not divide evenly into size(X,i), then the chunks at the upper boundary of
!X along dimension i will have bogus values that do not affect the factorization
!in the places where the matrix doesn't occupy. (according to older versions. Might have changed with some edits)
!
function mat2tiles(X,M,N,Nblock,rem) result(o)

! initialization  
      implicit none
      integer :: a, b, i, j, M, N, Nblock, rem
      complex*16 :: X(M,N)

      type(cell) :: o(M/Nblock+rem,N/Nblock+rem)

! diagnostic print statements
        print*,size(o(1,1)%block), size(o(1,2)%block), size(o(2,1)%block), size(o(2,2)%block)
        print*, 'got to start'

! turn matrix x into cell matrix o

      do j=1, N/Nblock+rem
        if (j==1) then
          b=j
        else 
          b=b+Nblock
        endif       
        do i=1, M/Nblock+rem
          if (i==1) then
            a=i
          else 
            a=a+Nblock
          endif
! diagnostic print statement
          print*, 'writing to o: i:', i, 'j:', j, 'i*Nblock:', i*Nblock, 'j*Nblock:', j*Nblock, 'min of i:', min(i*Nblock, M), &
                    'min of j:', min(j*Nblock, N), 'a:', a, 'b:', b

          o(i,j)=cell(x(a:min(i*Nblock,M), b:min(j*Nblock, N))) 
        enddo
      enddo
! diagnostic print statement
      print*, 'got to end'
      return 
end function mat2tiles

!==================================================================================================
end program

【问题讨论】：

您提供的代码甚至无法编译！我收到大量错误消息。请先解决这些问题，也许您的问题会在此过程中得到解决。
@AlexanderVogt 嗨，它为我编译了以下行：gfortran BLU_implementation_copy.f90 -o BLU_implementation_copy.x ~/lapack/liblapack.a ~/lapack/librefblas.a
大家好！更新：我发现了部分问题。我的编译器有点旧，没有发现错误。在另一台计算机上，编译器在 CSPDmatrix fill 的函数行中发现了错误，我在其中为函数定义了“复杂”。这与正确的数据类型相冲突，通过将其取出，运行 2000x2000 和 1000 块大小的案例。但是，其他情况仍然存在段错误。

标签： matrix segmentation-fault fortran derived-types

【解决方案1】：

在将问题缩小到使用数字与相同数字的变量之后，发现 Fortran 不喜欢将矩阵分配给不同维度的矩阵，即使它适合其中。这很奇怪，因为 M、N 和 Nblock 的较小值完美地解决了这个问题。

解决方案只是为i=1 和j=1 定义o(i,j)%block(1:Nblock,1:Nblock)=x(dim1:dim2,etc) 而不是o(i,j)=cell(x(dim1:dim2,etc)，并为i 和j 的所有情况分别稍微修改实例。

正确的代码（适用于所有矩阵大小和图块大小的情况）如下所示：

if (i==M/Nblock+rem .AND. j==N/Nblock+rem .AND. rem==1) then 
    o(i,j)%block(1:M-(i-1)*Nblock,1:N-(j-1)*Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
else if (i==M/Nblock+rem .AND. j/=N/Nblock+rem .AND. rem==1) then
    o(i,j)%block(1:M-(i-1)*Nblock,1:Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
else if (i/=M/Nblock+rem .AND. j==N/Nblock+rem .AND. rem==1) then
    o(i,j)%block(1:Nblock,1:N-(j-1)*Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
else
    o(i,j)%block(1:Nblock,1:Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
end if

通过这一更正，我的代码可以在新旧版本的 gfortran 上运行。然而，有趣的是，gfortran 的 Mac OS X 版本从未遇到此段错误，只有 Linux 版本。

【讨论】：