【问题标题】:Don't understand, IndexError: too many indices for array不明白,IndexError:数组的索引太多
【发布时间】:2017-10-05 23:21:50
【问题描述】:

如果点之间的距离在特定距离(5 公里或 10 公里或 30 公里)内,我的任务是删除经度和纬度坐标。这是出于建模目的并避免点聚集。我正在使用半正弦方程来测量距离。

下面是我的初始代码:

load the geometry record from points,  
then convert it to an array, 
compare each coordinate pairs and measure distance. 
After that, remove the longitude and latitude pairs that are   
close to each other, 

但卡在这一步。

我的计划是更新坐标对的项目列表并使用新的坐标对集再次迭代。

运行下面的脚本给我这个错误:

IndexError: 数组索引过多

似乎迭代中的索引没有更新。在第一次通过时它仍然获得索引。

import math, easygui, shapefile, itertools, os
import pandas as pd
import numpy as np

filepath = easygui.fileopenbox()

input_dist = int(raw_input("Distance Filter Value?: "))
input_crop = raw_input("what crop?: ")

directory = os.path.split(filepath)[0]

def dist_haversine(shp,input_dist,input_crop):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """

    r = shapefile.Reader(shp)
    idx = np.arange(len(r.records()))
    coordinates = []
    for i in idx:
        geom = r.shape(i)
        coordinates.append(geom.points[0])    

    acoords = np.array(coordinates)

    for r,n in itertools.izip(acoords[:,0],acoords[:,1]):

        coordinates_ = []

        for i,j in itertools.izip(acoords[:,0],acoords[:,1]):

            lon1=r
            lat1=n
            lon2=i
            lat2=j

            lon1, lat1, lon2, lat2 = map(math.radians, [lon1, lat1, lon2, lat2])

            # haversine formula
            dlon = lon2 - lon1 
            dlat = lat2 - lat1 
            a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
            c = 2 * math.asin(math.sqrt(a)) 
            km = c*6371 #/1000.0

            if km > input_dist:
                coordinates_.append([i,j])

        coordinates[:] = coordinates_
        acoords = np.array(coordinates)

    df_coords_ = pd.DataFrame(coordinates).drop_duplicates().values
    df_coords = pd.DataFrame(df_coords_, columns=['Lon','Lat'])

    df_coords.insert(0, 'Crop', input_crop)  

    return df_coords.to_csv(os.path.split(directory)[0] + "\\" + "%s_distFilter_%skm.csv" % (input_crop, input_dist), sep=",", index=None)

追溯

File "<ipython-input-3-8e88eba2ab54>", line 1,  in <module>  
  dist_haversine(filepath,input_dist,input_crop)
File "<ipython-input-2-d43a1f1da26a>", line 20, in 
  dist_haversine  
    for i,j in itertools.izip(acoords[:,0],acoords[:,1]):  
IndexError: too many indices for array

【问题讨论】:

  • 嗨,请发布完整的例外情况。此外,它可能有助于“简化”问题并添加一些示例数据。似乎问题不是关于坐标,而是关于 NumPy 数组。
  • 嗨,这是完整的错误消息: Traceback(最近一次调用最后一次):/n 文件“”,第 1 行,在 dist_haversine(filepath ,input_dist,input_crop) /n 文件“”,第 20 行,在 itertools.izip(acoords[:,0],acoords[:,1]) 中 i,j 的 dist_haversine 中:/ n IndexError: too many indices for array /n 我在 Dropbox 中上传了文件:dropbox.com/s/2s25tlkzbif3d54/sample_rice.tar.gz?dl=0
  • 任何解决方法?
  • 如果你愿意发minimal reproducible example,也许有人愿意帮助你。

标签: python filter distance


【解决方案1】:

这是我用来过滤点的初始解决方案。它的工作原理,对于 1,000 - 3,000 点的数据集来说有点快。但是,尝试过滤 50,000 个点,需要 2.5 - 3 个小时才能完成。

def dist_haversine(filepath,input_dist,input_crop):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """

    r = shapefile.Reader(filepath)
    idx = np.arange(len(r.records()))
    coordinates = []
    for i in idx:
        geom = r.shape(i)
        coordinates.append(geom.points[0])       

    acoords = np.array(coordinates)

    index = []        
    for r,n,l in itertools.izip(acoords[:,0],acoords[:,1],idx):
        if l in index:
            continue
        else:
            for i,j,k in itertools.izip(acoords[:,0],acoords[:,1], idx):
                if k in index:
                    continue

                else:

                    lon1=r
                    lat1=n
                    lon2=i
                    lat2=j

                    coord_check = ((lon1 == lon2) & (lat1 == lat2))*1

                    if coord_check == 1:
                        continue

                    else:
                        lon1, lat1, lon2, lat2 = map(math.radians, [lon1, lat1, lon2, lat2])

                        # haversine formula
                        dlon = lon2 - lon1 
                        dlat = lat2 - lat1 
                        a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
                        c = 2 * math.asin(math.sqrt(a)) 
                        km = c*6371 #/1000.0

                    if km < input_dist:
                        if k in index:
                            continue
                        else:
                            index.append(k)

    filterList = [i for j, i in enumerate(coordinates) if j not in index]

    df_coords = pd.DataFrame(filterList, columns=['Lon','Lat'])

    df_coords.insert(0, 'Crop', input_crop)  

    return df_coords.to_csv(directory + "\\" + "%s_distFilter_%skm.csv" % (input_crop, input_dist), sep=",", index=None)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-09-04
    • 1970-01-01
    • 2014-04-02
    • 2013-12-08
    • 2020-09-26
    • 2019-07-17
    相关资源
    最近更新 更多