GPS下采样答案

【问题标题】：GPS downsamplingGPS下采样
【发布时间】：2016-02-14 21:06:55
【问题描述】：

我的目标是每 100m 对我的数据进行下采样并得到第一行和最后一行

我的问题是，当我下采样时，我得到的行数比我应该得到的要少得多，而且我不知道如何得到最后一行。

希望足够清楚，有人能理解

To make this
Line 20130904_0848.nmea
$GPGGA,111936.00,5849.37538,N,01739.88263,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*42
$GPGGA,111936.00,5849.37548,N,01739.88240,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*44
$GPGGA,111936.00,5849.37556,N,01739.88216,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*48
$GPGGA,111936.00,5849.37569,N,01739.88193,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*4a
$GPGGA,111936.00,5849.37581,N,01739.88171,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*40
$GPGGA,111936.00,5849.69118,N,01739.89674,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*4c
EOL 

Line 20130904_0926.nmea
$GPGGA,111936.00,5849.67569,N,01739.98426,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*45
$GPGGA,111936.00,5849.67593,N,01739.98453,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*42
$GPGGA,111936.00,5849.67616,N,01739.98479,,E,2,09,00.9,00004.43,M,0024.87,M,007,0734*44
....

Look like this

Line 20081002-1119.nmea
58.853952   13.309779   0.00
58.853907   13.310688   101.15
58.853858   13.311593   100.72
58.853811   13.312498   100.62
58.853764   13.313402   100.59
58.853752   13.313660   28.70

EOL

Line 20081002-1119.nmea
58.853952   13.309779   0.00
58.853907   13.310688   101.15
58.853858   13.311593   100.72
58.853811   13.312498   100.62
58.853764   13.313402   100.59
...

这是我目前的代码

from math import sin, cos, sqrt, atan2, radians

coord=[]
coord1=None
def distance(coord1,coord2): #Haversin
    lat1,lon1=coord1
    lat2,lon2=coord2
    dlat = radians(lat2-lat1)
    dlon = radians(lon2-lon1)
    a = sin(dlat/2) * sin(dlat/2)
    + cos(radians(lat1))*cos(radians(lat2))*sin(dlon/2)*sin(dlon/2)
    c = 2 *atan2(sqrt(a),sqrt(1-a))
    s = (6367*c)*1000 #meter
    return s

# with open as data will close itself after reading each line. so you don't need to close it yourself

with open('asko_nav_2013.nmea', 'r') as indata:         #making a indata and outdata, r stands for reading(readcapabilities
    with open('asko_nav_out.txt', 'w') as outdata:      #w stands for write write to a new file(open for writing-you can change things)


        while True:
            line = indata.readline()
            if not line:
                break
            if line.startswith('EOL'):  #if the line starts with EOL(end of line) it writes it in the output
                outdata.writelines("EOL")
                coord1=None
            elif line.startswith('Line'): 
                LineID=line
                outdata.writelines('\n%s' %LineID)
            elif line.startswith('$GPGGA'):  #when the fist line starts with $GPGGA it splits the columns
                data=line.split(",")        #the for loop reads the file line by line



            # Importing only coordinates from asko input file (Row 2 and 4)

                # Converting the coordinates from DDMM.MMMM to DD.DDDDDD
                LAT=(data[2])
                LAT_D=LAT[0:2]               
                LATID=float(LAT_D)

                LAT_M=LAT[2:]
                LATM=float(LAT_M)
                LATIM = float(LATM) / 60.0

                latitude=(LATID + LATIM)                  

                LON=(data[4])
                LON_D=LON[1:3]
                LONGD=float(LON_D)

                LON_M=LON[3:]
                LONM=float(LON_M)
                LONGM = float(LONM) / 60.0

                longitude=(LONGD + LONGM)

                if coord1 is None:

                # The first time through the loop "coord1" is None
                    outdata.writelines('%0.6f\t%0.6f\t%s \n'%(latitude,longitude,0))
                    coord1=(latitude,longitude)
                else:
                    coord2=(latitude,longitude)
                    dist=distance(coord1,coord2)

                    if dist <100:
                        continue
                    outdata.writelines('%0.6f\t%0.6f\t%f\n' % (latitude,longitude,dist))
                    coord1=coord2

【问题讨论】：

我尽可能地改进了缩进，并删除了行首的所有>，就好像您从邮件列表中获取的一样。你能否进一步改进缩进，因为你是唯一知道哪些块属于哪里的人。
这样想，我不知道，但它看起来不错，我认为你的 cmets 缩进了一行。但最重要的部分是 if: 等中的代码块。

标签： python gps coordinates downsampling

【解决方案1】：

您的代码可以进行一些重组以使其更清晰。在距离小于 100m 的情况下，每当看到 EOL 时，您需要添加额外的写入：

from math import sin, cos, sqrt, atan2, radians    

def distance(coord1, coord2): #Haversin
    lat1,lon1=coord1
    lat2,lon2=coord2
    dlat = radians(lat2-lat1)
    dlon = radians(lon2-lon1)
    a = sin(dlat/2) * sin(dlat/2)
    + cos(radians(lat1))*cos(radians(lat2))*sin(dlon/2)*sin(dlon/2)
    c = 2 *atan2(sqrt(a),sqrt(1-a))
    s = (6367*c)*1000 #meter
    return s

def get_coordinates(data):
    # Importing only coordinates from asko input file (Row 2 and 4)
    # Converting the coordinates from DDMM.MMMM to DD.DDDDDD

    LAT = (data[2])
    LAT_D = LAT[0:2]               
    LATID = float(LAT_D)

    LAT_M = LAT[2:]
    LATM = float(LAT_M)
    LATIM = float(LATM) / 60.0

    latitude = (LATID + LATIM)                  

    LON = (data[4])
    LON_D = LON[1:3]
    LONGD = float(LON_D)

    LON_M = LON[3:]
    LONM = float(LON_M)
    LONGM = float(LONM) / 60.0

    longitude = (LONGD + LONGM)

    return (latitude, longitude)


coord1 = None

# with open as data will close itself after reading each line. so you don't need to close it yourself

with open('asko_nav_2013.nmea', 'r') as indata, open('asko_nav_out.txt', 'w') as outdata:
    for line in indata:
        if line.startswith('EOL'):  #if the line starts with EOL(end of line) it writes it in the output
            if dist < 100:
                outdata.write('%0.6f\t%0.6f\t%f\n' % (latitude, longitude, dist))
            outdata.write("\nEOL\n")
            coord1 = None   # Reset the first coordinate
        elif line.startswith('Line'): 
            outdata.write('\n%s' % line)
        elif line.startswith('$GPGGA'):  #when the fist line starts with $GPGGA it splits the columns
            data=line.split(",")        #the for loop reads the file line by line
            latitude, longitude = get_coordinates(data)

            if coord1:
                coord2 = (latitude, longitude)
                dist = distance(coord1, coord2)

                if dist >= 100:
                    outdata.write('%0.6f\t%0.6f\t%f\n' % (latitude, longitude, dist))
                    coord1 = coord2         
            else:
                # The first time through the loop "coord1" is None
                outdata.write('%0.6f\t%0.6f\t0.0 \n' % (latitude, longitude))
                coord1 = (latitude, longitude)

对于给定的输入，这会产生以下输出文件：

Line 20130904_0848.nmea
58.822923   17.664710   0.0 
58.828186   17.664946   584.888514

EOL

Line 20130904_0926.nmea
58.827928   17.666404   0.0 
58.827936   17.666413   0.870480

EOL

您还需要在检测到EOL 时重置coord1，以确保0 在第一个条目时再次显示。

由于您的示例数据似乎与您的预期输出不符，因此很难看出这是否能完全解决问题。

【讨论】：

感谢重组，这是非常需要的 :D 距离应该在 100m 左右，除了第一个和最后一个。第一个应该是 0（就是它），最后一个应该只是该行中的最后一个。这是我需要帮助才能得到的
Line 20130904_0848.nmea 的预期输出是什么？
A 确实知道最后一行，每 100m - 谢谢
你的距离函数可能有问题，我没有检查过它的逻辑。
也许吧？但我找不到另一种方法：/

【解决方案2】：

解决关于结果行少于预期的第二个问题：您提供的有关问题性质和正在处理的输入数据的信息太少。如果您的输入数据是从移动物体行进的轨迹中采样的，尤其是在运动不是纯线性的情况下，那么“每 100m”对您的输入进行采样可能意味着不同。

假设您的输入描述了通过定期测量 GPS 坐标获得的坐标，同时沿着半径小于 15m 的圆移动。那么无论您的输入提供多少数据点，您提出的解决方案的输出永远不会超过两条线，因为沿着该曲线的任何两个点的绝对距离都不能超过 100m。这或许可以解释为什么您在输出中看到的行数少于预期。

如果您的意思是在每 100 米行进处对输入进行采样，则您必须对自最后一个采样输出点以来输入样本之间的所有距离求和，并使用它而不是 dist。修改Martin重组后的代码，可以这样（为简洁省略了一些行）：

coord1 = None
coord_last = None  # holds coordinate of last input sample
dist = 0.0         # total distance travelled since coord1
# [...]
with open('asko_nav_2013.nmea', 'r') as indata, open('asko_nav_out.txt', 'w') as outdata:
    for line in indata:
    # [...]
            if coord1:
                coord2 = (latitude, longitude)
                delta = distance(coord_last, coord2)
                dist += delta
                coord_last = coord2

                if dist >= 100:
                    outdata.write('%0.6f\t%0.6f\t%f\n' % (latitude, longitude, dist))
                    coord1 = coord2
                    dist = 0.0
            else:
                # The first time through the loop "coord1" is None
                outdata.write('%0.6f\t%0.6f\t0.0 \n' % (latitude, longitude))
                coord1 = (latitude, longitude)
                coord_last = coord1
                dist = 0.0

【讨论】：

现在有更多数据，但没有我希望的那么多，但仍然比以前多，非常感谢：D
您能否详细说明为什么您认为输出中应该有更多数据点？如果您提供有关如何获取数据（移动车辆？）的更具体信息，为什么您打算减少样本数量（可视化？）以及您确定输出大小令人满意的指标，这将有很大帮助。
哦，当然！它来自一艘正在绘制海底地图的船。数据应该在 QGIS 中用于制作地图 :) 进行下采样是因为大约有 70 000 行数据，所以我在下采样后得到的 200 行有点少
感谢您解决这个问题。那么你主要对时间序列不感兴趣，在这种情况下你应该忽略我上面的建议。您可能想查看Resampling irregularly spaced data to a regular grid in Python 之类的内容。我想QGIS也有重采样过滤器。简单地丢弃测量值似乎是一种浪费，尤其是。如果您不知道“船”是否遵循常规搜索轨迹（可能有助于将其可视化！）