日期的时间戳，以及 pandas 或 csv 格式答案

【问题标题】：Timestamp from date, and formatting in panda or csv日期的时间戳，以及 pandas 或 csv 格式
【发布时间】：2015-12-14 03:00:16
【问题描述】：

我有一个函数可以输出从 RINEX (GPS) 文件生成的数据帧。目前，我将数据帧输出到单独的卫星（1-32）文件中。我想访问第一列（当它仍然是数据框或这些新文件时），以便将日期格式化为以秒为单位的时间戳，如下所示：

Epochs                  Epochs
2014-04-27 00:00:00 ->  00000
2014-04-27 00:00:30 ->  00030
2014-04-27 00:01:00 ->  00060

这需要去掉日期，然后将 hh:mm:ss 转换为秒。我碰壁试图弄清楚如何最好地访问第一列（Epochs），然后在整个列上进行转换。我一直在处理的代码是：

def read_data(self, RINEXfile):
    obs_data_chunks = []

    while True:
        obss, _, _, epochs, _ = self.read_data_chunk(RINEXfile)

        if obss.shape[0] == 0:
            break

        obs_data_chunks.append(pd.Panel(
            np.rollaxis(obss, 1, 0),
            items=['G%02d' % d for d in range(1, 33)],
            major_axis=epochs,
            minor_axis=self.obs_types
        ).dropna(axis=0, how='all').dropna(axis=2, how='all'))

        obs_data_chunks_dataframe = obs_data_chunks[0]

        for sv in range(32):
            sat = obs_data_chunks_dataframe[sv, :]
            print "sat_columns: {0}".format(sat.columns[0]) #list header of first column: L1
            sat.to_csv(('SV_{0}').format(sv+1), index_label="Epochs", sep='\t')

我是在数据帧内（即“sat”上，还是在使用“to_csv”后的文件上）执行此转换？我在这里有点迷路了。格式化列的相同问题。请参阅下面格式不太好的列：

Epochs  L1  L2  P1  P2  C1  S1  S2
2014-04-27 00:00:00 669486.833  530073.33   24568752.516    24568762.572    24568751.442    43.0    38.0
2014-04-27 00:00:30 786184.519  621006.551  24590960.634    24590970.218    24590958.374    43.0    38.0
2014-04-27 00:01:00 902916.181  711966.252  24613174.234    24613180.219    24613173.065    42.0    38.0
2014-04-27 00:01:30 1019689.006 802958.016  24635396.428    24635402.41 24635395.627    42.0    37.0
2014-04-27 00:02:00 1136478.43  893962.705  24657620.079    24657627.11 24657621.828    42.0    37.0

更新： 通过说我在试图弄清楚如何最好地访问第一列（Epochs）时遇到了困难，最初在其标题中的““sat”数据帧没有“Epochs”。它只是有信号：

L1  L2  P1  P2  C1  S1  S2

标头中缺少索引（日期和时间）。为了在我的 csv 输出文件中克服这个问题，我“强制”使用以下名称：

sat.to_csv(('SV_{0}').format(sv+1), index_label="Epochs", sep='\t')

我希望在生成 csv 文件之前，我应该（但不知道如何）能够访问此索引（日期和时间）列并简单地一次性转换所有日期/时间，以便输出时间戳。

更新： 纪元是在另一个函数的数据框中生成的：

epochs = np.zeros(CHUNK_SIZE, dtype='datetime64[us]')

更新：

def read_data_chunk(self, RINEXfile, CHUNK_SIZE = 10000):
    obss = np.empty((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.float64) * np.NaN
    llis = np.zeros((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
    signal_strengths = np.zeros((CHUNK_SIZE, TOTAL_SATS, len(self.obs_types)), dtype=np.uint8)
    epochs = np.zeros(CHUNK_SIZE, dtype='datetime64[us]')
    flags = np.zeros(CHUNK_SIZE, dtype=np.uint8)

    i = 0
    while True:
        hdr = self.read_epoch_header(RINEXfile)
        #print hdr
        if hdr is None:
            break
        epoch, flags[i], sats = hdr
        epochs[i] = np.datetime64(epoch)
        sat_map = np.ones(len(sats)) * -1
        for n, sat in enumerate(sats):
            if sat[0] == 'G':
                sat_map[n] = int(sat[1:]) - 1
        obss[i], llis[i], signal_strengths[i] = self.read_obs(RINEXfile, len(sats), sat_map)
        i += 1
        if i >= CHUNK_SIZE:
            break

    return obss[:i], llis[:i], signal_strengths[:i], epochs[:i], flags[:i]

更新：

如果我的描述有些含糊，我深表歉意。实际上我正在修改已经开发的代码，而且我不是软件开发人员，所以这对我来说也是一个强大的学习曲线。让我进一步解释一下：“Epochs”是从另一个函数中读取的：

def read_epoch_header(self, RINEXfile):
            epoch_hdr = RINEXfile.readline()
            if epoch_hdr == '':
                return None

            year = int(epoch_hdr[1:3])
            if year >= 80:
                year += 1900
            else:
                year += 2000
            month = int(epoch_hdr[4:6])
            day = int(epoch_hdr[7:9])
            hour = int(epoch_hdr[10:12])
            minute = int(epoch_hdr[13:15])
            second = int(epoch_hdr[15:18])
            microsecond = int(epoch_hdr[19:25]) # Discard the least significant digits (use microseconds only).
            epoch = datetime.datetime(year, month, day, hour, minute, second, microsecond)

            flag = int(epoch_hdr[28])
            if flag != 0:
                raise ValueError("Don't know how to handle epoch flag %d in epoch header:\n%s", (flag, epoch_hdr))

            n_sats = int(epoch_hdr[29:32])
            sats = []
            for i in range(0, n_sats):
                if ((i % 12) == 0) and (i > 0):
                    epoch_hdr = RINEXfile.readline()
                sats.append(epoch_hdr[(32+(i%12)*3):(35+(i%12)*3)])

            return epoch, flag, sats

在上面的 read_data 函数中，这些被附加到数据帧中。我基本上想让这个数据帧由它的卫星轴分隔，这样每个卫星文件在第一列中都有，即时期，然后是以下 7 个信号。 read_data 文件中的最后一段代码（如下）解释了这一点：

for sv in range(32):
            sat = obs_data_chunks_dataframe[sv, :]
            print "sat_columns: {0}".format(sat.columns[0]) #list header of first column: L1
            sat.to_csv(('SV_{0}').format(sv+1), index_label="Epochs", sep='\t')

The problem here is (1) I want to have the first column as timestamps (so, strip the date, convert so midnight = 00000s and 23:59:59 = 86399s) not as they are now, and (2) ensure the columns are aligned, so I can eventually manipulate these further using a different class to perform other calculations i.e. L1 minus L2 plotted against time, etc.

【问题讨论】：

抱歉'Epochs' 在 df 中已经是 datetime64 了吗？
那么您的索引已经是日期时间或日期时间字符串了吗？
是Epochs 你在sat 数据框中的索引吗？
@NaderHisham It's dtype='datetime64[us]'（以上更新）
抱歉，您说您生成了 epoch，但您是否将其添加为索引或新列？

标签： python csv pandas gps timestamp

【解决方案1】：

最后我自己解决了部分问题：在 read_epoch_header 函数中，我简单地操作了一个将 hh:mm:ss 转换为秒的变量，并将其用作纪元。看起来不那么优雅，但它确实有效。只需要格式化标题，使其与列对齐（并且它们也对齐）。干杯，pymat

【讨论】：

【解决方案2】：

当它是 df 时这样做会快得多，如果 dtype 是 datetime64 然后只需转换为 int64 然后除以纳秒：

In [241]:
df['Epochs'].astype(np.int64) // 10**9

Out[241]:
0    1398556800
1    1398556830
2    1398556860
3    1398556890
4    1398556920
Name: Epochs, dtype: int64

如果是字符串，则使用to_datetime进行转换，然后执行上述操作：

df['Epochs'] = pd.to_datetime(df['Epochs']).astype(np.int64) // 10**9

见related

【讨论】：

索引标签 (sat.to_csv(('SV_{0}').format(sv+1), index_label="Epochs", sep='\t'))从数据帧到 csv 的转换。我不知何故需要告诉数据框，使用第一列。这未编入索引。
我真的不明白你的意思，你能用原始数据和代码来编辑你的问题，以创建你的 df 和进一步的解释