如何修改下面的 Python 代码以将字符附加到 Pandas 中字符串的开头？答案

【问题标题】：How do I modify my Python code below to append a character to the beginning of the string in Pandas?如何修改下面的 Python 代码以将字符附加到 Pandas 中字符串的开头？
【发布时间】：2018-09-05 14:17:42
【问题描述】：

我正在做一项数据可视化任务，我需要在其中获取数据集并进行某些可视化。考虑有关数据集的以下内容：

列由经度表示（带有“E”或“W”的字符串列表，分别表示东经或西经）
行由纬度表示（一列带有“N”或“S”的字符串分别表示北纬或南纬）

所以我必须读取数据集，将带有“N”的纬度转换为正浮点值，将“S”作为负浮点值（整个数据在字符串中）。

同样，我必须将附有“E”的经度转换为正浮点值，并将附有“W”的经度转换为负浮点值。

由于我是 Python、Pandas、Numpy 的新手，我在实现相同目标方面遇到了很多困难。到目前为止，我已经能够将字符串格式的纬度和经度转换为浮点格式，并分别去掉“N”、“S”、“E”、“W”字符。但是，我无法弄清楚如何在浮点转换之前根据字符（'N'、'S'、'E'、'W'）将浮点值设为正值或负值。
以下是我目前编写的代码：

import pandas as pd

df = pd.read_csv("Aug-2016-potential-temperature-180x188.txt", skiprows = range(7))
df.columns = ["longitude"]
df = df.longitude.str.split("\t", expand = True)
smaller = df.iloc[::10,:]

print(df.head(10), end = "\n")
print(smaller, end = "\n")
print(df.iloc[1][3], end = "\n")
print(smaller.iloc[2][175], end = "\n")

import numpy as np
import pandas as pd

data = pd.read_csv('~/documents/datasets/viz_a1/Aug-2016-potential-temperature-180x188.txt', skiprows=7)
data.columns = ['longitudes']
data = data['longitudes'].str.split('\t', expand=True)
df = data.iloc[::10,:]
df.head()

# replace 'E' with '' and 'W' with ''
df.loc[0] = df.loc[0].str.replace('E', '').str.replace('W', '')

# convert the longitude values to float values (THIS ONE WORKS)
df.loc[0] = df.loc[0][1:].astype(float)

# replace 'S' with '' and 'N' with ''
df.loc[:][0] = df.loc[:][0].str.replace('S', '').str.replace('N', '')

# convert latitude values into float values (THIS ONE DOES NOT WORK!!)
df.loc[:][0] = df.loc[:][0].astype(float)

# checking if the float values exist
print(df.loc[0][2], ' data-type ', type(df.loc[0][2])) # columns converted into float
print(df.loc[30][0], ' data-type ', type(df.loc[30][0])) # rows not converted into float

疑问：

如何根据符号将值转换为正负浮点值（'S'、'W' 表示 -ve 浮点值，'E'、'N' 表示正浮点值）
如何成功地将纬度转换为浮点值（我编写的代码没有将行转换为浮点数；也没有抛出任何错误！）

P.S. 经度的转换产生了很多警告。如果有人能解释为什么我会收到这些警告以及如何防止它们，那就太好了？（再次强调，我是 Python 和 Pandas 的新手！）

数据集可以在here找到

这是数据集的截图：

【问题讨论】：

您可以使用startswith来检查特定字符并替换为正负号。 pandas.pydata.org/pandas-docs/stable/generated/…
你能提供一些示例数据吗？只需前几行就足够了
@asongtoruin 我已经编辑了问题并添加了一张 Jupyter 笔记本的图片，其中包含 df.head() 的输出
您不应该将代码发布为屏幕截图 - 如果您将其发布为文本，试图回答您问题的人可以直接复制它并进行处理。

标签： python python-3.x pandas dataframe data-processing

【解决方案1】：

我会在 read_csv 函数中添加更多参数以获得一个数据框，其中列是纵向字符串，索引是纬度。数据框中的数据现在是栅格数据

df = pd.read_csv(r'Aug-2016-potential-temperature-180x188.txt',
                 skiprows=8, delimiter='\t', index_col=0)

然后我会使用以下代码将纵向字符串（数据帧的列）转换为浮点数：

column_series = pd.Series(df.columns)
df.columns = column_series.apply(lambda x: float(x.replace('E','')) if x.endswith('E') else -float(x.replace('W','')))

使用以下代码将纬度字符串（数据帧的索引）转换为浮点数后：

index_series  = pd.Series(df.index)
df.index = index_series.apply(lambda x: float(x.replace('N','')) if x.endswith('N') else -float(x.replace('S','')))

【讨论】：

非常感谢您的回答。这是最有用的解决方案。只是一件事，您能否简要解释一下 lambda 函数中的 x 如何对行和列的各个值进行操作（我的意思是我得到了基于字符将其转换为浮点数的逻辑，但是如何究竟是 x getting the values 来读取吗？我们没有将任何形式的列表或变量传递给它。实际上这是唯一让我感到困惑的部分，我想出了使用在线文档解释代码的其余部分。
使用 .apply 方法，您可以将函数应用于 pandas 系列的每个值，x 是函数的参数。在这种情况下，x 是 pandas 系列中的值。将 lambda 函数应用于您的系列将将此函数应用于系列中的每个值。看看这里的例子可能会有所帮助：pandas.pydata.org/pandas-docs/version/0.23/generated/…

【解决方案2】：

这可能不是最干净的，但你可以用 "" 替换 'N' 和 'E'，然后使用 np.where 替换 'S' 和 'W'，转换为浮点数，然后乘以 -1

我做了一个例子 df 我将这个过程应用到第一列

example = pd.DataFrame({'1':['S35', 'E24', 'N45', 'W66'],
           '2': ['E45', 'N78', 'S12', 'W22']})

example
Out[153]: 
     1    2
0  S35  E45
1  E24  N78
2  N45  S12
3  W66  W22

col = example.loc[:, '1']

col = col.str.replace('N|E', "")

col
Out[156]: 
0    S35
1     24
2     45
3    W66
Name: 1, dtype: object

example.loc[:,'1'] = np.where(col.str.contains('W|S'), col.str.replace('W|S', '').astype('float') * -1, col)


example
Out[158]: 
    1    2
0 -35  E45
1  24  N78
2  45  S12
3 -66  W22

【讨论】：