【问题标题】:Adjust Pandas Series Using np.where (Avoid ValueError: The truth value of a Series is ambiguous.)使用 np.where 调整 Pandas 系列(避免 ValueError:系列的真值不明确。)
【发布时间】:2018-03-08 17:39:40
【问题描述】:

我有以下 pandas 系列,我试图根据系列中的值是否等于“未指定”以 if/else 方式进行调整。我一直在查看关于 pandas 中这个常见错误的尽可能多的材料,但它似乎没有帮助。有人可以帮我让我的熊猫系列将最后一个州名调整为大写,除非值等于“未指定”。这是我的最佳猜测 test_series.where('Unspecified', test_series.str[:-2] + test_series.str[-2:].str.upper()) 谢谢!!

test_series = pd.Series(['Asheville, nc', 'Cowpens, nc', 'Hendersonville, nc', 'Tryon, nc',
       'Fletcher, nc', 'Franklin, nc', 'Unspecified', 'Burnsville, nc',
       'Flat rock, nc', 'Fairview, nc', 'Greeneville, tn',
       'Weaverville, nc', 'Mills river, nc', 'Lake junaluska, nc',
       'Bristol, tn', 'Calhoun, ga', 'Canton, nc', 'Whittier, nc',
       'Bostic, nc', 'Horse shoe, nc', 'Reynolds, nc', 'Marion, nc',
       'Waynesville, nc', 'Candler, nc', 'Brevard, nc', 'Highlands, nc',
       'Knoxville, tn', 'Newport, tn', 'Greenville, sc',
       'Rutherfordton, nc', 'Hickory, nc', 'Asheboro, nc', 'Swannanoa, nc',
       'Spartanburg, sc', 'Jonesville, nc', 'Gaffney, sc', 'Randleman, nc',
       'Clyde, nc', 'Tryon,nc', 'Maryville, tn', 'Rutledge, tn',
       'Morganton, nc', 'Lake lure, nc', 'Sylva, nc', 'Mars hill, nc',
       'Dawsonville, ga', 'Arden, nc', 'Chadbourn, nc', 'Taylors, sc',
       'Oakley, nc', 'Charlotte, nc', 'Black mountain, nc',
       'Leicester, nc', 'East flat rock, nc', 'Morristown, tn',
       'Talbott, tn', 'Harmony, nc', 'Sevierville, tn', 'Newland, nc',
       'Kodak, tn', 'Marshall, nc', 'Edneyville, nc', 'Morristown, nc',
       'Forest city, nc', 'Greensboro, nc', 'Spruce pine, nc',
       'Shelby, nc', 'Barnardsville, nc', 'Tazewell, tn', 'Alexander, nc',
       'Bakersville, nc', 'Mountain home, nc', 'Clarkesville, ga',
       'Chesnee, sc', 'Pineville, nc', 'Elizabethton, tn', 'Oteen, nc',
       'Liberty, sc', 'Simpsonville, sc', 'Boone, nc', 'Clayton, ga',
       'Old fort, nc', 'Bat cave, nc', 'Johnson city, tn',
       'Bryson city, nc', 'Fayetteville, nc', 'Charleston, sc',
       'Grayson, ga', 'Murphy, nc', 'Inman, sc', 'Douglas, ga',
       'Columbus, nc', 'Glenville, nc', 'Easley, sc', 'Durham, nc',
       'Mill spring, nc', 'Clinton, tn', 'Piedmont, sc', 'Hot springs, nc',
       'Waxhaw, nc', 'La follette, tn', 'Cashiers, nc', 'Etowah, nc',
       'Nebo, nc', 'Yadkinville, nc', 'Toccoa, ga', 'Monroe, nc',
       'Boiling springs, sc', 'Cornelia, nc', 'Sparta, nc', 'Cherokee, nc',
       'Harriman, tn', 'Limestone, tn', 'Kingsport, tn', 'Laurel hill, nc',
       'Andrews, nc', 'Boiling spring, sc', 'Moncks corner, sc',
       'Cullowhee, nc', 'Clover, sc', 'Waynesvile, nc',
       'Maggie valley, nc', 'Hiawasssee, ga', 'Pigeon forge, tn',
       'Unicoi, tn', 'Gray, tn', 'Rosman, nc', 'Saluda, nc', 'Benson, nc',
       'Anderson, sc', 'Penrose, nc', 'Lake toxaway, nc',
       'Cedar mountain, nc', 'Chattanooga, tn', 'Turtletown, tn',
       'Almond, nc', 'Greenwood, sc', 'Lansing, nc', 'Wartburg, tn',
       'Cherryville, nc', 'Hildebran, nc', 'Raleigh, nc',
       'Pisgah forest, nc', 'Mooresboro, nc', 'Zebulon, nc',
       'Hiawassee, ga', 'Albemarle, nc', 'Burlington, nc', 'Salisbury, nc',
       'Livingston, tn', 'Twin brooks, nc', 'Ellenboro, nc', 'Lenoir, nc',
       'Milledgeville, ga', 'Overton, tn', 'Greer, sc', 'Thomasville, nc',
       'Jonesborough, tn', 'Blairsville, ga', 'Winston-salem, nc',
       'Atlanta, ga', 'Polk, nc', 'Dandridge, tn', 'Mooresville, nc'])

【问题讨论】:

  • test_series.where(test_series == 'Unspecified',...

标签: string python-3.x pandas if-statement series


【解决方案1】:

我想你想要:

test_series.where(test_series == 'Unspecified', 
                  test_series.str[:-2] + test_series.str[-2:].str.upper())

输出头(10):

0         Asheville, NC
1           Cowpens, NC
2    Hendersonville, NC
3             Tryon, NC
4          Fletcher, NC
5          Franklin, NC
6           Unspecified
7        Burnsville, NC
8         Flat rock, NC
9          Fairview, NC
dtype: object

再一次,列表推导式执行 .str 访问器:

%timeit pd.Series([i if i == 'Unspecified' else i[:-2] + i[-2:].upper() for i in test_series])
1000 loops, best of 3: 342 µs per loop

%%timeit 
test_series.where(test_series == 'Unspecified', 
                      test_series.str[:-2] + test_series.str[-2:].str.upper())

100 loops, best of 3: 2.84 ms per loop

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2018-01-11
    • 2019-04-06
    • 1970-01-01
    • 1970-01-01
    • 2017-12-28
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多