如何让这个 Python 类定义代码不那么难看答案

【问题标题】：How to make this Python class definition code less ugly如何让这个 Python 类定义代码不那么难看
【发布时间】：2011-03-12 16:03:57
【问题描述】：

编写类定义最惯用的方法是什么？我下面的代码不可能是最好的方法。

class Course:

    crn =  course =  title =  tipe =  cr_hours =  seats =  instructor =  days =  begin =  end = location = exam = ""

    def __init__(self, pyQueryRow):
        self.crn = Course.get_column(pyQueryRow, 0)
        self.course = Course.get_column(pyQueryRow, 1)
        self.title = Course.get_column(pyQueryRow, 2)
        self.tipe = Course.get_column(pyQueryRow, 3)
        self.cr_hours = Course.get_column(pyQueryRow, 4)
        self.seats = Course.get_column(pyQueryRow, 5)
        self.instructor = Course.get_column(pyQueryRow, 6)
        self.days = Course.get_column(pyQueryRow, 7)
        self.begin = Course.get_column(pyQueryRow, 8)
        self.end = Course.get_column(pyQueryRow, 9)
        self.location = Course.get_column(pyQueryRow, 10)
        self.exam = Course.get_column(pyQueryRow, 11)

    def get_column(row, index):
        return row.find('td').eq(index).text()

[首先，python 是一门很棒的语言。这是我第一个使用 python 的项目，我已经取得了可笑的进步。]

【问题讨论】：

标签： python idioms pyquery

【解决方案1】：

def__init__(self, pyQueryRow):
    for i,attr in enumerate("crn course title tipe cr_hours seats instructor"
                            " days begin end location exam".split()):
        setattr(self, attr, self.get_column(pyQueryRow, i))

这样可以避免多次调用self.get_column

def__init__(self, pyQueryRow):
    attrs = ("crn course title tipe cr_hours seats instructor"
             " days begin end location exam".split())
    values = [td.text for td in pyQueryRow.find('td')]
    for attr, value in zip(attrs, values):
        setattr(self, attr, value)

【讨论】：

这不是很危险吗？如果您在该字符串中拼错了某个成员怎么办？
@Assaf Lavie，就像您在代码中输入错误的属性名称一样。无论哪种方式，在您尝试访问不存在的属性之前，Python 都不会抱怨。通常你应该有单元测试来捕捉这些类型的错误

【解决方案2】：

就个人而言，我会使用字典将属性映射到列号：

class Course:

    crn =  course =  title =  tipe =  cr_hours =  seats =  instructor =  days =  begin =  end = location = exam = ""

    def __init__(self, pyQueryRow):
        course_row_mapping = {
            'crn' : 0,
            'course' : 1,
            'title' : 2,
            'tipe' : 3, # You probably mean "type"?
            'cr_hours' : 4,
            'seats' : 5,
            'instructor' : 6,
            'days' : 7,
            'begin' : 8,
            'end' : 9,
            'location' : 10,
            'exam' : 11,
        }   

        for name, col in course_row_mapping.iteritems():
            setattr(self, name, Course.get_column(pyQueryRow, col))

    def get_column(row, index):
        return row.find('td').eq(index).text()

【讨论】：

【解决方案3】：

我不确定是否有“更好”的方法。你所拥有的当然是非常可读的。如果您想避免重复 Course.get_column 代码，您可以为此定义一个 lambda，例如 Matthew Flaschen 的回答。

class Course:
    def __init__(self, pyQueryRow):
        get_column = lambda index: pyQueryRow.find('td').eq(index).text()

        self.crn = get_column(0)
        self.course = get_column(1)
        self.title = get_column(2)
        self.tipe = get_column(3)
        self.cr_hours = get_column(4)
        self.seats = get_column(5)
        self.instructor = get_column(6)
        self.days = get_column(7)
        self.begin = get_column(8)
        self.end = get_column(9)
        self.location = get_column(10)
        self.exam = get_column(11)

请注意，您不需要预先将所有字段初始化为 "" 的行 - 只需将它们设置为 __init__ 即可。 编辑：事实上，正如 Matthew 所说，设置类字段，而不是实例字段 - 我完全错过了。

【讨论】：

【解决方案4】：

编辑：实际上，最好的可能是：

self.crn, self.course, self.title, self.tipe, self.cr_hours, self.seats,\ 
self.instructor, self.days, self.begin, self.end, self.location, self.exam = \ 
[pq(td).text() for td in pyQueryRow.find('td')]

假设您已将 PyQuery 导入为 pq。这完全避免了使用索引。

self.crn, self.course, self.title, self.tipe, self.cr_hours, self.seats,\ 
self.instructor, self.days, self.begin, self.end, self.location, self.exam = \
map(lambda index: get_column(pyQueryRow, index), xrange(0, 12))

或者如果你想要一个列表理解：

self.crn, self.course, self.title, self.tipe, self.cr_hours, self.seats,\ 
self.instructor, self.days, self.begin, self.end, self.location, self.exam = \
[get_column(pyQueryRow, index) for index in xrange(0, 12)]

我不知道这些是不是最惯用的，但绝对没有样板。

另外，删除crn = course =。您正在分配给类，而不是实例。

【讨论】：

我喜欢 lambda 的想法，但我不认为这实际上更具可读性，因为很难看出哪个索引映射到哪个字段。想象一下，如果您必须在中间某处添加一个 - 很容易出错。
@Evgeny，我明白你的意思了。但是由于它是基于抓取一个HTML页面，如果在中间添加一个，其他的就会向下移动。你只需要把它放在正确的两个之间并增加最大值。