【问题标题】:How to serialize Keras models to use with Joblib?如何序列化 Keras 模型以与 Joblib 一起使用?
【发布时间】:2017-12-07 20:09:02
【问题描述】:

我正在尝试结合 Keras 和 Joblib 以生成多个简单模型并将它们存储在一个数组中,以便我可以在验证阶段之后投影探针样本。

我有一个 Bootstrap Aggregating (Bagging) 方法的实现,其中包含几个使用 Joblib 的简单二元神经网络模型。但是,我在尝试预测时遇到了以下错误:

Traceback (most recent call last):
File "../HFCN_openset_load.py", line 264, in <module>
main()
File "../HFCN_openset_load.py", line 107, in main
pr, roc = fcnhface(args, parallel_pool)
File "../HFCN_openset_load.py", line 194, in fcnhface
pred = models[k][0].predict(feature_vector.reshape(1, feature_vector.shape[0]))
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 1004, in predict
if not self.built:
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 339, in built
return self._built
AttributeError: 'Sequential' object has no attribute '_built'

您会在下面找到我认为可能出现错误的部分代码:

def getModel(input_shape,nclasses=2):
    make_keras_picklable()
    model = Sequential()
    model.add(Dense(64, activation='relu', input_shape=input_shape))
    model.add(Dropout(0.2))
    model.add(Dense(nclasses, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])#RMSprop()
    return model

def learn_fc_model(X, Y, split):
    boolean_label = [(split[key]+1)/2 for key in Y]
    y_train = np_utils.to_categorical(boolean_label, 2)
    model = getModel(input_shape=X[0].shape)
    model.fit(X, y_train, batch_size=40, nb_epoch=100, verbose=0)
    return (model, split)

#Training using Joblib, models is a list of tuples (ANN models, any variable)
with Parallel(n_jobs=4, verbose=15, backend='multiprocessing') as parallel_pool:
    models = parallel_pool(
        delayed(learn_fc_model) (numpy_x, numpy_y, split) for split in numpy_s
    )

#Testing
for k in range (0, len(models)):
    pred = models[k][0].predict(feature_vector.reshape(1, feature_vector.shape[0]))

完整文件的链接是正确的here

【问题讨论】:

    标签: python serialization keras pickle joblib


    【解决方案1】:

    下面是一个使用 Joblib 并行估计多个 keras 模型的简单方法

    定义基本参数:

    • n_jobs: 多少工作

    • n_estimators: 多少模型适合

      n_jobs, n_estimators = 4, 20
      

    生成虚拟数据:

    n_class = 2
    X = np.random.uniform(0,1, (100,10))
    y = np.random.randint(0,n_class, 100)
    

    空模型结构定义的实用函数:

    def get_model(input_shape):
        m = Sequential([Dense(n_class, input_shape=input_shape,
                              activation='softmax')])
        m.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
        return m
    

    多模型拟合的实用函数(必须返回拟合权重列表):

    def fit_models(n_estimators, x, y):
        
        weights = []
        for _ in range(n_estimators):
            m = get_model(input_shape=(10,))
            m.fit(x, y)
            weights.append(m.get_weights())
        
        return weights
    

    在作业之间划分估计器的实用函数

    from joblib import Parallel, delayed, effective_n_jobs
    
    def _partition_estimators(n_estimators, n_jobs):
    
        # Compute the number of jobs
        n_jobs = min(effective_n_jobs(n_jobs), n_estimators)
    
        # Partition estimators between jobs
        n_estimators_per_job = np.full(n_jobs, n_estimators // n_jobs,
                                       dtype=int)
        n_estimators_per_job[:n_estimators % n_jobs] += 1
    
        return n_jobs, n_estimators_per_job.tolist()
    

    并行运行作业:

    n_jobs, n_estimators = _partition_estimators(n_estimators, n_jobs)
    
    res = Parallel(n_jobs=n_jobs, verbose=1)(
        delayed(fit_models)(
            n_estimators = n_estimators[i],
            x = X,
            y = y
        ) 
        for i in range(n_jobs))
    
    all_weights = list(itertools.chain.from_iterable(res)) # get all fitted weights in a list
    all_models = [get_model((10,)) for _ in all_weights] # get empty models in a list
    # put fitted weights into empty model structures
    for w,m in zip(all_weights, all_models):
        m.set_weights(w)
    

    here 带有完整示例的正在运行的笔记本

    【讨论】:

      猜你喜欢
      • 2020-07-13
      • 1970-01-01
      • 2018-03-25
      • 2019-01-09
      • 1970-01-01
      • 1970-01-01
      • 2021-06-26
      • 2017-12-23
      • 2022-06-27
      相关资源
      最近更新 更多