【问题标题】：Is it possible to debug a 3rd-party python file inside a Jupyter notebook cell?是否可以在 Jupyter 笔记本单元中调试第 3 方 python 文件？
【发布时间】：2019-08-13 11:36:41
【问题描述】：

set_trace() 允许在 Jupyter 笔记本单元中调试我们自己的代码。

code_sn-p_1

#import the KNeighborsClassifier class from sklearn
from sklearn.neighbors import KNeighborsClassifier
from IPython.core.debugger import set_trace
#import metrics model to check the accuracy 
from sklearn import metrics
#Try running from k=1 through 25 and record testing accuracy
k_range = range(1,26)
scores = {}
scores_list = []
for k in k_range:
    set_trace()
    knn = KNeighborsClassifier(n_neighbors=k)
    knn.fit(X_train,y_train)
    y_pred=knn.predict(X_test)
    scores[k] = metrics.accuracy_score(y_test,y_pred)
    scores_list.append(metrics.accuracy_score(y_test,y_pred))

这是“KNN on Iris Datset”源代码的一部分。

这个link 是整个片段，可以 100% 在线重现。

问题是

是否可以在 Jupyter 笔记本单元格中调试第 3 方 python 文件，例如 classification.py？

特别是，是否可以在 Jupyter 笔记本单元格内调试 knn.predict()？

位于

/usr/local/lib/python3.6/dist-packages/sklearn/neighbors/classification.py

这件作品

y_pred=knn.predict(["trap", X_test])

%debug

得到这个错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-054b4ff1b356> in <module>()
----> 1 y_pred=knn.predict(["trap", X_test])
      2 
      3 get_ipython().magic('debug')

...

只运行这一行

y_pred=knn.predict(["trap", X_test])

得到这个错误（长数组输出已被删除）

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-054b4ff1b356> in <module>()
----> 1 y_pred=knn.predict(["trap", X_test])
      2 
      3 get_ipython().magic('debug')

1 frames
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    519                     "Reshape your data either using array.reshape(-1, 1) if "
    520                     "your data has a single feature or array.reshape(1, -1) "
--> 521                     "if it contains a single sample.".format(array))
    522 
    523         # in the future np.flexible dtypes will be handled like object dtypes

错误发生后，我在新单元格中运行%debug，然后出现此错误

> /usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py(521)check_array()
    519                     "Reshape your data either using array.reshape(-1, 1) if "
    520                     "your data has a single feature or array.reshape(1, -1) "
--> 521                     "if it contains a single sample.".format(array))
    522 
    523         # in the future np.flexible dtypes will be handled like object dtypes

和 ipdb 输入

我输入up，pdb切换到classification.py

设置断点

然后up，切换回来，

断点不起作用

这是整个日志记录

> /usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py(521)check_array()
    519                     "Reshape your data either using array.reshape(-1, 1) if "
    520                     "your data has a single feature or array.reshape(1, -1) "
--> 521                     "if it contains a single sample.".format(array))
    522 
    523         # in the future np.flexible dtypes will be handled like object dtypes

ipdb> up
> /usr/local/lib/python3.6/dist-packages/sklearn/neighbors/classification.py(147)predict()
    145             Class labels for each data sample.
    146         """
--> 147         X = check_array(X, accept_sparse='csr')
    148 
1   149         neigh_dist, neigh_ind = self.kneighbors(X)

ipdb> b
Num Type         Disp Enb   Where
1   breakpoint   keep yes   at /usr/local/lib/python3.6/dist-packages/sklearn/neighbors/classification.py:149
2   breakpoint   keep yes   at /usr/local/lib/python3.6/dist-packages/sklearn/neighbors/classification.py:150
ipdb> up
> <ipython-input-22-be2dbe619b73>(2)<module>()
      1 X = ["trap", X_test]
----> 2 y_pred=knn.predict(X)

ipdb> X = X_test
ipdb> s

【问题讨论】：

标签： python jupyter-notebook

【解决方案1】：

事实上这是不可能的。但是有一个窍门。您可以故意将错误的参数传递给predict 函数，使其失败，您可以调用%debug 以便逐行执行这些步骤。请参阅下面的示例。

y_pred=knn.predict(["trap", X_test])

这将尝试执行predict 方法并且会失败，因为您输入的是随机列表而不是数组。您可以从那里调用 %debug 魔术命令来执行执行

【讨论】：

我已经更新了 OP，这个技巧出现了一些错误，可能是我以不正确的方式使用了这个技巧。
对不起，我可能没有正确解释。这就是意图！您故意在那里产生了一个错误，以便通过随后手动运行%debug（在另一个单元格中并通过另一个调用）来使用调试器捕获它
我添加了一个新部分“仅运行这一行”，鉴于此特定错误，接下来我应该怎么做？运行%debug?然后我得到了/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py(521)，这不是我尝试调试的。
当然，您在该行，因为这是 sklearn 验证输入并在输入无效时引发错误的地方。现在您必须导航回溯以到达您想要的点。您可以使用up和down命令在回溯中上下移动。一旦您处于要调试的更高点，您可以使用breakcommand 设置断点，并在更正输入的情况下运行代码。有关 pdb 的更多信息：docs.python.org/3/library/pdb.html
我可以设置break即使是3rd-party python文件，比如/usr/local/lib/python3.6/dist-packages/sklearn/neighbors/classification.py，对吗？