【发布时间】:2017-08-31 07:49:34
【问题描述】:
我正在尝试合并用户提供的 URL 相对路径和文件路径。例如,如果给我以下项目:
url_base = 'http://myserver.com/my/path/to/files'
path = 'path/to/files/foo.txt'
想要的输出是
http://myserver.com/my/path/to/files/foo.txt
URL 和文件之间的公共路径元素已合并的位置; my/path/to/files 和 path/to/files/foo.txt 组合得到 my/path/to/files/foo.txt,它被附加到 URL 的基础上。
我能得到的最接近的是:
# python 2.7
import os
import urlparse
from collections import OrderedDict
url_base = 'http://myserver.com/my/path/to/files'
path = 'path/to/files/foo.txt'
url = urlparse.urlparse(url_base)
print(url)
# ParseResult(scheme='http', netloc='myserver.com', path='/my/path/to/files', params='', query='', fragment='')
merge_path = os.path.join(url.path, path)
print(merge_path)
# /my/path/to/files/path/to/files/foo.txt
# take an ordered set of the path components
# this is not good because it assumes '/' is the split key
merge_path_set = list(OrderedDict.fromkeys(merge_path.split('/')))
print(merge_path_set)
# ['', 'my', 'path', 'to', 'files', 'foo.txt']
path_joined = os.path.join(*merge_path_set)
print(path_joined)
# my/path/to/files/foo.txt
# THIS DOESN'T WORK:
url_joined = urlparse.urljoin(url.netloc, path_joined)
print(url_joined)
# my/path/to/files/foo.txt
似乎应该有一种更好的方法来做到这一点,即利用内置库而不是像我在这里所做的那样手动拆分 '/' 并采用有序集。我还没有想出如何将它返回到 URL 中进行输出。有什么想法吗?
【问题讨论】: