BeautifulSoup - 连接两个字符串，将它们放在同一行答案

【问题标题】：BeautifulSoup - joining two strings, putting them on the same lineBeautifulSoup - 连接两个字符串，将它们放在同一行
【发布时间】：2022-07-05 23:49:50
【问题描述】：

所以我想从在线词典中提取单词定义。网站结构有点奇怪。单词定义没有标签或属性，所以我使用 .find_next_sibling 方法。我得到了我想要的所有文本，但我找不到加入它们并将它们放在同一行的方法。这是我的代码：

import requests
from bs4 import BeautifulSoup as bs

word = 'ក'
url = "http://dictionary.tovnah.com/?word=" + word + "&dic=headley&criteria=word"
headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"}
response = requests.get(url, headers=headers)
soup = bs(response.text, "lxml")

main = soup.find('ol', attrs={'start':'1'})
entries = main.find_all('li')
for entry in entries:
    pos = entry.find('a').find_next_sibling(text=True)
    meaning = entry.find('a').find_next_siblings(text=True)[4]
    result = pos + meaning
    
    print(result)

#            first letter of the Cambodian alphabet  

             ( n ) 
              
            
            
             neck; collar; connecting link 

             ( v ) 
              
            
            
             to build, construct, create, found; to base on; to commence, start up; to come into being

预期结果：

first letter of the Cambodian alphabet  

( n ) neck; collar; connecting link 

( v ) to build, construct, create, found; to base on; to commence, start up; to come into being

我想去掉缩进，把词性（pos）放在定义（意思）之前。我认为我的打印结果是由不可见的 html 元素引起的。当我把结果作为一个列表，它显示：

['\n\n\t\t    \n\t\t    \n\t\t     first letter of the Cambodian alphabet \u200b \u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b']
['\n\t\t     ( n ) \n\t\t      \n\t\t    \n\t\t    \n\t\t     neck; collar; connecting link \u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b']
['\n\t\t     ( v ) \n\t\t      \n\t\t    \n\t\t    \n\t\t     to build, construct, create, found; to base on; to commence, start up; to come into being \u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b']

作为一个列表，我仍然找不到摆脱所有那些不需要的元素的方法。请赐教。

screenshot of the page structure

【问题讨论】：

标签： python arrays join beautifulsoup

【解决方案1】：

使用.strip() 删除前导和尾随空格/换行符

import requests
from bs4 import BeautifulSoup as bs

word = 'ក'
url = "http://dictionary.tovnah.com/?word=" + word + "&dic=headley&criteria=word"
headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"}
response = requests.get(url, headers=headers)
soup = bs(response.text, "lxml")

main = soup.find('ol', attrs={'start':'1'})
entries = main.find_all('li')
for entry in entries:
    pos = entry.find('a').find_next_sibling(text=True).strip()
    meaning = entry.find('a').find_next_siblings(text=True)[4].strip()
    result = pos + meaning
    print(result)

输出：

first letter of the Cambodian alphabet  
( n )neck; collar; connecting link 
( v )to build, construct, create, found; to base on; to commence, start up; to come into being

【讨论】：