【问题标题】:Switft get html value of amazon product imageSwift获取亚马逊产品图片的html值
【发布时间】:2021-04-02 23:08:19
【问题描述】:

我正在尝试在我的应用中获取亚马逊产品的图像。我在浏览器中检查了图像,发现他们的class-Namegc-design-img-preview。实际上有更多 Elements 同一个班级,所以我尝试只获得第一个。

这是我尝试过的:

func getAmazonImage(doc: Document) -> String {
    let images: Elements? = try? doc.getElementsByClass("gc-design-img-preview")
    
    guard (images?.first()) != nil else { return "nope" }
        
    guard  let imageUrl : String = try! images?.first()!.text() else { return "nope2" }
    
    print("image: " + imageUrl)
    
    return imageUrl
}

但是,这并没有返回任何内容,而只是返回了一个空的String...我在这里缺少什么?我正在使用SwiftSoup,也许还有其他方法可以做到这一点?

更新:

我认为这是我需要的,但在Swift:

const imgSrc = document.querySelector('li.image.item.itemNo0.maintain-height.selected img').getAttribute('src')

console.log(imgSrc) // https://images-na.ssl-images-amazon.com/images/I/71y%2BUGuJl5L._SX522_.jpg

【问题讨论】:

  • 我去了亚马逊产品页面,但没有找到任何具有此类名称的产品。您能否提供一个示例页面以进一步调查?
  • @ManuelMB 我试着找到它,但我找不到......不知道我做了什么。无论如何,主要问题仍然是一样的。我想从任何亚马逊产品中获取主图像。

标签: ios swift parsing html-parsing swiftsoup


【解决方案1】:

使用 SwiftSoup 来实现:

import UIKit
import SwiftSoup

class ViewController: UIViewController {

func getProductImage(url: URL)-> String{
    var result = ""
    do {
        let html = try String(contentsOf: url, encoding: .utf8)
        let doc: Document = try SwiftSoup.parseBodyFragment(html)
        let img: Element = try doc.select("li.image.item.itemNo0.maintain-height.selected img").first()!
        
        let imgOuterHtml: String = try img.outerHtml();
       
        let chunks = imgOuterHtml.components(separatedBy: "\"")

        result = chunks[5]
    }
    catch {
       print(error)
    }
    
    return result
}


override func viewDidLoad() {
    super.viewDidLoad()
    guard let url = URL(string: "https://www.amazon.com/dp/B08FC6C75Y") else {
        fatalError("Can not get url")
    }
     let imgUrl = getProductImage(url: url)
     print(imgUrl)  // https://images-na.ssl-images-amazon.com/images/I/61o7ai%2BYDoL._SL1441_.jpg
  }
 }

【讨论】:

  • 它并不适用于所有产品。不知道为什么
  • 你能提供一个不工作的网址吗?
  • 显然亚马逊返回所有产品的结果不一致,因此必须使用另一种方法。目前我遇到了 XCode 的问题并且无法调试,因为我收到以下错误:无法创建 Swift 临时上下文(无法加载 Swift 标准库)站点:developer.apple.com 我将尝试解决它并发布不同的解决方案
  • 啊好旧的 xcode 问题...我认为它实际上是在检索正确的元素,但在我的示例中,chunks 与我认为不同,但我不确定。让我知道您的 Xcode 是否再次正常工作以及是否可以重现该问题
【解决方案2】:

例如,在这个网址中: https://www.amazon.com/DualSense-Wireless-Controller-PlayStation-5/dp/B08FC6C75Y/ref=sr_1_1?dchild=1&fst=as%3Aoff&pf_rd_i=16225016011&pf_rd_m=ATVPDKIKX0DER&pf_rd_p=03b28c2c-71e9-4947-aa06-f8b5dc8bf880&pf_rd_r=CSWVBS40MDKKJYXEJ0AH&pf_rd_s=merchandised-search-3&pf_rd_t=101&qid=1489016289&rnid=16225016011&s=videogames-intl-ship&sr=1-1

const imgSrc = document.querySelector('li.image.item.itemNo0.maintain-height.selected img').getAttribute('src')

console.log(imgSrc) // https://images-na.ssl-images-amazon.com/images/I/71y%2BUGuJl5L._SX522_.jpg

【讨论】:

  • 我将如何使用 SwiftSoup 做到这一点?
【解决方案3】:

更新 XCode 后,我可以再次调试?:

此代码在两个网址中都可以正常工作:

import UIKit
import SwiftSoup

class ViewController: UIViewController {

func getProductImage(url: URL)-> String{
    var result = ""
    do {
        let html = try String(contentsOf: url, encoding: .utf8)
        let doc: Document = try SwiftSoup.parseBodyFragment(html)
        let img: Element = try doc.select("li.image.item.itemNo0.maintain-height.selected img").first()!
        let imgOuterHtml: String = try img.outerHtml();
        let imgUrl = getImageUrl(imgOuterHtml)
        result = imgUrl
    }
    catch {
        print(error)
    }
    return result
}

func getImageUrl(_ input: String)->String{
    let detector = try! NSDataDetector(types: NSTextCheckingResult.CheckingType.link.rawValue)
    let matches = detector.matches(in: input, options: [], range: NSRange(location: 0, length: input.utf16.count))
    
    guard let range = Range(matches[0].range, in: input) else { fatalError("Can not get range") }
    let url = input[range]
    
    return String(url).components(separatedBy: "&")[0]
}

override func viewDidLoad() {
    super.viewDidLoad()
    
    //guard let url = URL(string: "https://www.amazon.com/dp/B08FC6C75Y") else {
    guard let url = URL(string: "https://www.amazon.com/dp/B0084DS9EE") else {
        fatalError("Can not get url")
    }
    let imgUrl = getProductImage(url: url)
    //print(imgUrl)// https://images-na.ssl-images-amazon.com/images/I/61o7ai%2BYDoL._SL1441_.jpg
    print(imgUrl)  // https://images-na.ssl-images-amazon.com/images/I/41ZmuuKMtmL._SY450_.jpg
 }
}

【讨论】:

    【解决方案4】:

    对于第二个 url,图像可以在不同的分辨率下获得,你可以从不同的属性 data-midres-replacement、data-zoom-hires 或 data-a-hires 中获取,在代码中都是选项,但是其中一些已被评论。

    import UIKit
    import SwiftSoup
    
    class ViewController: UIViewController {
    
    func getProductImage(url: URL)-> String{
        var result = ""
        do {
            let html = try String(contentsOf: url, encoding: .utf8)
            let doc: Document = try SwiftSoup.parseBodyFragment(html)
                                                
            let img: Element = try doc.select(".image-size-wrapper.fp-image-wrapper.image-block-display-flex img").first()!  
            let src  = try img.attr("src")  
    
            if src.contains("data:image/gif;base64"){
                
                let dataMidresReplacement  = try img.attr("data-midres-replacement")
                print("dataMidresReplacement: \(dataMidresReplacement)") // dataMidresReplacement: https://images-na.ssl-images-amazon.com/images/I/41ZmuuKMtmL._AC_SY350_.jpg
                result = dataMidresReplacement
                /*
                let dataZoomHires  = try img.attr("data-zoom-hires")
                print("dataZoomHires: \(dataZoomHires)") // dataZoomHires: https://images-na.ssl-images-amazon.com/images/I/41ZmuuKMtmL._AC_SL1500_.jpg
                result = dataZoomHires
                */
                /*
                let dataHires  = try img.attr("data-a-hires")
                print("dataHires: \(dataHires)") // dataHires: https://images-na.ssl-images-amazon.com/images/I/41ZmuuKMtmL._AC_SY1000_.jpg
                result = dataHires
                */
            } else {
                result = src
            }
        }
        catch {
            print(error)
        }
        
        return result
    }
    
    override func viewDidLoad() {
        super.viewDidLoad()
        
       //guard let url = URL(string: "https://www.amazon.com/dp/B08FC6C75Y") else {
       guard let url = URL(string: "https://www.amazon.com/dp/B0084DS9EE") else {
            fatalError("Can not get url")
        }
        let imgUrl = getProductImage(url: url)
        //print(imgUrl) // https://images-na.ssl-images-amazon.com/images/I/61o7ai%2BYDoL._AC_SY350_QL15_.jpg
        print(imgUrl)  // data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
       }
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-06-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-04-11
      相关资源
      最近更新 更多