【问题标题】:How to access a #document inside an iframe using puppeteer如何使用 puppeteer 访问 iframe 中的#document
【发布时间】:2022-05-03 23:57:55
【问题描述】:

我正在尝试访问 iframe 内的文档(iframe 没有 ID)。我可以访问 iframe,但是该 iframe 中有一个我需要访问的文档。一旦我访问 iframe,下面是我如何使用 puppeteer 访问 iframe。

for (const frame of page.mainFrame().childFrames()) {
    if(frame._url.includes('hcaptcha')) {
      console.log(frame)
    }
  }

console.log(frame) 之后我得到这样的输出

<ref *1> Frame {
  _url: 'https://newassets.hcaptcha.com/captcha/v1/f6912ef/static/hcaptcha-checkbox.html#id=0s6dbw6vm1m&host=discord.com&sentry=true&reportapi=https%3A%2F%2Faccounts.hcaptcha.com&recaptchacompat=true&custom=false&hl=en&tplinks=on&sitekey=f5561ba9-8f1e-40ca-9b5b-a0b3f719ef34&theme=dark',
  _detached: false,
  _loaderId: 'C5D28A44CA23DF9FE035DEA8A582CCF4',
  _lifecycleEvents: Set(10) {
    'init',
    'DOMContentLoaded',
    'commit',
    'networkAlmostIdle',
    'load',
    'firstPaint',
    'firstContentfulPaint',
    'firstMeaningfulPaintCandidate',
    'firstMeaningfulPaint',
    'networkIdle'
  },
  _frameManager: <ref *2> FrameManager {
    eventsMap: Map(5) {
      Symbol(FrameManager.FrameAttached) => [Array],
      Symbol(FrameManager.FrameDetached) => [Array],
      Symbol(FrameManager.FrameNavigated) => [Array],
      Symbol(FrameManager.LifecycleEvent) => [],
      Symbol(FrameManager.FrameNavigatedWithinDocument) => []
    },
    emitter: {
      all: [Map],
      on: [Function: on],
      off: [Function: off],
      emit: [Function: emit]
    },
    _frames: Map(3) {
      'A69F314F873F3ACC2BEA21B83F5D9CFE' => [Frame],
      '4B7CAC9D4F330D6A02516AA51A28F8FF' => [Frame],
      '2A075D2F6483D921662E2694F01C6511' => [Circular *1]
    },
    _contextIdToContext: Map(6) {
      'B80B8EC3730F5CB200976DD409258982:5' => [ExecutionContext],
      'B80B8EC3730F5CB200976DD409258982:6' => [ExecutionContext],
      'CF7C9BD621D3B33F481293094B04066E:1' => [ExecutionContext],
      'CF7C9BD621D3B33F481293094B04066E:2' => [ExecutionContext],
      '3F74F2CAB7F2F7A106B773BA6953B5DB:3' => [ExecutionContext],
      '3F74F2CAB7F2F7A106B773BA6953B5DB:4' => [ExecutionContext]
    },
    _isolatedWorlds: Set(3) {
      'B80B8EC3730F5CB200976DD409258982:__puppeteer_utility_world__',
      'CF7C9BD621D3B33F481293094B04066E:__puppeteer_utility_world__',
      '3F74F2CAB7F2F7A106B773BA6953B5DB:__puppeteer_utility_world__'
    },
    _client: CDPSession {
      eventsMap: [Map],
      emitter: [Object],
      _callbacks: Map(0) {},
      _connection: [Connection],
      _targetType: 'page',
      _sessionId: 'B80B8EC3730F5CB200976DD409258982',
      send: [AsyncFunction (anonymous)]
    },
    _page: Page {
      eventsMap: Map(0) {},
      emitter: [Object],
      _closed: false,
      _timeoutSettings: [TimeoutSettings],
      _pageBindings: Map(0) {},
      _javascriptEnabled: true,
      _workers: [Map],
      _fileChooserInterceptors: Set(0) {},
      _userDragInterceptionEnabled: false,
      _handlerMap: [WeakMap],
      _client: [CDPSession],
      _target: [Target],
      _keyboard: [Keyboard],
      _mouse: [Mouse],
      _touchscreen: [Touchscreen],
      _accessibility: [Accessibility],
      _frameManager: [Circular *2],
      _emulationManager: [EmulationManager],
      _tracing: [Tracing],
      _coverage: [Coverage],
      _screenshotTaskQueue: [TaskQueue],
      _viewport: null
    },
    _networkManager: NetworkManager {
      eventsMap: [Map],
      emitter: [Object],
      _networkEventManager: [NetworkEventManager],
      _extraHTTPHeaders: {},
      _credentials: null,
      _attemptedAuthentications: Set(0) {},
      _userRequestInterceptionEnabled: false,
      _protocolRequestInterceptionEnabled: false,
      _userCacheDisabled: false,
      _emulatedNetworkConditions: [Object],
      _client: [CDPSession],
      _ignoreHTTPSErrors: true,
      _frameManager: [Circular *2]
    },
    _timeoutSettings: TimeoutSettings {
      _defaultTimeout: null,
      _defaultNavigationTimeout: null
    },
    _mainFrame: Frame {
      _url: 'https://discord.com/login',
      _detached: false,
      _loaderId: 'B22F2848497D00E837A1AC824039C081',
      _lifecycleEvents: [Set],
      _frameManager: [Circular *2],
      _parentFrame: null,
      _id: 'A69F314F873F3ACC2BEA21B83F5D9CFE',
      _childFrames: [Set],
      _client: [CDPSession],
      _mainWorld: [DOMWorld],
      _secondaryWorld: [DOMWorld],
      _name: undefined
    }
  },
  _parentFrame: <ref *3> Frame {
    _url: 'https://discord.com/login',
    _detached: false,
    _loaderId: 'B22F2848497D00E837A1AC824039C081',
    _lifecycleEvents: Set(10) {
      'init',
      'firstPaint',
      'firstMeaningfulPaintCandidate',
      'DOMContentLoaded',
      'firstContentfulPaint',
      'load',
      'firstImagePaint',
      'networkAlmostIdle',
      'firstMeaningfulPaint',
      'networkIdle'
    },
    _frameManager: <ref *2> FrameManager {
      eventsMap: [Map],
      emitter: [Object],
      _frames: [Map],
      _contextIdToContext: [Map],
      _isolatedWorlds: [Set],
      _client: [CDPSession],
      _page: [Page],
      _networkManager: [NetworkManager],
      _timeoutSettings: [TimeoutSettings],
      _mainFrame: [Circular *3]
    },
    _parentFrame: null,
    _id: 'A69F314F873F3ACC2BEA21B83F5D9CFE',
    _childFrames: Set(2) { [Frame], [Circular *1] },
    _client: CDPSession {
      eventsMap: [Map],
      emitter: [Object],
      _callbacks: Map(0) {},
      _connection: [Connection],
      _targetType: 'page',
      _sessionId: 'B80B8EC3730F5CB200976DD409258982',
      send: [AsyncFunction (anonymous)]
    },
    _mainWorld: DOMWorld {
      _documentPromise: null,
      _contextPromise: [Promise],
      _contextResolveCallback: null,
      _detached: false,
      _waitTasks: Set(0) {},
      _boundFunctions: Map(0) {},
      _ctxBindings: Set(0) {},
      _settingUpBinding: null,
      _client: [CDPSession],
      _frameManager: [FrameManager],
      _frame: [Circular *3],
      _timeoutSettings: [TimeoutSettings]
    },
    _secondaryWorld: DOMWorld {
      _documentPromise: null,
      _contextPromise: [Promise],
      _contextResolveCallback: null,
      _detached: false,
      _waitTasks: Set(0) {},
      _boundFunctions: Map(0) {},
      _ctxBindings: Set(0) {},
      _settingUpBinding: null,
      _client: [CDPSession],
      _frameManager: [FrameManager],
      _frame: [Circular *3],
      _timeoutSettings: [TimeoutSettings]
    },
    _name: undefined
  },
  _id: '2A075D2F6483D921662E2694F01C6511',
  _childFrames: Set(0) {},
  _client: CDPSession {
    eventsMap: Map(12) {
      'Runtime.bindingCalled' => [Array],
      'Page.frameAttached' => [Array],
      'Page.frameNavigated' => [Array],
      'Page.navigatedWithinDocument' => [Array],
      'Page.frameDetached' => [Array],
      'Page.frameStoppedLoading' => [Array],
      'Runtime.executionContextCreated' => [Array],
      'Runtime.executionContextDestroyed' => [Array],
      'Runtime.executionContextsCleared' => [Array],
      'Page.lifecycleEvent' => [Array],
      'Target.attachedToTarget' => [Array],
      'Target.detachedFromTarget' => [Array]
    },
    emitter: {
      all: [Map],
      on: [Function: on],
      off: [Function: off],
      emit: [Function: emit]
    },
    _callbacks: Map(0) {},
    _connection: Connection {
      eventsMap: [Map],
      emitter: [Object],
      _lastId: 115,
      _sessions: [Map],
      _closed: false,
      _callbacks: Map(0) {},
      _url: 'ws://127.0.0.1:9222/devtools/browser/9d9c5886-5aa5-49db-9dd0-e5596725fb91',
      _delay: 0,
      _transport: [NodeWebSocketTransport]
    },
    _targetType: 'iframe',
    _sessionId: '3F74F2CAB7F2F7A106B773BA6953B5DB'
  },
  _mainWorld: DOMWorld {
    _documentPromise: null,
    _contextPromise: Promise { [ExecutionContext] },
    _contextResolveCallback: null,
    _detached: false,
    _waitTasks: Set(0) {},
    _boundFunctions: Map(0) {},
    _ctxBindings: Set(0) {},
    _settingUpBinding: null,
    _client: CDPSession {
      eventsMap: [Map],
      emitter: [Object],
      _callbacks: Map(0) {},
      _connection: [Connection],
      _targetType: 'iframe',
      _sessionId: '3F74F2CAB7F2F7A106B773BA6953B5DB'
    },
    _frameManager: <ref *2> FrameManager {
      eventsMap: [Map],
      emitter: [Object],
      _frames: [Map],
      _contextIdToContext: [Map],
      _isolatedWorlds: [Set],
      _client: [CDPSession],
      _page: [Page],
      _networkManager: [NetworkManager],
      _timeoutSettings: [TimeoutSettings],
      _mainFrame: [Frame]
    },
    _frame: [Circular *1],
    _timeoutSettings: TimeoutSettings {
      _defaultTimeout: null,
      _defaultNavigationTimeout: null
    }
  },
  _secondaryWorld: DOMWorld {
    _documentPromise: null,
    _contextPromise: Promise { [ExecutionContext] },
    _contextResolveCallback: null,
    _detached: false,
    _waitTasks: Set(0) {},
    _boundFunctions: Map(0) {},
    _ctxBindings: Set(0) {},
    _settingUpBinding: null,
    _client: CDPSession {
      eventsMap: [Map],
      emitter: [Object],
      _callbacks: Map(0) {},
      _connection: [Connection],
      _targetType: 'iframe',
      _sessionId: '3F74F2CAB7F2F7A106B773BA6953B5DB'
    },
    _frameManager: <ref *2> FrameManager {
      eventsMap: [Map],
      emitter: [Object],
      _frames: [Map],
      _contextIdToContext: [Map],
      _isolatedWorlds: [Set],
      _client: [CDPSession],
      _page: [Page],
      _networkManager: [NetworkManager],
      _timeoutSettings: [TimeoutSettings],
      _mainFrame: [Frame]
    },
    _frame: [Circular *1],
    _timeoutSettings: TimeoutSettings {
      _defaultTimeout: null,
      _defaultNavigationTimeout: null
    }
  },
  _name: ''
}

我想要做的是,为人类解决验证码,但是,有时它会显示图像来解决验证码,在这种情况下,我想通知人类自己解决验证码。但是,这是后面的部分,我首先需要在 iframe 中访问该文档。请查看下图以了解 iframe 元素的外观。

iframe element

非常感谢您的帮助或任何形式的指导。我不需要代码,好吧,如果可以提供代码,很好,但更重要的是,我需要指导。

【问题讨论】:

    标签: javascript iframe puppeteer document


    【解决方案1】:

    你快到了。通过迭代所有帧获得的 Puppeteer 帧对象具有 api。使用该 api 可以获得整个框架文档 await frame.content() 或某些特定元素 await frame.$(selector) 的有效负载。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-07-21
      • 2021-05-26
      • 2014-04-20
      • 2023-02-05
      • 2021-12-24
      • 1970-01-01
      相关资源
      最近更新 更多