【问题标题】:How can I mass save a list of URLs to PDF?如何将 URL 列表批量保存为 PDF?
【发布时间】:2019-11-24 13:28:39
【问题描述】:

我有一个食谱的 URL 列表,我想在网站订阅用完之前将其保存为 PDF。这些 URL 是使用 Xenu Link Sleuth 收集的,它们被保存到 Excel 电子表格中,经过调整以进行清理、删除重复项并导出到制表符分隔的 txt 文件。

一位朋友编写了一个 AutoHotKey 脚本,该脚本将获取 URL 并使用 Chrome 中的“打印到 PDF”选项,但这样做存在问题。除了由于脚本需要控制鼠标而使我的计算机在运行时无法使用之外,它通常会通过以某种方式尝试保存相同的 URL 两次而停止正常工作,即使没有重复的链接,或者只是不保存任何内容。

下面是我朋友写的脚本。它似乎主要对他有用,但对我来说,在这种状态下它不会保存任何东西。在 Sleep 7000 之后,我添加了另一个 Send {Enter},以便激活 Save As 对话框中用于保存 PDF 的 Save 按钮,以及另一个更短的 Sleep。

#NoEnv  ; Recommended for performance and compatibility with future AutoHotkey releases.
; #Warn  ; Enable warnings to assist with detecting common errors.
SendMode Input  ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir %A_ScriptDir%  ; Ensures a consistent starting directory.

Loop, read, %A_ScriptDir%\CookSpread.txt

{
StringSplit, LineArray, A_LoopReadLine, %A_Tab%
  URL := LineArray1

Run, %URL%

Sleep 7000

Click 3407, 241
Sleep 3000
Send {Enter}
Sleep 7000

Send ^{F4}
Sleep 4000
}

我做了一些修改。一是我减少了每一步之间的时间(也许这就是问题所在)。我还添加了单击以选择要与食谱一起打印的营养信息。您还可以看到我的第二个 Enter 接近尾声。

Sleep 6000
Click 1500, 550
Sleep 500
Click 1480, 220
Sleep 3000
Send {Enter}
Sleep 1000
Send {Enter}
Sleep 1000
Send ^{F4}
Sleep 2000
}

我希望它能够运行,从文本文件中获取一个链接,在浏览器中打开它,将鼠标移到打印按钮(单击)上,这将打开打印界面。然后按 Enter 键单击 Save 按钮,然后按 Enter 键单击 Save 按钮,保存文件,然后关闭当前 Chrome 选项卡,然后重新开始。

发生的情况是 20 或 30 个 URL 都正常,但是当它保存时会发生一些事情,它说文件已经存在并询问我是否要覆盖它。当脚本继续尝试执行其余步骤时,此窗口保持打开状态,因此没有其他任何事情可以完成。最终会发生数百个选项卡被打开,因为 URL 仍在浏览器中打开。

我想知道是否有人知道如何纠正这个问题,或者他们是否知道另一种方法来完成这个问题。一个独立的 GUI 应用程序或可以使用我的登录凭据并在后台执行此操作的应用程序将是理想的,因为 AutoHotKey 脚本使我的计算机在运行时无法使用。但是,如果有人能弄清楚如何让这个工作,那对我来说已经足够了。

【问题讨论】:

    标签: google-chrome pdf printing autohotkey


    【解决方案1】:

    试试:

    全部替换

    "D:\Downloads" 在此代码中 程序使用文件夹的路径保存打印的 URL

    打印-设置中,“Destination”必须是“Save as PDF

    #NoEnv
    #SingleInstance Force
    SetWorkingDir %A_ScriptDir%
    
    ModernBrowsers := "Chrome_WidgetWin_0,Chrome_WidgetWin_1,MozillaWindowClass"
    LegacyBrowsers := "IEFrame,OperaWindowClass"
    
    FileCreateDir, D:\Downloads\Newly created
    FileMove, D:\Downloads\*.pdf, D:\Downloads\Newly created\, 1
    
    F1::
    If !WinExist("ahk_exe chrome.exe")
        Run, chrome.exe
    WinWait, ahk_exe chrome.exe
    Sleep, 500
    Loop, read, %A_ScriptDir%\CookSpread.txt
    {
        StringSplit, LineArray, A_LoopReadLine, %A_Tab%
        URL := LineArray1
        Run, chrome.exe "%URL%"
        Sleep, 500
        Loop
        {
            WinActivate, ahk_exe chrome.exe
            WinWaitActive, ahk_exe chrome.exe, , 1
            If !(ErrorLevel)
                 break
        }
        Loop
        {
            OutputURL := GetActiveBrowserURL()
            Sleep, 500
            If (OutputURL = "")
                 continue
            If (OutputURL = URL)
                break
        }
        Sleep, 500    
        Loop
        {
            WinActivate, ahk_exe chrome.exe
            WinWaitActive, ahk_exe chrome.exe, , 1
            If !(ErrorLevel)
                 break
        }
        Send, ^p
        Sleep, 500    
        Loop
        {
            WinActivate, ahk_exe chrome.exe
            WinWaitActive, ahk_exe chrome.exe, , 1
            If !(ErrorLevel)
                 break
        }
        Sleep, 300
        Send, {Enter}
        Sleep, 500
        WinWait, Save As ahk_exe chrome.exe   
        Loop
        {
            WinActivate, Save As ahk_exe chrome.exe
            WinWaitActive, Save As ahk_exe chrome.exe, , 1
            If !(ErrorLevel)
            {
                 Send, !s
                 break
            }
        }
        Sleep, 500
        Loop
        {
            FileMove, D:\Downloads\*.pdf, D:\Downloads\Newly created\, 1
            Sleep, 500
            If !FileExist("D:\Downloads\*.pdf")
            {
                WinActivate, ahk_exe chrome.exe
                WinWaitActive, ahk_exe chrome.exe, , 1
                If !(ErrorLevel)
                {           
                    Send, ^w
                    break 
                } 
            }   
        }
        Sleep, 500
    }
    Run D:\Downloads\Newly created
    return
    
    
    ; https://www.autohotkey.com/boards/viewtopic.php?t=3702
    ; Get the URL of the current (active) browser tab
    
    GetActiveBrowserURL(){
        global ModernBrowsers, LegacyBrowsers
        WinGetClass, sClass, A
        If sClass In % ModernBrowsers   ; %
            Return GetBrowserURL_ACC(sClass)
        Else If sClass In % LegacyBrowsers  ; %
            Return GetBrowserURL_DDE(sClass) ; empty string if DDE not supported (or not a browser)
        Else
            Return ""
    }
    
    ; "GetBrowserURL_DDE" adapted from DDE code by Sean, (AHK_L version by maraskan_user)
    ; Found at 
    ; http://autohotkey.com/board/topic/17633-/?p=434518
    
    GetBrowserURL_DDE(sClass) {
        WinGet, sServer, ProcessName, % "ahk_class " sClass     ; %
        StringTrimRight, sServer, sServer, 4
        iCodePage := A_IsUnicode ? 0x04B0 : 0x03EC ; 0x04B0 = CP_WINUNICODE, 0x03EC = CP_WINANSI
        DllCall("DdeInitialize", "UPtrP", idInst, "Uint", 0, "Uint", 0, "Uint", 0)
        hServer := DllCall("DdeCreateStringHandle", "UPtr", idInst, "Str", sServer, "int", iCodePage)
        hTopic := DllCall("DdeCreateStringHandle", "UPtr", idInst, "Str", "WWW_GetWindowInfo", "int", iCodePage)
        hItem := DllCall("DdeCreateStringHandle", "UPtr", idInst, "Str", "0xFFFFFFFF", "int", iCodePage)
        hConv := DllCall("DdeConnect", "UPtr", idInst, "UPtr", hServer, "UPtr", hTopic, "Uint", 0)
        hData := DllCall("DdeClientTransaction", "Uint", 0, "Uint", 0, "UPtr", hConv, "UPtr", hItem, "UInt", 1, "Uint", 0x20B0, "Uint", 10000, "UPtrP", nResult) ; 0x20B0 = XTYP_REQUEST, 10000 = 10s timeout
        sData := DllCall("DdeAccessData", "Uint", hData, "Uint", 0, "Str")
        DllCall("DdeFreeStringHandle", "UPtr", idInst, "UPtr", hServer)
        DllCall("DdeFreeStringHandle", "UPtr", idInst, "UPtr", hTopic)
        DllCall("DdeFreeStringHandle", "UPtr", idInst, "UPtr", hItem)
        DllCall("DdeUnaccessData", "UPtr", hData)
        DllCall("DdeFreeDataHandle", "UPtr", hData)
        DllCall("DdeDisconnect", "UPtr", hConv)
        DllCall("DdeUninitialize", "UPtr", idInst)
        csvWindowInfo := StrGet(&sData, "CP0")
        StringSplit, sWindowInfo, csvWindowInfo, `" ;"; comment to avoid a syntax highlighting issue in autohotkey.com/boards
        Return sWindowInfo2
    }
    
    GetBrowserURL_ACC(sClass) {
        global nWindow, accAddressBar
        If (nWindow != WinExist("ahk_class " sClass)) ; reuses accAddressBar if it's the same window
        {
            nWindow := WinExist("ahk_class " sClass)
            accAddressBar := GetAddressBar(Acc_ObjectFromWindow(nWindow))
        }
        Try sURL := accAddressBar.accValue(0)
        If (sURL == "") {
            WinGet, nWindows, List, % "ahk_class " sClass ; ; % In case of a nested browser window as in the old CoolNovo (TO DO: check if still needed)
            If (nWindows > 1) {
                accAddressBar := GetAddressBar(Acc_ObjectFromWindow(nWindows2))
                Try sURL := accAddressBar.accValue(0)
            }
        }
        If ((sURL != "") and (SubStr(sURL, 1, 4) != "http")) ; Modern browsers omit "http://"
            sURL := "http://" sURL
        If (sURL == "")
            nWindow := -1 ; Don't remember the window if there is no URL
        Return sURL
    }
    
    ; "GetAddressBar" based in code by uname
    ; Found at http://autohotkey.com/board/topic/103178-/?p=637687
    
    GetAddressBar(accObj) {
        Try If ((accObj.accRole(0) == 42) and IsURL(accObj.accValue(0)))
            Return accObj
        Try If ((accObj.accRole(0) == 42) and IsURL("http://" accObj.accValue(0))) ; Modern browsers omit "http://"
            Return accObj
        For nChild, accChild in Acc_Children(accObj)
            If IsObject(accAddressBar := GetAddressBar(accChild))
                Return accAddressBar
    }
    
    IsURL(sURL) {
        Return RegExMatch(sURL, "^(?<Protocol>https?|ftp)://(?<Domain>(?:[\w-]+\.)+\w\w+)(?::(?<Port>\d+))?/?(?<Path>(?:[^:/?# ]*/?)+)(?:\?(?<Query>[^#]+)?)?(?:\#(?<Hash>.+)?)?$")
    }
    
    ; The code below is part of the Acc.ahk Standard Library by Sean (updated by jethrow)
    ; Found at http://autohotkey.com/board/topic/77303-/?p=491516
    
    Acc_Init()
    {
        static h
        If Not h
            h:=DllCall("LoadLibrary","Str","oleacc","Ptr")
    }
    Acc_ObjectFromWindow(hWnd, idObject = 0)
    {
        Acc_Init()
        If DllCall("oleacc\AccessibleObjectFromWindow", "Ptr", hWnd, "UInt", idObject&=0xFFFFFFFF, "Ptr", -VarSetCapacity(IID,16)+NumPut(idObject==0xFFFFFFF0?0x46000000000000C0:0x719B3800AA000C81,NumPut(idObject==0xFFFFFFF0?0x0000000000020400:0x11CF3C3D618736E0,IID,"Int64"),"Int64"), "Ptr*", pacc)=0
        Return ComObjEnwrap(9,pacc,1)
    }
    Acc_Query(Acc) {
        Try Return ComObj(9, ComObjQuery(Acc,"{618736e0-3c3d-11cf-810c-00aa00389b71}"), 1)
    }
    Acc_Children(Acc) {
        If ComObjType(Acc,"Name") != "IAccessible"
            ErrorLevel := "Invalid IAccessible Object"
        Else {
            Acc_Init(), cChildren:=Acc.accChildCount, Children:=[]
            If DllCall("oleacc\AccessibleChildren", "Ptr",ComObjValue(Acc), "Int",0, "Int",cChildren, "Ptr",VarSetCapacity(varChildren,cChildren*(8+2*A_PtrSize),0)*0+&varChildren, "Int*",cChildren)=0 {
                Loop %cChildren%
                    i:=(A_Index-1)*(A_PtrSize*2+8)+8, child:=NumGet(varChildren,i), Children.Insert(NumGet(varChildren,i-8)=9?Acc_Query(child):child), NumGet(varChildren,i-8)=9?ObjRelease(child):
                Return Children.MaxIndex()?Children:
            } Else
                ErrorLevel := "AccessibleChildren DllCall Failed"
        }
    }
    

    【讨论】:

    • 这比我的效果好很多,尽管它似乎在 6 次左右循环后停止。我不知道这是因为鼠标焦点更改为其他内容,还是它是脚本的一部分。一旦我在浏览器中单击某些内容以手动继续保存,它就会自行重新启动。
    • 谢谢!回家后我会试试编辑后的版本。我认为最新的编辑是可以尝试的。我注意到它被编辑了好几次。
    • 它在大多数情况下都有效,但它会挂在重定向的链接上;这不是剧本的错。每个列表都有超过 1,000 个食谱(美国的测试厨房和它的两个姐妹网站 :))。当文件已经存在时,它被挂断了要做什么,但我在“发送,^w”之前将“发送,!y”添加到循环中。无论哪种方式,它都比我拥有的第一个 AHK 脚本好得多。
    猜你喜欢
    • 2014-05-21
    • 1970-01-01
    • 1970-01-01
    • 2013-01-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-04-28
    相关资源
    最近更新 更多