【问题标题】:R: Emulating a complex form with httrR:用 httr 模拟复杂的形式
【发布时间】:2016-10-30 09:29:57
【问题描述】:

我正在尝试使用httr 获取that form 的结果。

看了form results,我尝试了以下方法:

library(httr)
library(stringr)

r = str_c("http://www.memoiredeshommes.sga.defense.gouv.fr/fr/arkotheque/",
          "client/mdh/base_morts_pour_la_france_premiere_guerre/index.php")

q = list(
  "action" = 1,
  "todo" = "rechercher",
  "le_id"  = "",
  "multisite" = "",
  "r_c_nom" = "mo",
  "r_c_nom_like" = 1,
  "r_c_prenom" = "",
  "r_c_prenom_like" = 1,
  "r_c_naissance_jour_mois_annee_jj_debut" = "",
  "r_c_naissance_jour_mois_annee_mm_debut" = "",
  "r_c_naissance_jour_mois_annee_yyyy_debut" = 1890,
  "r_c_naissance_jour_mois_annee_jj_fin" = "",
  "r_c_naissance_jour_mois_annee_mm_fin" = "",
  "r_c_naissance_jour_mois_annee_yyyy_fin" = "",
  "r_c_id_naissance_departement" = "",
  "hidden_c_id_naissance_departement" = "",
  "r_c_id_naissance_pays" = "",
  "hidden_c_id_naissance_pays" = "",
  "r_annot_c_id_grade" = "",
  "hidden_c_id_grade" = "",
  "r_annot_c_id_unite" = "",
  "hidden_c_id_unite" = "",
  "r_annot_c_id_recrutement_bureau" = "",
  "hidden_c_id_recrutement_bureau" = "",
  "r_annot_c_classe" = "",
  "r_annot_c_recrutement_matricule" = "",
  "r_annot_c_id_naissance_lieu" = "",
  "hidden_c_id_naissance_lieu" = "",
  "r_annot_c_deces_jour_mois_annee_jj_debut" = "",
  "r_annot_c_deces_jour_mois_annee_mm_debut" = "",
  "r_annot_c_deces_jour_mois_annee_yyyy_debut" = "",
  "r_annot_c_deces_jour_mois_annee_jj_fin" = "",
  "r_annot_c_deces_jour_mois_annee_mm_fin" = "",
  "r_annot_c_deces_jour_mois_annee_yyyy_fin" = "",
  "r_annot_c_id_deces_lieu" = "",
  "hidden_c_id_deces_lieu" = "",
  "r_annot_c_deces_lieu_complement" = "",
  "r_annot_c_deces_lieu_complement_like" = 1,
  "r_annot_c_id_deces_departement" = "",
  "hidden_c_id_deces_departement" = "",
  "r_annot_c_id_deces_pays" = "",
  "hidden_c_id_deces_pays" = "",
  "r_annot_c_id_transcription_etablissement_lieu" = "",
  "hidden_c_id_transcription_etablissement_lieu" = "",
  "r_annot_c_id_transcription_etablissement_departement" = "",
  "hidden_c_id_transcription_etablissement_departement" = "",
  "r_annot_c_id_transcription_etablissement_pays" = "",
  "hidden_c_id_transcription_etablissement_pays" = ""
)

t = GET(r, query = q, verbose())
writeLines(content(t, "text", encoding = "UTF-8"), "~/Desktop/test.html")

…这根本不起作用(我得到的只是NA)。

我做错了什么?

【问题讨论】:

    标签: r get httr


    【解决方案1】:

    你可以这样试试

    library(rvest)
    html_session(url) %>%
      rvest:::request_POST(url, body = q, encode = "form") %>%
      read_html  %>%
      html_table 
    # [[1]]
    #             Nom                     Prénom(s) Date de naissance                   Département/Pays de naissance Détail     Images Panier Lien Fiche annotée
    # 1          MOAL                    Alain Marc        10-08-1890                                  29 - Finistère Détail Visualiser Panier  Ark           oui
    # 2          MOAL                          Jean        22-12-1890                                  29 - Finistère Détail Visualiser Panier  Ark           oui
    # 3          MOAL                  Joseph Marie        29-04-1890                                  29 - Finistère Détail Visualiser Panier  Ark           oui
    # 4        MOALIC           Pierre Joseph Marie        05-04-1890                                  29 - Finistère Détail Visualiser Panier  Ark           oui
    # ...
    

    【讨论】:

    • 绝招,非常感谢!请问(1)你是怎么遇到rvest中的request_POST函数的,以及(2)是否有办法用httr做同样的事情?我将尝试通过查看request_POST 的代码来回答(2)。再次感谢。
    • 好吧,忘了(2),答案从rvest的代码就很明显了。
    • 不客气。 (1) 为Floo0 干杯——这个也帮助了我一次。 :)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-04-04
    • 2017-08-02
    • 1970-01-01
    • 2018-09-10
    • 1970-01-01
    相关资源
    最近更新 更多