【问题标题】:connect to a web page with jsoup java使用 jsoup java 连接到网页
【发布时间】:2014-08-17 00:05:58
【问题描述】:

我正在尝试在 java 中使用 jsoup 来获取我知道它是短 URL 的页面的原始 URL。 例子 : 短网址:http://wornon.tv/20602 原网址:http://www1.bloomingdales.com/shop/product/bcbgmaxazria-dress-holly-blocked-sheath?ID=985138&LinkshareID=J84DHJLQkR4-TrjplRpi_nk8..LkpiI2ZA&PartnerID=LINKSHARE&cm_mmc=LINKSHARE--n--n-_-n

我可以使用 jsoup 还是我应该使用其他工具?谢谢你

【问题讨论】:

    标签: java jsoup url-redirection


    【解决方案1】:

    试试

    import java.io.IOException;
    import java.net.HttpURLConnection;
    
    import org.jsoup.Connection.Response;
    import org.jsoup.Jsoup;
    
    public class Test {
    
    public static void main(String[] args) throws IOException {
        Test test = new Test();
        String redirectUrl = test.getRedrectUrl("http://wornon.tv/20602"); // will return http://wornon.tv/out.php?z=20602
        redirectUrl = test.getRedrectUrl(redirectUrl); // will return http://api.shopstyle.com/action/apiVisitRetailer?url=http%3A%2F%2Fwww1.bloomingdales.com%2Fshop%2Fproduct%2Fbcbgmaxazria-dress-holly-blocked-sheath%3FID%3D985138&pid=uid5721-3671061-71&utm_medium=widget&utm_source=Product+Link
        System.out.println(redirectUrl);
    }
    
    private String getRedrectUrl(String url) throws IOException {
        Response response = Jsoup.connect(url).followRedirects(false).execute();
        int status = response.statusCode();
        if (status == HttpURLConnection.HTTP_MOVED_TEMP || status == HttpURLConnection.HTTP_MOVED_PERM || status == HttpURLConnection.HTTP_SEE_OTHER) {
            return response.header("location");
        }
        return null;
    }
    }
    

    【讨论】:

      【解决方案2】:

      如果响应码在 300s 以内,只需使用 HttpUrlConnection 并检查位置标头。

      What is the fastest way to resolve a shortened link to its target URL in Java? How to get the complete URL address most efficiently?

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2019-01-15
        • 2014-06-09
        • 2017-07-19
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-04-14
        相关资源
        最近更新 更多