【发布时间】:2022-01-26 12:39:20
【问题描述】:
我的大学任务是使用 TCP 套接字和 HTTP GET 请求通过 URL 从任何 Web 服务器获取网页。
我没有收到来自任何服务器的HTTP/1.0 200 OK 响应。
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.PrintStream;
import java.net.InetAddress;
import java.net.Socket;
import java.net.URL;
import java.util.Scanner;
import java.net.*;
public class DCCN042 {
public static void main(String[] args) {
Scanner inpt = new Scanner(System.in);
System.out.print("Enter URL: ");
String url = inpt.next();
TCPConnect(url);
}
public static void TCPConnect(String url) {
try {
String hostname = new URL(url).getHost();
System.out.println("Loading contents of Server: " + hostname);
InetAddress ia = InetAddress.getByName(hostname);
String ip = ia.getHostAddress();
System.out.println(ip + " is IP Adress for " + hostname);
String path = new URL(url).getPath();
System.out.println("Requested Path on the server: " + path);
Socket socket = new Socket(ip, 80);
// Create input and output streams to read from and write to the server
PrintStream out = new PrintStream(socket.getOutputStream());
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
// Follow the HTTP protocol of GET <path> HTTP/1.0 followed by an empty line
if (hostname ! = url) {
//Request Line
out.println("GET " + path + " HTTP/1.1");
out.println("Host: " + hostname);
//Header Lines
out.println("User-Agent: Java/13.0.2");
out.println("Accept-Language: en-us");
out.println("Accept: */*");
out.println("Connection: keep-alive");
out.println("Accept-Encoding: gzip, deflate, br");
// Blank Line
out.println();
} else {
//Request Line
out.println("GET / HTTP/1.0");
out.println("Host: " + hostname);
//Header Lines
out.println("User-Agent: Java/13.0.2");
out.println("Accept-Language: en-us");
out.println("Accept: */*");
out.println("Connection: keep-alive");
out.println("Accept-Encoding: gzip, deflate, br");
// Blank Line
out.println();
}
// Read data from the server until we finish reading the document
String line = in.readLine();
while (line != null) {
System.out.println(line);
line = in.readLine();
}
// Close our streams
in.close();
out.close();
socket.close();
} catch (Exception e) {
System.out.println("Invalid URl");
e.printStackTrace();
}
}
}
我创建了一个 TCP 套接字,并将我从 InetAddress.getHostAddress() 收到的 IP 地址和端口 80 传递给 Web 服务器,并使用 getPath() 和 getHost() 将路径和主机名与 URL 分开,并且在 HTTP GET 请求中使用相同的路径和主机名。
来自服务器的响应:
Enter URL: https://stackoverflow.com/questions/33015868/java-simple-http-get-request-using-tcp-sockets
Loading contents of Server: stackoverflow.com
151.101.65.69 is IP Adress for stackoverflow.com
Requested Path on the server: /questions/33015868/java-simple-http-get-request-using-tcp-sockets
HTTP/1.1 301 Moved Permanently
cache-control: no-cache, no-store, must-revalidate
location: https://stackoverflow.com/questions/33015868/java-simple-http-get-request-using-tcp-sockets
x-request-guid: 5f2af765-40c2-49ca-b9a1-daa321373682
feature-policy: microphone 'none'; speaker 'none'
content-security-policy: upgrade-insecure-requests; frame-ancestors 'self' https://stackexchange.com
Accept-Ranges: bytes
Transfer-Encoding: chunked
Date: Mon, 27 Dec 2021 15:00:17 GMT
Via: 1.1 varnish
Connection: keep-alive
X-Served-By: cache-qpg1263-QPG
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1640617217.166650,VS0,VE338
Vary: Fastly-SSL
X-DNS-Prefetch-Control: off
Set-Cookie: prov=149aa0ef-a3a6-8001-17c1-128d6d4b7273; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
0
我的要求是获取此网页的 HTML 源代码,以及一个HTTP/1.0 200 OK 响应。
【问题讨论】:
-
另外,HTTPS 不使用普通套接字进行通信。因此,您应该使用
SSLSocket进行HTTPS 或查找没有HTTPS 的站点。 -
@geobreze,我没有使用 SSL 套接字并点击“https”。谢谢你成功了。
标签: java html http sockets tcpclient