【发布时间】:2020-11-28 18:00:12
【问题描述】:
我有以下系统:
- Windows 主机
- 带有 Docker 的 Linux 客户机(在 Virtual Box 中)
我已经在 Docker(Ubuntu、Virtual Box)中安装了 HDFS。我使用了来自 Docker Hub 的 bde2020 hadoop 映像。这是我的 docker-compose:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
container_name: namenode
restart: always
ports:
- 9870:9870
- 9000:9000
volumes:
- hadoop_namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- ./hadoop.env
networks:
control_net:
ipv4_address: 10.0.1.20
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
container_name: datanode
restart: always
ports:
- 9864:9864
volumes:
- hadoop_datanode:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:9870"
env_file:
- ./hadoop.env
networks:
control_net:
ipv4_address: 10.0.1.21
resourcemanager:
image: bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8
container_name: resourcemanager
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864"
env_file:
- ./hadoop.env
networks:
control_net:
ipv4_address: 10.0.1.22
nodemanager1:
image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8
container_name: nodemanager
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088"
env_file:
- ./hadoop.env
networks:
control_net:
ipv4_address: 10.0.1.23
historyserver:
image: bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8
container_name: historyserver
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088"
volumes:
- hadoop_historyserver:/hadoop/yarn/timeline
env_file:
- ./hadoop.env
networks:
control_net:
ipv4_address: 10.0.1.24
volumes:
hadoop_namenode:
hadoop_datanode:
hadoop_historyserver:
networks:
processing_net:
driver: bridge
ipam:
driver: default
config:
- subnet: 10.0.0.0/24
gateway: 10.0.0.1
我的 hdfs-site.xml 是:
<configuration>
<property><name>dfs.namenode.datanode.registration.ip-hostname-check</name><value>false</value></property>
<property><name>dfs.webhdfs.enabled</name><value>true</value></property>
<property><name>dfs.permissions.enabled</name><value>false</value></property>
<property><name>dfs.namenode.name.dir</name><value>file:///hadoop/dfs/name</value></property>
<property><name>dfs.namenode.rpc-bind-host</name><value>0.0.0.0</value></property>
<property><name>dfs.namenode.servicerpc-bind-host</name><value>0.0.0.0</value></property>
<property><name>dfs.namenode.http-bind-host</name><value>0.0.0.0</value></property>
<property><name>dfs.namenode.https-bind-host</name><value>0.0.0.0</value></property>
<property><name>dfs.client.use.datanode.hostname</name><value>true</value></property>
<property><name>dfs.datanode.use.datanode.hostname</name><value>true</value></property>
</configuration>
如果我在导航器中从 Linux(在 Virtual Box 内)编写:
然后我可以访问 Hadoop web ui。
如果我从 Windows(主机系统,Virtual Box 外部)的导航器中写入:
http://192.168.56.1:9870 然后我也可以访问(我已映射此 IP 以便能够从 Virtual Box 外部连接)。
但是当我在 web ui 中导航并且我想下载文件时会出现问题。然后导航器说它无法连接到服务器 dcfb0bf3b42c 并在地址选项卡中显示如下一行:
http://dcfb0bf3b42c:9864/webhdfs/v1/tmp/datalakes/myJsonTest1/part-00000-0009b521-b474-49e7-be20-40f5e8b3a7b4-c000.json?op=OPEN&namenoderpcaddress=namenode:9000&offset=0
如果我将这部分“dcfb0bf3b42c”更改为 IP:10.0.1.21(来自 Linux)或 192.168.56.1(来自 Windows),它将正常工作并下载文件。
我需要自动化这个过程以避免每次都需要手动编写 IP,因为我需要使用程序来访问 HDFS 数据(Power BI)并且当它尝试访问数据时由于提到的问题而失败.
我是 Hadoop 新手。我可以通过编辑任何配置文件来解决这个问题吗?
【问题讨论】:
-
您能否尝试为您使用的每个容器/服务添加主机名选项,例如
hostname:historyserver。 -
@smart_coder 通过这些更改,它仍然无法下载文件并且地址选项卡显示:datanode:9864/webhdfs/v1/tmp/datalakes/myJsonTest1/…? (...)我也尝试写主机名:'hereContainerIP',但仍然无法在地址选项卡中获取:0.0.0.10:9864/webhdfs/v1/tmp/datalakes/myJsonTest1/part-(...)(我不知道它来自那个 0.0.0.10,因为我的容器 ip类似于:10.0.1.20、10.0.1.21、...等)
标签: docker hadoop hdfs webhdfs datanode