hadoop3.1 hdfs的api使用

1.开发环境配置

1.1eclipse导入插件

1.2.配置hadoop的引用路径

window---》preference ---》

1.3创建hadoop的location

与网页上active节点显示的一致即可

默认是空文件夹

测试连接的location，创建一个haha目录

1.4测试eclipse与hdfs的交互

符合预期

1.5创建lib库

1.6向库里导入hadoop3.1 的jar包

1.7新建一个java项目

点击finish

1.8项目中引入jar包

向项目中导入库右键项目 buildpath ---》 add library 选择 user library ，选择刚刚创建的library

1.9添加单元测试库JUnit

1.10项目中引入hadoop配置

将配置文件传到conf下的ha目录

右键ha目录 source as root

2.测试代码

TestHDFS demo:

2.1测试创建文件

package com.hadoop.hdfs;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class TestHDFS {
      Configuration conf;
      FileSystem fs;
      @Before
      public void conn() throws IOException{
            //读取xml文件获取配置对象
            conf = new Configuration(true);
            //客户端对象 fs
            fs=FileSystem.get(conf);
      }
      @After
      public void close() throws IOException{
            fs.close();
      }
      
      //创建文件
      @Test
      public void mkdir() throws IOException{
            Path ifile=new Path("/ooxx");
            if(fs.exists(ifile)){
                  fs.delete(ifile,true);
            }
            fs.mkdirs(ifile);
      }
}

2.2执行单元测试

点击右上角大象图标并刷新hdfs目录

2.3测试上传文件

      @Test
      public void upload() throws IOException{
            //输出流
            Path f= new Path("/ooxx/hello.txt");
            FSDataOutputStream output = fs.create(f);
            
            //输入流
            InputStream input=new BufferedInputStream(new  FileInputStream(new File("c:\\nginx")));
            
            IOUtils.copyBytes(input, output, conf,true);
            
      }

2.4在服务端使用hdfs客户端创建目录和文件

hdfs -mkdir -p /user/root
[root@node01 ~]# ls
anaconda-ks.cfg  hosts.list  test.txt  tsar-master  tsar.zip
将本地的test.txt文件上传到hdfs
hdfs dfs -D dfs.blocksize=1048576 -put test.txt

2.5获取文件块信息代码

//取文件块信息
      @Test
      public void blk() throws IOException{
            Path i=new Path("/user/root/test.txt");
            FileStatus ifile=fs.getFileStatus(i);
            BlockLocation[] fileBlockLocations =  fs.getFileBlockLocations(ifile, 0, ifile.getLen());
            for(BlockLocation b:fileBlockLocations){
                  System.out.println(b);
            }
            FSDataInputStream in=fs.open(i);
            System.out.println((char)in.read());
            System.out.println((char)in.read());
            System.out.println((char)in.read());
            System.out.println((char)in.read());
            System.out.println((char)in.read());
            System.out.println((char)in.read());
      }

读取时按文件逐行读取有点类似readline，但是文件中的数据有可能会分布在不同的node节点上

0代表偏移量，14表示块的大小，node03代表块所在节点的位置