通过Java操作hdfs

2021/11/19 22:40:14

本文主要是介绍通过Java操作hdfs,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!

1、使用IDEA,在之前创建的bigdata项目里面新建hadoop模块,导入相关hadoop包

<dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.7.6</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.7.6</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.7.6</version>
        </dependency>


        <!-- https://mvnrepository.com/artifact/junit/junit -->
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.3</version>
        </dependency>
    </dependencies>

 

 

 

 

 2、连接Hadoop进行hdfs操作

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;

import java.net.URI;

/*
通过Java连接hadoop进行hdfs操作
 */
public class HadoopAPI {
    public static void main(String[] args) throws Exception{
        //自动获取hadoop配置文件
        Configuration conf = new Configuration();
        //设置副本
        conf.set("dfs.replication","1");
        //连接
        URI uri = new URI("hdfs://master:9000");
        //连接文件管理系统,生成一个对象,相当于一个客户端
        FileSystem fileSystem = FileSystem.get(uri, conf);
    }
}

 

 

 

 

  

 

 

 3、为了方便测试使用junit测试方法

  a、创建目录mkdir命令

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.junit.Before;
import org.junit.Test;

import java.net.URI;

/*
通过Java连接hadoop进行hdfs操作
 */
public class HadoopAPI {
    FileSystem fs;
    @Before
    //初始化
    public void init() throws Exception{
        //自动获取hadoop配置文件
        Configuration conf = new Configuration();
        //设置副本
        conf.set("dfs.replication","1");
        //连接
        URI uri = new URI("hdfs://master:9000");
        //连接文件管理系统,生成一个对象,相当于一个客户端
        fs = FileSystem.get(uri, conf);
    }

    @Test
    public void mkdir() throws Exception{
        fs.mkdirs(new Path("/javamake "));
    }
}

 

   b、递归删除目录

 @Test
    public  void delete() throws Exception{
//        fs.delete(new Path("/javamake"),false);
        fs.delete(new Path("/javamake"),true);
    }

 

 

   c、listStatus 获取多个       listStatus 获取单个

 @Test
    public  void listStatus() throws Exception{
        FileStatus[] fls = fs.listStatus(new Path("/"));
        for (FileStatus fl : fls) {
            System.out.println(fl.getLen());
            System.out.println(fl.getBlockSize());
            System.out.println(fl.getGroup());
        }
    }

 

 

   d、从hdfs上读文件

    @Test
    public void load() throws Exception{
        //从hdfs上获取要读文件
        FSDataInputStream path = fs.open(new Path("/data/student/students.txt"));
        //使用字符缓冲流读文件
        BufferedReader br = new BufferedReader(new InputStreamReader(path));
        String line;
        while ((line=br.readLine())!=null){
            System.out.println(line);
        }

        br.close();
        path.close();
    }

 

   e、写文件

  @Test
    public void create() throws Exception{
        FSDataOutputStream fo = fs.create(new Path("/text.txt"));
        BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fo));
        bw.write("你好");
        bw.newLine();
        bw.write("大数据");
        bw.newLine();
        bw.close();
        fo.close();
    }
}

 

 

 



这篇关于通过Java操作hdfs的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!


扫一扫关注最新编程教程