Hadoop由Java编写的,所有通过JavaAPI可以调用所有的HDFS的交互操作接口,最常用的是FileSystem类,它是有Hadoop fs 实现。
一、读取文件内容
1、Java.net.URL读取HDFS文件内容
import java.io.InputStream; import java.net.MalformedURLException; import java.net.URL; import org.apache.hadoop.fs.FsUrlStreamHandlerFactory; import org.apache.hadoop.io.IOUtils; public class main { static{ //让JAVA 程序识别Hadoop HDFS URL URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory()); } public static void main(String[] args) throws Exception { InputStream in = null; try{ //使用java.net.URL对象打开数据流 in = new URL("hdfs://192.168.2.50:8020/user/hadoop/outp89/part-r-00000").openStream(); IOUtils.copyBytes(in, System.out, 4096,false); } finally{ IOUtils.closeStream(in); } } }
2、SequenceFile文件写入
SequenceFile是HDFS API提供的一种二进制文件支持,这种二进制文件直接将<Key,Value>序列化到文件中。
package hadooptest2; import java.io.IOException; import java.net.URI; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IOUtils; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.SequenceFile; import org.apache.hadoop.io.Text; public class SequenceFileWriter { private static final String[] text = { "不忘初心", "砥砺前行", "只是测试", }; public static void main(String[] args) throws Exception { String uri = "hdfs://192.168.2.50:8020/user/hadoop/testseq"; Configuration conf = new Configuration(); SequenceFile.Writer writer = null; try{ FileSystem fs = FileSystem.get(URI.create(uri),conf); Path path = new Path(uri); //Int类型的Writable封装(Hadoop包) IntWritable key = new IntWritable(); Text value = new Text(); //SequenceFile.Writer 构造方法需要指定键值对类型 writer = SequenceFile.createWriter(fs, conf, path, key.getClass(), value.getClass()); for(int i= 0;i<100;i++) { //此demo中,键是从100-1 key.set(100-i); //此demo中,值是text[i%text.length]模值 value.set(text[i%text.length]); writer.append(key, value); } }catch(IOException e) { e.printStackTrace(); }finally{ IOUtils.closeStream(writer); } } }
查看写入的文件:
[root@TJ1-000 ~]# hdfs dfs -text /user/hadoop/testseq
17/03/28 09:31:05 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
17/03/28 09:31:05 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
100 不忘初心
99 砥砺前行
98 只是测试
97 不忘初心
96 砥砺前行
95 只是测试
94 不忘初心...
4 不忘初心
3 砥砺前行
2 只是测试
1 不忘初心
-