自定义udtf函数(一进多出)
2022/8/4 23:24:40
本文主要是介绍自定义udtf函数(一进多出),对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
案例要求
java编写
package udtf; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory; import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; import java.util.ArrayList; import java.util.List; public class MyExplode extends GenericUDTF { @Override public StructObjectInspector initialize(StructObjectInspector argOIs) throws UDFArgumentException { List<String> columnNames = new ArrayList<String>(); columnNames.add("user"); List<ObjectInspector> objectInspectors = new ArrayList<ObjectInspector>(); objectInspectors.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector); return ObjectInspectorFactory.getStandardStructObjectInspector(columnNames, objectInspectors); } public void process(Object[] args) throws HiveException { String str = args[0].toString(); String split = args[1].toString(); String[] strings = str.split(split); for (String s : strings) { ArrayList<String> list = new ArrayList<String>(); list.add(s); forward(list); } } public void close() throws HiveException { } }
shell
hive (default)> create temporary function myexplode as "udtf.MyExplode" using jar "hdfs://node1:9000/hive_function-1.0-SNAPSHOT.jar"; Added [/tmp/10de4466-6601-49b1-b749-8b5c8c2809b2_resources/hive_function-1.0-SNAPSHOT.jar] to class path Added resources: [hdfs://node1:9000/hive_function-1.0-SNAPSHOT.jar] OK Time taken: 5.442 seconds hive (default)> create table a(name string); OK Time taken: 1.046 seconds hive (default)> insert into table a values("zs_ls_ww"),("ww_ml_wb"); hive (default)> select myexplode(name, "_") from a; OK user zs ls ww ww ml wb Time taken: 1.138 seconds, Fetched: 6 row(s)
这篇关于自定义udtf函数(一进多出)的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-06-05做软件测试需要懂代码吗?
- 2024-06-0514-ShardingSphere的分布式主键实现
- 2024-06-03为什么以及如何要进行架构设计权衡?
- 2024-05-31全网首发第二弹!软考2024年5月《软件设计师》真题+解析+答案!(11-20题)
- 2024-05-31全网首发!软考2024年5月《软件设计师》真题+解析+答案!(21-30题)
- 2024-05-30【Java】百万数据excel导出功能如何实现
- 2024-05-30我们小公司,哪像华为一样,用得上IPD(集成产品开发)?
- 2024-05-30java excel上传--poi
- 2024-05-30安装笔记本应用商店的pycharm,再安排pandas等模块,说是没有打包工具?
- 2024-05-29java11新特性