从本质理解Oracle数据泵(expdp/impdp)原理
2022/2/21 19:30:41
本文主要是介绍从本质理解Oracle数据泵(expdp/impdp)原理,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
从本质理解Oracle数据泵(expdp/impdp)原理
范伟林4天前 561一、前言
相信数据泵我们会经常用到,但是没有理解其原理和普通exp/imp的区别,往往遇到一些错误就很难进行排查;
其实数据泵的本质就是后台存储过程作业,由dw进程进行工作,此进程可以并发多个运行,相关作业状态可以由dba_datapump_jobs视图查看。在终端关闭或者ctrl c啥的依然不会影响作业的运行,若要进行作业管理需要attach;
利用数据泵可以做到逻辑的备份,数据迁移,传输表空间等,DBA比较常用的工具;
下面我们做个简单案例,环境为12c版本。
二、数据导出(expdp)
-
数据的导出为数据库的存储过程作业,所以需要用到目录对象,定义导出dump文件、日志文件等的路径信息;
-
导出任务发起,作业会数据库自动创建,作业会自动创建master表记录作业信息,作业状态也会记录在dba_datapump_jobs视图
-
数据导出源端为数据库,目标端为dump文件,也就是灵活将数据库的相关对象写入到dump物理文件,理解链接关系,跟目标库是无关系的
-
当然需要用到exp_full_database角色权限,目录对象的执行读写权限
1、查看涉及的表空间信息
表:
set line 300 pages 100 col tablespace_name for a40 col owner for a30 select distinct tablespace_name,owner from dba_tables where owner in (select username from dba_users where account_status='OPEN' and username not in ('SYS','SYSTEM','DBSNMP')) order by 2;
索引:
set line 300 pages 100 col tablespace_name for a40 col owner for a30 select distinct tablespace_name,owner from dba_indexes where owner in (select username from dba_users where account_status='OPEN' and username not in ('SYS','SYSTEM','DBSNMP')) order by 2;
Note:规划需要导出的schema,找出其数据涉及的表空间,方便后续的数据导入规划
2、查看数据库逻辑大小
select sum(bytes)/1024/1024/1024 from dba_segments;
Note:也可以事先查出比较大的表,例如千万级别以上的,然后在导的过程中看进度也心里有数,也可以单独拎出来导
3、创建目录对象
准备目录 mkdir /backup/dumpfile chown oracle.oinstall /backup/dumpfile 创建 create directory mydir as '/backup/dumpfile';
Note:这里默认是用sys进行创建,普通用户需要进行授权
4、导出数据
创建参数文件
vi /home/oracle/wl_full.par USERID=system/******* DIRECTORY=dumpdir DUMPFILE=wl20220216_%U.dmp LOGFILE=wl20220216_exp.log JOB_NAME=wl_export_full LOGTIME=all SCHEMA=wl EXCLUDE=statistics COMPRESSION=all PARALLEL=8 CLUSTER=no
Note:也可以直接写导出参数,看个人习惯;参数比较多,可以得知数据泵很灵活功能强大,可以-help查看所有参数,这里列举几个参数;
- USERID 为用户凭证,普通用户操作需要授权
- DIRECTORY 为目录对象
- DUMPFILE 为导出文件名称
- LOGFILE 为导出日志名称
- JOB_NAME 为作业名称,自定义作业名称方便管理
- LOGTIME all为记录每个环节的时间
- SCHEMA 为模式名称,通俗讲也就是导出哪个业务用户的数据
- EXCLUDE 为排除不导出的内容,这里列举为统计信息,当然可以是其他的对象信息
- COMPRESSION 为导出数据的压缩级别,all为数据和元数据全部压缩,最高压缩类型,当然可以结合压缩算法COMPRESSION_ALGORITHM达到更高级别的压缩
- PARALLEL 为并行度,也就是定义dw进程的个数
- CLUSTER 为RAC特有参数,定义作业是否在每个节点运行
执行任务
expdp PARFILE=/home/oracle/wl_full.par
5、查看作业日志也状态
观察导出日志
也就是LOGFILE参数定义的日志文件
tail -1000f wl20220216_exp.log
Export: Release 12.2.0.1.0 - Production on Thu Oct 15 11:54:07 2020 Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved. ;;; Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 15-OCT-20 11:54:19.635: ;;; ************************************************************************** 15-OCT-20 11:54:19.640: ;;; Parfile values: 15-OCT-20 11:54:19.645: ;;; parfile: parallel=8 15-OCT-20 11:54:19.650: ;;; parfile: cluster=N 15-OCT-20 11:54:19.655: ;;; parfile: schemas=wl 15-OCT-20 11:54:19.661: ;;; parfile: logtime=ALL 15-OCT-20 11:54:19.666: ;;; parfile: logfile=wl20220216_exp.log 15-OCT-20 11:54:19.671: ;;; parfile: dumpfile=wl20220216_%U.dmp 15-OCT-20 11:54:19.754: ;;; parfile: directory=mydir 15-OCT-20 11:54:19.760: ;;; ************************************************************************** 15-OCT-20 11:54:20.427: FLASHBACK automatically enabled to preserve database integrity. 15-OCT-20 11:54:21.601: Starting "SYS"."SYS_EXPORT_SCHEMA_01": "/******** AS SYSDBA" parfile=/home/oracle/wl_full.par 15-OCT-20 11:54:24.230: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT 15-OCT-20 11:54:24.273: Processing object type SCHEMA_EXPORT/ROLE_GRANT 15-OCT-20 11:54:24.274: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE 15-OCT-20 11:54:24.275: Processing object type SCHEMA_EXPORT/USER 15-OCT-20 11:54:25.968: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA 15-OCT-20 11:54:26.146: Processing object type SCHEMA_EXPORT/DB_LINK 15-OCT-20 11:54:26.773: Processing object type SCHEMA_EXPORT/SYNONYM/SYNONYM 15-OCT-20 11:54:26.900: Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_INDEX/INDEX_STATISTICS 15-OCT-20 11:54:26.904: Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_BODY 15-OCT-20 11:54:27.075: Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS 15-OCT-20 11:54:27.222: Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS 15-OCT-20 11:54:27.621: Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE 15-OCT-20 11:54:28.311: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA 15-OCT-20 11:54:29.010: Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE 15-OCT-20 11:54:29.999: Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT 15-OCT-20 11:54:31.714: Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_SPEC 15-OCT-20 11:54:31.830: Processing object type SCHEMA_EXPORT/FUNCTION/FUNCTION 15-OCT-20 11:54:31.969: Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE 15-OCT-20 11:54:32.145: Processing object type SCHEMA_EXPORT/PACKAGE/COMPILE_PACKAGE/PACKAGE_SPEC/ALTER_PACKAGE_SPEC 15-OCT-20 11:54:32.157: Processing object type SCHEMA_EXPORT/FUNCTION/ALTER_FUNCTION 15-OCT-20 11:54:32.163: Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE 15-OCT-20 11:54:32.582: Processing object type SCHEMA_EXPORT/TABLE/COMMENT 15-OCT-20 11:54:32.630: Processing object type SCHEMA_EXPORT/TABLE/TABLE 15-OCT-20 11:54:32.972: Processing object type SCHEMA_EXPORT/TABLE/INDEX/FUNCTIONAL_INDEX/INDEX 15-OCT-20 11:54:34.086: Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT 15-OCT-20 11:54:34.612: Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT 15-OCT-20 11:54:34.678: Processing object type SCHEMA_EXPORT/VIEW/VIEW 15-OCT-20 11:54:34.783: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX 15-OCT-20 11:54:47.347: Processing object type SCHEMA_EXPORT/STATISTICS/MARKER 15-OCT-20 11:58:02.392: . . exported "WL"."T_TEST_RECORD" 18.99 GB 66499480 rows 15-OCT-20 11:59:22.653: . . exported "WL"."T_TEST" 30.47 GB 70834724 rows
观察作业状态
真正管理作业需要attach作业进去操作
查看作业
set line 300 pages 100 col owner_name for a20 col job_name for a30 col state for a20 select owner_name,job_name,state from dba_datapump_jobs;
OWNER_NAME JOB_NAME STATE -------------------- ------------------------------ -------------------- SYSTEM SJJ_EXPORT_FULL EXECUTING
登录作业
expdp system attach=sjj_export_full
查看作业状态
status
Export> status Job: SJJ_EXPORT_FULL Operation: EXPORT Mode: FULL State: STOP PENDING Bytes Processed: 32,384,054,664 Percent Done: 99 Current Parallelism: 8 Job Error Count: 0 Job heartbeat: 6 Dump File: /backup/dumpfile/full20220216_%u.dmp Dump File: /backup/dumpfile/full20220216_01.dmp bytes written: 15,032,143,872 Dump File: /backup/dumpfile/full20220216_02.dmp bytes written: 3,542,888,448 Dump File: /backup/dumpfile/full20220216_03.dmp bytes written: 3,009,998,848 Dump File: /backup/dumpfile/full20220216_04.dmp bytes written: 2,373,156,864 Dump File: /backup/dumpfile/full20220216_05.dmp bytes written: 3,188,301,824 Dump File: /backup/dumpfile/full20220216_06.dmp bytes written: 948,051,968 Dump File: /backup/dumpfile/full20220216_07.dmp bytes written: 37,437,628,416 Dump File: /backup/dumpfile/full20220216_08.dmp bytes written: 2,978,820,096 Worker 1 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Object start time: Wednesday, 16 February, 2022 20:35:08 Object status at: Wednesday, 16 February, 2022 22:03:31 Process Name: DW00 State: WORK WAITING Worker 2 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Access method: direct_path Object start time: Wednesday, 16 February, 2022 20:33:35 Object status at: Wednesday, 16 February, 2022 20:40:42 Process Name: DW01 State: WORK WAITING Worker 3 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Access method: direct_path Object start time: Wednesday, 16 February, 2022 20:33:35 Object status at: Wednesday, 16 February, 2022 21:16:26 Process Name: DW02 State: WORK WAITING Worker 4 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Access method: direct_path Object start time: Wednesday, 16 February, 2022 20:33:38 Object status at: Wednesday, 16 February, 2022 20:40:37 Process Name: DW03 State: WORK WAITING Worker 5 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Object start time: Wednesday, 16 February, 2022 20:36:11 Object status at: Wednesday, 16 February, 2022 20:38:13 Process Name: DW04 State: EXECUTING Object Schema: WL Object Name: T_TEST Object Type: DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA Completed Objects: 4 Total Objects: 939 Completed Rows: 430,816 Worker Parallelism: 1 Worker 6 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Access method: external_table Object start time: Wednesday, 16 February, 2022 20:33:35 Object status at: Wednesday, 16 February, 2022 20:41:10 Process Name: DW05 State: WORK WAITING Worker 7 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Access method: direct_path Object start time: Wednesday, 16 February, 2022 20:33:35 Object status at: Wednesday, 16 February, 2022 20:40:31 Process Name: DW06 State: WORK WAITING Worker 8 Status: Instance ID: 1 Instance name: wldb1 Host name: wldb1 Access method: direct_path Object start time: Wednesday, 16 February, 2022 20:34:00 Object status at: Wednesday, 16 February, 2022 20:40:31 Process Name: DW07 State: WORK WAITING Export>
Note:若导出日志长时间没反应,可以时不时查看作业status,观察相关对象是否在变化,判断作业是否正常。
其他操作:
STOP_JOB,停止作业,可以继续启动,dba_datapump_jobs信息依然存在
START_JOB,继续启动停止的作业
KILL_JOB,强制终止作业,dba_datapump_jobs信息会清除
三、数据导入
导入跟导出原理差不多
-
数据的导入也是数据库的存储过程作业,所以需要用到目录对象,定义dump文件、日志文件等的路径信息;
-
导入任务发起,作业会数据库自动创建,作业会自动创建master表记录作业信息,作业状态也会记录在dba_datapump_jobs视图
-
数据导入源端为dump文件,目标端为数据库,也就是灵活将dump文件的相关对象写入到目标数据库,理解链接关系,跟源库是无关系的
-
当然需要用到imp_full_database角色权限,目录对象的执行读写权限
1、创建用户和表空间
create tablespace TEST datafile size 31G autoextend on; create user TEST identified by "TEST" default tablespace TEST quota unlimited on TEST;
Note:关键是定义好表空间,用户可以不用创建
若不同表空间需要利用REMAP_TABLESPACE重新映射表空间
若需要导入不同的用户,可以利用REMAP_SCHEMA重新映射用户
当然表结构和数据也可以重新映射
2、创建目录对象
准备目录 mkdir /backup/dumpfile chown oracle.oinstall /backup/dumpfile 创建 create directory mydir as '/backup/dumpfile';
3、导入数据
创建参数文件
vi /home/oracle/imp_full.par USERID=system/******* DIRECTORY=mydir DUMPFILE=wl20220216_%U.dmp LOGFILE=wl20220216_imp.log JOB_NAME=wl_import_full LOGTIME=ALL CLUSTER=NO PARALLEL=8 TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y REMAP_TABLESPACE=users:CISMT_DATA TABLE_EXISTS_ACTION=REPLACE
Note:也可以直接写导出参数,看个人习惯;参数比较多,可以得知数据泵很灵活功能强大,可以-help查看所有参数,这里列举几个参数;
- USERID 为导出用户凭证
- DIRECTORY 为目录对象
- DUMPFILE 为导出文件名称
- LOGFILE 为导入日志名称
- JOB_NAME 为作业名称,自定义作业名称方便管理
- LOGTIME all为记录每个环节的时间
- SCHEMA 为模式名称,通俗讲也就是导出哪个业务用户的数据
- EXCLUDE 为排除不导出的内容,这里列举为统计信息,当然可以是其他的对象信息
- COMPRESSION 为导出数据的压缩级别,all为数据和元数据全部压缩
- PARALLEL 为并行度,也就是定义dw进程的个数,要跟dmp文件数对应上
- CLUSTER 为RAC特有参数,定义作业是否在每个节点运行
- TRANSFORM为转换参数,DISABLE_ARCHIVE_LOGGING:Y也就是不写归档,12c新特性
- REMAP_TABLESPACE为重新映射表空间,源:目标
- TABLE_EXISTS_ACTION为当表存着如何操作,REPLACE为替换,重新创建表
执行任务
expdp PARFILE=/home/oracle/imp_full.par
4、查看作业日志也状态
观察导出日志
也就是LOGFILE参数定义的日志文件
tail -1000f wl20220216_imp.log
;;; Import: Release 12.2.0.1.0 - Production on Thu Oct 15 14:59:51 2020 Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved. ;;; Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 15-OCT-20 14:59:59.462: ;;; ************************************************************************** 15-OCT-20 14:59:59.467: ;;; Parfile values: 15-OCT-20 14:59:59.472: ;;; parfile: table_exists_action=REPLACE 15-OCT-20 14:59:59.477: ;;; parfile: remap_tablespace=SAPME:test 15-OCT-20 14:59:59.488: ;;; parfile: transform=DISABLE_ARCHIVE_LOGGING:Y 15-OCT-20 14:59:59.493: ;;; parfile: parallel=8 15-OCT-20 14:59:59.498: ;;; parfile: cluster=N 15-OCT-20 14:59:59.503: ;;; parfile: logtime=ALL 15-OCT-20 14:59:59.508: ;;; parfile: logfile=wl20220216_imp.log 15-OCT-20 14:59:59.513: ;;; parfile: dumpfile=wl20220216_%U.dmp 15-OCT-20 14:59:59.518: ;;; parfile: directory=mydir 15-OCT-20 14:59:59.523: ;;; ************************************************************************** 15-OCT-20 15:00:01.940: Master table "SYS"."SYS_IMPORT_FULL_03" successfully loaded/unloaded 15-OCT-20 15:00:03.878: Starting "SYS"."SYS_IMPORT_FULL_03": "/******** AS SYSDBA" parfile=/home/oracle/imp_full.par 15-OCT-20 15:00:03.970: Processing object type SCHEMA_EXPORT/USER 15-OCT-20 15:00:05.109: ORA-31684: Object type USER:"WIPTEST" already exists 15-OCT-20 15:00:05.286: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT 15-OCT-20 15:00:06.522: Processing object type SCHEMA_EXPORT/ROLE_GRANT 15-OCT-20 15:00:07.518: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE 15-OCT-20 15:00:08.364: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA 15-OCT-20 15:00:08.710: Processing object type SCHEMA_EXPORT/DB_LINK 15-OCT-20 15:00:09.081: Processing object type SCHEMA_EXPORT/SYNONYM/SYNONYM 15-OCT-20 15:00:11.453: Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE 15-OCT-20 15:00:12.108: Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE 15-OCT-20 15:00:14.614: Processing object type SCHEMA_EXPORT/TABLE/TABLE 15-OCT-20 15:00:25.856: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA 15-OCT-20 15:21:09.917: . . imported "WL"."T_TEST" 18.99 GB 66499480 rows 15-OCT-20 15:26:00.295: . . imported "WL"."TEST01" 30.47 GB 70834724 rows
观察作业状态
真正管理作业需要attach作业进去操作
查看作业
set line 300 pages 100 col owner_name for a20 col job_name for a30 col state for a20 select owner_name,job_name,state from dba_datapump_jobs;
OWNER_NAME JOB_NAME STATE -------------------- ------------------------------ -------------------- SYSTEM SJJ_IMPORT_FULL EXECUTING
登录作业
expdp system attach=sjj_import_full
查看作业状态
status
Note:状态跟导入的差不多;操作同理
晋升:当然再细排查作业进度,就需要分析数据库作业会话,会话的等待事件,会话执行SQL进度。
这篇关于从本质理解Oracle数据泵(expdp/impdp)原理的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-11-23增量更新怎么做?-icode9专业技术文章分享
- 2024-11-23压缩包加密方案有哪些?-icode9专业技术文章分享
- 2024-11-23用shell怎么写一个开机时自动同步远程仓库的代码?-icode9专业技术文章分享
- 2024-11-23webman可以同步自己的仓库吗?-icode9专业技术文章分享
- 2024-11-23在 Webman 中怎么判断是否有某命令进程正在运行?-icode9专业技术文章分享
- 2024-11-23如何重置new Swiper?-icode9专业技术文章分享
- 2024-11-23oss直传有什么好处?-icode9专业技术文章分享
- 2024-11-23如何将oss直传封装成一个组件在其他页面调用时都可以使用?-icode9专业技术文章分享
- 2024-11-23怎么使用laravel 11在代码里获取路由列表?-icode9专业技术文章分享
- 2024-11-22怎么实现ansible playbook 备份代码中命名包含时间戳功能?-icode9专业技术文章分享