==========================================
Autonomous Health Framework (AHF) and tfactl
==========================================
Autonomous Health Framework (AHF) can be installed as the "root" user on the server,
which provides the most functionality and allows it to run in a proactive manner as a daemon.
In this example installation is done as the root user.
AHP installs these utilities:
orachk -> /opt/oracle.ahf/orachk/orachk
oerr -> /opt/oracle.ahf/orachk/lib/oerr.sh
tfactl
tfacl can be run with various flags, to gather different info about OS, oracle installations, filesystem, etc.
Step 1 - Installation
Unzip the software and run the ahf_setup command.
Answer the questions when prompted.
The following must be run as root
as root:
mkdir /opt/ahf_data
root@my_server:/software/oracle/oracle/scripts/AHF>% ./ahf_setup
AHF Installer for Platform Linux Architecture x86_64
AHF Installation Log : /tmp/ahf_install_211400_23652_2021_07_08-15_23_08.log
Starting Autonomous Health Framework (AHF) Installation
AHF Version: 21.1.4 Build Date: 202106281226
Default AHF Location : /opt/oracle.ahf
Do you want to install AHF at [/opt/oracle.ahf] ? [Y]|N : y
AHF Location : /opt/oracle.ahf
AHF Data Directory stores diagnostic collections and metadata.
AHF Data Directory requires at least 5GB (Recommended 10GB) of free space.
Please Enter AHF Data Directory : /opt/ahf_data
Do you want to add AHF Notification Email IDs ? [Y]|N : n
Extracting AHF to /opt/oracle.ahf
Configuring TFA Services
Discovering Nodes and Oracle Resources
Successfully generated certificates.
Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
.-----------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+----------------+---------------+------+-------+------------+----------------------+
| my_server | RUNNING | 6388 | 11599 | 21.1.4.0.0 | 21140020210628122659 |
'----------------+---------------+------+-------+------------+----------------------'
Running TFA Inventory...
Adding default users to TFA Access list...
.-------------------------------------------------------------------.
| Summary of AHF Configuration |
+---------------+---------------------------------------------------+
| Parameter | Value |
+-----------------+-------------------------------------------------+
| AHF Location | /opt/oracle.ahf |
| TFA Location | /opt/oracle.ahf/tfa |
| Orachk Location | /opt/oracle.ahf/orachk |
| Data Directory | /opt/ahf_data/oracle.ahf/data |
| Repository | /opt/ahf_data/oracle.ahf/data/repository |
| Diag Directory | /opt/ahf_data/oracle.ahf/data/qanfv-1-dbs-1b/diag
'-----------------+-------------------------------------------------'
Starting orachk scheduler from AHF ...
AHF binaries are available in /opt/oracle.ahf/bin
AHF is successfully installed
Do you want AHF to store your My Oracle Support Credentials for Automatic Upload ? Y|[N] : N
Moving /tmp/ahf_install_211400_3967_2021_07_08-15_28_16.log to /opt/ahf_data/oracle.ahf/data/qanfv-1-dbs-1b/diag/ahf/
This installs /opt/oracle.ahf/
root@my_server:/software/oracle/oracle/scripts/oracle_support>% cd /opt/oracle.ahf/bin/
root@my_server:/opt/oracle.ahf/bin>% ls -ltr
total 4
lrwxrwxrwx 1 root root 29 Mar 17 23:19 orachk -> /opt/oracle.ahf/orachk/orachk
lrwxrwxrwx 1 root root 34 Mar 17 23:19 oerr -> /opt/oracle.ahf/orachk/lib/oerr.sh
-rwxr-xr-x 1 root root 3818 Mar 17 23:19 tfactl
./orachk
Collections and audit checks log file is
/opt/oracle.ahf/data/qanfv-1-dbs-1a/orachk/user_root/output/orachk_qanfv-1-dbs-1a_igt_031621_152257/log/orachk.log
Step 2 - Execution - a general run
cd /opt/oracle.ahf/bin/
============================================================
Node name - qanfv-1-dbs-1a
============================================================
. . . . . .
Collecting - Database Parameters for igt database
Collecting - Database Undocumented Parameters for igt database
Collecting - List of active logon and logoff triggers for igt database
Collecting - CPU Information
Collecting - Disk I/O Scheduler on Linux
Collecting - DiskMount Information
Collecting - Kernel parameters
Collecting - Maximum number of semaphore sets on system
Collecting - Maximum number of semaphores on system
Collecting - Maximum number of semaphores per semaphore set
Collecting - Memory Information
Collecting - OS Packages
Collecting - Operating system release information and kernel version
Collecting - Patches for RDBMS Home
Collecting - Table of file system defaults
Collecting - number of semaphore operations per semop system call
Collecting - Disk Information
Collecting - ORAchk Daemon/Scheduler configuration
Collecting - Root user limits
Collecting - Verify TCP Selective Acknowledgement is enabled
Collecting - Verify no database server kernel out of memory errors
Collecting - Verify the vm.min_free_kbytes configuration
Data collections completed. Checking best practices on qanfv-1-dbs-1a.
------------------------------------------------------------
WARNING => Linux swap configuration does not meet recommendation
WARNING => Non-AWR Space consumption is greater than or equal to 50% of total SYSAUX space. for igt
WARNING => There are some application objects with STALE statistics for igt
INFO => Most recent ADR incidents for /software/oracle/122
INFO => Oracle GoldenGate failure prevention best practices
CRITICAL => The vm.min_free_kbytes configuration is not set as recommended
INFO => Oracle GoldenGate Health-Checks and Diagnostics Reports for igt
INFO => user_dump_dest has trace files older than 30 days for igt
WARNING => ORA-00600 errors found in alert log for igt
INFO => Alert log file is too big and should be rolled over periodically for igt
INFO => At some times checkpoints are not being completed for igt
WARNING => One or more redo log groups are not multiplexed for igt
WARNING => Primary database is not protected with Data Guard (standby database) for real-time data protection and availability for igt
FAIL => numa_balancing kernel parameter is not configured to 0
INFO => Important Storage Minimum Requirements for Grid & Database Homes
WARNING => OSWatcher is not running as is recommended.
FAIL => Database parameter DB_LOST_WRITE_PROTECT is not set to recommended value on igt instance
WARNING => Database parameter DB_BLOCK_CHECKING on primary is not set to the recommended value. for igt
WARNING => Consider setting the value of the parameter _cursor_obsolete_threshold to 1024 for Non-Multitenant environment which is the appropriate recommended value for igt
INFO => Operational Best Practices
INFO => Database Consolidation Best Practices
INFO => Computer failure prevention best practices
INFO => Data corruption prevention best practices
INFO => Logical corruption prevention best practices
INFO => Database/Cluster/Site failure prevention best practices
INFO => Client failover operational best practices
WARNING => Oracle patch 30712670 is not applied on RDBMS_HOME /software/oracle/122
WARNING => Oracle patch 29867728 is not applied on RDBMS_HOME /software/oracle/122
WARNING => Oracle patch 31142749 is not applied on RDBMS_HOME /software/oracle/122
WARNING => Oracle patch 26749785 is not applied on RDBMS_HOME /software/oracle/122
WARNING => Oracle patch 29302565 is not applied on RDBMS_HOME /software/oracle/122
WARNING => Oracle patch 29259068 is not applied on RDBMS_HOME /software/oracle/122
FAIL => RECYCLEBIN on PRIMARY should be set to the recommended value on igt instance
WARNING => Oracle clusterware is not being used
WARNING => RAC Application Cluster is not being used for database high availability on igt instance
FAIL => Table AUD$[FGA_LOG$] should use Automatic Segment Space Management for igt
WARNING => Flashback on PRIMARY is not configured for igt
INFO => Database failure prevention best practices
WARNING => fast_start_mttr_target has NOT been changed from default on igt instance
FAIL => Active Data Guard is not configured for igt
INFO => Parallel Execution Health-Checks and Diagnostics Reports for igt
CRITICAL => The data files should be recoverable for igt
WARNING => The UTL_SPADV package should be installed in the database. for igt
WARNING => The Streams pool is not currently set or not sized appropriately for this database instance. for igt
INFO => Oracle recovery manager(rman) best practices
INFO => Database feature usage statistics for igt
WARNING => Consider investigating changes to the schema objects such as DDLs or new object creation for igt
WARNING => Consider adding more redo log groups or increase the size of redo logs for igt
Best Practice checking completed. Checking recommended patches on qanfv-1-dbs-1a
--------------------------------------------------------------------------------
Collecting patch inventory on ORACLE_HOME /software/oracle/122
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
1 Recommended RDBMS patches for 122010 from /software/oracle/122 on qanfv-1-dbs-1a
--------------------------------------------------------------------------------
Patch# RDBMS ASM type Patch-Description
--------------------------------------------------------------------------------
31741641 no merge Database Oct 2020 Release Update 12.2.0.1.201020
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
RDBMS homes patches summary report
--------------------------------------------------------------------------------
Total patches Applied on RDBMS Applied on ASM ORACLE_HOME
--------------------------------------------------------------------------------
1 1 0 /software/oracle/122
--------------------------------------------------------------------------------
------------------------------------------------------------
Detailed report (html) - /opt/oracle.ahf/data/qanfv-1-dbs-1a/orachk/user_root/output/orachk_qanfv-1-dbs-1a_igt_031621_152257/orachk_qanfv-1-dbs-1a_igt_031621_152257.html
UPLOAD [if required] - /opt/oracle.ahf/data/qanfv-1-dbs-1a/orachk/user_root/output/orachk_qanfv-1-dbs-1a_igt_031621_152257.zip
Step 2 - Execution - gather info about corruption on disk
This MUST be run as oracle user
Need first to get database name
SELECT NAME FROM V$DATABASE;
NAME
---------
IGT
cd /opt/oracle.ahf/bin/
./tfactl diagcollect -srdc dbcorrupt
oracle@my_server:~>% cd /opt/oracle.ahf/bin
oracle@my_server:/opt/oracle.ahf/bin>% ./tfactl diagcollect -srdc dbcorrupt
Enter the Database Name [Required for this SRDC] : IGT
Scripts to be run by this srdc: ipspack get_db_opatch_info srdc_corruption_1578_info.sql hcheck.sql hcheck_pdb.sql
Components included in this srdc: OS DATABASE ASM CHMOS
Collecting data for local node(s)
Collection Id : 20210316154324qanfv-1-dbs-1a
Detailed Logging at : /opt/oracle.ahf/data/repository/srdc_dbcorrupt_collection_Tue_Mar_16_15_43_25_GMT_2021_node_local/diagcollect_20210316154324_qanfv-1-dbs-1a.log
2021/03/16 15:43:30 GMT : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2021/03/16 15:43:30 GMT : Collection Name : tfa_srdc_dbcorrupt_Tue_Mar_16_15_43_25_GMT_2021.zip
2021/03/16 15:43:30 GMT : Collecting additional diagnostic information...
2021/03/16 15:43:30 GMT : Scanning of files for Collection in progress...
2021/03/16 15:43:40 GMT : Getting list of files satisfying time range [03/16/2021 14:43:30 GMT, 03/16/2021 15:43:40 GMT]
2021/03/16 15:43:54 GMT : Collecting ADR incident files...
2021/03/16 15:43:57 GMT : Completed collection of additional diagnostic information...
2021/03/16 15:44:00 GMT : Completed Local Collection
.------------------------------------------.
| Collection Summary |
+----------------+-----------+------+------+
| Host | Status | Size | Time |
+----------------+-----------+------+------+
| qanfv-1-dbs-1a | Completed | 8MB | 30s |
'----------------+-----------+------+------'
Logs are being collected to: /opt/oracle.ahf/data/repository/srdc_dbcorrupt_collection_Tue_Mar_16_15_43_25_GMT_2021_node_local
/opt/oracle.ahf/data/repository/srdc_dbcorrupt_collection_Tue_Mar_16_15_43_25_GMT_2021_node_local/qanfv-1-dbs-1a.tfa_srdc_dbcorrupt_Tue_Mar_16_15_43_25_GMT_2021.zip
The Output Files:
are under
qanfv-1-dbs-1a.tfa_srdc_dbcorrupt_Tue_Mar_16_15_43_25_GMT_2021.zip.txt
qanfv-1-dbs-1a.tfa_srdc_dbcorrupt_Tue_Mar_16_15_43_25_GMT_2021.zip
diagcollect_console_20210316154324_qanfv-1-dbs-1a.log
diagcollect_20210316154324_qanfv-1-dbs-1a.log
qanfv-1-dbs-1a.tfa_srdc_dbcorrupt_Tue_Mar_16_15_43_25_GMT_2021.zip - if the actual output file
Example B - gather info about corrupt UNDO tablespace
As oracle user run:
oracle@my_server:/opt/oracle.ahf/bin>% ./tfactl diagcollect -srdc DBAUM
SRDC diagnostic collections must be run as an oracle privileged user - not root
root@my_server:/opt/oracle.ahf/bin>% su - oracle
Last login: Thu Jul 8 15:36:43 GMT 2021
oracle@my_server:~>% oraigt
oracle@my_server:~>% cd /opt/oracle.ahf/bin/
oracle@my_server:/opt/oracle.ahf/bin>% ./tfactl diagcollect -srdc DBAUM
Enter the Database Name [Required for this SRDC] : IGT
Selected ORACLE_HOME /software/oracle/122
Enter the time of the issue [YYYY-MM-DD HH24:MI:SS,<RETURN>=ALL] :
Is there any ORA error happened on undo?[Y|N] [Required for this SRDC]: Y
Please input the ORA error number[number only] [Required for this SRDC]: 30013
Can you reproduce the issue? [Required for this SRDC]: y
Enter the full path of the SQL file which would reproduce the issue now: [Required for this SRDC]: /software/oracle/oracle/scripts/AHF/for_oracle_support/drop_undotbs1.sql
Scripts to be run by this srdc: srdc_undo_recommendation.sql srdc_undo.sql srdc_get_errorstack_trace cp_reproduce_sql_file
Components included in this srdc: DATABASE NOCHMOS OS
EXIT; -- Exit from the custom script
Collecting data for local node(s).
Collection Id : 20210708154541qanfv-1-dbs-1b
Detailed Logging at : /opt/ahf_data/oracle.ahf/data/repository/srdc_dbaum_collection_Thu_Jul_08_15_45_42_GMT_2021_node_local/diagcollect_20210708154541_qanfv-1-dbs-1b.log
2021/07/08 15:45:47 GMT : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2021/07/08 15:45:47 GMT : Collection Name : tfa_srdc_dbaum_Thu_Jul_08_15_45_42_GMT_2021.zip
2021/07/08 15:45:47 GMT : Getting list of files satisfying time range [07/08/2021 08:45:47 GMT, 07/08/2021 15:45:47 GMT]
2021/07/08 15:45:47 GMT : Collecting additional diagnostic information...
2021/07/08 15:45:58 GMT : Collecting ADR incident files...
2021/07/08 15:46:04 GMT : Completed collection of additional diagnostic information...
2021/07/08 15:46:08 GMT : Completed Local Collection
.-------------------------------------------.
| Collection Summary |
+----------------+-----------+-------+------+
| Host | Status | Size | Time |
+----------------+-----------+-------+------+
| qanfv-1-dbs-1b | Completed | 413kB | 21s |
'----------------+-----------+-------+------'
Logs are being collected to: /opt/ahf_data/oracle.ahf/data/repository/srdc_dbaum_collection_Thu_Jul_08_15_45_42_GMT_2021_node_local
/opt/ahf_data/oracle.ahf/data/repository/srdc_dbaum_collection_Thu_Jul_08_15_45_42_GMT_2021_node_local/qanfv-1-dbs-1b.tfa_srdc_dbaum_Thu_Jul_08_15_45_42_GMT_2021.zip
Example C - gather info about Golden Gate
As oracle user run:
cd /opt/oracle.ahf/bin
./tfactl diagcollect -srdc gg_abend
oracle@my_host:~>% cd /opt/oracle.ahf/bin
oracle@my_host:/opt/oracle.ahf/bin>% ./tfactl diagcollect -srdc gg_abend
Enter the Database Name [Required for this SRDC] : IGT
Use of uninitialized value in split at /opt/oracle.ahf/tfa/bin/common/dbutil.pm line 1140.
Database Name IGT was specificed however this database has a Database Unique Name of igt.
Database Unique Name igt set for IGT.
Enter the GoldenGate Home [Required for this SRDC]: /software/oracle/1910
Enter the failed GoldenGate component name [Required for this SRDC]: EXT_P_01
Is this a Microservices Install type? [Y|N] [Required for this SRDC]: N
Components included in this collection: OS DATABASE
Preparing to execute support diagnostic scripts.
Use of uninitialized value in split at /opt/oracle.ahf/tfa/bin/common/dbutil.pm line 1140.
Use of uninitialized value in split at /opt/oracle.ahf/tfa/bin/common/dbutil.pm line 1140.
Use of uninitialized value in split at /opt/oracle.ahf/tfa/bin/common/dbutil.pm line 1140.
Use of uninitialized value in split at /opt/oracle.ahf/tfa/bin/common/dbutil.pm line 1140.
Executing DB Script ogg_12102.sql on igt with timeout of 300 seconds...
Executing OS Script cp_files_classic with timeout of 120 seconds...
Script Execution Failed. Review the collection SRDC log for details...
Executing OS Script get_gghome_opatch_info with timeout of 500 seconds...
Collecting data for the last 1 hours for this component ...
Collecting data for local node(s).
TFA is using system timezone for collection, All times shown in GMT.
Collection Id : 20220425121900phliptst-1-aps01
Detailed Logging at : /opt/ahf_data/oracle.ahf/data/repository/srdc_gg_abend_collection_Mon_Apr_25_12_19_04_GMT_2022_node_local/diagcollect_20220425121900_phliptst-1-aps01.log
2022/04/25 12:19:09 GMT : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2022/04/25 12:19:09 GMT : Collection Name : tfa_srdc_gg_abend_Mon_Apr_25_12_19_02_GMT_2022.zip
2022/04/25 12:19:09 GMT : Getting list of files satisfying time range [04/25/2022 11:19:09 GMT, 04/25/2022 12:19:09 GMT]
2022/04/25 12:19:09 GMT : Collecting additional diagnostic information...
2022/04/25 12:19:21 GMT : Collecting ADR incident files...
2022/04/25 12:19:41 GMT : Completed collection of additional diagnostic information...
2022/04/25 12:19:41 GMT : Completed Local Collection
.--------------------------------------------.
| Collection Summary |
+------------------+-----------+------+------+
| Host | Status | Size | Time |
+------------------+-----------+------+------+
| my_server | Completed | 1MB | 32s |
'------------------+-----------+------+------'
Logs are being collected to: /opt/ahf_data/oracle.ahf/data/repository/srdc_gg_abend_collection_Mon_Apr_25_12_19_04_GMT_2022_node_local
/opt/ahf_data/oracle.ahf/data/repository/srdc_gg_abend_collection_Mon_Apr_25_12_19_04_GMT_2022_node_local/phliptst-1-aps01.tfa_srdc_gg_abend_Mon_Apr_25_12_19_02_GMT_2022.zip
oracle@my_server:/opt/oracle.ahf/bin>%