Mandalika's scratchpad: [Oracle Database] Unreliable AWR reports on T5 & Redo logs on F40 PCIe Cards

(1) AWR report shows bogus wait events and times on SPARC T5 servers

Here is a sample from one of the Oracle 11g R2 databases running on a SPARC T5 server with Solaris 11.1 SRU 7.5

Top 5 Timed Foreground Events

Event	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
latch: cache buffers chains	278,727	812,447,335	2914850	13307324.15	Concurrency
library cache: mutex X	212,595	449,966,330	2116542	7370136.56	Concurrency
buffer busy waits	219,844	349,975,251	1591925	5732352.01	Concurrency
latch: In memory undo latch	25,468	37,496,800	1472310	614171.59	Concurrency
latch free	2,602	24,998,583	9607449	409459.46	Other

Reason:
Unknown. There is a pending bug 17214885 - Implausible top foreground wait times reported in AWR report.

Tentative workaround:
Disable power management as shown below.

# poweradm set administrative-authority=none

# svcadm disable power
# svcadm enable power

Verify the setting by running poweradm list.

Also disable NUMA I/O object binding by setting the following parameter in /etc/system (requires a system reboot).

set numaio_bind_objects=0

Oracle Solaris 11 added support for NUMA I/O architecture. Here is a brief explanation of NUMA I/O from Solaris 11 : What's New web page.

Non-Uniform Memory Access (NUMA) I/O : Many modern systems are based on a NUMA architecture, where each CPU or set of CPUs is associated with its own physical memory and I/O devices. For best I/O performance, the processing associated with a device should be performed close to that device, and the memory used by that device for DMA (Direct Memory Access) and PIO (Programmed I/O) should be allocated close to that device as well. Oracle Solaris 11 adds support for this architecture by placing operating system resources (kernel threads, interrupts, and memory) on physical resources according to criteria such as the physical topology of the machine, specific high-level affinity requirements of I/O frameworks, actual load on the machine, and currently defined resource control and power management policies.

Do not forget to rollback these changes after applying the fix for the database bug 17214885, when available.

(2) Redo logs on F40 PCIe cards (non-volatile flash storage)

Per the F40 PCIe card user's guide, the Sun Flash Accelerator F40 PCIe Card is designed to provide best performance for data transfers that are multiples of 8k size, and using addresses that are 8k aligned. To achieve optimal performance, the size of the read/write data should be an integer multiple of this block size and the data transferred should be block aligned. I/O operations that are not block aligned and that do not use sizes that are a multiple of the block size may suffer performance degration, especially for write operations.

Oracle redo log files default to a block size that is equal to the physical sector size of the disk, typically 512 bytes. And most of the time, database writes to the redo log in a normal functioning environment. Oracle database supports a maximum block size of 4K for redo logs. Hence to achieve optimal performance for redo write operations on F40 PCIe cards, tune the environment as shown below.

Configure the following init parameters

_disk_sector_size_override=TRUE
_simulate_disk_sectorsize=4096

Create redo log files with 4K block size
eg.,

SQL> ALTER DATABASE ADD LOGFILE '/REDO/redo.dbf' size 20G blocksize 4096;

[Solaris only] Append the following line to /kernel/drv/sd.conf (requires a reboot)

sd-config-list="ATA     3E128-TS2-550B01","disksort:false, cache-nonvolatile:true, physical-block-size:4096";

[Solaris only][F20] To enable maximum throughput from the MPT driver, append the following line to /kernel/drv/mpt.conf and reboot the system.
```
mpt_doneq_thread_n_prop=8;
```

This tip is applicable to all kinds of flash storage that Oracle sells or sold including F20/F40 PCIe cards and F5100 storage array. sd-config-list in sd.conf may need some adjustment to reflect the correct vendor id and product id.

Mandalika's scratchpad

Pages

Saturday, August 31, 2013

[Oracle Database] Unreliable AWR reports on T5 & Redo logs on F40 PCIe Cards

No comments:

Post a Comment