Mandalika's scratchpad [ Work blog @Oracle | Stock Market Notes | My Music Compositions ]

Old Posts: 09.04  10.04  11.04  12.04  01.05  02.05  03.05  04.05  05.05  06.05  07.05  08.05  09.05  10.05  11.05  12.05  01.06  02.06  03.06  04.06  05.06  06.06  07.06  08.06  09.06  10.06  11.06  12.06  01.07  02.07  03.07  04.07  05.07  06.07  08.07  09.07  10.07  11.07  12.07  01.08  02.08  03.08  04.08  05.08  06.08  07.08  08.08  09.08  10.08  11.08  12.08  01.09  02.09  03.09  04.09  05.09  06.09  07.09  08.09  09.09  10.09  11.09  12.09  01.10  02.10  03.10  04.10  05.10  06.10  07.10  08.10  09.10  10.10  11.10  12.10  01.11  02.11  03.11  04.11  05.11  07.11  08.11  09.11  10.11  11.11  12.11  01.12  02.12  03.12  04.12  05.12  06.12  07.12  08.12  09.12  10.12  11.12  12.12  01.13  02.13  03.13  04.13  05.13  06.13  07.13  08.13  09.13  10.13  11.13  12.13  01.14  02.14  03.14  04.14  05.14  06.14  07.14  09.14  10.14  11.14  12.14  01.15  02.15  03.15  04.15  06.15  09.15  12.15  01.16  03.16  04.16  05.16  06.16  07.16  08.16  09.16  12.16  01.17  02.17  03.17  04.17  06.17  07.17  08.17  09.17  10.17  12.17  01.18  02.18  03.18  04.18  05.18  06.18 


Saturday, June 30, 2018
 
Python: Exclusive File Locking on Solaris

Solaris doesn't lock open files automatically (not just Solaris - most of *nix operating systems behave this way).

In general, when a process is about to update a file, the process is responsible for checking existing locks on target file, acquiring a lock and releasing it after updating the file. However given that not all processes cooperate and adhere to this mechanism (advisory locking) due to various reasons, such non-conforming practice may lead to problems such as inconsistent or invalid data mainly triggered by race condition(s). Serialization is one possible solution to prevent this, where only one process is allowed to update the target file at any time. It can be achieved with the help of file locking mechanism on Solaris as well as majority of other operating systems.

On Solaris, a file can be locked for exclusive access by any process with the help of fcntl() system call. fcntl() function provides for control over open files. It can be used for finer-grained control over the locking -- for instance, we can specify whether or not to make the call block while requesting exclusive or shared lock.

The following rudimentary Python code demonstrates how to acquire an exclusive lock on a file that makes all other processes wait to get access to the file in focus.

eg.,

% cat -n xflock.py
     1  #!/bin/python
     2  import fcntl, time
     3  f = open('somefile', 'a')
     4  print 'waiting for exclusive lock'
     5  fcntl.flock(f, fcntl.LOCK_EX)
     6  print 'acquired lock at %s' % time.strftime('%Y-%m-%d %H:%M:%S')
     7  time.sleep(10)
     8  f.close()
     9  print 'released lock at %s' % time.strftime('%Y-%m-%d %H:%M:%S')

Running the above code in two terminal windows at the same time shows the following.

Terminal 1:

% ./xflock.py
waiting for exclusive lock
acquired lock at 2018-06-30 22:25:36
released lock at 2018-06-30 22:25:46

Terminal 2:

% ./xflock.py
waiting for exclusive lock
acquired lock at 2018-06-30 22:25:46
released lock at 2018-06-30 22:25:56

Notice that the process running in second terminal was blocked waiting to acquire the lock until the process running in first terminal released the exclusive lock.

Non-Blocking Attempt

If the requirement is not to block on exclusive lock acquisition, it can be achieved with LOCK_EX (acquire exclusive lock) and LOCK_NB (do not block when locking) operations by performing a bitwise OR on them. In other words, the statement fcntl.flock(f, fcntl.LOCK_EX) becomes fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB) so the process will either get the lock or move on without blocking.

Be aware that an IOError will be raised when a lock cannot be acquired in non-blocking mode. Therefore, it is the responsibility of the application developer to catch the exception and properly deal with the situation.

The behavior changes as shown below after the inclusion of fcntl.LOCK_NB in the sample code above.

Terminal 1:

% ./xflock.py
waiting for exclusive lock
acquired lock at 2018-06-30 22:42:34
released lock at 2018-06-30 22:42:44

Terminal 2:

% ./xflock.py
waiting for exclusive lock
Traceback (most recent call last):
  File "./xflock.py", line 5, in 
    fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB)
IOError: [Errno 11] Resource temporarily unavailable

Labels:




Thursday, May 31, 2018
 
Solaris 11.4: 10 Good-to-Know Features, Enhancements or Changes

  1. [Admins] Device Removal From a ZFS Storage Pool

    In addition to removing hot spares, cache and log devices, Solaris 11.4 has support for removal of top-level virtual data devices (vdev) from a zpool with the exception of a RAID-Z pool. It is possible to cancel a remove operation that's in progress too.

    This enhancement will come in handy especially when dealing with overprovisioned and/or misconfigured pools.

    Ref: ZFS: Removing Devices From a Storage Pool for examples.

  2. [Developers & Admins] Bundled Software

    Bundled software packages include Python 3.5, Oracle instant client 12.2, MySQL 5.7, Cython (C-Extensions for Python), cx_Oracle Python module, Go compiler, clang (C language family frontend for LLVM) and so on.

    cx_Oracle is a Python module that enables accessing Oracle Database 12c and 11g from Python applications. The Solaris packaged version 5.2 can be used with Python 2.7 and 3.4.

    Depending on the type of Solaris installation, not every software package may get installed by default but the above mentioned packages can be installed from the package repository on demand.

    eg.,

    # pkg install pkg:/developer/golang-17
    
    # go version
    go version devel a30c3bd1a7fcc6a48acfb74936a19b4c Fri Dec 22 01:41:25 GMT 2017 solaris/sparc64
    
  3. [Security] Isolating Applications with Sandboxes

    Sandboxes are isolated environments where users can run applications to protect them from other processes on the system while not giving full access to the rest of the system. Put another way, application sandboxing is one way to protect users, applications and systems by limiting the privileges of an application to its intended functionality there by reducing the risk of system compromise.

    Sandboxing joins Logical Domains (LDoms) and Zones in extending the isolation mechanisms available on Solaris.

    Sandboxes are suitable for constraining both privileged and unprivileged applications. Temporary sandboxes can be created to execute untrusted processes. Only administrators with the Sandbox Management rights profile (privileged users) can create persistent, uniquely named sandboxes with specific security attributes.

    The unprivileged command sandbox can be used to create temporary or named sandboxes to execute applications in a restricted environment. The privileged command sandbox can be used to create and manage named sandboxes.

    To install security/sandboxing package, run:

    # pkg install sandboxing
    
    -OR-
    
    # pkg install pkg:/security/sandboxing
    

    Ref: Configuring Sandboxes for Project Isolation for details.

  4. New Way to Find SRU Level

    uname -v was enhanced to include SRU level. Starting with the release of Solaris 11.4, uname -v reports Solaris patch version in the format "11.<update>.<sru>.<build>.<patch>".

    # uname -v
    11.4.0.12.0
    

    Above output translates to Solaris 11 Update 4 SRU 0 Build 12 Patch 0.

  5. [Cloud] Service to Perform Initial Configuration of Guest Operating Systems

    cloudbase-init service on Solaris will help speed up the guest VM deployment in a cloud infrastructure by performing initial configuration of the guest OS. Initial configuration tasks typically include user creation, password generation, networking configuration, SSH keys and so on.

    cloudbase-init package is not installed by default on Solaris 11.4. Install the package only into VM images that will be deployed in cloud environments by running:

    # pkg install cloudbase-init
    
  6. Device Usage Information

    The release of Solaris 11.4 makes it easy to identify the consumers of busy devices. Busy devices are those devices that are opened or held by a process or kernel module.

    Having access to the device usage information helps with certain hotplug or fault management tasks. For example, if a device is busy, it cannot be hotplugged. If users are provided with the knowledge of how a device is currently being used, it helps them in resolving related issue(s).

    On Solaris 11.4, prtconf -v shows pids of processes using different devices.

    eg.,

    # prtconf -v
     ...
        Device Minor Nodes:
            dev=(214,72)
                dev_path=/pci@300/pci@2/usb@0/hub@4/storage@2/disk@0,0:a
                    spectype=blk type=minor nodetype=ddi_block:channel
                    dev_link=/dev/dsk/c2t0d0s0
                dev_path=/pci@300/pci@2/usb@0/hub@4/storage@2/disk@0,0:a,raw
                    spectype=chr type=minor nodetype=ddi_block:channel
                    dev_link=/dev/rdsk/c2t0d0s0
                Device Minor Opened By:
                    proc='fmd' pid=1516
                        cmd='/usr/lib/fm/fmd/fmd'
                        user='root[0]'
     ...
    
  7. [Developers] Support for C11 (C standard revision)

    Solaris 11.4 includes support for the C11 programming language standard: ISO/IEC 9899:2011 Information technology - Programming languages - C.

    Note that C11 standard is not part of the Single UNIX Specification yet. Solaris 11.4 has support for C11 in addition to C99 to provide customers with C11 support ahead of its inclusion in a future UNIX specification. That means developers can write C programs using the newest available C programming language standard on Solaris 11.4 (and later).

  8. pfiles on a coredump

    pfiles, a /proc debugging utility, has been enhanced in Solaris 11.4 to provide details about the file descriptors opened by a crashed process in addition to the files opened by a live process.

    In other words, "pfiles core" now works.

  9. Privileged Command Execution History

    A new command, admhist, was included in Solaris 11.4 to show successful system administration related commands which are likely to have modified the system state, in human readable form. This is similar to the shell builtin "history".

    eg.,

    The following command displays the system administration events that occurred on the system today.

    # admhist -d "today" -v
    ...
    2018-05-31 17:43:21.957-07:00 root@pitcher.dom.com cwd=/ /usr/bin/sparcv9/python2.7 /usr/bin/64/python2.7 /usr/bin/pkg -R /zonepool/p6128-z1/root/ --runid=12891 remote --ctlfd=8 --progfd=13
    2018-05-31 17:43:21.959-07:00 root@pitcher.dom.com cwd=/ /usr/lib/rad/rad -m /usr/lib/rad/transport -m /usr/lib/rad/protocol -m /usr/lib/rad/module -m /usr/lib/rad/site-modules -t pipe:fd=3,exit -e 180 -i 1
    2018-05-31 17:43:22.413-07:00 root@pitcher.dom.com cwd=/ /usr/bin/sparcv9/pkg /usr/bin/64/python2.7 /usr/bin/pkg install sandboxing
    2018-05-31 17:43:22.415-07:00 root@pitcher.dom.com cwd=/ /usr/lib/rad/rad -m /usr/lib/rad/transport -m /usr/lib/rad/protocol -m /usr/lib/rad/module -m /usr/lib/rad/site-modules -t pipe:fd=3,exit -e 180 -i 1
    2018-05-31 18:59:52.821-07:00 root@pitcher.dom.com cwd=/root /usr/bin/sparcv9/pkg /usr/bin/64/python2.7 /usr/bin/pkg search cloudbase-init
    ..
    

    It is possible to narrow the results by date, time, zone and audit-tag

    Ref: man page of admhist(8)

  10. [Developers] Process Control Library

    Solaris 11.4 includes a new process control library, libproc, which provides high-level interface to features of the /proc interface. This library also provides access to information such as symbol tables which are useful while examining and control of processes and threads.

    A controlling process using libproc can typically:

    • Grab another process by suspending its execution
    • Examine the state of that process
    • Examine or modify the address space of the grabbed process
    • Make that process execute system calls on behalf of the controlling process, and
    • Release the grabbed process to continue execution

    Ref: man page of libproc(3LIB) for an example and details.

Labels:




Wednesday, April 25, 2018
 
Solaris 11.4: Three Zones Related Changes in 3 Minutes or Less

[ 1 ] Automatic Live Migration of Kernel Zones using sysadm Utility

Live migrate (evacuate) all kernel zones from a host system onto other systems temporarily or permanently with the help of new sysadm(8) utility. In addition, it is possible to evacuate all zones including kernel zones that are not running and native solaris zones in the installed state.

  1. If the target host (that is, the host the zone will be migrated to) meets all evacuation requirements, set it as destination host for one or more migrating kernel zones by setting the SMF service property evacuation/target.

    svccfg -s svc:/system/zones/zone:<migrating-zone> setprop evacuation/target=ssh://<dest-host>
    
  2. Put the source host in maintenance mode using sysadm utility to prevent non-running zones from attaching, booting, or migrating in zones from other hosts.

    sysadm maintain <options>
    
  3. Migrate the zones to their destination host(s) by running sysadm's evacuate subcommand.

    sysadm evacuate <options>
    
  4. Complete system maintenance work and end the maintenance mode on source host

    sysadm maintain -e
    
  5. Optionally bring back evacuated zones to the source host

Please refer to Evacuating Oracle Solaris Kernel Zones for detailed steps.

[ 2 ] Moving Solaris Zones across Different Storage URIs

Starting with the release of Solaris 11.4, zoneadm's move subcommand can be used to change the zonepath without moving the Solaris zone installation. In addition, the same command can be used to move a zone from:

[ 3 ] ZFS Dataset Live Zone Reconfiguration

Live Zone Reconfiguration (LZR) is the ability to make changes to a running Solaris native zone configuration permanently or temporarily. In other words, LZR avoids rebooting the target zone.

Solaris 11.3 already has support for reconfiguring resources such as dedicated cpus, capped memory and automatic network (anets). Solaris 11.4 extends the LZR support to ZFS datasets.

With the release of Solaris 11.4, privileged users should be able to add or remove ZFS datasets dynamically to and from a Solaris native zone without the need to reboot the zone.

eg.,
# zoneadm list -cv
  ID NAME             STATUS      PATH                         BRAND      IP
   0 global           running     /                            solaris    shared
   1 tstzone          running     /zonepool/tstzone           solaris    excl

    Add a ZFS filesystem to the running zone, tstzone

# zfs create zonepool/testfs

# zonecfg -z tstzone "info dataset"

# zonecfg -z tstzone "add dataset; set name=zonepool/testfs; end; verify; commit"

# zonecfg -z tstzone "info dataset"
dataset:
        name: zonepool/testfs
        alias: testfs

# zoneadm -z tstzone apply
zone 'tstzone': Checking: Modifying anet linkname=net0
zone 'tstzone': Checking: Adding dataset name=zonepool/testfs
zone 'tstzone': Applying the changes

# zlogin tstzone "zfs list testfs"
cannot open 'testfs': filesystem does not exist

# zlogin tstzone "zpool import testfs"

# zlogin tstzone "zfs list testfs"
NAME    USED  AVAIL  REFER  MOUNTPOINT
testfs   31K  1.63T    31K  /testfs

    Remove a ZFS filesystem from the running zone, tstzone

# zonecfg -z tstzone "remove dataset name=zonepool/testfs; verify; commit"

# zonecfg -z tstzone "info dataset"

# zlogin tstzone "zpool export testfs"

# zoneadm -z tstzone apply
zone 'tstzone': Checking: Modifying anet linkname=net0
zone 'tstzone': Checking: Removing dataset name=zonepool/testfs
zone 'tstzone': Applying the changes

# zlogin tstzone "zfs list testfs"
cannot open 'testfs': filesystem does not exist

# zfs destroy zonepool/testfs
#

A summary of LZR support for resources and properties in native and kernel zones can be found in this page.

Labels:




Sunday, March 25, 2018
 
Solaris 11.4: Brief Introduction to Solaris Analytics

This is something I can take some credit for even though I haven't contributed in any significant way other than filing a timely enhancement request. :-)

Overview

On a high level: Solaris has quite a few observability and diagnostic tools and utilities such as vmstat, mprstat, iostat, prstat, pgstat, lockstat, dtrace to observe and diagnose CPU/Core/memory/disk IO/network utilization, locks, busy processes and threads, interrupts and so on. However except for power users, majority of normal users and application & system administrators are not much familiar with those tools, or savvy enough to read man pages and documentation to figure the best ways to extract diagnostic or performance data/information that they want or need (this is likely the case across all operating environments not just Solaris).

Solaris 11.4 attempts to improve the usability of these tools and utilities by providing an interactive browser user interface (BUI) called "Oracle Solaris Analytics". Solaris Analytics gather event information and data samples from a variety of system & application sources. Consolidated view of the statistics, faults and administrative change requests are presented in a simple easy-to-digest manner. Users will be guided through health and performance analysis to diagnose problems.

Ultimately OS users, application and system administrators benefit with the visual representation of performance and diagnostic data, and system events. For instance, with the help of Solaris Analytics users will be able to view historical information about system performance, contrast it with current performance, and correlate statistics and events from multiple sources.

Check Using Oracle Solaris 11.4 Analytics for more information and details.

Accessing the Analytics BUI

Analytics services are enabled by default, and the Solaris Web UI can be accessed on ports 443 and 6787.

Access Analytics BUI at https://<s11.4host>:<port>/solaris/ where "s11.4host" is the hostname of the system running Solaris 11.4 and port=[443|6787].

Log in as any Solaris user that is configured to log into "s11.4host".

Those who are familiar with the Oracle ZFS Storage Appliance BUI may find some similarities between these two browser interfaces.

Troubleshooting: Unable to access Analytics BUI?

Make sure that:

Screenshots

Note that the system is almost idle to show any interesting data. Click on each image to see the image in original size.

Default Dashboard Home

Solaris Analytics BUI

Available Views

Solaris Analytics BUI

Sample View - SMF Services

Solaris Analytics BUI

Labels:




Tuesday, February 27, 2018
 
Steps to Upgrade from Solaris 11.3 to 11.4 Beta

Recently I updated one of our lab systems running Solaris 11.3 SRU 16 to Solaris 11.4 beta. Just wanted to share my experience along with the steps I ran and the outputs/stdout/stderr messages that I captured. I followed Updating Your Operating System to Oracle Solaris 11.4 document in Solaris 11.4 documentation library and the instructions worked flawlessly without a hitch.

My target system has one non-global zone running, and it took a little over one hour to complete the upgrade from start to finish. I recommend setting at least couple of hours aside for this upgrade as various factors such as the current Solaris 11.3 SRU, number of non-global zones running, and how Solaris 11.4 Beta packages are being accessed have a direct impact on the overall time it takes to complete the upgrade exercise.

Step 1: Prepare the System for Upgrade

Oracle recommends that the target system is at least at Solaris 11.3 SRU 23 level for the upgrade to succeed so the first step is to make sure that the system is running Solaris 11.3 SRU 23 or later.

eg.,
# pkg info entire | grep -i branch
           Branch: 0.175.3.16.0.3.0

The above output indicates that my lab system is running Solaris 11.3 SRU 16 - so, I have no choice but to upgrade 11.3 SRU first. Those with systems already at 11.3 SRU 23 or later can skip to Step 2: Get access to Solaris 11.4 Beta packages.

Following listing indicates that existing/configured publishers allow me to upgrade 11.3 to the latest SRU (which is 29).

# pkg list -af entire@0.5.11-0.175.3
NAME (PUBLISHER)             VERSION                    IFO
entire (solaris)             0.5.11-0.175.3.29.0.5.0    ---
entire (solaris)             0.5.11-0.175.3.28.0.4.0    ---
...

An attempt to update my system to the latest SRU met with a failure.

# pkg update --be-name s11.3.sru29 pkg:/entire@0.5.11-0.175.3.29.0.5.0
            Packages to remove:  37
           Packages to install:  12
            Packages to update: 510
       Create boot environment: Yes
Create backup boot environment:  No

Planning linked: 0/1 done; 1 working: zone:some-ngz
Linked progress: /pkg: update failed (linked image exception(s)):

A 'sync-linked' operation failed for child 'zone:some-ngz' with an 
unexpected return value of 1 and generated the following output:

pkg sync-linked: One or more client key and certificate files have 
expired. Please update the configuration for the publishers or origins 
listed below:

Publisher: solarisstudio
  Origin URI:
    https://pkg.oracle.com/solarisstudio/release/
    ...

Root cause of this failure is that a seperate repo for Solaris Studio has been configured in the non-global zone. Fixed it by removing associated publisher from the non-global zone.

root@some-ngz:~# pkg publisher
PUBLISHER                   TYPE     STATUS P LOCATION
solaris        (syspub)     origin   online T 
solarisstudio  (syspub)     origin   online T 
solarisstudio  (syspub)     origin   online F https://pkg.oracle.com/solarisstudio/release/

root@some-ngz:~# pkg set-publisher -G https://pkg.oracle.com/solarisstudio/release/ solarisstudio

SRU update has no problem afterward.

# pkg update --be-name s11.3.sru29 pkg:/entire@0.5.11-0.175.3.29.0.5.0
            Packages to remove:  37
           Packages to install:  12
            Packages to update: 510
       Create boot environment: Yes
Create backup boot environment:  No

Planning linked: 0/1 done; 1 working: zone:some-ngz
Linked image 'zone:some-ngz' output:
|  Packages to remove:  25
| Packages to install:  12
|  Packages to update: 237
|  Services to change:   9
`
...
Updating image state                            Done
Creating fast lookup database                   Done
Executing linked: 0/1 done; 1 working: zone:some-ngz
Executing linked: 1/1 done
Updating package cache                           3/3

A clone of s11.3.sru.16.3.0 exists and has been updated and activated.
On the next boot the Boot Environment s11.3.sru29 will be
mounted on '/'.  Reboot when ready to switch to this updated BE.

Now that the system was upgraded to 11.3 SRU 29, it is time to boot the new boot environment (BE) in preparation for a subsequent upgrade to 11.4 beta.

# beadm list | grep sru29
s11.3.sru29        R     -          98.05G  static 2018-02-26 17:13

# shutdown -y -g 30 -i6

# pkg info entire | grep -i branch
           Branch: 0.175.3.29.0.5.0

Step 2: Get access to Solaris 11.4 Beta packages

Unless the target system has no connectivity to the public pkg.oracle.com repo, probably the simplest way is to use the public package repository that was set up by Oracle to support 11.4 beta. If the preference is to have a local repository for any reason (say no exernal connectivity, control what gets installed for example), download the Solaris 11.4 Beta package repository file and follow the instructions in README to set up the beta repository locally.

Rest of this section shows the steps involved in getting access to the public repository.

Step 3: Perform the Upgrade

Optionally perform a dry run update to check if there are any issues.

pkg update -nv

Finally perform the actual update to Solaris 11.4. This is the most time consuming step in this whole exercise.

# pkg update --accept --be-name 11.4.0 --ignore-missing \
  --reject system/input-method/ibus/anthy \
  --reject system/input-method/ibus/pinyin \
  --reject system/input-method/ibus/sunpinyin \
  --reject system/input-method/library/m17n/contrib entire@latest
            ...
            Packages to remove: 190
           Packages to install: 304
            Packages to update: 716
           Mediators to change:   8
       Create boot environment: Yes
Create backup boot environment:  No
           ...

Reboot the system to boot the new 11.4 boot environment and check the OS version.

# uname -v
11.3

# beadm list | grep R
11.4.0             R     -          113.20G static 2018-02-26 20:10

# shutdown -y -g 30 -i6

# uname -v
11.4.0.12.0

# pkg info entire | grep Version
          Version: 11.4 (Oracle Solaris 11.4.0.0.0.12.1)

Labels:




Wednesday, January 31, 2018
 
Random Solaris Tips: 11.4 Beta, LDoms 3.5, Privileges, File Attributes & Disk Block Size

Solaris OS Beta

11.4 Download Location & Documentation

Recently Solaris 11.4 hit the web as a public beta product meaning anyone can download and use it in non-production environments. This is a major Solaris milestone since the release of Solaris 11.3 GA back in 2015.

Few interesting pages:


Logical Domains
Dynamic Reconfiguration
Blacklisted Resources
Command History

Dynamic Reconfiguration of Named Resources

Starting with the release of Oracle VM Server for SPARC 3.5 (aka LDoms) it is possible to dynamically reconfigure domains that have named resources assigned. Named resources are the resources that are assigned explicitly to domains. Assigning core ids 10 & 11 and a 32 GB block of memory at physical address 0x50000000 to some domain X is an example of named resource assignment. SuperCluster Engineered System is one example where named resources are explicitly assigned to guest domains.

Be aware that depending on the state of the system, domains and resources, some of the dynamic reconfiguration operations may or may not succeed.

Here are few examples that show DR functionality with named resources.

ldm remove-core cid=66,67,72,73 primary
ldm add-core cid=66,67 guest1
ldm add-mem mblock=17664M:16G,34048M:16G,50432M:16G guest2

Listing Blacklisted Resources

When FMA detects faulty resource(s), Logical Domains Manager attempts to stop using those faulty core and memory resources (no I/O resources at the moment) in all running domains. Also those faulty resources will be preemptively blacklisted so they don't get assigned to any domain.

However if the faulty resource is currently in use, Logical Domains Manager attempts to use core or memory DR to evacuate the resource. If the attempt fails, the faulty resource is marked as "evacuation pending". All such pending faulty resources are removed and moved to blacklist when the affected guest domain is stopped or rebooted.

Starting with the release of LDoms software 3.5, blacklisted and evacuation pending resources (faulty resources) can be examined with the help of ldm's -B option.

eg.,
# ldm list-devices -B
CORE
ID STATUS DOMAIN
1 Blacklisted
2 Evac_pending ldg1
MEMORY
PA SIZE STATUS DOMAIN
0xa30000000 87G Blacklisted
0x80000000000 128G Evac_pending ldg1

Check this page for some more information.

LDoms Command History

Recent releases of LDoms Manager can show the history of recently executed ldm commands with the list-history subcommand.

# ldm history
Jan 31 19:01:18 ldm ls -o domain -p
Jan 31 19:01:48 ldm list -p
Jan 31 19:01:49 ldm list -e primary
Jan 31 19:01:54 ldm history
..

Last 10 ldm commands are shown by default. ldm set-logctl history=<value> command can be used to configure the number of commands in the command history. Setting the value to 0 disables the command history log.


Disks

Determine the Blocksize

devprop command on recent versions of Solaris 11 can show the logical and physical block size of a device. The size is represented in bytes.

eg.,

Following output shows 512-byte size for both logical and physical block. It is likely a 512-byte native disk (512n).

% devprop -v -n /dev/rdsk/c4t2d0 device-blksize device-pblksize
device-blksize=512
device-pblksize=512

Find some useful information about disk drives that exceed the common 512-byte block size here.


Security Services

Privileges

When debugging option was enabled, ppriv command on recent versions of Solaris 11 can be used to check if the current user has required privileges to run a certain command.

eg.,
% ppriv -ef +D /usr/sbin/trapstat
trapstat[18998]: missing privilege "file_dac_read" (euid = 100, syscall = "faccessat") for "/devices/pseudo/trapstat@0:trapstat" at devfs_access+0x74
trapstat: permission denied opening /dev/trapstat: Permission denied

% ppriv -ef +D /usr/sbin/prtdiag
System Configuration:  Oracle Corporation  sun4v T5240
Memory size: 65312 Megabytes

================================ Virtual CPUs ================================
..

Following example examines the privileges of a running process.

# ppriv 23829  <-- pid 23829 running in a non-global zone. ppriv executed in global zone
23829:  ora_lmhb_spare31
flags = 
        E: basic,sys_mount
        I: basic,sys_mount
        P: basic,sys_mount
        L: basic,contract_event,contract_identity,contract_observer,file_chown,file_chown_self,[...]


# ppriv 18374 <-- pid 18374 and ppriv are running in the global zone
18374:  /u01/app/12.2.0.1/grid/bin/crsd.bin reboot
flags = 
        E: basic,contract_event,contract_identity,contract_observer,file_chown,[...]
        I: basic,sys_mount
        P: basic,contract_event,contract_identity,contract_observer,file_chown,file_chown_self,[...]
        L: basic,contract_event,contract_identity,contract_observer,file_chown,file_chown_self,file_dac_execute,[...]

stat

File Attributes

stat command on Solaris and other flavors of *nix operating systems can show various attributes related to a file or a file system.

Following example shows the usage to fetch filename along with the file owner, last modification date and the size in bytes.

% stat -c "%n %U %y %s" /var/tmp/perl5.zip
/var/tmp/perl5.zip twiki 2017-04-29 10:10:52.295626350 -0700 7672631

Following example demonstrates how to examine the file permissions (access rights) in octal and human readable form.

% stat -c "%a %A %n" perl5.zip
644 -rw-r--r-- perl5.zip

All attributes of the file can be obtained by dropping the -c option with format strings.

Now let's look at an example that examines the file system status.

% stat -f /export
  File: "/export"
    ID: 4bd000a  Namelen: 255     Type: zfs
Block size: 131072     Fundamental block size: 512
Blocks: Total: 159472866  Free: 159472802  Available: 159472802
Inodes: Total: 159472810  Free: 159472802

All attributes of the file system can be examined by dropping the -f option

% stat /export
  File: '/export'
  Size: 3               Blocks: 3          IO Block: 512    directory
Device: 12f0001000ah/1301375156234d     Inode: 4           Links: 3
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    3/     sys)
Access: 2017-07-27 10:38:22.346222255 -0700
Modify: 2016-08-01 16:58:04.364608118 -0700
Change: 2016-08-01 16:58:04.364608118 -0700
 Birth: 2016-08-01 15:41:55.740419710 -0700

Labels:




Saturday, December 30, 2017
 
Blast from the Past : The Weekend Playlist #14

Previous playlists:

    #1    #8 (50s, 60s and 70s)    |    #2    #3    #4    #5 (80s)    |    #6    #7    #9 (90s)    |    #11    #12 (00s)    |    #13 (10s) |    #10 (Instrumental)

Another 50s, 60s and 70s playlist. Audio & Widget courtesy: Spotify

Labels:




Wednesday, December 13, 2017
 
osc-setcoremem: Simulation on SuperCluster Nodes

Running the osc-setcoremem simulator on live SuperCluster nodes is very similar to running it on Non-SuperCluster nodes with the exception of setting a shell variable SSC_SCM_SIMULATE to differentiate simulated actions from normal processing. Any SuperCluster configuration can be simulated on a live SuperCluster node including its own configuration.

Please check the first two blog posts in this series too for some information related to osc-setcoremem simulator.

Reproducing all high-level steps below for the sake of completeness (highlighted new text in blue color for convenience).

  1. Copy osc-setcoremem v2.4 or later executable binary from any live SuperCluster environment onto the target SuperCluster SPARC system running Solaris 11.3 or later

  2. Generate base configuration file in the original live SuperCluster environment that you wish to simulate elsewhere

    eg.,
    # /opt/oracle.supercluster/bin/osc-setcoremem -g
    
     [OK] simulator config file generated
     location: /var/tmp/oscsetcorememcfg.txt
    

       For the argument list, check "SIMULATOR ARGUMENTS" section in the output of "osc-setcoremem -h|-help"

  3. If you do not have access to the live SuperCluster environment that you wish to simulate, generate base configuration file template and edit it manually to populate base configuration of the SuperCluster environment to be simulated. Base configuration file template can be generated on any SPARC node running Solaris 11.3. And this step does not require root privileges.

    eg.,

    To generate a base configuration containing 4 domains, run:

    % ./osc-setcoremem -g -dc 4
    
     [OK] simulator config file generated
     location: /var/tmp/oscsetcorememcfg.txt
    
    % cat /var/tmp/oscsetcorememcfg.txt
    
    #DOMAIN                          ROOT             SERVICE          SOCKET           CORE             MEMORY           HCA
    # NAME                           DOMAIN           DOMAIN           COUNT            COUNT              GB             COUNT
    #--------------------------------------------------------------------------------------------------------------------------
    primary                          YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    ssccn-dom1                       YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    ssccn-dom2                       YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    ssccn-dom3                       YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    

       Check the Guidelines page for the manual editing of base configuration file

  4. Kick off simulation with the help of the base configuration file populated in either of the last two steps. osc-setcoremem's non-interactive mode can be activated too by supplying non-interactive arguments.

    Syntax: osc-setcoremem -p <platform> -c <config_file_path> [<non-interactive_arguments>]

    It is not necessary to set the shell variable SSC_SCM_SIMULATE when starting a new simulation from scratch with the help of a base configuration file. The presence of simulator specific options such as -p and -c eliminate the need for any hints about the simulation on a live SuperCluster system.

    eg.,
    % ./osc-setcoremem -p m8 -c ./oscsetcorememcfg.txt -type core -res 16/480:16/480:16/480
    
                     osc-setcoremem simulator (non-interactive)
                        v2.5  built on Oct 13 2017 11:33:52
    
     Current Configuration: SuperCluster M8
    
     +----------------------------------+-------+--------+-----------+--- MINIMUM ----+
     | DOMAIN                           | CORES | MEM GB |   TYPE    | CORES | MEM GB |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | primary                          |    32 |    960 | Dedicated |     2 |     32 |
     | ssccn1-dom1                      |    32 |    960 | Dedicated |     2 |     32 |
     | ssccn1-dom2                      |    32 |    960 | Dedicated |     2 |     32 |
     | ssccn1-dom3                      |     2 |     32 |   Root    |     2 |     32 |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | Parked Resources (Approx)        |    30 |    928 |    --     |    -- |   --   |
     +----------------------------------+-------+--------+-----------+-------+--------+
    
     [ INFO ] following domains will be ignored in this session.
    
     Root Domains
     ------------
     ssccn1-dom3
    
    
     CPU Granularity Preference:
    
            1. Socket
            2. Core
    
     In case of Socket granularity, proportional memory capacity is
      automatically selected for you.
    
     Choose Socket or Core [S or C] C
     ...
     ...
             DOMAIN REBOOT SUMMARY
    
     The following domains would have rebooted on a live system:
    
       ssccn1-dom2
       ssccn1-dom1
       primary
    
    
             POSSIBLE NEXT STEP
    
     Continue the simulation with updated configuration
             eg., SSC_SCM_SIMULATE=1 /osc-setcoremem [<option(s)>]
    
                     - OR -
    
     Start with an existing or brand new base configuration
             eg., /osc-setcoremem  -p [T4|T5|M6|M7|M8] -c <path_to_config_file>
    
  5. By this time osc-setcoremem simulator would have saved the changes made to the base configuration in previous step. You can verify by running osc-setcoremem executable with no options or using "-list" option.

    Ensure to set the shell variable SSC_SCM_SIMULATE to any value before running osc-setcoremem executable. Without this variable, osc-setcoremem shows the configuration of the underlying SuperCluster that it is running on.

    eg.,

    Changes highlighted below.

    % SSC_SCM_SIMULATE=1 ./osc-setcoremem
    
                             osc-setcoremem simulator
                        v2.5  built on Oct 13 2017 11:33:52
    
     Current Configuration: SuperCluster M8
    
     +----------------------------------+-------+--------+-----------+--- MINIMUM ----+
     | DOMAIN                           | CORES | MEM GB |   TYPE    | CORES | MEM GB |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | primary                          |    16 |    480 | Dedicated |     2 |     32 |
     | ssccn1-dom1                      |    16 |    480 | Dedicated |     2 |     32 |
     | ssccn1-dom2                      |    16 |    480 | Dedicated |     2 |     32 |
     | ssccn1-dom3                      |     2 |     32 |   Root    |     2 |     32 |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | Parked Resources (Approx)        |    78 |   2368 |    --     |    -- |   --   |
     +----------------------------------+-------+--------+-----------+-------+--------+
    
     [ INFO ] following domains will be ignored in this session.
    
     Root Domains
     ------------
     ssccn1-dom3
     ...
     ...
    
  6. Two options to choose from at this point:

    1. Continue simulation using the updated configuration.

        Simply running osc-setcoremem executable in the presence of SSC_SCM_SIMULATE shell variable without any arguments or with optional non-interactive arguments shall continue the simulation. This will let you simulate the move from one arbitrary configuration state (cores & memory assigned to different domains) to another.

      Syntax: SSC_SCM_SIMULATE=1 osc-setcoremem [<non-interactive_arguments>]

                 -OR-

    2. Start a brand new simulation using any base configuration file

        This is nothing but step #4 above. Here we assume that the required base configuration file was populated and ready. Be aware that this step wipes the current modified core, memory [virtual] configuration clean and starts again with a base configuration that was specified in the configuration file input to "-c" option

  7. Repeat steps 2-6 to simulate different SuperCluster configurations

   A complete example can be found in Oracle SuperCluster M7/M8 Administration Guide at Run a Simulation on a SuperCluster Node

Labels:




Tuesday, October 17, 2017
 
osc-setcoremem: Simulation on Non-SuperCluster Nodes

.. simulation of domain CPU and memory configuration changes, that is.

Please check the first blog post in this series too - Oracle SuperCluster: osc-setcoremem simulator.

Keep in mind that fresh/new simulations should always start with configurations representing the base configuration of a SuperCluster node (PDom) to be simulated. Base configuration is the compute server CPU and memory resources that are initially allocated during a SuperCluster installation.

High-level steps:

  1. Copy osc-setcoremem v2.4 or later executable binary from any live SuperCluster environment onto the target non-supercluster SPARC system running Solaris 11.3 or later

  2. Generate base configuration file in the original live SuperCluster environment that you wish to simulate elsewhere

    eg.,
    # /opt/oracle.supercluster/bin/osc-setcoremem -g
    
     [OK] simulator config file generated
     location: /var/tmp/oscsetcorememcfg.txt
    

       For the argument list, check "SIMULATOR ARGUMENTS" section in the output of "osc-setcoremem -h|-help"

  3. If you do not have access to the live SuperCluster environment (that you wish to simulate), generate base configuration file template and edit it manually to populate base configuration of the SuperCluster environment to be simulated. Base configuration file template can be generated on any SPARC node running Solaris 11.3. And this step does not require root privileges.

    eg.,

    To generate a base configuration containing 4 domains, run:

    % ./osc-setcoremem -g -dc 4
    
     [OK] simulator config file generated
     location: /var/tmp/oscsetcorememcfg.txt
    
    % cat /var/tmp/oscsetcorememcfg.txt
    
    #DOMAIN                          ROOT             SERVICE          SOCKET           CORE             MEMORY           HCA
    # NAME                           DOMAIN           DOMAIN           COUNT            COUNT              GB             COUNT
    #--------------------------------------------------------------------------------------------------------------------------
    primary                          YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    ssccn-dom1                       YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    ssccn-dom2                       YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    ssccn-dom3                       YES|NO           YES|NO           <COUNT>          <COUNT>          <CAPACITY>        1|2
    

       Check the Guidelines page for the manual editing of base configuration file

  4. Kick off simulation with the help of the base configuration file populated in either of the last two steps. osc-setcoremem's non-interactive mode can be activated too by supplying non-interactive arguments.

    Syntax: osc-setcoremem -p <platform> -c <config_file_path> [<non-interactive_arguments>]

    eg.,
    % ./osc-setcoremem -p m8 -c ./oscsetcorememcfg.txt -type core -res 16/480:16/480:16/480
    
                             osc-setcoremem simulator
                        v2.5  built on Oct 13 2017 11:33:52
    
     Current Configuration: SuperCluster M8
    
     +----------------------------------+-------+--------+-----------+--- MINIMUM ----+
     | DOMAIN                           | CORES | MEM GB |   TYPE    | CORES | MEM GB |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | primary                          |    32 |    960 | Dedicated |     2 |     32 |
     | ssccn1-dom1                      |    32 |    960 | Dedicated |     2 |     32 |
     | ssccn1-dom2                      |    32 |    960 | Dedicated |     2 |     32 |
     | ssccn1-dom3                      |     2 |     32 |   Root    |     2 |     32 |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | Parked Resources (Approx)        |    30 |    928 |    --     |    -- |   --   |
     +----------------------------------+-------+--------+-----------+-------+--------+
    
     [ INFO ] following domains will be ignored in this session.
    
     Root Domains
     ------------
     ssccn1-dom3
    
    
     CPU Granularity Preference:
    
            1. Socket
            2. Core
    
     In case of Socket granularity, proportional memory capacity is
      automatically selected for you.
    
     Choose Socket or Core [S or C] C
     ...
     ...
             DOMAIN REBOOT SUMMARY
    
     The following domains would have rebooted on a live system:
    
       ssccn1-dom2
       ssccn1-dom1
       primary
    
    
             POSSIBLE NEXT STEP
    
     Continue the simulation with updated configuration
             eg., /osc-setcoremem [<option(s)>]
    
                     - OR -
    
     Start with an existing or brand new base configuration
             eg., /osc-setcoremem  -p [T4|T5|M6|M7|M8] -c <path_to_config_file>
    
  5. By this time osc-setcoremem simulator would have saved the changes made to the base configuration in previous step. You can verify by running osc-setcoremem executable with no options or using "-list" option.

    eg.,

    Changes highlited below.

    % ./osc-setcoremem
    
                             osc-setcoremem simulator
                        v2.5  built on Oct 13 2017 11:33:52
    
     Current Configuration: SuperCluster M8
    
     +----------------------------------+-------+--------+-----------+--- MINIMUM ----+
     | DOMAIN                           | CORES | MEM GB |   TYPE    | CORES | MEM GB |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | primary                          |    16 |    480 | Dedicated |     2 |     32 |
     | ssccn1-dom1                      |    16 |    480 | Dedicated |     2 |     32 |
     | ssccn1-dom2                      |    16 |    480 | Dedicated |     2 |     32 |
     | ssccn1-dom3                      |     2 |     32 |   Root    |     2 |     32 |
     +----------------------------------+-------+--------+-----------+-------+--------+
     | Parked Resources (Approx)        |    78 |   2368 |    --     |    -- |   --   |
     +----------------------------------+-------+--------+-----------+-------+--------+
    
     [ INFO ] following domains will be ignored in this session.
    
     Root Domains
     ------------
     ssccn1-dom3
     ...
     ...
    
  6. Two options to choose from at this point:

    1. Continue simulation using the updated configuration.

        Simply running osc-setcoremem executable without any arguments or with optional non-interactive arguments shall continue the simulation. This will let you simulate the move from one arbitrary configuration state (cores & memory assigned to different domains) to another.

      Syntax: osc-setcoremem [<non-interactive_arguments>]

                 -OR-

    2. Start a brand new simulation using any base configuration file

        This is nothing but step #4 above. Here we assume that the required base configuration file was populated and ready. Be aware that this step wipes the current modified core, memory [virtual] configuration clean and starts again with a base configuration that was specified in the configuration file input to "-c" option

  7. Repeat steps 2-6 to simulate different SuperCluster configurations

   A complete example can be found in Oracle SuperCluster M7/M8 Administration Guide at Example: Simulating Changes on a Non-SuperCluster Node

Labels:




Tuesday, September 19, 2017
 
Oracle SuperCluster: osc-setcoremem simulator

Target Audience: Oracle SuperCluster customers and Oracle technical support

.. are some of the common/recurring questions that I've heard in the last several years. Unfortunately in most cases questions like "Is it possible to use setcoremem to achieve my desired configuration?" won't arise until after the customer had their hands on the SuperCluster configuration they ordered. If the customer has a way of figuring out what combinations of core/memory configurations are possible in the planned SuperCluster configuration beforehand, it'd help them tremendously in planning; and likely minimize frustrations and service requests later on if the tool shows different possible combinations than the ones they'd prefer.

To address some of the questions and concerns that are similar to the ones mentioned above, osc-setcoremem simulator was introduced in SuperCluster Exafamily update 2.4.x. The self-contained osc-setcoremem binary from Exafamily update 2.4.x can be used to run simulations on any SPARC hardware not necessarily on SuperCluster SPARC hardware alone as long as the target SPARC hardware has Solaris 11 or later running. While normal execution (non-simulation) of osc-setcoremem requires root privileges to make the intended core/memory configuration changes in one or more logical domains (LDoms), it is not necessary to use root privileges to run the simulator. In other words, normal users with limited privileges can run osc-setcoremem on any SPARC hardware including non-SuperCluster hardware to simulate the behavior of osc-setcoremem on a variety of SuperCluster T4/T5/M6/M7/M8 supported configurations such as fully/half/quarter populated configurations.

Few things to keep in mind:

We will explore the simulator in the next few blog posts. Meanwhile please check the official documentation pages out at Configuring CPU and Memory Resources (osc-setcoremem) and Run a Simulation to learn more about the functionality of osc-setcoremem and for the instructions to run the simulator.

ALSO SEE:

  1. osc-setcoremem: A Brief Introduction
  2. Non-Interactive osc-setcoremem

(To be continued ..)

Labels:





2004-2018 

This page is powered by Blogger. Isn't yours?