Pages

Wednesday, August 24, 2005

An Odyssey to Solaris 11 on Solaris Express 17

Last weekend I've upgraded the OS on my Dell Inspiron 2650 laptop, from Solaris 10 GA to Solaris Nevada Build 17 (aka snv_17 or Solaris 11). In fact, I have no plans to upgrade it in imminent future, but thanks to an unexpected power outage, that made my laptop sleep for few hours, after running 57 days continuously. So, I took that opportunity to do, some of the things that I have been keep postponing for quite a few days:
  1. Prepare the system to install OpenSolaris bits
    This is the major reason for the upgrade; but I'm not sure if I'll be able to play with OpenSolaris right away.The reason being, I couldn't afford significant down time of my note book, at the moment.

  2. Experiment with large pages for text and initialized data segments (aka VMPSS or MPSS for Vnodes).[To DO:// detailed blog post on VMPSS]. VMPSS maps user applications, and library text and initialized data segment with large pages by default on SPARC. ie., the user doesn't have to enable it, like the way they do it for MPSS for Data. This was integrated into Solaris Express Build 15.

    Since no large pages are used by default on x86 systems, I've enabled it with set use_text_largepages=1 setting in /etc/system, once the system is ready. I can see that some of the processes (eg., Xorg, X-Chat, Mozilla) are using 4M pages.
    # pgrep Xorg
    2357

    # pmap -sx 2357 | grep 4M
    08800000 8192 8192 8192 - 4M rw--- [ heap ]
    09400000 4096 4096 4096 - 4M rw--- [ heap ]
  3. To install vold (volume daemon)
    Somehow I missed this daemon, from my Solaris 10 GA installation, and has been mounting the CD-ROMs manually ever since

  4. To post instructions for the system recovery from a run-time linker failure, on Solaris 10 and later
Upgrading Solaris 10 to Solaris Express build 17 (aka Solaris 11)

I have some bad news here with the upgrade. First of all, there is no way I can force it to overwrite the previously installed packages. It would be nice to have a blind upgrade option for home users, just like blindingly fast upgrade or BFU for OpenSolaris. Installer tried to backup most of the packages, and complains that there is not enough space to install the new/updated packages, even though I've reserved 9G disk space for root file system (/). It actually needs 4.5G disk space, for the complete OEM. Since it tried to backup most of the stuff from Solaris 10 GA installation, the disk space constraints gone up to ~9G. To accomodate the 300M+ shortage, I've sacrificed the existing swap partition.

Then I kicked off the installation and slept, with the hope that the installation from CD1 will be complete by the morning. But to my surprise, it flunked; there is absolutely 0% progress on the screen. Then I thought of restarting the whole process, by killing the installer windows. As soon as I killed the status window, the installation resumed and took three hours to complete the upgrade (just) from CD1.

After the reboot, many errors showed up on the screen and was stuck in the console mode (neither asking for CD #2, nor leaving the text mode). So I've decided to make a clear install. Other influencing factors include (but not limited to):
  1. /opt/csw directory is a bit messy with three different window managers (KDE, Enlightenment and Xfce) and hundreds of 3rd party softwares; and due to this most of the allocated 9G disk space, was filled up. It is not so easy to clean it up

  2. Multiple copies (different versions) of GLib, GTK+, Pango, etc., got installed under /usr/lib, /opt/sfw, /opt/csw directories

  3. Accidental deletion of package config files (.pc) from /usr/lib/pkgconfig directory, made it hard to build applications

  4. Parts of JDS were broken, which I never bothered to fix, due to the reluctance emerged from high memory requirements. JDS is pretty cool; but GNOME components hog most of the physical memory, leaving no choice to the user except keeping the number of open windows to a pretty low number. Moreover it makes it real hard to run it on any machine with less than 512M memory
       NPROC USERNAME  SIZE   RSS MEMORY      TIME  CPU
    37 techno 864M 559M 91% 4:54:34 8.2%
    Let's hope that GNOME Memory Reduction project yields good results.
Solaris 11 clean installation

The real nice thing is the installer realized that there are some UFS partitions with some data; and gave an option to preserve the partitions of my choice. I chose my home directory; and from there the installation was a piece of cake. At the end, I felt that the (clean) installation was fast, compared to Solaris 10 GA installation.

Here are some hints for home users:
Choose:
* Solaris Interactive installation in the initial screen.
* Networked in Network Connectivity screen
* Use DHCP
* No to IPv6 & Kerberos, if running Solaris 10 is the only priority
* None to Name Service (assuming the machine is not in corporate network)
* CD/DVD to "Specify Media"
* Default Install to "Select type of install"
* No to "Do you need to override the system's default NFS version 4 domain name?"

Notes:
  1. Once the installation from CD1 is complete, system boots up from the hard disk; and it may appear that it is not going to ask for CD2. Just be patient, and wait for it to ask you about CD2

  2. You have to download the companion software CD from a different location. It can't be find in the same location, where you download the Solaris OS CD/DVD images. Here's the URL: http://www.sun.com/software/solaris/freeware/s10download.xml. If this image is on hard disk, you need not burn it into a CD; but can be mounted on a loopback file device with the help of lofiadm tool. This can be done while the installer is waiting for the CD or path to the installer on local/remote disk.
Relevant resources:
  1. http://docs.sun.com/app/docs/doc/817-0544/6mgbagb1b?a=view (installation instructions)
  2. http://shots.osdir.com/slideshows/slideshow.php?release=279&slide=1 (screen shots)
Post-installation troubleshooting (Solaris & JDS)
  1. Missing JDS Linux partition
    • Solaris 11 overwrote the Master Boot Record (MBR) with its GRUB; hence JDS Linux was disappeared from the GRUB menu list. Luckily I had a backup of the grub config file from Linux partition. So a simple copy/paste to /boot/grub/menu.lst, did the job. Now I can boot up both JDS Linux and Solaris 11

  2. Audio driver problem is back
  3. Unable to login to JDS as normal user
    • Able to login as root, but not as normal user. This behavior gave a hint that it has something to do with permissions. With little effort, found that ~/.ICEauthority file holds the X authorization records, and shouldn't be owned by root for normal users to login. Fixed it by changing the ownership of ~/.ICEauthority file

  4. [JDS] Unable to launch any application from command line. The following error message appears on the console:
     Xlib: connection to ":0.0" refused by server
    Xlib: Invalid MIT-MAGIC-COOKIE-1 key
    Thu Aug 15 21:29:59 2005 Gtk-WARNING **: cannot open display: :0.0 at (eval 1) line 1.
    • When we try to run an X11 application, it reads the $DISPLAY variable, connects to the X11 server (local host, in this case) and provides the magic cookie by reading ~/.Xauthority

    • Fixed it by deleting ~/.Xauthority file (that's not a proper way, of course)

  5. Couldn't burn CDs. It failed with the following error:
    # /opt/sfw/bin/cdrecord -scanbus
    Warning: Using USCSI interface.
    Warning: Volume management is running, medialess managed drives are invisible.
    /opt/sfw/bin/cdrecord: No such file or directory. Cannot open '/dev/rdsk/c1t0d0s2'. Cannot open SCSI driver.
    ...

    # truss /opt/sfw/bin/cdrecord -scanbus
    ...

    open("/dev/rdsk/c1t0d0s2", O_RDONLY|O_NDELAY) Err#16 EBUSY
    ...
    • From the warning message, and the truss output, it is clear that some thing is blocking the CD recording device; and it is the "removable media manager" process vold

    • For burning CDRs on Solaris:
      1. stop the volume manager daemon vold, because it blocks the CD recorder device
           /etc/init.d/volmgt stop
      2. Burn the CD using cdrecord
      3. Restart the daemon
           /etc/init.d/volmgt start

  6. gtkam didn't recognize the digital camera
    • Fixed it by binding the device (digital camera) to ugen (USB Generic Driver)

    • There's a piece of good news lately. The fix to the RFE 6213551: libusb support should just work was incorporated in Solaris Express 8/05 (Nevada Build 19). With this, the system takes care of automatic handling of unknown USB devices; and we should be able to plug in devices with USB interface like digital camera, scanner, .. with no effort (manual configuration) at all.
Goodies
  1. Replaced the default GRUB splash screen with Chandan's Solaris Express theme
  2. Replaced the default login screen with Chandan's Open World theme for OpenSolaris
  3. Installed Evince, a document viewer by grabbing evince-0.3.2-SunOS5.8-i386-CSW.pkg.gz from blastwave.org
Now I have a nice little shiny Solaris desktop. As Solaris is getting better (from home users perspective) with each Solaris Express build, it is definitely worth the effort. For all new potential users, Solaris Express is the way to go.
________________
Technorati tag:

No comments:

Post a Comment