Chapter 7. Frequently Asked Questions

This section provides answers to frequently asked questions associated with the NVIDIA Linux x86 Driver and its installation. Common problem diagnoses can be found in Chapter 8, Common Problems and tips for new users can be found in Appendix I, Tips for New Linux Users. Also, detailed information for specific setups is provided in the Appendices.

7.1. NVIDIA-INSTALLER

How do I extract the contents of the .run without actually installing the driver?

Run the installer as follows:

    # sh NVIDIA-Linux-x86-325.08.run --extract-only

This will create the directory NVIDIA-Linux-x86-325.08, containing the uncompressed contents of the .run file.

How can I see the source code to the kernel interface layer?

The source files to the kernel interface layer are in the kernel directory of the extracted .run file. To get to these sources, run:

    # sh NVIDIA-Linux-x86-325.08.run --extract-only
    # cd NVIDIA-Linux-x86-325.08/kernel/

How and when are the the NVIDIA device files created?

When a user-space NVIDIA driver component needs to communicate with the NVIDIA kernel module, and the NVIDIA character device files do not yet exist, the user-space component will first attempt to load the kernel module and create the device files itself.

Device file creation and kernel module loading generally require root privileges. The X driver, running within a setuid root X server, will have these privileges, but not, e.g., the CUDA driver within the environment of a normal user.

If the user-space NVIDIA driver component cannot load the kernel module or create the device files itself, it will attempt to invoke the setuid root nvidia-modprobe utility, which will perform these operations on behalf of the non-privileged driver.

See the nvidia-modprobe(1) man page, or its source code, available here: ftp://download.nvidia.com/XFree86/nvidia-modprobe/

When possible, it is recommended to use your Linux distribution's native mechanisms for managing kernel module loading and device file creation. nvidia-modprobe is provided as a fallback to work out-of-the-box in a distribution-independent way.

Whether a user-space NVIDIA driver component does so itself, or invokes nvidia-modprobe, it will default to creating the device files with the following attributes:

      UID:  0     - 'root'
      GID:  0     - 'root'
      Mode: 0666  - 'rw-rw-rw-'

Existing device files are changed if their attributes don't match these defaults. If you want the NVIDIA driver to create the device files with different attributes, you can specify them with the "NVreg_DeviceFileUID" (user), "NVreg_DeviceFileGID" (group) and "NVreg_DeviceFileMode" NVIDIA Linux kernel module parameters.

For example, the NVIDIA driver can be instructed to create device files with UID=0 (root), GID=44 (video) and Mode=0660 by passing the following module parameters to the NVIDIA Linux kernel module:

      NVreg_DeviceFileUID=0
      NVreg_DeviceFileGID=44
      NVreg_DeviceFileMode=0660

The "NVreg_ModifyDeviceFiles" NVIDIA kernel module parameter will disable dynamic device file management, if set to 0.

Why does NVIDIA not provide RPMs?

Not every Linux distribution uses RPM, and NVIDIA provides a single solution that works across all Linux distributions. NVIDIA encourages Linux distributions to repackage and redistribute the NVIDIA Linux driver in their native package management formats. These repackaged NVIDIA drivers are likely to inter-operate best with the Linux distribution's package management technology. For this reason, NVIDIA encourages users to use their distribution's repackaged NVIDIA driver, where available.

Can the nvidia-installer use a proxy server?

Yes, because the FTP support in nvidia-installer is based on snarf, it will honor the FTP_PROXY, SNARF_PROXY, and PROXY environment variables.

What is the significance of the -no-compat32 suffix on Linux-x86_64 .run files?

To distinguish between Linux-x86_64 driver package files that do or do not also contain 32-bit compatibility libraries, "-no-compat32" is be appended to the latter. NVIDIA-Linux-x86-325.08.run contains both 64-bit and 32-bit driver binaries; but NVIDIA-Linux-x86-325.08-no-compat32.run omits the 32-bit compatibility libraries.

Can I add my own precompiled kernel interfaces to a .run file?

Yes, the --add-this-kernel .run file option will unpack the .run file, build a precompiled kernel interface for the currently running kernel, and repackage the .run file, appending -custom to the filename. This may be useful, for example. if you administer multiple Linux computers, each running the same kernel.

Where can I find the source code for the nvidia-installer utility?

The nvidia-installer utility is released under the GPL. The source code for the version of nvidia-installer built with driver 325.08 is in nvidia-installer-325.08.tar.bz2 available here: ftp://download.nvidia.com/XFree86/nvidia-installer/

7.2. NVIDIA Driver

Where should I start when diagnosing display problems?

One of the most useful tools for diagnosing problems is the X log file in /var/log. Lines that begin with (II) are information, (WW) are warnings, and (EE) are errors. You should make sure that the correct config file (i.e. the config file you are editing) is being used; look for the line that begins with:

    (==) Using config file:

Also make sure that the NVIDIA driver is being used, rather than the “nv” or “vesa” driver. Search for

    (II) LoadModule: "nvidia"

Lines from the driver should begin with:

    (II) NVIDIA(0)

How can I increase the amount of data printed in the X log file?

By default, the NVIDIA X driver prints relatively few messages to stderr and the X log file. If you need to troubleshoot, then it may be helpful to enable more verbose output by using the X command line options -verbose and -logverbose, which can be used to set the verbosity level for the stderr and log file messages, respectively. The NVIDIA X driver will output more messages when the verbosity level is at or above 5 (X defaults to verbosity level 1 for stderr and level 3 for the log file). So, to enable verbose messaging from the NVIDIA X driver to both the log file and stderr, you could start X with the verbosity level set to 5, by doing the following

    % startx -- -verbose 5 -logverbose 5

What is NVIDIA's policy towards development series Linux kernels?

NVIDIA does not officially support development series kernels. However, all the kernel module source code that interfaces with the Linux kernel is available in the kernel/ directory of the .run file. NVIDIA encourages members of the Linux community to develop patches to these source files to support development series kernels. A web search will most likely yield several community supported patches.

Where can I find the tarballs?

Plain tarballs are not available. The .run file is a tarball with a shell script prepended. You can execute the .run file with the --extract-only option to unpack the tarball.

How do I tell if I have my kernel sources installed?

If you are running on a distro that uses RPM (Red Hat, Mandriva, SuSE, etc), then you can use rpm to tell you. At a shell prompt, type:

    % rpm -qa | grep kernel

and look at the output. You should see a package that corresponds to your kernel (often named something like kernel-2.6.15-7) and a kernel source package with the same version (often named something like kernel-devel-2.6.15-7 or kernel-source-2.4.22-7). If none of the lines seem to correspond to a source package, then you will probably need to install it. If the versions listed mismatch (e.g., kernel-2.6.15-7 vs. kernel-devel-2.6.15-10), then you will need to update the kernel-devel package to match the installed kernel. If you have multiple kernels installed, you need to install the kernel-devel package that corresponds to your running kernel (or make sure your installed source package matches the running kernel). You can do this by looking at the output of uname -r and matching versions.

What is SELinux and how does it interact with the NVIDIA driver ?

Security-Enhanced Linux (SELinux) is a set of modifications applied to the Linux kernel and utilities that implement a security policy architecture. When in use it requires that the security type on all shared libraries be set to 'shlib_t'. The installer detects when to set the security type, and sets it on all shared libraries it installs. The option --force-selinux passed to the .run file overrides the detection of when to set the security type.

Why does X use so much memory?

When measuring any application's memory usage, you must be careful to distinguish between physical system RAM used and virtual mappings of shared resources. For example, most shared libraries exist only once in physical memory but are mapped into multiple processes. This memory should only be counted once when computing total memory usage. In the same way, the video memory on a graphics card or register memory on any device can be mapped into multiple processes. These mappings do not consume normal system RAM.

This has been a frequently discussed topic on XFree86 mailing lists; see, for example:

http://marc.theaimsgroup.com/?l=xfree-xpert&m=96835767116567&w=2

The pmap utility described in the above thread is available in the "procps" package shipped with most recent Linux distributions, and is a useful tool in distinguishing between types of memory mappings. For example, while top may indicate that X is using several hundred MB of memory, the last line of output from the output of pmap (note that pmap may need to be run as root):

    # pmap -d `pidof X` | tail -n 1
    mapped: 161404K    writeable/private: 7260K    shared: 118056K

reveals that X is really only using roughly 7MB of system RAM (the "writeable/private" value).

Note, also, that X must allocate resources on behalf of X clients (the window manager, your web browser, etc); the X server's memory usage will increase as more clients request resources such as pixmaps, and decrease as you close X applications.

The IndirectMemoryAccess X configuration option may cause additional virtual address space to be reserved.

Why do applications that use DGA graphics fail?

The NVIDIA driver does not support the graphics component of the XFree86-DGA (Direct Graphics Access) extension. Applications can use the XDGASelectInput() function to acquire relative pointer motion, but graphics-related functions such as XDGASetMode() and XDGAOpenFramebuffer() will fail.

The graphics component of XFree86-DGA is not supported because it requires a CPU mapping of framebuffer memory. As graphics cards ship with increasing quantities of video memory, the NVIDIA X driver has had to switch to a more dynamic memory mapping scheme that is incompatible with DGA. Furthermore, DGA does not cooperate with other graphics rendering libraries such as Xlib and OpenGL because it accesses GPU resources directly.

NVIDIA recommends that applications use OpenGL or Xlib, rather than DGA, for graphics rendering. Using rendering libraries other than DGA will yield better performance and improve interoperability with other X applications.

My kernel log contains messages that are prefixed with "Xid"; what do these messages mean?

"Xid" messages indicate that a general GPU error occurred, most often due to the driver misprogramming the GPU or to corruption of the commands sent to the GPU. These messages provide diagnostic information that can be used by NVIDIA to aid in debugging reported problems.

I use the Coolbits overclocking interface to adjust my graphics card's clock frequencies, but the defaults are reset whenever X is restarted. How do I make my changes persistent?

Clock frequency settings are not saved/restored automatically by default to avoid potential stability and other problems that may be encountered if the chosen frequency settings differ from the defaults qualified by the manufacturer. You can use the command line below in ~/.xinitrc to automatically apply custom clock frequency settings when the X server is started:

    # nvidia-settings -a GPUOverclockingState=1 -a GPU2DClockFreqs=<GPU>,<MEM> -a GPU3DClockFreqs=<GPU>,<MEM>

Here <GPU> and <MEM> are the desired GPU and video memory frequencies (in MHz), respectively.

Why is the refresh rate not reported correctly by utilities that use the XRandR X extension (e.g., the GNOME "Screen Resolution Preferences" panel, `xrandr -q`, etc)?

The XRandR X extension is not presently aware of multiple display devices on a single X screen; it only sees the MetaMode bounding box, which may contain one or more actual modes. This means that if multiple MetaModes have the same bounding box, XRandR will not be able to distinguish between them.

In order to support DynamicTwinView, the NVIDIA X driver must make each MetaMode appear to be unique to XRandR. Presently, the NVIDIA X driver accomplishes this by using the refresh rate as a unique identifier.

You can use `nvidia-settings -q RefreshRate` to query the actual refresh rate on each display device.

This behavior can be disabled by setting the X configuration option "DynamicTwinView" to FALSE.

For details, see Chapter 12, Configuring Multiple Display Devices on One X Screen.

Why does starting certain applications result in Xlib error messages indicating extensions like "XFree86-VidModeExtension" or "SHAPE" are missing?

If your X config file has a Module section that does not list the "extmod" module, some X server extensions may be missing, resulting in error messages of the form:

Xlib: extension "SHAPE" missing on display ":0.0"
Xlib: extension "XFree86-VidModeExtension" missing on display ":0.0"
Xlib: extension "XFree86-DGA" missing on display ":0.0"

You can solve this problem by adding the line below to your X config file's Module section:

    Load "extmod"

Where can I find older driver versions?

Please visit ftp://download.nvidia.com/XFree86/Linux-x86/

What is the format of a PCI Bus ID?

Different tools have different formats for the PCI Bus ID of a PCI device.

The X server's "BusID" X configuration file option interprets the BusID string in the format "bus@domain:device:function" (the "@domain" portion is only needed if the PCI domain is non-zero), in decimal. More specifically,

"%d@%d:%d:%d", bus, domain, device, function

in printf(3) syntax. NVIDIA X driver logging, nvidia-xconfig, and nvidia-settings match the X configuration file BusID convention.

The lspci(8) utility, in contrast, reports the PCI BusID of a PCI device in the format "domain:bus:device.function", printing the values in hexadecimal. More specifically,

"%04x:%02x:%02x.%x", domain, bus, device, function

in printf(3) syntax. The "Bus Location" reported in the /proc/driver/nvidia/gpus/0..N/information files matches the lspci format.

How do I interpret X server version numbers?

X server version numbers can be difficult to interpret because some X.Org X servers report the versions of different things.

In 2003, X.Org created a fork of the XFree86 project's code base, which used a monolithic build system to build the X server, libraries, and applications together in one source code repository. It resumed the release version numbering where it left off in 2001, continuing with 6.7, 6.8, etc., for the releases of this large bundle of code. These version numbers are sometimes written X11R6.7, X11R6.8, etc. to include the version of the X protocol.

In 2005, an effort was made to split the monolithic code base into separate modules with their own version numbers to make them easier to maintain and so that they could be released independently. X.Org still occasionally releases these modules together, with a single version number. These releases are simply referred to as “X.Org releases”, or sometimes “katamari” releases. For example, X.Org 7.6 was released on December 20, 2010 and contains version 1.9.3 of the xorg-server package, which contains the core X server itself.

The release management changes from XFree86, to X.Org monolithic releases, to X.Org modular releases impacted the behavior of the X server's -version command line option. For example, XFree86 X servers always report the version of the XFree86 monolithic package:

XFree86 Version 4.3.0 (Red Hat Linux release: 4.3.0-2)
Release Date: 27 February 2003
X Protocol Version 11, Revision 0, Release 6.6

X servers in X.Org monolithic and early “katamari” releases did something similar:

X Window System Version 7.1.1
Release Date: 12 May 2006
X Protocol Version 11, Revision 0, Release 7.1.1

However, X.Org later modified the X server to start printing its individual module version number instead:

X.Org X Server 1.9.3
Release Date: 2010-12-13
X Protocol Version 11, Revision 0

Please keep this in mind when comparing X server versions: what looks like “version 7.x” is older than version 1.x.