Chapter 30. Configuring SLI and Multi-GPU Mosaic

The NVIDIA Linux driver contains support for NVIDIA SLI and Multi-GPU Mosaic. This technology enables you to extend a single X screen transparently across all of the available display outputs on each GPU. See below for the exact set of configurations which can be used with SLI Mosaic Mode.

The distinction between SLI and Multi-GPU is straightforward. SLI is used to leverage the processing power of GPUs across two or more graphics cards, while Multi-GPU is used to leverage the processing power of two GPUs colocated on the same graphics card. If you want to link together separate graphics cards, you should use the "SLI" X config option. Likewise, if you want to link together GPUs on the same graphics card, you should use the "MultiGPU" X config option. If you have two cards, each with two GPUs, and you wish to link them all together, you should use the "SLI" option.

If any SLI mode is enabled, applications may override which rendering mode is in use by creating an OpenGL context with the GLX_CONTEXT_MULTIGPU_ATTRIB_NV attribute. In addition, applications may gain explicit control over individual GPU rendering in SLI configurations through the GL_NV_gpu_multicast extension by creating a context with the GLX_CONTEXT_MULTIGPU_ATTRIB_MULTICAST_NV attribute. Multicast rendering in SLI Mosaic configurations requires use of the GLX_CONTEXT_MULTIGPU_ATTRIB_MULTI_DISPLAY_MULTICAST_NV attribute, which is only allowed on Quadro GPUs.

Enabling Multi-GPU

Multi-GPU is enabled by setting the "MultiGPU" option in the X configuration file; see Appendix B, X Config Options for details about the "MultiGPU" option.

The nvidia-xconfig utility can be used to set the "MultiGPU" option, rather than modifying the X configuration file by hand. For example:

    % nvidia-xconfig --multigpu=mosaic

Enabling SLI

SLI is enabled by setting the "SLI" option in the X configuration file; see Appendix B, X Config Options for details about the SLI option.

The nvidia-xconfig utility can be used to set the SLI option, rather than modifying the X configuration file by hand. For example:

    % nvidia-xconfig --sli=mosaic

Enabling SLI Mosaic Mode

The simplest way to configure SLI Mosaic Mode using a grid of monitors is to use nvidia-settings (see Chapter 24, Using the nvidia-settings Utility). The steps to perform this configuration are as follows:

  1. Connect each of the monitors you would like to use to any connector from any GPU used for SLI Mosaic Mode. If you are going to use fewer monitors than there are connectors, connect one monitor to each GPU before adding a second monitor to any GPUs.

  2. Install the NVIDIA display driver set.

  3. Configure an X screen to use the "nvidia" driver on at least one of the GPUs (see Chapter 6, Configuring X for the NVIDIA Driver for more information).

  4. Start X.

  5. Run nvidia-settings. You should see a tab in the left pane of nvidia-settings labeled "SLI Mosaic Mode Settings". Note that you may need to expand the entry for the X screen you configured earlier.

  6. Check the "Use SLI Mosaic Mode" check box.

  7. Select the monitor grid configuration you'd like to use from the "display configuration" dropdown.

  8. Choose the resolution and refresh rate at which you would like to drive each individual monitor.

  9. Set any overlap you would like between the displays.

  10. Click the "Save to X Configuration File" button. NOTE: If you don't have permissions to write to your system's X configuration file, you will be prompted to choose a location to save the file. After doing so, you must copy the X configuration file into a location the X server will consider upon startup (usually /etc/X11/xorg.conf).

  11. Exit nvidia-settings and restart your X server.

Alternatively, nvidia-xconfig can be used to configure SLI Mosaic Mode via a command like nvidia-xconfig --sli=Mosaic --metamodes=METAMODES where the METAMODES string specifies the desired grid configuration. For example:

    nvidia-xconfig --sli=Mosaic --metamodes="GPU-0.DFP-0: 1920x1024+0+0, GPU-0.DFP-1: 1920x1024+1920+0, GPU-1.DFP-0: 1920x1024+0+1024, GPU-1.DFP-1: 1920x1024+1920+1024"

will configure four DFPs in a 2x2 configuration, each running at 1920x1024, with the two DFPs on GPU-0 driving the top two monitors of the 2x2 configuration, and the two DFPs on GPU-1 driving the bottom two monitors of the 2x2 configuration.

See the MetaModes X configuration description in details in Chapter 12, Configuring Multiple Display Devices on One X Screen. See Appendix C, Display Device Names for further details on GPU and Display Device Names.

Hardware requirements

SLI functionality requires:

Other Notes and Requirements

The following other requirements apply to SLI and Multi-GPU:

30.1. Frequently Asked SLI and Multi-GPU Questions

Why does SLI or MultiGPU fail to initialize?

There are several reasons why SLI or MultiGPU may fail to initialize. Most of these should be clear from the warning message in the X log file; e.g.:

  • Unsupported bus type

  • The video link was not detected

  • GPUs do not match

  • Unsupported GPU video BIOS

  • Insufficient PCIe link width

The warning message 'Unsupported PCI topology' is likely due to problems with your Linux kernel. The NVIDIA driver must have access to the PCI Bridge (often called the Root Bridge) that each NVIDIA GPU is connected to in order to configure SLI or MultiGPU correctly. There are many kernels that do not properly recognize this bridge and, as a result, do not allow the NVIDIA driver to access this bridge. See the below "How can I determine if my kernel correctly detects my PCI Bridge?" FAQ for details.

Below are some specific troubleshooting steps to help deal with SLI and MultiGPU initialization failures.

  • Make sure that ACPI is enabled in your kernel. NVIDIA's experience has been that ACPI is needed for the kernel to correctly recognize the Root Bridge. Note that in some cases, the kernel's version of ACPI may still have problems and require an update to a newer kernel.

  • Run lspci to check that multiple NVIDIA GPUs can be identified by the operating system; e.g:

        % /sbin/lspci | grep -i nvidia
    

    If lspci does not report all the GPUs that are in your system, then this is a problem with your Linux kernel, and it is recommended that you use a different kernel.

    Please note: the lspci utility may be installed in a location other than /sbin on your system. If the above command fails with the error: '/sbin/lspci: No such file or directory', please try:

        % lspci | grep -i nvidia
    

    , instead. You may also need to install your distribution's pciutils package.

  • Make sure you have the most recent SBIOS available for your motherboard.

  • The PCI Express slots on the motherboard must provide a minimum link width. Please make sure that the PCI Express slot(s) on your motherboard meet the following requirements and that you have connected the graphics board to the correct PCI Express slot(s):

    • A dual-GPU board needs a minimum of 8 lanes (i.e. x8 or x16)

    • A pair of single-GPU boards requires one of the following supported link width combinations:

      • x16 + x16

      • x16 + x8

      • x16 + x4

      • x8 + x8

How can I determine if my kernel correctly detects my PCI Bridge?

As discussed above, the NVIDIA driver must have access to the PCI Bridge that each NVIDIA GPU is connected to in order to configure SLI or MultiGPU correctly. The following steps will identify whether the kernel correctly recognizes the PCI Bridge:

  • Identify both NVIDIA GPUs:

        % /sbin/lspci | grep -i vga
    
        0a:00.0 VGA compatible controller: nVidia Corporation [...]
        81:00.0 VGA compatible controller: nVidia Corporation [...]
    

  • Verify that each GPU is connected to a bus connected to the Root Bridge (note that the GPUs in the above example are on buses 0a and 81):

        % /sbin/lspci -t
    

    good:

        -+-[0000:80]-+-00.0
         |           +-01.0
         |           \-0e.0-[0000:81]----00.0
        ...
         \-[0000:00]-+-00.0
                     +-01.0
                     +-01.1
                     +-0e.0-[0000:0a]----00.0
    

    bad:

        -+-[0000:81]---00.0
        ...
         \-[0000:00]-+-00.0
                     +-01.0
                     +-01.1
                     +-0e.0-[0000:0a]----00.0
    

    Note that in the first example, bus 81 is connected to Root Bridge 80, but that in the second example there is no Root Bridge 80 and bus 81 is incorrectly connected at the base of the device tree. In the bad case, the only solution is to upgrade your kernel to one that properly detects your PCI bus layout.