NVIDIA nForce Linux: Known Problems

 

·           2.4.x kernel hangs or some devices are not available

There are bugs in 2.4.x kernel ACPI support that may affect install and/or OS boot. If you are having problems, one of the first trouble shooting tactics should be to disable ACPI. This can be done from the bootline using "acpi=off" boot line option or by modifying the BIOS settings.

If disabling ACPI fixes the boot but you wish to run with ACPI enabled, consult other problems in this list that document known problems and workarounds for ACPI-related issues.

·           2.4.x kernel on AMD64 system hangs during boot when ACPI is enabled

If prior to the hang the kernel console boot trace indicates that device interrupts are being steered to IRQ0, then the hang may be due to a bug in the x86_64 kernel's handling of ACPI interrupt steering. A kernel patch has been accepted to fix this bug, but it has not yet been picked up by all distributions. In this case you can work around the problem by disabling ACPI (in the BIOS, or by using the 'acpi=off' boot line option), or you can manually patch the kernel.

The patch modifies the function mp_parse_prt() in arch/x86_64/kernel/mpparse.c. You can patch this file by hand by commenting out the line:

irq = entry->link.index;

that immediately precedes the comment:

/* Dont set up the ACPI SCI because it's already up */

·           Network and other devices randomly stop working when ACPI is enabled

This problem may be caused by an incorrect ACPI table entry that causes the timer interrupt to be incorrectly configured.

If the kernel console boot trace (viewable using dmesg) contains messages such as these:

..MP-BIOS bug: 8254 timer not connected to IOAPIC
...trying to set up timer (IRQ0) through the 8259A . failed.
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ... works.

then the incorrect ACPI table entry is present. On 2.6 kernels, this can be worked around by specifying the 'acpi_skip_timer_override' boot line option. An alternative workaround is to disable ACPI in the BIOS or by using the 'acpi=off' boot line option.

·           2.6.x kernel unbootable on some systems

There is a bug in the 2.6.x kernel MP table handling that prevents install and OS boot on some systems. At the time of writing, the only nForce systems known to trigger this kernel bug are nForce4 MP systems.

This bug causes memory corruption upon detection of any PCI bus numbered higher than 32, and consequently renders the system unusable very early in the install or boot process. On some systems, leaving ACPI on during boot fixes this problem.

There are currently no known workarounds for this problem, but a kernel patch for it has been accepted and is expected to be included in future distribution releases.

·           Older distributions missing nForce3/4 storage controller device IDs

Some older distributions do not have nForce3/4 IDE/SATA device IDs. This has the following consequences:


Red Hat Enterprise Linux, updates 2 and 3

32-bit:

ftp://download.nvidia.com/linux/nforce/installdriverdisk/rhel3/ia32/nvdriverdisk.tar.gz

64-bit:

ftp://download.nvidia.com/linux/nforce/installdriverdisk/rhel3/x86_64/nvdriverdisk.tar.gz


SuSE Linux Enterprise Server version 8, with Service Pack 3

32-bit:

ftp://download.nvidia.com/linux/nforce/installdriverdisk/sles8/ia32/nvdriverdisk.suse.tgz

To patch the driver, two tables in the drivers/ide/pci/amd74xx.c driver need modification. The first table is an array of struct amd_ide_chip called amd_ide_chips.  Each entry is of the following form:

{ PCI_DEVICE_ID_XXXXXXXXXXX, 0xXX, AMD_UDMA_100 },

If any of the following device IDs are missing from that table:

    PCI_DEVICE_ID_NVIDIA_NFORCE3_IDE
    PCI_DEVICE_ID_NVIDIA_NFORCE3S_IDE
    PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA
    PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA2
 
    PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_IDE
    PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA
    PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA2
 
    PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_IDE
    PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA
    PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA2
 

then the amd74xx.c driver does not support the nForce3/4 IDE and SATA
controllers. To add support, make the following changes:

Step 1: Define PCI device ID macros.

Immediately before the amd_ide_chips table, add the following lines:

    #define PCI_DEVICE_ID_NVIDIA_NFORCE3_IDE        0x00d5
    #define PCI_DEVICE_ID_NVIDIA_NFORCE3S_IDE       0x00e5
    #define PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA      0x00e3
    #define PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA2     0x00ee
 
    #define PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_IDE   0x0053
    #define PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA  0x0054
    #define PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA2 0x0055
 
    #define PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_IDE   0x0035
    #define PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA  0x0036
    #define PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA2 0x003e
 

Step 2: Add entries to the end of the amd_ide_chips table (but before the terminating entry "{ 0 },").

    { PCI_DEVICE_ID_NVIDIA_NFORCE3_IDE, 0x50, AMD_UDMA_133 },
    { PCI_DEVICE_ID_NVIDIA_NFORCE3S_IDE, 0x50, AMD_UDMA_133},
    { PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA, 0x50, AMD_UDMA_133},
    { PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA2, 0x50, AMD_UDMA_133 },
    { PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_IDE, 0x50, AMD_UDMA_133},
    { PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA, 0x50, AMD_UDMA_133},
    { PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA2, 0x50, AMD_UDMA_133 },
    { PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_IDE, 0x50, AMD_UDMA_133},
    { PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA, 0x50, AMD_UDMA_133},
    { PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA2, 0x50, AMD_UDMA_133 },

Step 3: Add entries to the amd74xx_pci_tbl (but before the terminating entry "{ 0, },").

There is an array of struct pci_device_id called amd74xx_pci_tbl. Entries need to be added here for nForce3/nForce4 support.

    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3_IDE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 9 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3S_IDE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 10 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 11 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE3S_SATA2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 12 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_IDE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 13 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 14 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_CK804_SATA2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 15 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_IDE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 16 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 17 },
    { PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE_MCP04_SATA2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 18 },

The number at the end of each entry (9 through 18) refers to the the corresponding entries (numbering starts with 0) in the amd_ide_chips table. For example, counting down in the amd_ide_chips table, the 9th entry (starting the count with 0) should correspond with the PCI_DEVICE_ID_NVIDIA_NFORCE3_IDE entry. If it doesn't correspond, the numbers in the entries of amd74xx_pci_tbl will need to be adjusted accordingly.

Step 4: Rebuild the kernel.