Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia Driver Failure on GeForce GTX 1050 Ti #44284

Closed
galaxite opened this issue Aug 1, 2018 · 20 comments
Closed

Nvidia Driver Failure on GeForce GTX 1050 Ti #44284

galaxite opened this issue Aug 1, 2018 · 20 comments

Comments

@galaxite
Copy link

galaxite commented Aug 1, 2018

Issue description

Attempting to start X on a dell xps 9570 laptop with a nvidia geforce 1050 Ti graphics card results in failure despite multiple attempts to fix the issue. In the services.xserver.videoDrivers section of configuration.nix, I have tried loading just nvidia, just intel, and both nvidia and intel. These all resulted in screen flickering with no x server. I have tried both the bumblebee and official nvidia PRIME solution to this problem described here in the wiki. These resulted in the system hanging when it attempts to start the display manager, still without seeming to start x server. I attempted to use the fix described by ambrop72 in issue #24711, but trying to rebuild nixos resulted in an error saying that hardware.nvidia wasn't found. Additionally, I've been having a problem where my system won't shut down because mdadm-shutdown.service hangs, and my snooping has led me to believe that this is also because the graphics card can't be shut off. I've attached my configuration.nix and hardware-configuration.nix in case they would be helpful; you can see some of my failed attempts at fixing the problem commented out by the xservices section.
configuration.nix.txt
hardware-configuration.nix.txt

Steps to reproduce

Attempt to start X on NixOS using a dell xps 9570 laptop.

Technical details

  • system: x68_64-linux
  • host os: Linux 4.14.59, NixOS, 18.09pre147696. 7c58523 (Jellyfish)
  • multi-user?: no
  • sandbox: yes
  • version: nix-env (Nix) 2.0.4
  • channels(root): nixos-18.09pre147696. 7c58523
  • nixpkgs: /nix/var/nix.profiles/per-user/root/channels/nixos
@7c6f434c
Copy link
Member

7c6f434c commented Aug 1, 2018 via email

@kalbasit
Copy link
Member

kalbasit commented Aug 1, 2018

@galaxite I bought the same XPS last month and I decided to return it. I got most of it to work properly, but I was getting a panic during resume after suspend, and that's a deal breaker for me. I was also having issues when connecting my wide screen.

Anyway, you can find my /etc/nixos here. Make sure you are running the latest kernel.

IIRC the key settings that made it work, were:

boot.kernelPackages = pkgs.linuxPackages_latest;
services.xserver.videoDrivers = ["modesetting"];
boot.blacklistedKernelModules = ["nouveau"];
i18n.consoleFont = "latarcyrheb-sun32";  # will work without it, but it will make your life easier

EDIT: Oh, as for the shutdown issue, I was able to track it down to the X server refusing to die after you get the black screen, it was just stuck there. Try it by shutting down the system without ever having started X11.

@galaxite
Copy link
Author

galaxite commented Aug 1, 2018

@kalbasit That worked! Thank you so much! I just tried suspending and resuming and didn't get a kernel panic, but that may change eventually. Will those settings have any effect on the functionality of my laptop besides disabling the nvidia graphics card?

@vcunat
Copy link
Member

vcunat commented Aug 1, 2018

IIRC Fedora and maybe some other mainstream distros prefer the "modesetting" driver for Intel cards, so I believe it should work just fine.

@eadwu
Copy link
Member

eadwu commented Aug 1, 2018

For the shutdown issue, you can try acpi_rev_override=5 as a kernel parameter (if I remember correctly, I got it from a Arch thread and then another source which said 5 fixed some power management problems), never had any major problems on my 9570 for nixOS. Suspend and resume also works fine on mine.

@kalbasit
Copy link
Member

kalbasit commented Aug 1, 2018

@galaxite it should work with no issues. I have not tried acpi_rev_override=5, but this might actually work for you. However, since you are fully disabling the Nvidia card, maybe you should consider another laptop that comes without it. I went with a Precision 7530, Intel card, 32Gb RAM and a Xeon 6 cores and never looked back, it's heavier but it's worth it.

@galaxite
Copy link
Author

galaxite commented Aug 1, 2018

@eadwu Did you manage to get the nvidia card working on your laptop as well? If so, how?

@kalbasit I admit that I'm kind of regretting choosing a laptop that I can't use the graphics card on, but I saved up for a long time to get a new laptop and I'm not sure returning it would be easy.

@kalbasit
Copy link
Member

kalbasit commented Aug 1, 2018

@galaxite it depends. If you got it from Dell, you can return it within 30 days at no charge.

@galaxite galaxite changed the title Nvidia Driver Failure Nvidia Driver Failure on GeForce GTX 1050 Ti Aug 1, 2018
@eadwu
Copy link
Member

eadwu commented Aug 1, 2018

@galaxite Yes, through ambrop72's solution though not on my current kernel (4.18.0-rc7). If I remember correctly, at the time I was using boot.kernelPackages = pkgs.linuxPackages_latest while my relevant prime config

{ config, pkgs, ... }:

{
  hardware = {
    nvidia = {
      modesetting = {
        enable = true;
      };

      optimus_prime = {
        enable = true;
        # values are from lspci
        # try lspci | grep -P 'VGA|3D'
        intelBusId = "PCI:0:2:0";
        nvidiaBusId = "PCI:1:0:0";
      };
    };
  };

  services = {
    xserver = {
      videoDrivers = [
        "nvidiaBeta" # nvidia should work fine as well
      ];
    };
  };
}

Just tried it and works fine on 4.17.11. Normally I keep it off unless I actually need to use it, since it averages at around ~30W. nvidia-smi output:

Wed Aug  1 19:57:31 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67                 Driver Version: 390.67                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   48C    P3    N/A /  N/A |    447MiB /  4042MiB |     25%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1511      G   ...3wga9525dcxby4-xorg-server-1.19.6/bin/X   411MiB |
|    0      1846      G   ...cnqymla-compton-0.1_beta2.5/bin/compton    35MiB |
+-----------------------------------------------------------------------------+

@galaxite
Copy link
Author

galaxite commented Aug 2, 2018

@eadwu Do you know what kernels ambrop's solution works on? After implementing the boot.kernelPackages = pkgs.linuxPackages_latest fix above I'm currently using 4.17.11. (As a side note, how are you using 4.18.0-rc7?)

@eadwu
Copy link
Member

eadwu commented Aug 2, 2018

Don't specifically know the kernel versions since I only ever used boot.kernelPackages = pkgs.linuxPackages_latest and kernelPackages = pkgs.linuxPackages_testing (for 4.18.0-rc7) for my 9570. I think when I tried the default kernelPackages in the beginning I never got it to successfully boot until I added the i915.alpha_support kernel param but then switched to latest since it was easier to work with. As the kernel versions I have tested it on, looking at my git history it should be around >4.17.6 excluding 4.18.0+.

@galaxite
Copy link
Author

galaxite commented Aug 2, 2018

@eadwu I made the changes, and I was given the error: The option hardware.nvidia defined in /etc/nixos/configuration.nix does not exist. when executing nixos-rebuild switch. Is that normal? I was able to use what I assume is the new system. Is there any way I can check if the card is running? I tried typing nvidia-smi into the terminal, but I was told the nvidia command doesn't exist.

@eadwu
Copy link
Member

eadwu commented Aug 2, 2018

ambrop72's solution hasn't been merged into nixpkgs yet (#42846), you need a local nixpkgs to use that. You then reference it through nixos-rebuild -I nixpkgs=PATH_TO_NIXPKGS. Since I normally don't use the channels anymore I made an alias for it, nixos-rebuild-local:

which nixos-rebuild-local
nixos-rebuild-local: aliased to nixos-rebuild -I nixpkgs=/home/xxxx/Downloads/nixpkgs

Clone repo (EDIT: easier instructions?)

git clone https://github.com/NixOS/nixpkgs-channels.git
cd nixpkgs-channels
git checkout CHANNEL
git remote add upstream https://github.com/NixOS/nixpkgs.git
git remote -v
git fetch upstream pull/42846/head
git cherry-pick f26153754a1b6ac0d72adde9c75e1473463b4dbb

@galaxite
Copy link
Author

galaxite commented Aug 2, 2018

@eadwu I see, will that affect my ability to do other system upgrades or install other packages?

@eadwu
Copy link
Member

eadwu commented Aug 2, 2018

It shouldn't impact much, though if you want to update to the latest channel version just do git rebase COMMIT_SHA after git fetch upstream master and then rerun nixos-rebuild -I nixpkgs=PATH_TO_NIXPKGS. I look at the SHAs here, https://howoldis.herokuapp.com/.

@galaxite
Copy link
Author

galaxite commented Aug 2, 2018

@eadwu I modified the commands you used a bit (because using them didn't work for me and also because the nixos github pages recommend something else), and it did not work. I'm not very experienced with the git workflow, so I'm just going to list out the commands I used:

git clone [email protected]:NixOS/nixpkgs.git
cd nixpkgs
git remote add channels git://github.com/NixOS/nixpkgs-channels.git
git remote update channels
git checkout channels/nixos-unstable
git remote add upstream https://github.com/NixOS/nixpkgs.git
git fetch upstream pull/42846/head
git cherry-pick f26153754a1b6ac0d72adde9c75e1473463b4dbb
cd
sudo nixos-rebuild -I /home/[my name]/nixpkgs switch

Attempting to edit my configuration.nix file to use ambrop's solution after that and rebuilding in the same way as above still doesn't work and returns the hardware.nvidia defined in /etc/nixos/configuration.nix does not exist error.

@eadwu
Copy link
Member

eadwu commented Aug 2, 2018

Try sudo nixos-rebuild -I nixpkgs=/home/[my name]/nixpkgs switch. You don't need to add upstream through that method since the remote origin should point to the same url. So something like this

git clone [email protected]:NixOS/nixpkgs.git
cd nixpkgs
git remote add channels git://github.com/NixOS/nixpkgs-channels.git
git remote update channels
git checkout channels/nixos-unstable
git fetch origin pull/42846/head
git cherry-pick f26153754a1b6ac0d72adde9c75e1473463b4dbb
cd
sudo nixos-rebuild -I nixpkgs=/home/[my name]/nixpkgs switch

To then update your repository fetch then rebase if there's an update.

cd nixpkgs
git fetch channels nixos-unstable
git rebase channels/nixos-unstable

Using Nvidia Prime requires a reboot as well if I'm not mistaken.

@galaxite
Copy link
Author

galaxite commented Aug 3, 2018

@eadwu That worked perfectly! Just one last question before I close the issue, installing individual packages with this kind of local setup would entail something like nix-env -f /home/[my name]/nixpkgs -iA nixos.[package] right?

@eadwu
Copy link
Member

eadwu commented Aug 3, 2018

nix-env -f /home/[my name]/nixpkgs -iA [package] would use the local nixpkgs but since the only change you did to your nixpkgs is merging the PR, it really wouldn't matter whether or not you used -f since the PR didn't add any new packages that should be useful to your needs. If you want to be consistent just alias nix-env and nixos-rebuild to nix-env -f /home/[my name]/nixpkgs and nixos-rebuild -I nixpkgs=/home/[my name]/nixpkgs. Though upgrading will need you to manually rebase the local repository.

@galaxite galaxite closed this as completed Aug 3, 2018
@eadwu
Copy link
Member

eadwu commented Aug 11, 2018

So started to mess around with bumblebee again (since Nvidia PRIME had noticeable delay for me compared to the iGPU) and got it to "work" (considered a win for me since it uses less power than Nvidia PRIME, ~5W less compared to it, with powertop ~7-8W, measured (cough approximated) idle). The approach I used is outlined in the second option in the specific comment here. I can't seem to disable the graphics card on boot even though lsmod shows no output of any of the Nvidia drivers (open or closed) being loaded and modprobe -r on the drivers after they are loaded doesn't turn them off either.

Some specifics on my current setup
Nvidia driver version: 390.77 (396.51 seems to have some DRM problems)
uname -r: 4.18.0-rc8

Keep in mind powerManagement.powertop.enable should be false (default value).

{
  hardware = {
    bumblebee = {
      enable = true;
      pmMethod = "none";
    };
  };

  # Not sure if this is needed
  # Probably was when I initially started testing it, not sure now
  services = {
    xserver = {
      videoDrivers = [
        "modesetting"
      ];
    };
  };
}

Though if you really want to use powertop, you can use the following (taken from here)

optirun glxgears &
nix-shell -p powertop --run 'sudo powertop --auto-tune'
pgrep glxgears | xargs kill

EDIT: Funny enough, using the workaround for powertop --auto-tune lets modprobe -r $(lsmod | grep 'nvidia ' | awk '{print $4}') nvidia work correctly (though the default is still bad so it should be the first thing you run after logging in) and I now get comparable battery consumption (~10W idle) though it is still ~2W higher which I will kindly attribute (which probably isn't due to) decreasing the interval to 0.1 from 0.5 in polybar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants