A simple guide for how to get a lot of VRAM for a cheap price
At the writing of this post, it is possible to get a Tesla M40 24GB for the price of $179 on eBay, it is a low price for the amount of VRAM you will get. As of this time getting an equivalent of VRAM (RTX 3090) will cost me about 14,000 NOK here in Norway which is about $1,438.
Table of Contents
Requirements
- Tesla M40 24GB.
- Integrated graphics or second GPU for video output (Tesla M40 24GB doesn’t have any video output).
- 3 x 92mm fans (Noctua NF-B9 2x and Noctua NF-A9 were used).
- PCI bracket for NVIDIA Tesla M40.
- NVIDIA Dual 8 to 8 Graphics Power Cable (2×8-pin Pci-e to 1×8-Pin CPU).
- T6 Torx Screwdriver.
- Aluminum Foil Tape.
- Pliers.
- Zip ties.
01 – BIOS
- Enable integrated graphics if you are not using a dedicated graphics card for video output in BIOS.
- Enable Above 4G decoding in BIOS
02 – Modification
Remove Cover with the T6 Torx Screwdriver, and replace the PCI bracket.
Figure 1. The GPU.
Figure 2. T6 torx screws.
Use pliers to bend the heat sink fins up. Don’t try to cut the bends as I did on my first try.
Figure 3. Don’t do this….
Figure 4. All heat sink fins bent.
Attach Fans with Zip ties, then use Aluminum Foil Tape around the gaps to make the airflow even over the card even and exit at the bracket part.
Figure 5. Fan attached.
Install the GPU in the computer and attach the NVIDIA Dual 8 to 8 Graphics Power Cable to the PSU as well as the power for the fans.
Figure 6. Installed.
03 – For gaming or Machine learning?
- For gaming download the Quadro M6000 driver on the NVIDIA driver website
- Follow steps 04 and 05.
- For Machine learning download the Data Center / Tesla – M40 Driver for Windows on the NVIDIA driver website.
Figure 7. And it is ready for use.
04 – For Gaming
Switch from TCC mode to WDDM mode to be able to run games. Do this by running the command below to get the GPUS ID number in CMD.
https://docs.nvidia.com/gameworks/content/developertools/desktop/tesla_compute_cluster.htm
nvidia-smi -L
Figure 8. cmd output.
Then run the command below to switch to WDDM:
nvidia-smi -g GPU-5aa247f7-fa8b-48be-5e02-801848fb6df7 -dm 0
Then reboot.
05 – Assign your game to use the Tesla M40 24GB
- Go to Start → Setting → System → Display.
- Click on “Graphics Settings”
Figure 9. Settings.
- Find the .exe of the game.
- Then click options → Specific GPU → NVIDIA Tesla M40 24GB.
- (I’m using my card only for ML so where it says Radeon RX 480 under High performance it supposed to say NVIDIA Tesla M40 24GB)
Figure 10. Graphics settings.
And now it possible to use the Tesla M40 24GB card to game on.
06 – Benchmark
hashcat benchmark:
CUDA API (CUDA 11.7)
====================
* Device #1: Tesla M40 24GB, 22867/22975 MB, 24MCU
HIP API (HIP 4.4)
=================
* Device #2: Radeon (TM) RX 480 Graphics, skipped
OpenCL API (OpenCL 2.1 AMD-APP (3380.6)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #3: Radeon (TM) RX 480 Graphics, skipped
Benchmark relevant options:
===========================
* --backend-devices=1
* --optimized-kernel-enable
-------------------
* Hash-Mode 0 (MD5)
-------------------
Speed.#1.........: 17132.5 MH/s (92.64ms) @ Accel:128 Loops:1024 Thr:512 Vec:8
----------------------
* Hash-Mode 100 (SHA1)
----------------------
Speed.#1.........: 5789.4 MH/s (68.74ms) @ Accel:64 Loops:512 Thr:512 Vec:1
---------------------------
* Hash-Mode 1400 (SHA2-256)
---------------------------
Speed.#1.........: 2053.5 MH/s (48.50ms) @ Accel:8 Loops:512 Thr:1024 Vec:1
---------------------------
* Hash-Mode 1700 (SHA2-512)
---------------------------
Speed.#1.........: 694.6 MH/s (71.92ms) @ Accel:8 Loops:1024 Thr:256 Vec:1
-------------------------------------------------------------
* Hash-Mode 22000 (WPA-PBKDF2-PMKID+EAPOL) [Iterations: 4095]
-------------------------------------------------------------
Speed.#1.........: 296.2 kH/s (81.16ms) @ Accel:32 Loops:256 Thr:512 Vec:1
-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------
Speed.#1.........: 29538.6 MH/s (53.51ms) @ Accel:128 Loops:1024 Thr:512 Vec:8
---------------------
* Hash-Mode 3000 (LM)
---------------------
Speed.#1.........: 14998.2 MH/s (52.98ms) @ Accel:1024 Loops:1024 Thr:32 Vec:1
--------------------------------------------
* Hash-Mode 5500 (NetNTLMv1 / NetNTLMv1+ESS)
--------------------------------------------
Speed.#1.........: 15562.0 MH/s (51.03ms) @ Accel:32 Loops:1024 Thr:1024 Vec:2
----------------------------
* Hash-Mode 5600 (NetNTLMv2)
----------------------------
Speed.#1.........: 1124.7 MH/s (88.92ms) @ Accel:16 Loops:512 Thr:512 Vec:1
--------------------------------------------------------
* Hash-Mode 1500 (descrypt, DES (Unix), Traditional DES)
--------------------------------------------------------
Speed.#1.........: 613.0 MH/s (81.55ms) @ Accel:64 Loops:1024 Thr:32 Vec:1
------------------------------------------------------------------------------
* Hash-Mode 500 (md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5)) [Iterations: 1000]
------------------------------------------------------------------------------
Speed.#1.........: 5496.3 kH/s (54.24ms) @ Accel:16 Loops:1000 Thr:1024 Vec:1
----------------------------------------------------------------
* Hash-Mode 3200 (bcrypt $2*$, Blowfish (Unix)) [Iterations: 32]
----------------------------------------------------------------
Speed.#1.........: 10112 H/s (73.47ms) @ Accel:64 Loops:32 Thr:12 Vec:1
--------------------------------------------------------------------
* Hash-Mode 1800 (sha512crypt $6$, SHA512 (Unix)) [Iterations: 5000]
--------------------------------------------------------------------
Speed.#1.........: 98478 H/s (64.89ms) @ Accel:128 Loops:512 Thr:512 Vec:1
--------------------------------------------------------
* Hash-Mode 7500 (Kerberos 5, etype 23, AS-REQ Pre-Auth)
--------------------------------------------------------
Speed.#1.........: 255.3 MH/s (48.82ms) @ Accel:256 Loops:64 Thr:32 Vec:1
-------------------------------------------------
* Hash-Mode 13100 (Kerberos 5, etype 23, TGS-REP)
-------------------------------------------------
Speed.#1.........: 254.3 MH/s (49.01ms) @ Accel:256 Loops:64 Thr:32 Vec:1
---------------------------------------------------------------
* Hash-Mode 15300 (DPAPI masterkey file v1) [Iterations: 23999]
---------------------------------------------------------------
Speed.#1.........: 49970 H/s (82.67ms) @ Accel:64 Loops:128 Thr:512 Vec:1
---------------------------------------------------------------
* Hash-Mode 15900 (DPAPI masterkey file v2) [Iterations: 12899]
---------------------------------------------------------------
Speed.#1.........: 25209 H/s (73.85ms) @ Accel:4 Loops:512 Thr:512 Vec:1
------------------------------------------------------------------
* Hash-Mode 7100 (macOS v10.8+ (PBKDF2-SHA512)) [Iterations: 1023]
------------------------------------------------------------------
Speed.#1.........: 306.2 kH/s (75.72ms) @ Accel:64 Loops:31 Thr:512 Vec:1
---------------------------------------------
* Hash-Mode 11600 (7-Zip) [Iterations: 16384]
---------------------------------------------
Speed.#1.........: 231.3 kH/s (48.95ms) @ Accel:16 Loops:4096 Thr:128 Vec:1
------------------------------------------------
* Hash-Mode 12500 (RAR3-hp) [Iterations: 262144]
------------------------------------------------
Speed.#1.........: 36165 H/s (83.47ms) @ Accel:4 Loops:16384 Thr:512 Vec:1
--------------------------------------------
* Hash-Mode 13000 (RAR5) [Iterations: 32799]
--------------------------------------------
Speed.#1.........: 25617 H/s (59.27ms) @ Accel:8 Loops:256 Thr:1024 Vec:1
-----------------------------------------------------------------------
* Hash-Mode 6211 (TrueCrypt RIPEMD160 + XTS 512 bit) [Iterations: 1999]
-----------------------------------------------------------------------
Speed.#1.........: 186.7 kH/s (60.71ms) @ Accel:8 Loops:128 Thr:1024 Vec:1
-----------------------------------------------------------------------------------
* Hash-Mode 13400 (KeePass 1 (AES/Twofish) and KeePass 2 (AES)) [Iterations: 24569]
-----------------------------------------------------------------------------------
Speed.#1.........: 23356 H/s (87.19ms) @ Accel:4 Loops:512 Thr:1024 Vec:1
----------------------------------------------------------------
* Hash-Mode 6800 (LastPass + LastPass sniffed) [Iterations: 499]
----------------------------------------------------------------
Speed.#1.........: 1363.9 kH/s (57.54ms) @ Accel:4 Loops:499 Thr:1024 Vec:1
--------------------------------------------------------------------
* Hash-Mode 11300 (Bitcoin/Litecoin wallet.dat) [Iterations: 200459]
--------------------------------------------------------------------
Speed.#1.........: 3115 H/s (53.27ms) @ Accel:256 Loops:128 Thr:1024 Vec:1
07 – Conclusion
I did actually end up using a Radeon RX 480 as my dedicated GPU, and the Tesla M40 24GB connected with a PCI riser and it works fine. By running Deepfacelab for over 24 hours, the max temperature did reach 86°C and the idle temp was around 33.