NVMe to USB Adapter - Stability Issues

I have been having some stability issues with my EON. After what appears to be a random period of uptime, the EON just stops responding. When I discover it in this state, here is what I find…

  • It responds to pings.
  • VNC is stopped.
  • Cannot connect via SSH.
  • The OLED stops cycling screens.
  • Nothing will launch from the GUI.

I am able to perform an orderly shutdown via the power switch and when it comes back up, I cannot find anything in any of the logs. It’s like it just stops writing to the NVMe drive completely. Sometimes it’s up for over a week and other times, it does this after a few hours.

To troubleshoot further, I connected a KVM and display and waited for it to do its thing. I left a journalctl -f running to see if anything logged right before this. It just did it again and nothing in the log. From the console, nothing would launch. The default icons (browser, file manager, and terminal) in the menu bar were replaced with some strange “dead” icons.

I’m suspecting the NVMe to USB adapter I’m using is the issue. It’s as if the boot/rootfs just “goes away” and the EON tries to keep running. I pulled the adapter and plugged it into another Pi and it would not mount. I plugged it into my (Windows) laptop and it went through a “found new hardware/hardware removed” cycle about six times in just a few seconds. It ended up “thinking” the drive was plugged in but neither the GUI nor diskpart saw it. At this point, no matter where I plug it in, nothing sees it.

I searched the forum threads here and found a few mentions of similar behavior where the EON just stops responding and only a power cycle will bring it back. @BlackRose67 had a similar experience. I wonder if this is the same issue.

This is the adapter I’m using. I’ve read a number of threads regarding complaints about heat from these adapters so I did sandwich a thermal pad between the drive and the adapter. I suspect the thing may have cooked itself. I wonder if the use case for these adapters is something like a one time data recovery and they are just not built to run continuously. Maybe I just got a bad one…don’t really know.

So…anyone have similar experiences?

I always attributed the issues I was having to using a version of OMV 6 that was still having the kinks worked out or an OMV update had issues.

I have had issues where I would do an OMV update while I was SSHd into the machine, and the SSH session would just stop responding.

I never thought to consider that it could be the NVME adapter that was causing the issue.
The USB NVME adapter I’m using has a JMicron chipset, and the S.M.A.R.T. interface doesn’t see it as a drive to monitor, so I don’t see any temperature data for it.

I’m kinda glad to read others have this issue… except… I’m not using NVME… I have a USB hard drive plugged into the internal usb

I think you’re on the right track with the temperature of the NVME. NVME is heat sensitive–the hotter it gets the worse it performs, which is why modern motherboards integrate cooling systems for NVME disks.

I’ve got the same NVME-to-USB adapter paired with a Silicon Power A80-based NVME. I haven’t had any issues with it, but I’m also using the entire case as a heatsink for the NVME.

What I did was turn the whole unit off and let it cool to room temperature, then put the included thermal pad (came with NVME) onto the side of the adapter facing the metal interior wall of the EON and push the adapter lightly until the thermal pad made contact with the interior wall.

Then I used some scotch tape to tape the top edge of the adapter to the interior wall, so it remained in contact.

Once the device is turned back on, it’ll effectively glue itself to the interior wall, and the drive can use the case itself as a heatsink. You can remove the tape at that point and the drive won’t move.

As far as figuring out if the NVME is overheating–which would possibly lead it it shutting down and disappearing–see if you can see its temperature easily enough with sensors or smartctl, and then pair that with watch to monitor the temps. Chaining a bunch of commands together into something watch can parse without errors is more trouble than its worth (that is, I can’t figure it out :stuck_out_tongue: ), so I created and made executable this one-line script:

# nvmeTempWatch.sh
smartctl -a /dev/nvme0n1 | grep -i temperature:

Then:
sudo watch ./nvmeTempWatch.sh
should let you monitor the temperature and see if it’s spiking when you crash.

konradwalsh
36m

I’m kinda glad to read others have this issue… except… I’m not using NVME… I have a USB hard drive plugged into the internal usb

That sounds like the USB header inside the EON isn’t delivering enough power at a constant-enough rate to keep the USB SSD happy. You might want to check and make sure it’s not overheating.

I’ve definitely run into portable USB SSDs that aren’t stable running off a single USB port on a Pi 4 because of power issues. Could be a similar issue with the EON. I’m not sure how those USB ports are in terms of power delivery.

One thing I failed to mention is that for weeks, I ran my EON booting from an SSD connected to one of these and did not experience any of this behavior. I only changed over to the NVMe for a less cluttered interior.

As a test, I have pulled the NVMe from the EON, duplicated it to another drive, and now have my EON booting from that drive connected to the adapter above. I have it plugged into the USB jack intended for the NVMe. I suspect that it will be stable as it was before.

Just wanted to follow-up on my comment about the JMicron device.

I checked the documentation again and this time found the extra command needed to get the S.M.A.R.T. data and temperature info, at least from the command line.

sudo smartctl -a /dev/sdc1 -d sntjmicron

Now I need to figure out how to get that info into OMV so it shows on it’s S.M.A.R.T. monitoring page.

Is your adapter close enough to the interior wall of the EON to do this with the thermal pad that was included with the USB adapter or did you have to coax it a little? My pad was only 1mm thick.

I did have to apply a gentle pressure to get the pad to touch the wall. Not enough that anything made any creaking noise or that it felt like something was going to break.

Maybe try a 2mm or 3mm pad in lieu of forcing it too far? I’m not sure how thick the pad can be before it stops working properly.

Actually it is not entirelly true, parts of the Nvme are more comfortable around the 50C mark, according to LTT and my own experiences

I use one of these Nvme to USB3 to run PiHole off an Pi 400, have had it running now for 4 months with only issues when we had a heat spell of 25-32C and no wind, on the other hand my PC heat throttled also during that week

To be fair, that entire enclosure is a heatsink and if you are running on a 400, I am assuming the enclosure is sitting in open air. I am talking about these bare adapters that have literally nothing to help with heat dissipation and are installed inside the EON enclosure.

NVME SSDs are annoying because the NVME controller IC needs a different degree of cooling than the memory chip(s) do, and what is optimal for one is not always awesome for the other.

Instead of letting this drive me crazy, I just watch the reported drive temperature with, e.g., sensors and make sure it doesn’t get within 10 C of the specified maximum temperature under a sustained load benchmarking test.