Hi!

First of all sorry if this is the wrong place to ask, if it is, please point me to a better suited channel!

Anyway I’ve got this old 2TB HDD attached to a rpi 4b, it worked flawlessly until now, the last few days it started disconnecting randomly…

If i reboot it mounts back again.

This is the df output:

/dev/sdb1       1.8T  535G  1.2T  31% /mnt/2tb

And this is sudo dmesg | grep sdb (the device is sdb ofc).

[   14.970908] sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[   14.978857] sd 1:0:0:0: [sdb] 4096-byte physical blocks
[   14.984484] sd 1:0:0:0: [sdb] Write Protect is off
[   14.989382] sd 1:0:0:0: [sdb] Mode Sense: 43 00 00 00
[   14.989684] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   15.044802] sd 1:0:0:0: [sdb] Preferred minimum I/O size 4096 bytes
[   15.051196] sd 1:0:0:0: [sdb] Optimal transfer size 33553920 bytes not a multiple of preferred minimum block size (4096 bytes)
[   15.065585]  sdb: sdb1
[   15.068403] sd 1:0:0:0: [sdb] Attached SCSI disk
[   22.631983] EXT4-fs (sdb1): recovery complete
[   22.660922] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Quota mode: none.

The device has an external power supply of its own, so it’s not a power issue… This setup worked for a couple of years.

I cannot see anything wrong here, pheraps is the HDD which is going bad?

  • seaQueue@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    edit-2
    3 days ago

    Don’t just look at sdb hits in the log. Open up that entire session in journalctl kernel mode (journalctl -k -bN where N is the session number in session history) and find the context surrounding the drive dropping and reconnecting.

    You’ll probably find that something caused a USB bus reset or a similar event before the drive dropped and reconnected. if you find nothing like that try switching power supplies for the HDD and/or switching USB ports until you can move the drive to a different USB root port. Use lsusb -t and swap ports until the drive is attached beneath a different root port. You might have a neighboring USB device attached to the bus that’s causing issues for other devices attached to the same root port (it happens, USB devices or drivers sometimes behave badly.)

    Always look at the context of the event when you’re troubleshooting a failure like this, don’t just drill down on the device messages. Most of the time the real cause of the issue preceded the symptom by a bit of time.

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 days ago

      Very good answer. I’ve also spent some time analyzing some red herrings when it was something else like a bad cable or connector. And by the way, you can use the same keys in journalctl as in the usual pager (less(?)) so hit / and search for ‘unmount’, ‘disconnect’, etc. And then scroll through the log and find out what led to the situation.