Ubuntu 18.10 constantly freezing and filesystem gets corrupted
EDIT: I updated my kernel version from 4.19 to 5.1.14 and it seems to have fixed the issue. Usually on a single day it might crash/mess up the filesystem at least 5 to 6 times but in 2 days I only had to reboot one time.
My laptop is an Acer Predator Helios 300, Intel® Core™ i7-7700HQ, 16GiB RAM, NVIDIA GTX1060 6GB (nvidia-driver-410) running Ubuntu 18.04 and Windows in dual-boot (although I almost never used Windows).
Ubuntu crashes randomly. I do web development and usually after 40 minutes or so, my laptop starts getting slower. Opening the shell takes longer, file writes are slower, mouse starts getting slow etc... After some time it just freezes completely. I have to force reboot it.
When I reboot it, it brings up initramfs
where I do fsck /dev/sda2
, say yes
to all fixes until it finishes the reboot
it.
Sometimes the reboot works, sometimes the filesystem gets remounted in readonly
, even after fixing, as soon as the OS loads. Sometimes I have to do this upto 5 times in a row. Needless to say, this is really frustrating me and slowing down my word (I am a web-dev).
smartctl
output:
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 947
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 2143
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 095 095 000 Old_age Always - 78
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 299
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 068 037 000 Old_age Always - 32 (Min/Max 13/63)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 095 095 001 Old_age Offline - 5
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 17550641040
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 550398567
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 440978387
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 2041
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
What should I do?
Also I should mention that I didn't put the kernel.log
s because I have something wrong with my touchpad that fills up the kernel.log
with lines like these:
Jun 24 10:06:40 mehdisaffar-Predator-G3-571 kernel: [24335.295971] i2c_hid i2c-ELAN0501:01: i2c_hid_get_input: incomplete report (14/65535)
Check your SSD firmware
It's very important to check the firmware version of your SSD (and your NVMe). In the terminal, do sudo lshw -C disk
. It'll tell you the firmware version, and then you can go to the manufacturer's website and check for updates.
Check your BIOS
In the terminal, do sudo dmidecode -s bios-version
, and then go to the manufacturer's website to check for a newer BIOS.
Current BIOS is 1.22 dated 4/1/2019. See https://www.acer.com/ac/en/US/content/support-product/7213?b=1
Make sure you have enough swap
In the terminal, do free -h
and make sure that you have at least a 2G swap partition or /swapfile.
Check your cabling
If the SSD is an internal drive, check the condition of the SATA cables and assure that they're tightly connected at both ends.
If the SSD is external, assure that you're using a USB3 port if the drive enclosure is USB3. Also keep in mind that the USB cable, enclosure, and enclosure power supply can also be suspects.
Check your memory
Go to http://www.memtest.org or https://www.memtest86.com/ (use the second link to get the latest free version), and download/run memtest to test your memory. Get at least one complete pass of all the 4/4 of the tests to confirm that the memory is good.