FieldDB / FieldDB

An offline/online field database which adapts to its user's terminology and I-Language. http://fielddb.github.io

Home Page:http://lingsync.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Monitor is DOWN: Corpus Urls

cesine opened this issue · comments

The monitor Corpus Urls (https://www.example.org/lingllama/communitycorpus/search/nay) is currently DOWN (HTTP 502 - Bad Gateway)

Event timestamp: 2022-01-11 09:00:34 UTC+0

Monitor showed up and down over the course of the next few days until it stayed down

Event timestamp: 2022-01-13 08:06:55 UTC+0

Screen Shot 2022-01-16 at 12 30 14 PM

Screen Shot 2022-01-16 at 9 49 54 AM

You are receiving this email because instance i-cd18014d in region US East (N. Virginia) has failed an instance or a system status check for at least 2 period(s) of 60 seconds at "Saturday 15 January, 2022 18:44:10 UTC". You can view status check details about this instance by navigating to the EC2 console at https://us-east-1.console.aws.amazon.com/ec2/home?region=us-east-1#s=Instances, selecting the instance and clicking on the Status Check tab.

Information about troubleshooting instances with failed status checks:
http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/TroubleshootingInstances.html.

Information about EC2 Status Checks:
http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/TroubleshootingInstances.html#troubleshooting-retrieve-system-logs

Warning: fsck not present, so skipping root file system
[    3.598927] EXT4-fs (xvda1): INFO: recovery required on readonly filesystem
[    3.603679] EXT4-fs (xvda1): write access will be enabled during recovery
[   25.285677] random: nonblocking pool is initialized
[   46.196362] EXT4-fs (xvda1): recovery complete
[   46.216055] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null)
done.
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[   54.709812] systemd[1]: systemd 229 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN)
[   54.729838] systemd[1]: Detected virtualization xen.
[   54.735296] systemd[1]: Detected architecture x86-64.

Welcome to �[1mUbuntu 16.04.2 LTS�[0m!

[   55.065910] systemd[1]: Set hostname to <ip-172-...>.
[   71.267298] systemd[1]: Listening on Journal Socket.
[�[0;32m  OK  �[0m] Listening on Journal Socket.
[   71.278989] systemd[1]: Listening on Journal Socket (/dev/log).
[�[0;32m  OK  �[0m] Listening on Journal Socket (/dev/log).
[   71.291853] systemd[1]: Reached target Encrypted Volumes.
[�[0;32m  OK  �[0m] Reached target Encrypted Volumes.
[   71.303841] systemd[1]: Reached target User and Group Name Lookups.
[�[0;32m  OK  �[0m] Reached target User and Group Name Lookups.
[   71.317200] systemd[1]: Listening on fsck to fsckd communication Socket.
[�[0;32m  OK  �[0m] Listening on fsck to fsckd communication Socket.
[  177.022745] cloud-init[1880]: Cloud-init v. 0.7.8 finished at Sun, 16 Jan 2022 00:55:17 +0000. Datasource DataSourceEc2.  Up 176.98 seconds
[  478.786881] Out of memory: Kill process 1174 (mysqld) score 14 or sacrifice child
[  478.790790] Killed process 1174 (mysqld) total-vm:1114412kB, anon-rss:121004kB, file-rss:0kB
[  478.868680] Out of memory: Kill process 1895 (mysqld) score 15 or sacrifice child
[  478.873989] Killed process 1895 (mysqld) total-vm:1114412kB, anon-rss:121396kB, file-rss:820kB
[  480.128376] Out of memory: Kill process 2123 (paster) score 8 or sacrifice child
[  480.132622] Killed process 2123 (paster) total-vm:1039552kB, anon-rss:64256kB, file-rss:1644kB
[  547.326978] Out of memory: Kill process 2086 (paster) score 8 or sacrifice child
[  547.330826] Killed process 2086 (paster) total-vm:1039552kB, anon-rss:64260kB, file-rss:1772kB

Investigate volume usage

https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#metricsV2:graph=~(view~'timeSeries~stacked~false~region~'us-east-1~start~'-P7D~end~'P0D~metrics~(~(~'AWS*2fEBS~'VolumeWriteOps~'VolumeId~'vol-090977f8cd2c9cb65)~(~'.~'VolumeQueueLength~'.~'.)~(~'.~'VolumeWriteBytes~'.~'.)~(~'.~'VolumeTotalReadTime~'.~'.)~(~'.~'BurstBalance~'.~'.)~(~'.~'VolumeReadBytes~'.~'.)~(~'.~'VolumeTotalWriteTime~'.~'.)~(~'.~'VolumeReadOps~'.~'.)~(~'.~'VolumeIdleTime~'.~'.)));query=~'*7bAWS*2fEBS*2cVolumeId*7d

/dev/sda1

Screen Shot 2022-01-16 at 10 23 03 AM

/dev/sdg

Screen Shot 2022-01-16 at 10 22 35 AM

Resized sda1 to 20GB

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html

$ df -hT
Filesystem     Type      Size  Used Avail Use% Mounted on
udev           devtmpfs  3.9G     0  3.9G   0% /dev
tmpfs          tmpfs     799M  8.6M  790M   2% /run
/dev/xvda1     ext4       20G   12G  6.9G  64% /
tmpfs          tmpfs     3.9G     0  3.9G   0% /dev/shm
tmpfs          tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs          tmpfs     3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/xvdg      ext4       50G   45G  2.5G  95% /data
cgmfs          tmpfs     100K     0  100K   0% /run/cgmanager/fs
tmpfs          tmpfs     799M     0  799M   0% /run/user/1000

$ lsblk
NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda    202:0    0  20G  0 disk 
└─xvda1 202:1    0  20G  0 part /
xvdg    202:96   0  50G  0 disk /data

What is taking space on sda1?

# du -a /var | sort -n -r | head -n 20 
5600524	/var
4853280	/var/lib
4578600	/var/lib/mysql
3634880	/var/lib/mysql/ike_hiver_2015old
3621968	/var/lib/mysql/ike_hiver_2015old/formbackup.MYD
635908	/var/lib/mysql/ike_field_methodsold
628428	/var/lib/mysql/ike_field_methodsold/formbackup.MYD
582152	/var/log
174836	/var/lib/apt
174768	/var/lib/apt/lists
140576	/var/cache
137348	/var/log/auth.log.1
106296	/var/log/syslog.1
98480	/var/log/auth.log
92980	/var/log/kern.log
85116	/var/log/letsencrypt
67260	/var/cache/apt
61804	/var/cache/apt-xapian-index
61800	/var/cache/apt-xapian-index/index.5
50488	/var/lib/dpkg

What is taking space on data?

$ sudo du -a /data | sort -n -r | head -n 20
46428996	/data
19321668	/data/couchdb
18272260	/data/oldapps
8478820	/data/fielddbhome
3929380	/data/oldapps/ike_hiver_2015old.dump
3150544	/data/fielddbhome/logs
2254652	/data/oldapps/yale_fall_2016_field_methodsold
2252428	/data/oldapps/yale_fall_2016_field_methodsold/log
2252424	/data/oldapps/yale_fall_2016_field_methodsold/log/paster-old.log
2250804	/data/oldapps/malagasy_uoft_lec_5101old
2248936	/data/oldapps/transylvanian_saxonold
2247080	/data/oldapps/malagasy_uoft_lec_5101old/log
2247076	/data/oldapps/malagasy_uoft_lec_5101old/log/paster-old.log
2245908	/data/oldapps/transylvanian_saxonold/log
2245904	/data/oldapps/transylvanian_saxonold/log/paster-old.log
2229804	/data/oldapps/sghold
2229648	/data/oldapps/cakold
2229484	/data/oldapps/sghold/log
2229480	/data/oldapps/sghold/log/paster-old.log
2229352	/data/oldapps/cakold/log

After turning off apache2, nginx is able to bind port 80 and the services are back up.