nodejs / build

Better build and test infra for Node.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`test-equinix-ubuntu2204-x64-2` is down

anonrig opened this issue · comments

While onboarding, we deleted workspace folders and rebooted the machine. Unfortunately, it's been down ever since. I don't have access to Equinix. Appreciate if someone can access Equinix and reboot the machine from the UI.

cc @UlisesGascon

Machine: https://ci.nodejs.org/computer/test%2Dequinix%2Dubuntu2204%2Dx64%2D2/
Ref: nodejs/jenkins-alerts#678

Let me see if I can do it from Equinix dashboard and restore it

@UlisesGascon It looks like you had been invited to join the Node.js org in Equinix Metal but hadn't accepted the invite (I resent it just in case).

In the meantime I'm going to attempt to fix this machine. If you cannot log in via ssh normally to the machine, in Equinix Metal try the out-of-band console (rightmost ">_" icon):
image

It will give you an ssh command to copy/paste -- you'll need to add -i <path to nodejs_build_test ssh key>.

I've done this and got:

[SOS Session Ready. Use ~? for help.]
[Note: You may need to press RETURN or Ctrl+L to get a prompt.]











Shell>

i.e. the machine is waiting for input on a UEFI prompt. We've seen that before on the arm64 machines but this is the first time, I think, that I've seen it on x64. I've typed exit to quite the shell prompt and will see if the machine is able to restart.

crap, it's back at the prompt again 😞 :

UEFI Interactive Shell v2.2
EDK II
UEFI v2.70 (American Megatrends, 0x0005000D)
Mapping table
     BLK0: Alias(s):
          PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x0,0xFFFF,0x0)
     BLK4: Alias(s):
          PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x1,0xFFFF,0x0)
     BLK1: Alias(s):
          PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x0,0xFFFF,0x0)/HD(1,GPT,DD089E04-788C-445B-BC66-5E18A1A25F6A,0x800,0x1000)
     BLK2: Alias(s):
          PciRoot(0x0)/Pci(0x17,0x0)/S80F7516-00B3-4C50-9C28-78357809439A,0x1800,0x3CF000)
     BLK3: Alias(s):
          PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x0,0xFFFF,0x0)/HD(3,GPT,B601B131-43B2-45B4-8A38-594A2AFDC545,0x3D0800,0x37A72E8F)



Press ESC in 1 seconds to skip startup.nsh or any other key to continue.
Shell>

I've requested the machine be reinstalled in the Equinix Metal console. I'll rerun the Ansible set up on it.

@richardlau what's the procedure of getting access to Equinix console?

@richardlau what's the procedure of getting access to Equinix console?

You have to be in @nodejs/build-infra.

I've tried to reansible the reinstalled machine but it's failing:

TASK [jenkins-workspace : Initialize Git repository] ************************************************************************************************************************************************************
fatal: [test-equinix-ubuntu2204-x64-2]: FAILED! => {"changed": false, "cmd": "/usr/bin/git clone --bare https://github.com/nodejs/node /home/binary_tmp/binary_tmp.git", "msg": "Cloning into bare repository '/home/binary_tmp/binary_tmp.git'...\nfatal: Invalid path '/home/iojs/build': Permission denied", "rc": 128, "stderr": "Cloning into bare repository '/home/binary_tmp/binary_tmp.git'...\nfatal: Invalid path '/home/iojs/build': Permission denied\n", "stderr_lines": ["Cloning into bare repository '/home/binary_tmp/binary_tmp.git'...", "fatal: Invalid path '/home/iojs/build': Permission denied"], "stdout": "", "stdout_lines": []}

PLAY RECAP ******************************************************************************************************************************************************************************************************
test-equinix-ubuntu2204-x64-2 : ok=39   changed=1    unreachable=0    failed=1    skipped=112  rescued=0    ignored=0

I don't really have the time to pursue this further -- perhaps someone else with test level access could take a look. For now the machine is marked offline in Jenkins.

Since the machine was reinstalled you may need to use ssh-keygen -R <ip address> to remove the old host identifier before you can log into the reinstalled test-equinix-ubuntu2204-x64-2.

It looks like you had been invited to join the Node.js org in Equinix Metal but hadn't accepted the invite (I resent it just in case).

Thanks @richardlau! I was able to accepted now.

Side note: I was able to SSH into the machine, so I think that we can try to re-ansible the machine 🤔