belanasaikiran/oneapi-devsummit-sea-2023

Welcome

Thank you for choosing to attend the AI Workshop @ oneAPI Devsummit 2023. This github repository holds the necessary artifacts for participating in the handson AI workshop.

Objectives

Get the hands-on experience on the new Intel® Developer Cloud (Beta).
Explore the optimizations delivered through the Intel Extension for Tensorflow* (ITEX), docs.
Understand the specifics of effectively utilizing the following Intel® hardware for AI workloads, (1) the 4th Generation Intel® Xeon® Scalable Processor (codenamed Sapphire Rapids) and (2) the Intel® Data Center GPU Max 1100 (codenamed Ponte Vecchio).

Pre-requisites

You have registered and can login to the Intel® Developer Cloud (IDC).
Yet to register on IDC? This guide helps you get started.
Your laptop has a basic ssh client installed.
Most Linux/MacOS distros comes pre-installed with an ssh client.
If you are on Microsoft Windows, open Command Prompt and verify that the commands 'ssh' and 'ssh-keygen' works. If it says 'command not recognized', you could install an ssh client like MobaXterm* or Putty*.
You have access to the oneAPI Discord channel.
This discord channel can help resolve your queries during and after the workshop.

Getting started with IDC

Note : Have you already ssh'ed on to the head node? If so, you can skip this section.

Once registered on IDC, perform the following steps to access the IDC "Scheduled access" nodes.

Sign-in to https://cloud.intel.com .
Post your ssh public-key on IDC profile.
If you already have a key under $HOME/.ssh/id_rsa.pub, You could use that key itself.
If not, generate a key-pair using the ssh-keygen command (press Enter to accept blank defaults).
Visit 'View Instances' tab and ensure that there are no running instances.
Go to 'Launch Instance' tab and launch the 'Schedule Access' instance (it's the first option in the list).
Go to 'View Instances' tab and check if the instance you created is getting listed there.

Create an SSH config file.

Create a file named 'config' at the path $HOME/.ssh/config. Copy the below contents and change username.

Host myidc
Hostname idcbetabatch.eglb.intel.com
User uXXXXXX ## Change this to reflect your username obtained in step 4
ServerAliveInterval 60
ServerAliveCountMax 10
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null

Open command prompt and try logging in as 'ssh myidc'

Note: The above steps assumes that your laptop is connected to an open Internet and it is NOT behind a corporate VPN/proxy. Additional steps as highlighted in this guide might be needed to get it working behind a proxy.

Getting started on AI workshop

SSH into idc head node.
```
ssh myidc
```
--> info
Request for compute node.
```
srun -p pvc-shared --pty /bin/bash
```
--> info

Clone this repository and change directory.

git clone https://github.com/vishnumadhu365/oneapi-devsummit-sea-2023.git
cd oneapi-devsummit-sea-2023

--> info

Prepare environment.
Note : Below step could take 15 ~ 20 mins to complete. This step has to be executed only once.
```
source prepare_env.sh
```
--> info
If everything goes well, you should see the jupyter logs as in below image. You should see 2 links as marked in the red box
Note down ip-address (starting 10.10.10.x) and port-number(starting 88xx)of the jupyter server.
Copy the url starting with 127.0.0.1:88xx

--> info
Note down the following (1) ipaddress starting 10.10.10.x (2) port number starting 88xx (3) copy to a notepad the link starting 127.0.0.1:88xx/tree?token=........
In a new terminal create an ssh tunnel to the jupyter server
```
ssh -L port-number:ip-address:portnumber myidc
```
--> info
sample ssh command --> ssh -L 88xx:10.0.0.x:88xx myidc
include the ip-address and port number from step:5
Open browser on laptop and hit the url copied earlier (starting with 127.0.01:88xx)
--> info
The browser would open a Jupyter workspace with the ipynb notebook files
You are all set to run through the exercises in the ipynb notebooks.
Hereafter, what to do if the terminal window is closed by mistake or the SSH connection gets interrupted?
--> info
You can resume your work by repeating the above 8 steps with the exception of step:4 where you have to instead run
```
source resume_env.sh
```

Common issues

I was running the ipynb notebook, then the terminal exited abrupty. How do I resume my work ?
Login to the headnode >> Using srun get inside a compute node >> Navigate to the cloned repo directory >> Run 'source resume_env.sh'. As before this will print the Jupyter Notebook link >> Repeat step 5,6,7 in 'Getting started on AI workshop' section above.
GPU Notebook has been running for more than 10 mins, seems its stuck, what to do ?
The issue could probably be due to overutilization of the GPU. You could try the following, within the GPU notebook navigate to Kernel >> Restart and Clear All Output >> Manually run the cells from the top and this time choose a different GPU device based on GPU frequency table (go for the one with the lowest frequency).
Facing random errors ?
Try after restarting the Jupyter Kernel. Also, note that the notebook is designed to be executed top-to-bottom without skipping any cells.
If nothing works, feel free to reach out to the presenters onsite or post the issue on Discord for assistance.
srun job is hung waiting in the queue.
There is a limit of one running job per user. Check the queue to see if there are any orphaned jobs under your userid. Delete the job with scancel command and try srun again
Running sycl-ls doesnt list any GPU's.
GPU's are available only on the compute nodes and not in the Head node. Check the bash prompt and verify whether you are on the compute node.

Legal Notices and Disclaimers

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com.
Cost reduction scenarios described including recommendations are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
Any forecasts of goods and services needed for Intel’s operations are provided for discussion purposes only. Intel will have no liability to make any purchase in connection with forecasts published in this document.
Intel technologies may require enabled hardware, software or service activation.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks.

|* Other names and brands may be claimed as the property of others.

Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
Copyright 2023 Intel Corporation.rademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
Copyright 2023 Intel Corporation.

belanasaikiran / oneapi-devsummit-sea-2023