generated from sig_core/wiki-template
84 lines
3.4 KiB
Markdown
84 lines
3.4 KiB
Markdown
# SIG/HPC meeting 2023-10-19
|
|
|
|
## Attendees:
|
|
* Sherif
|
|
* Stack
|
|
* Alan Marshall
|
|
* Jeremy Siadal
|
|
|
|
## Discussions:
|
|
|
|
Stack, Asks about automating process for building slumr packages, Sherif explained the packaging process work and how we can improve it by using upstream monitoring tools
|
|
|
|
Jeremy, suggesting to start working on HPC rocky's kernel, will be mostly based on Rocky standard kernel with different configuration file
|
|
|
|
Stack, Found a problem slurmrestd, will look about it for next week
|
|
|
|
## Action items:
|
|
* Sherif to create kernel repo for kernel HPC, kernel-hpc-node
|
|
* Jeermy, to get the ball rolling with intel GPU driver
|
|
* Stack, Fix the slurm rest daemon and integrated it with openQA
|
|
* Sherif, staging repo for HPC
|
|
|
|
## Old business:
|
|
|
|
## 2023-10-09:
|
|
* None for this meeting, however we should be working on old business action items
|
|
|
|
## 2023-09-21:
|
|
* Sherif: Get the SIG for drivers
|
|
* Sherif: Check the names of nvidia drivers "open , dkms and closed source"
|
|
* Chris: Bench mark nvidia open vs closed source
|
|
|
|
## 2023-09-07:
|
|
* Sherif: Reaching out to AI SIG to check on hosting nvida that drivers that CIQ would like to contribute - Done and waiting to hear from them -
|
|
|
|
## 2023-08-24:
|
|
* Sherif: To push the testing repo file to release package
|
|
* Sherif: testing / merging the_real_swa scripts
|
|
|
|
## 2023-08-10:
|
|
* Sherif: Looking into the openQA testing - Pending
|
|
|
|
## 2023-07-27:
|
|
* Sherif: Reach out to jose-d about pmix - Done, no feedback yet -
|
|
* Greg: to reach out to openPBS and cloud charly
|
|
* Sherif: To update slurm23 to latest - Done -
|
|
|
|
## 2023-07-13:
|
|
* Sherif needs to update the wiki - Done
|
|
* Sherif to look into MPI stack
|
|
* Chris will send Sherif a link with intro
|
|
|
|
## 2023-06-29:
|
|
* Sherif release slurm23 sources - Done
|
|
* Stack and Sherif working on the HPC list
|
|
* Sherif email Jeremy, the slurm23 source URL - Done
|
|
|
|
## 2023-06-15:
|
|
* Sherif to look int openHPC slurm spec file - Pending on Sherif
|
|
* We need to get lists of centres and HPC that are moving to Rocky to make a blog post and PR
|
|
|
|
## 2023-06-01:
|
|
* Get a list of packages from Jeremy to pick up from openHPC - Done
|
|
* Greg / Sherif talk in Rocky / RESF about generic SIG for common packages such as chaintools
|
|
* Plan the openHPC demo Chris / Sherif - Done
|
|
* Finlise the slurm package with naming / configuration - Done
|
|
|
|
## 2023-05-18:
|
|
* Get a demo / technical talk after 4 weeks "Sherif can arrange that with Chris" - Done
|
|
* Getting a list of packages that openHPC would like to move to distros "Jeremy will be point of contact if we need those in couple of weeks" - Done
|
|
|
|
## 2023-05-04
|
|
* Start building slurm - On going, a bit slowing down with R9.2 and R8.8 releases, however packages are built, some minor configurations needs to be fixed -
|
|
* Start building apptainer - on hold -
|
|
* Start building singulartiry - on hold -
|
|
* Start building warewulf - on hold -
|
|
* Sherif: check about forums - done, we can have our own section if we want, can be discussed over the chat -
|
|
|
|
## 2023-04-20
|
|
* Reach out to other communities “Greg” - on going -
|
|
* Reaching out for different sites that uses Rocky for HPC “Stack will ping few of them and others as well -Group effort-”
|
|
* Reaching out to hardware vendors - nothing done yet -
|
|
* Statistic / public registry for sites / HPC to add themselves if they want - nothing done yet -
|