Guidance on Installing CPE using the Package Repository

No Official Support

Please note that these are not official installation instructions; if you would like to install a complete version of CPE, please follow the official download and installation instructions for CSM, HPCM, or HPE Cray XD2000 systems. Please join our Slack Workspace and start a discussion in our #hpe-cray-programming-environment channel.

HPCM based Systems

Instructions

  1. Enable repositories, including the following:

For systems using COS Base or SLES:

  • SLE Module Basesystem

  • SLE Module HPC

For systems using RHEL:

  • RHEL BaseOS

  • RHEL AppStream

  • RHEL CRB

  • RHEL EPEL

2. Create a local repository from the CPE RPMs downloaded from the public repository.

admin# cm repo add --custom <CPE_REPO> <CPE_REPO_PATH>

3. Verify that the new repository was created.

admin# cm repo show

In the output, the repository names are on the left side of the colon.

  1. (Optional) Add the new repository to a repository group.

a. Display the available groups on the system.

admin# cm repo group show

In the output, find <REPO_GROUP> in the list. This is the repository group for the user services and compute nodes where the CPE is to be installed.

b. Add the CPE repository to the repository group.

admin# cm repo group add <REPO_GROUP> --repos <CPE_REPO>
  1. (Optional for testing) Install an RPM from the repository onto a running node.

a. Display a list of nodes on the system.

admin# cm node show
  1. Install the desired RPMs to a node using the cm-cli.

For systems using COS Base or SLES:

admin# cm node zypper -n <NODE> --repo-group <REPO_GROUP> install <RPMs>

For systems using RHEL:

admin# cm node dnf -n <NODE> --repo-group <REPO_GROUP> install <RPMs>
  1. Display a list of the available images.

To list all available images:

admin# cm image show

To list which images are assigned to nodes:

admin# cm node show -I -n <node>

In the output, find <IMAGE_NAME> within the list of images used for the compute, login, and service nodes where the CPE will be installed.

7. (Optional) Copy an existing image to a new one.It may take some time to copy the image.

admin# cm image copy -o <EXISTING_IMAGE> -i <CPE_IMAGE>

8. Install the CPE RPMs into the image using the cm-cli.

admin# cm image <dnf | zypper> --image <CPE_IMAGE> --repos <CPE_REPO> --duk 'install <rpm1> <rpm2> ...'

9. Create a squashfs image using the following:

admin# mksquashfs /opt/clmgr/image/images/<CPE_IMAGE> cpe-sles15sp5.x86_64-24.07.squashfs -noappend -b 1M -comp xz -Xbcj x86 -no-xattrs -wildcards -e !(opt) opt/!(cray) opt/cray/!(pe)

10. Activate the newly created squashfs by copying it to the following location:

admin# cp <NEW_IMAGE_NAME>.squashfs /opt/clmgr/image/images_rw_nfs/pe/PE/<VERSION>/cpe-base-<OS>.<ARCH>-<VERSION>.squashfs

<OS> must be in the format "sles15spX" or "rhelXX"
<ARCH> must be "x86_64" or "aarch64"

11. Install “cm-pe-integration” and “cpe-support” into the compute image:

admin# cm image <dnf | zypper> --image <COMPUTE_IMAGE> --repos <CPE_REPO> --duk 'install cm-pe-integration cpe-support'

12. Modify /opt/clmgr/image/images/<COMPUTE_NAME>/etc/cray-pe.d/pe_releases so that it contains the same <VERSION> as the newly created squashfs image:

admin# vi /opt/clmgr/image/images/<COMPUTE_IMAGE>/etc/cray-pe.d/pe_releases<VERSION>
  1. The CPE installation is now complete. Make any additional changes to the compute image and move forward to booting the nodes with the new image.

CSM based Systems

Procedure to install and configure CPE from downloadable rpms.

  1. Get new rpms into Nexus

  2. Prepare new CPE recipe

  3. Build CPE image

  4. Prepare to create the image

  5. Get the IMS recipe to build

  6. Verify Creation

  7. Clean up the creation environment

  8. Trim the squashfs

  9. Update CPE deployment

Get new rpms into Nexus

  1. Get or create an rpm list for new CPE release.

  2. Fetch rpms from rpm repository

    URL=https://<email-address>:<token>@update1.linux.hpe.com/repo/cpe/<version>/<type>/<os_ver>/<arch>
    

    Run script to download with wget.

    ARCH=x86_64
    #ARCH=aarch64
    echo "working on $ARCH"
    mkdir $ARCH
    cd $ARCH
    for rpm in $(cat ../<RPM_LIST> | xargs -n 1)
    do
      wget $URL/$ARCH/$rpm
    done
    cd ..
    
  3. Load rpms into Nexus.

    1. Find install-3p.sh from previous CPE installation. Look in the $MEDIA_DIR from the previous $ACTIVITY_NAME which was used to install the previous CPE with IUF.

      INSTALL3P=/etc/cray/upgrade/csm/media/update-products-<VERSION>/<CPE_VERSION>/install-3p.sh
      
    2. Load rpms into Nexus for the $ARCH variable.

      VERSION=<CPE_VERSION>
      $INSTALL3P $ARCH $VERSION
      

      Example output:

      ________________________________________
      Loading install dependencies... OK
      
      ________________________________________
      Getting Nexus credentials... OK
      
      ________________________________________
      Looking for yum/hosted repository: cpe-24.07-sles15-sp5-test... NOT found...Creating yum/hosted repository: cpe-24.07-sles15-sp5-test...201 OK created
      
      ________________________________________
      Uploading assets in /etc/cray/upgrade/csm/media/cpe-rpms-only/x86_64 to cpe-24.07-sles15-sp5-test...
      /data/cray-mpixlate-1.0.5-libs-intel20221-ucx-0-6.sles15sp5.x86_64.rpm  204 373219 0.522975 713645
      /data/cray-mpixlate-1.0.5-libs-intel20221-ucx-0-6.sles15sp5.x86_64.rpm  created
      /data/cpe-intel-24.07-20240627185640_901d2ff69c23-4.sles15sp5.x86_64.rpm        204 7111 0.552895 12861
      /data/cpe-intel-24.07-20240627185640_901d2ff69c23-4.sles15sp5.x86_64.rpm        created
      
      [...]
      
      /data/cce-18.0.0-202406031331.98280ea519a0c-4.sles15sp4.x86_64.rpm      204 1000181919 53.479475 18702164
      /data/cce-18.0.0-202406031331.98280ea519a0c-4.sles15sp4.x86_64.rpm      created
      real    0m 57.89s
      user    0m 8.40s
      sys     0m 13.53s
      Ok
      
      ________________________________________
      Triggering rebuild of Yum repodata for cpe-24.07-sles15-sp5-test...Getting rebuild Yum repodata tasks... OK
      Creating task to rebuild Yum repodata for cpe-24.07-sles15-sp5-test... OK
      {
        "name": "create_task",
        "result": "QuartzTaskInfo{jobKey=nexus.523140ca-6b80-455e-948e-b9ec5e5884aa, state=WAITING, taskState=QuartzTaskState{taskConfiguration={.updated=2024-07-09T19:31:05.312Z, .name=Rebuild repodata - cpe-24.07-sles15-sp5-test, .recoverable=false, .enabled=true, .id=450e1ebf-2766-4aef-877e-31068f87eecf, .typeName=Repair - Rebuild Yum repository metadata (repodata), .visible=true, repositoryName=cpe-24.07-sles15-sp5-test, .typeId=repository.yum.rebuild.metadata, .exposed=true, .created=2024-07-09T19:31:05.312Z}, schedule=Manual{properties={schedule.type=manual}}, nextExecutionTime=Sun Aug 17 07:12:55 UTC 292278994}, taskFuture=null, removed=false}"
      }
      Getting rebuild Yum repodata tasks... OK
      Running rebuild task 450e1ebf-2766-4aef-877e-31068f87eecf... OK
      {
        "id": "450e1ebf-2766-4aef-877e-31068f87eecf",
        "name": "Rebuild repodata - cpe-24.07-sles15-sp5-test",
        "type": "repository.yum.rebuild.metadata",
        "message": "Rebuild metadata for cpe-24.07-sles15-sp5-test",
        "currentState": "WAITING",
        "lastRunResult": "OK",
        "nextRun": null,
        "lastRun": "2024-07-09T19:31:05.432+00:00"
      }
      ________________________________________
      Waiting for Nexus to create repository metadata for cpe-24.07-sles15-sp5-test...
      cpe-24.07-sles15-sp5-test repo metadata is not ready yet...
      cpe-24.07-sles15-sp5-test repo metadata is not ready yet...
      cpe-24.07-sles15-sp5-test repo metadata is not ready yet...
      cpe-24.07-sles15-sp5-test repo metadata is not ready yet...
      OK - cpe-24.07-sles15-sp5-test repo metadata exists
      
      OK
      
      ________________________________________
      Cleaning up install dependencies...
      Untagged: docker.io/library/cray-nexus-setup:cpe-23.12-sles15-sp5
      Deleted: 616e9cf15e8eefb4b499b46810112765c46763fb927cc23f423fc9d0b7be0c52
      Untagged: docker.io/library/skopeo:cpe-23.12-sles15-sp5
      Deleted: 55c5f847c8b86b47376d1de6226e86295020db0e22a5a79786ec13b2c6fbba96
      OK
      
      ________________________________________
      Done
      

Prepare new CPE recipe

  1. Find all recipes in IMS.

    cray ims recipes list --format json  > cray.ims.recipes.list.json
    
  2. Search for previous cpe-barebones-sles15sp5 recipe. The name will have $ARCH and the previus CPE release version in it.

    For example:

    cpe-barebones-sles15sp5.x86_64-23.12.3
    

    Set these variables

    ARCH=x86_64
    PREVIOUS_VERSION=23.12.3
    
  3. Extract the old recipe from IMS.

    cray ims recipes describe --format json cpe-barebones-sles15sp5.$ARCH-$PREVIOUS_VERSION
    

    Note the path field.

    Example output:

    {
      "arch": "x86_64",
      "created": "2024-03-25T21:59:36.367188+00:00",
      "id": "1a9deae0-ea13-4774-80af-26cb91c29f25",
      "link": {
        "etag": "bf247036c0ab36d2a59a6e4d3f26f374",
        "path": "s3://ims/recipes/1a9deae0-ea13-4774-80af-26cb91c29f25/recipe.tar.gz",
        "type": "s3"
      },
      "linux_distribution": "sles15",
      "name": "cpe-barebones-sles15sp5.x86_64-24.3.0",
      "recipe_type": "kiwi-ng",
      "require_dkms": false,
      "template_dictionary": []
    }
    
  4. Save a copy of the old recipe from S3.

    Use the path field from the IMS recipe to extract the recipe file from the S3 ims bucket into a local file.

    cray artifacts get ims recipes/1a9deae0-ea13-4774-80af-26cb91c29f25/recipe.tar.gz cpe-barebones-sles15sp5.$ARCH-$PREVIOUS_VERSION.recipe.tgz
    
  5. Create a directory and expand the old recipe.

    mkdir <IMAGE_NAME>.$ARCH-$VERSION
    cd <IMAGE_NAME>.$ARCH-$VERSION
    tar xpf ../<IMAGE_NAME>.$ARCH-$PREVIOUS_VERSION.recipe.tgz
    
  6. Edit the new recipe

    1. Save a copy of config.xml.

      cp -p config.xml config.xml.orig
      
    2. Edit the config.xml file.

      vi config.xml
      

      There are several changes to make in the config.xml file.

      1. Image name

      2. Version

      3. CPE repository stanza

      4. COS or COS/USS repository stanzas

      5. WLM stanza

      6. Choice of meta-packages to be installed. aarch64 has two to choose from, but x86-64 has five. These include the cpe-base and then the optional 3rd-party software support bundles, but not all are needed for all systems.

      When completed with the editing the file should look like this.

      diff config.xml.orig config.xml
      

      Example output:

      3c3
      < <image schemaversion="6.8" name="cpe-barebones-sles15sp5">
      ---
      > <image schemaversion="6.8" name="cpe-barebones-sles15sp5-24.07-test">
      13c13
      <         <version>23.12</version>
      ---
      >         <version>24.07-test</version>
      27,29c27,29
      <     <!-- CPE 23.13 SLES15SP5 -->
      <     <repository type="rpm-md" alias="cpe-23.12-sles15-sp5" priority="5" imageinclude="true">
      <         <source path="https://packages.local/repository/cpe-23.12-sles15-sp5/"/>
      ---
      >     <!-- CPE 24.07 SLES15SP5 -->
      >     <repository type="rpm-md" alias="cpe-24.07-sles15-sp5-test" priority="5" imageinclude="true">
      >         <source path="https://packages.local/repository/cpe-24.07-sles15-sp5-test/"/>
      32,33c32,37
      <     <repository type="rpm-md" alias="cos-sle-15sp5-cn" priority="1" imageinclude="true">
      <         <source path="https://packages.local/repository/cos-2.6-sle-15sp5-compute/"/>
      ---
      >     <repository type="rpm-md" alias="cos-base-3.0.0-26-sle-15.5" priority="1" imageinclude="true">
      >           <source path="https://packages.local/repository/cos-base-3.0.0-26-sle-15.5/"/>
      >     </repository>
      >     <!-- uss SLES15SP5 CN -->
      >     <repository type="rpm-md" alias="uss-1.0.0-61-cos-base-3.0" priority="1" imageinclude="true">
      >           <source path="https://packages.local/repository/uss-1.0.0-61-cos-base-3.0/"/>
      36,37c40,41
      <     <repository type="rpm-md" alias="slingshot-host-software-cos-sle-15sp5-cn" priority="1" imageinclude="true">
      <         <source path="https://packages.local/repository/cos-2.6-net-sle-15sp5-compute-shs-2.1/"/>
      ---
      >     <repository type="rpm-md" alias="slingshot-host-software-2.1-cos-3.0-x86_64-sle15-sp5-cn-cassini" priority="1" imageinclude="true">
      >           <source path="https://packages.local/repository/slingshot-host-software-2.1-cos-3.0-x86_64-sle15-sp5-cn-cassini"/>
      39,41c43,49
      <     <!-- WLM PBS SLES15sp5 CN -->
      <     <repository type="rpm-md" alias="wlm-pbs-sle-15sp5-cn" priority="3" imageinclude="false">
      <         <source path="https://packages.local/repository/wlm-pbs-1.2-sle-15sp5-compute/"/>
      ---
      >     <!-- WLM Slurm SLES15sp5 CN -->
      >     <repository type="rpm-md" alias="slurm-2.0.5-23.02.6-sle-15.5" priority="3" imageinclude="false">
      >           <source path="https://packages.local/repository/slurm-2.0.5-23.02.6-sle-15.5/"/>
      >     </repository>
      >     <!-- Nvidia driver default -->
      >     <repository type="rpm-md" alias="https://packages.local/repository/nvidia-driver-default" priority="3" imageinclude="false">
      >           <source path="https://packages.local/repository/nvidia-driver-default/"/>
      107,110d114
      <     <!-- SUSE SLE INSTALLER 15 SP5 Updates -->
      <     <repository type="rpm-md" alias="SUSE-SLE-INSTALLER-15-SP5-x86_64-Updates" priority="2" imageinclude="false">
      <         <source path="https://packages.local/repository/SUSE-SLE-INSTALLER-15-SP5-x86_64-Updates/"/>
      <     </repository>
      141,145c149,155
      < <!--        <package name="cpe-base-23.12"/>   -->
      < <!--        <package name="cpe-aocc-23.12"/>   -->
      < <!--        <package name="cpe-amd-23.12"/>    -->
      < <!--        <package name="cpe-intel-23.12"/>  -->
      < <!--        <package name="cpe-nvidia-23.12"/> -->
      ---
      > <!--        <package name="cpe-base-24.07"/>   -->
      >         <package name="cpe-base-24.07"/>
      > <!--        <package name="cpe-aocc-24.07"/>   -->
      > <!--        <package name="cpe-amd-24.07"/>    -->
      > <!--        <package name="cpe-intel-24.07"/>  -->
      > <!--        <package name="cpe-nvidia-24.07"/> -->
      >         <package name="cpe-nvidia-24.07"/>
      
    3. Save a copy of config.sh.

      cp -p config.sh config.sh.orig
      
    4. Edit the config.sh file.

      vi config.sh
      

      When completed with the editing the file should look like this. The section which would remove CCE and GCC from the image should be commented.

      diff config.sh.orig config.sh
      

      Example output:

      111,123c111,123
      < # remove cce and gcc from non-base image
      < if ! [[ ${kiwi_iname} =~ base ]]; then
      <   if [[ -d /opt/cray/pe/cce ]]; then
      <     cce_version=$(ls /opt/cray/pe/cce/ | head -1)
      <     rpm -eh --nodeps cce-${cce_version}
      <   fi
      <   if [[ -d /opt/cray/pe/gcc ]]; then
      <     for version in `find /opt/cray/pe/gcc -maxdepth 1 -regex ".*[0-9]+" | cut -d/ -f6`; do
      <       rpm -eh --nodeps --noscripts cpe-gcc-${version}
      <     done
      <     rm -rf /opt/cray/pe/gcc-libs
      <   fi
      < fi
      ---
      > ## remove cce and gcc from non-base image
      > #if ! [[ ${kiwi_iname} =~ base ]]; then
      > #  if [[ -d /opt/cray/pe/cce ]]; then
      > #    cce_version=$(ls /opt/cray/pe/cce/ | head -1)
      > #    rpm -eh --nodeps cce-${cce_version}
      > #  fi
      > #  if [[ -d /opt/cray/pe/gcc ]]; then
      > #    for version in `find /opt/cray/pe/gcc -maxdepth 1 -regex ".*[0-9]+" | cut -d/ -f6`; do
      > #      rpm -eh --nodeps --noscripts cpe-gcc-${version}
      > #    done
      > #    rm -rf /opt/cray/pe/gcc-libs
      > #  fi
      > #fi
      139c139
      < exit 0
      \ No newline at end of file
      ---
      > exit 0
      
    5. (Optional) Remove the original files so they will not appear in the new recipe.

      rm config.sh.orig config.xml.orig
      
    6. Set an environment variable for the name of the file that will contain the archive of the image recipe.

      IMAGE_ROOT=<IMAGE_NAME>.$ARCH-$VERSION
      ARTIFACT_FILE=$IMAGE_ROOT.recipe.tgz
      
    7. Create new tar file

      tar czf ../$ARTIFACT_FILE .
      cd ..
      
    8. Upload new recipe to IMS.

      cray ims recipes create --name "<IMAGE_NAME>.$ARCH-$VERSION" \
         --recipe-type kiwi-ng --linux-distribution <LINUX_IMAGE_NAME> \
         --arch $ARCH --require-dkms False --format toml
      

      Example output:

      created = "2024-07-09T18:52:25.118518+00:00"
      id = "c8be1727-bd59-4639-87a5-f0e2bd05b594"
      recipe_type = "kiwi-ng"
      linux_distribution = "sles15"
      require_dkms = false
      arch = "x86_64"
      name = "cpe-barebones-sles15sp5.x86_64-cpe-24.07-test"
      template_dictionary = []
      
    9. Create a variable for the id value in the returned data.

      IMS_RECIPE_ID=2233c82a-5081-4f67-bec4-4b59a60017a6
      
    10. Upload the customized recipe to S3.

      It is suggested as a best practice that the S3 object name start with recipes/ and contain the IMS recipe ID, in order to remove ambiguity.

      cray artifacts create ims recipes/$IMS_RECIPE_ID/$ARTIFACT_FILE $ARTIFACT_FILE
      

      Example output:

      artifact = "recipes/c8be1727-bd59-4639-87a5-f0e2bd05b594/cpe-barebones-sles15sp5.x86_64-cpe-24.07-test.recipe.tgz"
      Key = "recipes/c8be1727-bd59-4639-87a5-f0e2bd05b594/cpe-barebones-sles15sp5.x86_64-cpe-24.07-test.recipe.tgz"
      
    11. Update the IMS recipe record with the S3 path to the recipe archive.

      cray ims recipes update $IMS_RECIPE_ID --link-type s3 \
        --link-path s3://ims/recipes/$IMS_RECIPE_ID/$ARTIFACT_FILE --format toml
      

      Example output:

      arch = "x86_64"
      created = "2024-07-09T18:52:25.118518+00:00"
      id = "7b79dfd7-c107-41b6-b1de-c3a8b513214c"
      linux_distribution = "sles15"
      name = "cpe-barebones-sles15sp5.x86_64-cpe-24.07-test"
      recipe_type = "kiwi-ng"
      require_dkms = false
      template_dictionary = []
      
      [link]
      etag = ""
      path = "s3://ims/recipes/7b79dfd7-c107-41b6-b1de-c3a8b513214c/cpe-barebones-sles15sp5.x86_64-cpe-24.07-test.recipe.tgz"
      type = "s3"
      

Build CPE image

  1. Check for an existing IMS public key ID.

    Skip this step if it is known that a public key associated with the user account being used was not previously uploaded to the IMS service.

    The following query may return multiple public key records. The correct one will have a name value including the current username in use.

    cray ims public-keys list
    

    Example output excerpt:

    [[results]]
    public_key = "ssh-rsa <PUBLIC_KEY>"
    id = "<PUBLIC_ID>"
    name = "username public key"
    created = "<DATE>"
    

    If a public key associated with the username in use is not returned, proceed to the next step. If a public key associated with the username does exist, create a variable for the IMS public key id value in the returned data and then proceed to step 3.

    IMS_PUBLIC_KEY_ID=<PUBLIC_KEY>
    
  2. Upload the SSH public key to the IMS service.

    Skip this step if an IMS public key record has already been created for the account being used.

    The IMS debug/configuration shell relies on passwordless SSH. This SSH public key needs to be uploaded to IMS to enable interaction with the image customization environment later in this procedure.

    Replace the username value with the actual username being used on the system when setting the public key name.

    cray ims public-keys create --name "username public key" --public-key ~/.ssh/id_rsa.pub
    

    Example output:

    public_key = "ssh-rsa <PUBLIC_KEY>"
    id = "<PUBLIC_ID>"
    name = "username public key"
    created = "<DATE>"
    

    If successful, create a variable for the IMS public key id value in the returned data.

    IMS_PUBLIC_KEY_ID=<PUBLIC_KEY>
    
  1. Confirm the IMS recipe needed to build the image has been created.

    cray ims recipes list | grep $IMS_RECIPE_ID
    

    Example output excerpt:

    id = "<PUBLIC_ID>"
    path = "s3://ims/recipes/<ID>/<IMAGE_RECIPE>.recipe.tgz"
    
  1. Create an IMS job record and start the image creation job.

    After building an image, IMS will automatically upload any build artifacts (root file system, kernel and initrd) to the artifact repository, and associate them with IMS.

    cray ims jobs create \
     --job-type create \
     --image-root-archive-name $IMAGE_ROOT \
     --artifact-id $IMS_RECIPE_ID \
     --public-key-id $IMS_PUBLIC_KEY_ID \
     --enable-debug False
    

    Example output:

    kubernetes_pvc = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-job-claim"
    kernel_parameters_file_name = "kernel-parameters"
    initrd_file_name = "initrd"
    job_type = "create"
    id = "fee571ee-ed50-4b32-8381-7dc3a01508ce"
    status = "creating"
    kubernetes_namespace = "ims"
    job_mem_size = 8
    kubernetes_job = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-create"
    artifact_id = "c8be1727-bd59-4639-87a5-f0e2bd05b594"
    require_dkms = false
    arch = "x86_64"
    build_env_size = 30
    enable_debug = false
    created = "2024-07-09T19:17:14.444420+00:00"
    kubernetes_configmap = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-configmap"
    kubernetes_service = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-service"
    image_root_archive_name = "cpe-barebones-sles15sp5.x86_64-cpe-24.07-test"
    kernel_file_name = "vmlinuz"
    public_key_id = "c1e5d809-27b6-4aea-8a61-72ed4a070791"
    

    If successful, create variables for the IMS job id and kubernetes_job values in the returned data.

    IMS_JOB_ID=fee571ee-ed50-4b32-8381-7dc3a01508ce
    IMS_KUBERNETES_JOB=cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-create
    
  2. Describe the image create job.

    kubectl -n ims describe job $IMS_KUBERNETES_JOB
    

    Example output:

    Name:             cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-create
    Namespace:        ims
    
    [...]
    
    Events:
      Type     Reason            Age   From            Message
      ----     ------            ----  ----            -------
      Normal   SuccessfulCreate  79s   job-controller  Created pod: cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-create-5w8m9
      Warning  PolicyViolation   49s   kyverno-scan    policy disallow-privileged-containers/autogen-privileged-containers fail: validation error: Privileged mode is disallowed. The fields spec.containers[*].securityContext.privileged and spec.initContainers[*].securityContext.privileged must be unset or set to `false`. rule autogen-privileged-containers failed at path /spec/template/spec/initContainers/3/securityContext/privileged/
    

    If successful, create a variable for the pod name that was created above, displayed in the Events section.

    POD=cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-create-5w8m9
    
  3. Watch the logs from the fetch-recipe, wait-for-repos, build-ca-rpm, build-image, and buildenv-sidecar containers to monitor the image creation process.

    Use kubectl and the returned pod name from the previous step to retrieve this information.

    The fetch-recipe container is responsible for fetching the recipe archive from S3 and uncompressing the recipe.

    kubectl -n ims logs -f $POD -c fetch-recipe
    

    Example output:

    INFO:/scripts/fetch.py:IMS_JOB_ID=fee571ee-ed50-4b32-8381-7dc3a01508ce
    INFO:/scripts/fetch.py:Setting job status to 'fetching_recipe'.
    INFO:ims_python_helper:image_set_job_status: {{ims_job_id: fee571ee-ed50-4b32-8381-7dc3a01508ce, job_status: fetching_recipe}}
    INFO:ims_python_helper:PATCH https://api-gw-service-nmn.local/apis/ims/jobs/fee571ee-ed50-4b32-8381-7dc3a01508ce status=fetching_recipe
    INFO:/scripts/fetch.py:Fetching recipe http://rgw.local:8080/ims/recipes/2233c82a-5081-4f67-bec4-4b59a60017a6/my_recipe.tgz?AWSAccessKeyId=GQZKV1HAM80ZFDZJFFS7&Expires=1586891507&Signature=GzRzuTWo3p5CoKHzT2mIuPQXLGM%3D
    INFO:/scripts/fetch.py:Saving file as '/mnt/recipe/recipe.tgz'
    INFO:/scripts/fetch.py:Verifying md5sum of the downloaded file.
    INFO:/scripts/fetch.py:Successfully verified the md5sum of the downloaded file.
    INFO:/scripts/fetch.py:Uncompressing recipe into /mnt/recipe
    INFO:/scripts/fetch.py:Deleting compressed recipe /mnt/recipe/recipe.tgz
    INFO:/scripts/fetch.py:Done
    

    The wait-for-repos container will ensure that any HTTP/HTTPS repositories referenced by the Kiwi-NG recipe can be accessed and are available. This helps ensure that the image will be built successfully. If 301 responses are returned instead of 200 responses, that does not indicate an error.

    kubectl -n ims logs -f $POD -c wait-for-repos
    

    Example output:

    2019-05-17 09:53:47,381 - INFO    - __main__ - Recipe contains the following repos: ['http://api-gw-service-nmn.local/repositories/sle15-Module-Basesystem/', 'http://api-gw-service-nmn.local/repositories/sle15-Product-SLES/', 'http://api-gw-service-nmn.local/repositories/cray-sle15']
    2019-05-17 09:53:47,381 - INFO    - __main__ - Attempting to get http://api-gw-service-nmn.local/repositories/sle15-Module-Basesystem/repodata/repomd.xml
    2019-05-17 09:53:47,404 - INFO    - __main__ - 200 response getting http://api-gw-service-nmn.local/repositories/sle15-Module-Basesystem/repodata/repomd.xml
    2019-05-17 09:53:47,404 - INFO    - __main__ - Attempting to get http://api-gw-service-nmn.local/repositories/sle15-Product-SLES/repodata/repomd.xml
    2019-05-17 09:53:47,431 - INFO    - __main__ - 200 response getting http://api-gw-service-nmn.local/repositories/sle15-Product-SLES/repodata/repomd.xml
    2019-05-17 09:53:47,431 - INFO    - __main__ - Attempting to get http://api-gw-service-nmn.local/repositories/cray-sle15/repodata/repomd.xml
    2019-05-17 09:53:47,458 - INFO    - __main__ - 200 response getting http://api-gw-service-nmn.local/repositories/cray-sle15/repodata/repomd.xml
    

    The build-ca-rpm container creates an RPM with the private root CA certificate for the system. This RPM is installed automatically by Kiwi-NG to ensure that Kiwi can securely talk to the Nexus repositories when building the image root.

    kubectl -n ims logs -f $POD -c build-ca-rpm
    

    Example output:

    cray_ca_cert-1.0.1/
    cray_ca_cert-1.0.1/etc/
    cray_ca_cert-1.0.1/etc/cray/
    cray_ca_cert-1.0.1/etc/cray/ca/
    cray_ca_cert-1.0.1/etc/cray/ca/certificate_authority.crt
    Building target platforms: noarch
    Building for target noarch
    Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.mFCHpo
    + umask 022
    + cd /root/rpmbuild/BUILD
    + cd /root/rpmbuild/BUILD
    + rm -rf cray_ca_cert-1.0.1
    + /bin/gzip -dc /root/rpmbuild/SOURCES/cray_ca_cert-1.0.1.tar.gz
    + /bin/tar -xof -
    + STATUS=0
    + '[' 0 -ne 0 ]
    + cd cray_ca_cert-1.0.1
    + /bin/chmod -Rf a+rX,u+w,g-w,o-w .
    + RPM_EC=0
    + jobs -p
    + exit 0
    Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.HoCJDo
    + umask 022
    + cd /root/rpmbuild/BUILD
    + cd cray_ca_cert-1.0.1
    + RPM_EC=0
    + jobs -p
    + exit 0
    Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.dMaoEk
    + umask 022
    + cd /root/rpmbuild/BUILD
    + cd cray_ca_cert-1.0.1
    + install -d '/root/rpmbuild/BUILDROOT/cray_ca_cert-1.0.1-1.%{_arch}/usr/share/pki/trust/anchors'
    + install -m 644 /etc/cray/ca/certificate_authority.crt '/root/rpmbuild/BUILDROOT/cray_ca_cert-1.0.1-1.%{_arch}/usr/share/pki/trust/anchors/cray_certificate_authority.crt'
    + RPM_EC=0
    + jobs -p
    + exit 0
    Processing files: cray_ca_cert-1.0.1-1.noarch
    Provides: cray_ca_cert = 1.0.1-1
    Requires(interp): /bin/sh
    Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
    Requires(post): /bin/sh
    Checking for unpackaged file(s): /usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/cray_ca_cert-1.0.1-1.%{_arch}
    Wrote: /root/rpmbuild/RPMS/noarch/cray_ca_cert-1.0.1-1.noarch.rpm
    Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.FpNpgK
    + umask 022
    + cd /root/rpmbuild/BUILD
    + cd cray_ca_cert-1.0.1
    + RPM_EC=0
    + jobs -p
    + exit 0
    

    The build-image container builds the recipe using the Kiwi-NG tool.

    kubectl -n ims logs -f $POD -c build-image
    

    Example output:

    + RECIPE_ROOT_PARENT=/mnt/recipe
    + IMAGE_ROOT_PARENT=/mnt/image
    + PARAMETER_FILE_BUILD_FAILED=/mnt/image/build_failed
    + PARAMETER_FILE_KIWI_LOGFILE=/mnt/image/kiwi.log
    
    [...]
    
    + kiwi-ng --logfile=/mnt/image/kiwi.log --type tbz system build --description /mnt/recipe --target /mnt/image
    [ INFO    ]: 16:14:31 | Loading XML description
    [ INFO    ]: 16:14:31 | --> loaded /mnt/recipe/config.xml
    [ INFO    ]: 16:14:31 | --> Selected build type: tbz
    [ INFO    ]: 16:14:31 | Preparing new root system
    [ INFO    ]: 16:14:31 | Setup root directory: /mnt/image/build/image-root
    [ INFO    ]: 16:14:31 | Setting up repository http://api-gw-service-nmn.local/repositories/sle15-Module-Basesystem/
    [ INFO    ]: 16:14:31 | --> Type: rpm-md
    [ INFO    ]: 16:14:31 | --> Translated: http://api-gw-service-nmn.local/repositories/sle15-Module-Basesystem/
    [ INFO    ]: 16:14:31 | --> Alias: SLES15_Module_Basesystem
    [ INFO    ]: 16:14:32 | Setting up repository http://api-gw-service-nmn.local/repositories/sle15-Product-SLES/
    [ INFO    ]: 16:14:32 | --> Type: rpm-md
    [ INFO    ]: 16:14:32 | --> Translated: http://api-gw-service-nmn.local/repositories/sle15-Product-SLES/
    [ INFO    ]: 16:14:32 | --> Alias: SLES15_Product_SLES
    [ INFO    ]: 16:14:32 | Setting up repository http://api-gw-service-nmn.local/repositories/cray-sle15
    [ INFO    ]: 16:14:32 | --> Type: rpm-md
    [ INFO    ]: 16:14:32 | --> Translated: http://api-gw-service-nmn.local/repositories/cray-sle15
    [ INFO    ]: 16:14:32 | --> Alias: DST_built_rpms
    
    [...]
    
    [ INFO    ]: 16:19:19 | Calling images.sh script
    [ INFO    ]: 16:19:55 | Creating system image
    [ INFO    ]: 16:19:55 | Creating XZ compressed tar archive
    [ INFO    ]: 16:21:31 | --> Creating archive checksum
    [ INFO    ]: 16:21:51 | Export rpm packages metadata
    [ INFO    ]: 16:21:51 | Export rpm verification metadata
    [ INFO    ]: 16:22:09 | Result files:
    [ INFO    ]: 16:22:09 | --> image_packages: /mnt/image/cray-sles15-barebones.x86_64-1.0.1.packages
    [ INFO    ]: 16:22:09 | --> image_verified: /mnt/image/cray-sles15-barebones.x86_64-1.0.1.verified
    [ INFO    ]: 16:22:09 | --> root_archive: /mnt/image/cray-sles15-barebones.x86_64-1.0.1.tar.xz
    [ INFO    ]: 16:22:09 | --> root_archive_md5: /mnt/image/cray-sles15-barebones.x86_64-1.0.1.md5
    + rc=0
    + '[' 0 -ne 0 ']'
    + exit 0
    

    The buildenv-sidecar container determines if the Kiwi-NG build was successful or not.

  • If the Kiwi-NG build completed successfully, the image root, kernel, and initrd artifacts are uploaded to the artifact repository.

  • If the Kiwi-NG build failed to complete successfully, an optional SSH debug shell is enabled so the image build can be debugged.

kubectl -n ims logs -f $POD -c buildenv-sidecar

Example output:

Not running user shell for successful create action
Copying SMS CA Public Certificate to target image root
+ IMAGE_ROOT_PARENT=/mnt/image
+ IMAGE_ROOT_DIR=/mnt/image/build/image-root
+ KERNEL_FILENAME=vmlinuz
+ INITRD_FILENAME=initrd
+ IMAGE_ROOT_ARCHIVE_NAME=sles15_barebones_image
+ echo Copying SMS CA Public Certificate to target image root
+ mkdir -p /mnt/image/build/image-root/etc/cray
+ cp -r /etc/cray/ca /mnt/image/build/image-root/etc/cray/
+ mksquashfs /mnt/image/build/image-root /mnt/image/sles15_barebones_image.sqsh
Parallel mksquashfs: Using 4 processors
Creating 4.0 filesystem on /mnt/image/sles15_barebones_image.sqsh, block size 131072.
[===========================================================\] 26886/26886 100%

Exportable Squashfs 4.0 filesystem, gzip compressed, data block size 131072
    compressed data, compressed metadata, compressed fragments, compressed xattrs

[...]

+ python -m ims_python_helper image upload_artifacts sles15_barebones_image 7de80ccc-1e7d-43a9-a6e4-02cad10bb60b \
        -v -r /mnt/image/sles15_barebones_image.sqsh -k /mnt/image/image-root/boot/vmlinuz
        -i /mnt/image/image-root/boot/initrd
{
    "ims_image_artifacts": [
        {
            "link": {
                "etag": "4add976679c7e955c4b16d7e2cfa114e-32",
                "path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/rootfs",
                "type": "s3"
            },
            "md5": "94165af4373e5ace3e817eb4baba2284",
            "type": "application/vnd.cray.image.rootfs.squashfs"
        },
        {
            "link": {
                "etag": "f836412241aae79d160556ed6a4eb4d4",
                "path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/kernel",
                "type": "s3"
            },
            "md5": "f836412241aae79d160556ed6a4eb4d4",
            "type": "application/vnd.cray.image.kernel"
        },
        {
            "link": {
                "etag": "ec8793c07f94e59a2a30abdb1bd3d35a-4",
                "path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/initrd",
                "type": "s3"
            },
            "md5": "86832ee3977ca0515592e5d00271d2fe",
            "type": "application/vnd.cray.image.initrd"
        },
        {
            "link": {
                "etag": "13af343f3e76b0f8c7fbef7ee3588ac1",
                "path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/manifest.json",
                "type": "s3"
            },
            "md5": "13af343f3e76b0f8c7fbef7ee3588ac1",
            "type": "application/json"
        }
    ],
    "ims_image_record": {
        "created": "2018-12-17T22:59:43.264129+00:00",
        "id": "d88521c3-b339-43bc-afda-afdfda126388",
        "name": "sles15_barebones_image"
        "link": {
            "etag": "13af343f3e76b0f8c7fbef7ee3588ac1",
            "path": "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/manifest.json",
            "type": "s3"
        },
    },
    "ims_job_record": {
        "artifact_id": "2233c82a-5081-4f67-bec4-4b59a60017a6",
        "build_env_size": 10,
        "created": "2018-11-21T18:22:53.409405+00:00",
        "enable_debug": false,
        "id": "fee571ee-ed50-4b32-8381-7dc3a01508ce",
        "image_root_archive_name": "sles15_barebones_image",
        "initrd_file_name": "initrd",
        "job_type": "create",
        "arch": "x86_64"
        "require_dkms": False
        "kernel_file_name": "vmlinuz",
        "kubernetes_configmap": "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-configmap",
        "kubernetes_job": "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-create",
        "kubernetes_service": "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-service",
        "public_key_id": "a252ff6f-c087-4093-a305-122b41824a3e",
        "resultant_image_id": "d88521c3-b339-43bc-afda-afdfda126388",
        "ssh_port": 0,
        "status": "creating"
    },
    "result": "success"
}

[...]

IMPORTANT: The IMS image creation workflow automatically copies the NCN Certificate Authority’s public certificate to /etc/cray/ca/certificate_authority.crt within the image root being built. This can be used to enable secure communications between the NCN and the client node.

If the image creation operation fails, the build artifacts will not be uploaded to S3. If enable_debug is set to true, then the IMS creation job will enable a debug SSH shell that is accessible by one or more dynamic host names. The user needs to know if they will SSH from inside or outside the Kubernetes cluster to determine which host name to use. Typically, customers access the system from outside the Kubernetes cluster using the Customer Access Network (CAN).

  1. If no errors are observed, skip to the 1. Verify that the new image was created correctly step.

    Otherwise, proceed to the following step to debug the failure.

  2. Use the IMS_JOB_ID to look up the ID of the newly created image.

    There may be multiple records returned. Ensure that the correct record is selected in the returned data.

    cray ims jobs describe $IMS_JOB_ID
    

    Example output:

    status = "waiting_on_user"
    enable_debug = false
    kernel_file_name = "vmlinuz"
    artifact_id = "4e78488d-4d92-4675-9d83-97adfc17cb19"
    build_env_size = 10
    job_type = "create"
    kubernetes_service = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-service"
    kubernetes_job = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-create"
    id = "fee571ee-ed50-4b32-8381-7dc3a01508ce"
    image_root_archive_name = "my_customized_image"
    initrd_file_name = "initrd"
    created = "2018-11-21T18:22:53.409405+00:00"
    kubernetes_namespace = "ims"
    arch = "x86_64"
    require_dkms = false
    public_key_id = "a252ff6f-c087-4093-a305-122b41824a3e"
    kubernetes_configmap = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-configmap"
    [[ssh_containers]]
    status = "pending"
    jail = false
    name = "debug"
    
    [ssh_containers.connection_info."cluster.local"]
    host = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-service.ims.svc.cluster.local"
    port = 22
    [ssh_containers.connection_info.customer_access]
    host = "fee571ee-ed50-4b32-8381-7dc3a01508ce.ims.cmn.shasta.cray.com" <<-- Note this host
    port = 22 <<-- Note this port
    

    If successful, create variables for the SSH connection information.

    IMS_SSH_HOST=fee571ee-ed50-4b32-8381-7dc3a01508ce.ims.cmn.shasta.cray.com
    IMS_SSH_PORT=22
    
  3. Connect to the IMS debug shell.

    To access the debug shell, SSH to the container using the private key that matches the public key used to create the IMS job.

    IMPORTANT: The following command will not work when run on a node within the Kubernetes cluster.

    ssh -p IMS_SSH_PORT root@IMS_SSH_HOST
    

    Example output:

    Last login: Tue Sep  4 18:06:27 2018 from gateway
    [root@POD ~]#
    
  4. Investigate using the IMS debug shell.

  5. Change to the /mnt/image/ directory.

    [root@POD image]# cd /mnt/image/
    
  6. Access the image root.

    [root@POD image]# chroot image-root/
    
  7. Investigate inside the image debug shell.

  8. Exit the image root.

    :/ # exit
    [root@POD image]#
    
  9. Touch the complete file once investigations are complete.

    [root@POD image]# touch /mount/image/complete
    
  1. Verify that the new image was created correctly.

    cray ims jobs describe $IMS_JOB_ID
    

    Example output:

    status = "success"
    enable_debug = false
    kernel_file_name = "vmlinuz"
    artifact_id = "2233c82a-5081-4f67-bec4-4b59a60017a6"
    build_env_size = 10
    job_type = "create"
    kubernetes_service = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-service"
    kubernetes_job = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-customize"
    id = "fee571ee-ed50-4b32-8381-7dc3a01508ce"
    image_root_archive_name = "sles15_barebones_image"
    resultant_image_id = "d88521c3-b339-43bc-afda-afdfda126388"
    initrd_file_name = "initrd"
    arch = "x86_64"
    require_dkms = false
    created = "2018-11-21T18:22:53.409405+00:00"
    kubernetes_namespace = "ims"
    public_key_id = "a252ff6f-c087-4093-a305-122b41824a3e"
    kubernetes_configmap = "cray-ims-fee571ee-ed50-4b32-8381-7dc3a01508ce-configmap"
    

    If successful, create a variable for the IMS resultant_image_id.

    IMS_RESULTANT_IMAGE_ID=d88521c3-b339-43bc-afda-afdfda126388
    
  2. Verify that the new IMS image record exists.

    cray ims images describe $IMS_RESULTANT_IMAGE_ID
    

    Example output:

    created = "2018-12-17T22:59:43.264129+00:00"
    id = "d88521c3-b339-43bc-afda-afdfda126388"
    name = "sles15_barebones_image"
    
    [link]
    path = "s3://boot-images/d88521c3-b339-43bc-afda-afdfda126388/manifest.json"
    etag = "180883770442235de747e9d69855f269"
    type = "s3"
    
  1. Delete the IMS job record.

    cray ims jobs delete $IMS_JOB_ID
    

    Deleting the job record will delete the underlying Kubernetes job, service, and ConfigMap that were created when the job record was submitted.

    Jobs created with the flag --enable-debug true will remain in a ‘Running’ state and continue to consume Kubernetes resources until the job is manually completed, or deleted. If there are enough ‘Running’ IMS jobs on the system it may not be possible to schedule more pods on worker nodes.

  1. The image created by IMS has some content which should be removed.

    1. Get the image created by IMS

      cray artifacts get boot-images $IMS_RESULTANT_IMAGE_ID/rootfs rootfs
      
    2. Mount the SquashFS file.

      mkdir mnt
      mount -t squashfs rootfs mnt -o ro,loop
      
    3. Create new SquashFS excluding content which is not needed for CPE.

      (cd mnt; mksquashfs . ../CPE-base.$ARCH-<CPE_VERSION>.squashfs -comp xz -Xbcj x86 -no-xattrs -wildcards -e !(opt) opt/!(cray) opt/cray/!(pe) )
      
    4. Create a new IMS image registration and save the id field in an environment variable.

      cray ims images create --name CPE-base-$ARCH-<CPE_VERSION> --format toml
      

      Example output:

      created = "2024-07-12T18:03:01.854467+00:00"
      id = "ef21fbd7-a47e-44e1-a64c-7cd2033bbcb2"
      arch = "x86_64"
      name = "CPE-base-x86_64-24.07-test"
      

      Keep track of the new image id:

      export NEW_IMAGE_ID=ef21fbd7-a47e-44e1-a64c-7cd2033bbcb2
      
    5. Upload the new image to S3 using the ID from the previous step.

      cray artifacts create boot-images ${NEW_IMAGE_ID}/rootfs CPE-base.$ARCH-<CPE_VERSION>.squashfs --format toml
      

      Example output:

      artifact = "ef21fbd7-a47e-44e1-a64c-7cd2033bbcb2/rootfs"
      Key = "ef21fbd7-a47e-44e1-a64c-7cd2033bbcb2/rootfs"
      
    6. Get the S3 generated etag value for the uploaded artifact.

      Display S3 values for uploaded image.

      cray artifacts describe boot-images ${NEW_IMAGE_ID}/rootfs --format toml
      

      Example output:

      [artifact]
      AcceptRanges = "bytes"
      LastModified = "2024-07-12T18:04:57+00:00"
      ContentLength = 2891317248
      ETag = "\"451ffdf92a8286b60ab180cd4b2dbe50-345\""
      ContentType = "binary/octet-stream"
      
      [artifact.Metadata]
      md5sum = "4d879b8b57585aca50da53b739e6be47"
      

      Note that when adding the etag to the IMS manifest below, remove the quotation marks from the etag value. So, for the above artifact, the etag would be 451ffdf92a8286b60ab180cd4b2dbe50-345.

      NEW_ETAG=451ffdf92a8286b60ab180cd4b2dbe50-345
      
    7. Obtain the md5sum of the SquashFS image, initrd, and kernel.

      md5sum CPE-base.$ARCH-<CPE_VERSION>.squashfs
      

      Example output:

      4d879b8b57585aca50da53b739e6be47  CPE-base.x86_64-24.07-test.squashfs
      
    8. Print out all the IMS details about the current image.

      Use the IMS image ID from 1. Get UAN information.

      cray ims images describe $IMS_RESULTANT_IMAGE_ID --format toml
      

      Example output:

      arch = "x86_64"
      created = "2024-07-11T22:06:49.075885+00:00"
      id = "4dc925c6-8cb0-4e04-b59c-1388f8bc7a9b"
      name = "cpe-barebones-sles15sp5.x86_64-cpe-24.07-sles15-sp5-test"
      
      [link]
      etag = "07c0f55e67a8ae1765e07342e56cac44"
      path = "s3://boot-images/4dc925c6-8cb0-4e04-b59c-1388f8bc7a9b/manifest.json"
      type = "s3"
      
    9. Use the path of the manifest.json file to download that JSON to a local file.

      cray artifacts get boot-images $IMS_RESULTANT_IMAGE_ID/manifest.json old-manifest.json
      cat old-manifest.json
      

      Example output:

      {
          "artifacts": [
              {
                  "link": {
                      "etag": "afb54e51d1237224a18bd515db8ef3f2-521",
                      "path": "s3://boot-images/4dc925c6-8cb0-4e04-b59c-1388f8bc7a9b/rootfs",
                      "type": "s3"
                  },
                  "md5": "fa70be6ef1e73001abe25f82bc2b754b",
                  "type": "application/vnd.cray.image.rootfs.squashfs"
              },
              {
                  "link": {
                      "etag": "c4cb2ced940b5413d5a70dac763ad80d-2",
                      "path": "s3://boot-images/4dc925c6-8cb0-4e04-b59c-1388f8bc7a9b/kernel",
                      "type": "s3"
                  },
                  "md5": "c62fddb31e9e5be2013ffa540db8f39c",
                  "type": "application/vnd.cray.image.kernel"
              },
              {
                  "link": {
                      "etag": "d41d8cd98f00b204e9800998ecf8427e",
                      "path": "s3://boot-images/4dc925c6-8cb0-4e04-b59c-1388f8bc7a9b/initrd",
                      "type": "s3"
                  },
                  "md5": "d41d8cd98f00b204e9800998ecf8427e",
                  "type": "application/vnd.cray.image.initrd"
              }
          ],
          "created": "2024-07-11 22:07:29.260078",
          "version": "1.0"
      }
      

      Alternatively, a manifest.json can be created from scratch.

    10. Modify manifest.json.

      1. Edit the new-manifest.json file.

        cp old-manifest.json new-manifest.json
        vi new-manifest.json
        
        1. Remove the stanzas for the kernel and initrd.

        2. The rootfs stanza should not end with a comma character, since the later stanzas for kernel and initrd have been removed.

        3. Update the value for the created field in the manifest with the output of this command. date ‘+%Y%m%d%H%M%S’

          date '+%Y%m%d%H%M%S'
          
        4. Replace the path, md5, and etag values of the rootfs with the values obtained in substeps above.

      2. Verify that the modified JSON file is still valid.

        cat manifest.json | jq
        
      3. Upload the updated manifest.json file.

        cray artifacts create boot-images ${NEW_IMAGE_ID}/manifest.json new-manifest.json
        
      4. Update the IMS image to use the new uan-manifest.json file.

        cray ims images update ${NEW_IMAGE_ID} \
                --link-type s3 --link-path s3://boot-images/${NEW_IMAGE_ID}/manifest.json \
                --link-etag $NEW_ETAG --format toml
        

        Example output:

        arch = "x86_64"
        created = "2024-07-12T18:03:01.854467+00:00"
        id = "ef21fbd7-a47e-44e1-a64c-7cd2033bbcb2"
        name = "CPE-base-x86_64-24.07-test"
        
        [link]
        etag = "451ffdf92a8286b60ab180cd4b2dbe50-345"
        path = "s3://boot-images/ef21fbd7-a47e-44e1-a64c-7cd2033bbcb2/manifest.json"
        type = "s3"
        

Update CPE deployment

  1. Update pe_deploy.yml in cpe-config-management repository in VCS.

    This section assumes that the the integration-23.12.3 branch has been checked out from the cpe-config-management repository inf VCS. Use the most appropriate recent version.

    Change the the directory where the git checkout has been done.

    1. Copy pe_deploy.yml.

      cp -p pe_deploy.yml pe_deploy.yml.orig
      
    2. In the section for roles, add a new line similar to the one shown below, using the value of the IMS image ID for $NEW_IMAGE_ID of the recently built SquashFS as the img_id and the image name for img_name. The img_name is a description.

      vi pe_dploy.yml
      diff pe_deploy.yml.orig pe_deploy.yml
      

      Example output:

      41a42
      >     - { role: cray.pe_deploy, img_name: CPE-base.x86_64-24.07-test, img_id: ef21fbd7-a47e-44e1-a64c-7cd2033bbcb2, when: not cray_cfs_image }
      
    3. Obtain the password for the crayvcs user from the Kubernetes secret for use in the git pull command.

      kubectl get secret -n services vcs-user-credentials --template={{.data.vcs_password}} | base64 --decode && echo
      
    4. Commit the change to VCS

      git add pe_deploy.yml
      git commit -m "Adding new CPE image CPE-base.<IMAGE_NAME>"
      git push
      

      The pe_deploy.yml.orig file can be remove, if not longer needed.

    5. Find the git commit ID

      git log | head
      

      Example output:

      commit fb297b8d51ded75212fe75dfc417e681fc5895ab (HEAD -> integration-23.12.3, origin/integration-23.12.3)
      Author: root <root@creek-ncn-m002.local>
      Date:   Fri Jul 12 19:11:40 2024 +0000
      
          Adding new CPE image CPE-base.x86_64-24.07-test
      
    6. Update the CFS configuration applied to compute nodes.

      cray cfs configurations describe --format json compute-<CPE_VERSION> > compute-<CPE_VERSION>.json
      vi compute-<CPE_VERSION>.json
      
      • Add the new git commit ID.

      • Remove the lines which have lastUpdated and Name.

      • Remember to remove the comma on the line above where Name was deleted.

    7. Check validity with jq

      cat compute-<CPE_VERSION>.jsonm | jq
      
    8. Update the CFS configuration for compute nodes.

      cray cfs configurations update --format json compute-<CPE_VERSION> --file compute-<CPE_VERSION>.json
      
    9. Check configuration being applied to compute nodes.

      kubectl -n services --sort-by=.metadata.creationTimestamp get pods | grep cfs
      
    10. Update the CFS configuration applied to UANs.

      If the configuration name used in this command is applied to booted nodes, CFS will start configuring the booted nodes.

      cray cfs configurations describe --format json uan-<CPE_VERSION> > uan-<CPE_VERSION>.json
      vi uan-<CPE_VERSION>.json
      
      • Add the new git commit ID.

      • Remove the lines which have lastUpdated and Name.

      • Remember to remove the comma on the line above where Name was deleted.

    11. Check validity with jq

      cat uan-<CPE_VERSION>.jsonm | jq
      
    12. Update the CFS configuration for UANs.

      If the configuration name used in this command is applied to booted nodes, CFS will start configuring the booted nodes.

      cray cfs configurations update --format json uan-<CPE_VERSION> --file uan-<CPE_VERSION>.json
      
    13. Check configuration being applied to compute nodes.

      kubectl -n services --sort-by=.metadata.creationTimestamp get pods | grep cfs